Abstract
We prove an extension of Szarek’s optimal Khinchin inequality (1976) for distributions close to the Rademacher one, when all the weights are uniformly bounded by a \(1/\sqrt{2}\) fraction of their total \(\ell _2\)-mass. We also show a similar extension of the probabilistic formulation of Ball’s cube slicing inequality (1986). These results establish the distributional stability of these optimal Khinchin-type inequalities. The underpinning to such estimates is the Fourier-analytic approach going back to Haagerup (1981).
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \(\varepsilon _1, \varepsilon _2, \dots \) be independent identically distributed (i.i.d.) Rademacher random variables, that is, symmetric random signs satisfying \(\mathbb {P}\left( \varepsilon _j = \pm 1 \right) = \frac{1}{2}\). Motivated by his study of bilinear forms on infinitely many variables, Littlewood conjectured in [26] (see also [15]) the following inequality: for every \(n \ge 1\) and every unit vector a in \(\mathbb {R}^n\), we have
which is clearly best possible. Not until 46 years after it had been posed, was this proved by Szarek in [34]. His result was later generalised in a stunning way to the setting of vector-valued coefficients \(a_j\) in arbitrary normed space by Latała and Oleszkiewicz in [24] (see also [30, Section 4.2] for a modern presentation of their proof using discrete Fourier analysis). Szarek’s original proof was based mainly on an intricate inductive scheme (see also [35]). Note that (1) holds trivially if \(\Vert a\Vert _\infty = \max _j |a_j| \ge \frac{1}{\sqrt{2}}\), for if, say we have \(|a_1| \ge \frac{1}{\sqrt{2}}\), then thanks to independence and convexity,
Haagerup in his pioneering work [14] on Khinchin inequalities offered a very different approach to the nontrivial regime \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\), using classical Fourier-analytic integral representations along with tricky estimates for a special function.
Taking that route, the point of this paper is to illustrate the robustness of Haagerup’s method and extend (1) to i.i.d. sequences of random variables whose distribution is close to the Rademacher one in the \(\textsf{W}_2\)-Wasserstein distance. Using the same framework, we also treat Ball’s cube slicing inequality from [3] which asserts that the maximal-volume hyperplane section of the cube \([-1,1]^n\) in \(\mathbb {R}^n\) is attained at \((1,1,0,\dots , 0)^\perp \). This can be equivalently stated in probabilistic terms as an inequality akin to (1) as follows (see, e.g. equation (2) in [6]). Let \(\xi _1, \xi _2, \dots \) be i.i.d. random vectors uniform on the unit Euclidean sphere in \(\mathbb {R}^3\). For every \(n \ge 1\) and every unit vector a in \(\mathbb {R}^n\), we have
where here and throughout \(|\cdot |\) denotes the standard Euclidean norm.
Szarek’s inequality (1), Ball’s inequality (2), as well as these extensions fall under the umbrella of so-called Khinchin-type inequalities. The archetype was Khinchin’s result asserting that all \(L_p\) norms of Rademacher sums \(\sum a_j\varepsilon _j\) are comparable to its \(L_2\)-norm, established in his work [18] on the law of the iterated logarithm (and perhaps discovered independently by Littlewood in [26]). Due to the intricacies of the methods involved, sharp Khinchin inequalities are known only for a handful of distributions, most notably random signs [14, 29], but also uniforms [2, 5, 6, 8, 19, 22, 25], type L [17, 32], Gaussian mixtures [1, 10], marginals of \(\ell _p\)-balls [4, 11], or distributions with good spectral properties [23, 33]. The present work makes a first step towards more general distributions satisfying only a closeness-type assumption instead of imposing structural properties. Viewing sharp Khinchin-type inequalities as maximization problems for functionals on the sphere, our results assert, perhaps surprisingly, the fact that such inequalities are stable with respect to perturbations of the law of the underlying random vectors. These distributional stability results are novel in the context of optimal probabilistic inequalities.
2 Main results
For \(p > 0\) and a random vector X in \(\mathbb {R}^d\), we denote its \(L_p\)-norm with respect to the standard Euclidean norm \(|\cdot |\) on \(\mathbb {R}^d\) by \(\Vert X\Vert _p = (\mathbb {E}|X|^p)^{1/p}\), whereas for a (deterministic) vector a in \(\mathbb {R}^n\), \(\Vert a\Vert _\infty = \max _{j \le n} |a_j|\) is its \(\ell _\infty \)-norm. We say that the random vector X in \(\mathbb {R}^d\) is symmetric if \(-X\) has the same distribution as X. We also recall that the vector X is called rotationally invariant if for every orthogonal map U on \(\mathbb {R}^d\), UX has the same distribution as X. Equivalently, X has the same distribution as \(|X|\xi \), where \(\xi \) is uniformly distributed on the unit sphere \(\mathbb {S}^{d-1}\) in \(\mathbb {R}^d\) and independent of |X|. Recall that the \(\textsf{W}_2\)-Wasserstein distance \(\textsf{W}_2(X,Y)\) between (the distributions of) two random vectors X and Y in \(\mathbb {R}^d\) is defined as \(\inf _{(X',Y')} \Vert X'-Y'\Vert _2\), where the infimum is taken over all couplings of X and Y, that is, all random vectors \((X',Y')\) in \(\mathbb {R}^{2d}\) such that \(X'\) has the same distribution as X and \(Y'\) has the same distribution as Y.
Our first result is an extension of Szarek’s inequality (1) which reads as follows.
Theorem 1
There is a positive universal constant \(\delta _0\) such that if we let \(X_1, X_2, \dots \) be i.i.d. symmetric random variables satisfying
then for every \(n \ge 3\) and unit vectors a in \(\mathbb {R}^n\) with \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\), we have
Moreover, we can take \(\delta _0 = 10^{-4}\).
Note that left hand side of (3) is nothing but the \(\textsf{W}_2\)-Wasserstein distance between the distribution of \(X_1\) and the Rademacher distribution since \(|x\pm 1| \ge \big ||x|-1\big |\) for \(x \in \mathbb {R}\) and thus the optimal coupling of the two distributions is \(\big (X_1,\textrm{sign}(X_1)\big )\).
Our second main result provides an analogous extension for Ball’s inequality (2).
Theorem 2
Let \(X_1, X_2, \dots \) be i.i.d. symmetric random vectors in \(\mathbb {R}^3\). Suppose their common characteristic function \(\phi (t) = \mathbb {E}e^{i\!\left\langle t, X_1 \right\rangle \!}\) satisfies
for some constant \(C_0 > 0\). Assume that
where \(C_1 = \max \{C_0,1\}\) and \(\xi \) is a random vector uniform on the unit Euclidean sphere \(\mathbb {S}^2\) in \(\mathbb {R}^3\). Then for every \(n \ge 3\) and unit vectors a in \(\mathbb {R}^n\) with \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\), we have
Plainly, if we know that \(X_1\) and \(\xi \) are sufficiently close in \(\textsf{W}_3\), then the parameter \(\mathbb {E}|X_1|^{3}\) in (6) is redundant. In contrast to Theorem 1, here the closeness assumption (6) is put in terms of two parameters of the distribution: its third moment and the polynomial decay of its characteristic function. It is not clear whether this is essential. At the technical level of our proofs, the third moment is needed to carry out a certain Gaussian approximation, whilst the decay assumption has to do with an a priori lack of integrability in the Fourier-analytic representation of the \(L_{-1}\) norm (as opposed to the \(L_1\)-norm handled in Theorem 1).
On the other hand, neither of these is very restrictive. In particular, if \(X_1\) has a density f on \(\mathbb {R}^3\) vanishing at \(\infty \) whose gradient is integrable, then
so (5) holds with \(C_0 = \sqrt{3}\int _{\mathbb {R}^3}|\nabla f|\).
Another natural sufficient condition is the rotational invariance of \(X_1\): if, say, \(X_1\) has the same distribution as \(R\xi \), for a nonnegative random variable R and an independent of it random vector \(\xi \) uniform on the unit sphere \(\mathbb {S}^2\), then Archimedes’ Hat-Box theorem implies that \(\left\langle t, R\xi \right\rangle \!\), conditioned on the value of R, is uniform on \([-R|t|,R|t|]\) and thus
Moreover, in this case \(\textsf{W}_2(X_1,\xi ) = \Vert R-1\Vert _2\) (since for every unit vectors \(\theta , \theta '\) in \(\mathbb {R}^d\) and \(R \ge 0\), we have \(|R\theta -\theta '| \ge |R-1|\), as is easily seen by squaring). Probabilistically, this is an important special case as it yields results for symmetric unimodal distributions on \(\mathbb {R}\). Indeed, if X is of the form \(R\xi \) as above, for \(q > -1\), we have the identity
where the \(R_j\) are i.i.d. copies of R and the \(U_j\) are i.i.d. uniform random variables on \([-1,1]\), independent of the \(R_j\) (see Proposition 4 in [20]). The \(R_jU_j\) showing up in this formula can have any symmetric unimodal distribution, uniquely defined by the distribution of \(R_j\). Thus, if \(V_1, V_2, \dots \) be i.i.d. symmetric unimodal random variables, Theorem 2 then immediately yields a sharp upper bound on \(\lim _{q\downarrow -1}(1+q)\mathbb {E}\left| \sum _{j=1}^n a_jX_j\right| ^{q}\) for all unit vectors a with \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\) (cf. [5, 6, 11, 25]).
A result in the same vein as Theorem 2 is König and Koldobsky’s extension [20] of Ball’s cube slicing inequality to product measures with densities satisfying certain regularity and moment assumptions. Their result also applies specifically to vectors of weights satisfying the small coefficient condition \(\Vert a\Vert _\infty \le \tfrac{1}{\sqrt{2}}\).
Approached differently, full extensions of (1) and (2) (i.e. without the small coefficient restriction on a) have been obtained in our recent work [12] for a very special family of distributions corresponding geometrically to extremal sections and projections of \(\ell _p\)-balls.
3 Proof of Theorem 1
Our approach builds on Haagerup’s slick Fourier-analytic proof from [14]. We let
be the characteristic function of \(X_1\). Using the elementary Fourier-integral representation
as well as the symmetry and independence of the \(X_j\), we have,
(see also Lemma 1.2 in [14]). If a is a unit vector in \(\mathbb {R}^n\) with nonzero components, using the AM-GM inequality, we obtain Haagerup’s lower bound
where
(see Lemma 1.3 in [14]). The crucial lemma reads as follows.
Lemma 3
Under the assumptions of Theorem 1, we have \(\Psi (s) \ge \Psi (2)\) for every \(s\ge 2\).
If we take the lemma for granted, the proof of Theorem 1 is finished because the small coefficient assumption \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\) gives \(\Psi (a_j^{-2}) \ge \Psi (2)\) for each j, and as a result we get
where the last equality is justified by (10).
It remains to prove Lemma 3. To this end, we recall that if the \(X_j\) were Rademacher random variables, then the special function \(\Psi \) becomes
Haagerup showed that for every \(s>0\),
and concluded by the product representation that \(\Psi _0\) is strictly increasing. In particular, Lemma 3 holds in the Rademacher case due to monotonicity. The rest of the proof builds exactly on this observation: we show that the closeness of distributions guarantees that \(\Psi \) and \(\Psi _0\) are close for, say \(s \ge 3\), and that their derivatives are close for \(2 \le s \le 3\). Crucially, not only do we know that \(\Psi _0\) is strictly monotone, but also we can get a good bound on its derivative near the endpoint \(s=2\), which we record now for future use.
Lemma 4
We have
Proof
Differentiating Haagerup’s product expression (14) term-by-term yields
\(\square \)
The rest of this section is devoted to the proof of Lemma 3. We break it into several parts.
3.1 A uniform bound on the characteristic function
Lemma 5
Let X be a symmetric random variable satisfying (3). Then its characteristic function \(\phi (t) = \mathbb {E}e^{itX}\) satisfies,
Proof
By symmetry, the triangle inequality and the bound \(|\sin u| \le |u|\), we get
using the Cauchy–Schwarz inequality in the last estimate. Moreover,
Plugging in the assumption \(\big \Vert |X|-1\big \Vert _2 \le \delta _0\) completes the proof. \(\square \)
3.2 Uniform bounds on the special function and its derivative
Lemma 6
Assuming (3) and the symmetry of \(X_1\), the functions \(\Psi \) and \(\Psi _0\) defined in (12) and (13) respectively satisfy
Proof
Fix \(T>0\). Breaking the integral defining \(\Psi \) into \(\int _0^T + \int _T^\infty \) and using that \(|a-b| \le 1\) for \(a, b \in [0,1]\), we obtain
We also have \(\big ||a|^s - |b|^s\big | \le s|a-b|\) for \(a, b \in [-1,1]\), \(s \ge 1\), thus Lemma 5 yields
Optimizing over the parameter T gives the desired bound. \(\square \)
Lemma 7
For \(s \ge 2\) and \(0< u, v < 1\), we have
Proof
Let \(f(x)=x^s \log x\). It suffices to prove that on (0, 1) we have \(|f'(x)| \le 1\), which is equivalent to \(|\alpha t \log t + t| \le 1\) with \(t = x^{s-1} \in (0,1)\) and \(\alpha = \frac{s}{s-1} \in [1,2]\). To prove this observe that for \(t\in (0,1)\) we have \(\alpha t \log t + t \le t \le 1\) and
\(\square \)
Lemma 8
Assuming (3) and the symmetry of \(X_1\), the functions \(\Psi \) and \(\Psi _0\) defined in (12) and (13) satisfy
Proof
Changing the variables and differentiating gives
Thus,
To estimate the integral, we proceed along the same lines as in the proof of Lemma 6. We fix \(T > 0\), write \(\int _0^\infty = \int _0^T + \int _T^\infty \) and for the second integral use \(|u^s\log u| = \frac{1}{s}|u^s\log (u^s)| \le \frac{1}{es}\), \(0< u < 1\), to get a bound on it by \(\frac{2}{esT}\), whilst for the first integral, using first Lemma 7 and then Lemma 5, we obtain
Altogether, with the aid of Lemma 6,
Minimising the second term over \(T > 0\) leads to the bound by
For \(s \ge 2\), we have \(\frac{1}{\pi s}\left( \sqrt{2}+\frac{4}{\sqrt{e}}\right) < 0.61\ldots \) and this completes the proof. \(\square \)
3.3 Proof of Lemma 3
First we assume that \(s \ge 3\). Using Lemma 6 and letting \(\eta = \frac{2}{\pi }\sqrt{2\delta _0(\delta _0+2)}\) for brevity, we get
Since \(\Psi _0\) is increasing, \(\Psi _0(s) \ge \Psi _0(3) = \Psi _0(3) - \Psi _0(2) + \Psi _0(2)\) and \(\Psi _0(2) \ge \Psi (2) -\eta \), again using Lemma 6. Therefore,
It is now clear that as long as \(\delta _0\) is sufficiently small, namely \(2\eta \le \Psi _0(3) - \Psi _0(2)\), we get \(\Psi (s) \ge \Psi (2)\), as desired. It can be checked that \(\Psi _0(3) - \Psi _0(2) = \frac{4}{\pi \sqrt{3}} - \frac{1}{\sqrt{2}} = 0.027..\) and a choice of \(\delta _0 \le 10^{-4}\) suffices for the estimate \(\Psi (s)\ge \Psi (2)\) to hold for \(s\ge 3\).
Now we assume that \(2< s < 3\). We have
for some \(2< \theta < s\). Using Lemmas 4 and 8, we get
which is positive for all \(\delta _0 \le 3.7\cdot 10^{-4}\). Thus, \(\Psi (s) \ge \Psi (2)\) holds in both cases.\(\square \)
4 Proof of Theorem 2
The approach is the same as for Theorem 1, however certain technical details are substantially more involved. We begin with a Fourier-analytic representation for negative moments due to Gorin and Favorov [13].
Lemma 9
(Lemma 3 in [13]) For a random vector X in \(\mathbb {R}^d\) and \(-d< q < 0\), we have
where \(\beta _{q,d} = 2^{q}\pi ^{-d/2}\frac{\Gamma \left( (d+q)/2\right) }{\Gamma (-q/2)}\), provided that the integral on the right hand side exists.
Specialised to \(d = 3\), \(q = -1\) (\(\beta _{-1,3} = \frac{1}{2\pi ^2}\)) and \(X = \sum _{j=1}^n a_jX_j\) with \(X_1,\ldots ,X_n\) independent random vectors, we obtain
Note that thanks to the decay assumption (5), the integral on the right hand side converges as long as \(n \ge 2\) (assuming the \(a_j\) are nonzero). As in Ball’s proof from [3], Hölder’s inequality yields
where
with
denoting the characteristic function of \(X_1\). Exactly as in the proof of Theorem 1, the following pivotal lemma allows us to finish the proof.
Lemma 10
Under the assumptions of Theorem 2, we have \(\Phi (s) \le \Phi (2)\) for every \(s\ge 2\).
If the \(X_j\) are uniform on the unit sphere \(\mathbb {S}^2\) in \(\mathbb {R}^3\), we have \(\phi (t) = \frac{\sin |t|}{|t|}\) (because \(\left\langle t, X_1 \right\rangle \!\) is uniform on \([-|t|,|t|]\)), in which case the special function \(\Phi \) defined in (22) becomes
(after integrating in polar coordinates). Ball’s celebrated integral inequality states that \(\Phi _0(s) \le \Phi _0(2)\), for all \(s \ge 2\) (see Lemma 3 in [3], as well as [28, 31] for different proofs). Our proof of Lemma 10 relies on this, additional bounds on the derivative \(\Phi _0'(s)\) near \(s=2\), as well as, crucially, bounds quantifying how close \(\Phi \) is to \(\Phi _0\). In the following subsections we gather such results and then conclude with the proof of Lemma 10.
4.1 A uniform bound on the characteristic function
Throughout these sections \(\xi \) always denotes a random vector uniform on the unit sphere \(\mathbb {S}^2\) in \(\mathbb {R}^3\).
Lemma 11
Let X be a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X,\xi )\). Then, its characteristic function \(\phi (t) = \mathbb {E}e^{i\!\left\langle t, X \right\rangle \!}\) satisfies
Proof
Let \(\xi \) be uniform on \(\mathbb {S}^2\) such that for the joint distribution of \((X, \xi )\), we have \(\Vert X-\xi \Vert _2 = \textsf{W}_2(X,\xi ) = \delta \). By symmetry, the bound \(|\sin u| \le |u|\) and the Cauchy–Schwarz inequality (used twice), we get
To conclude we use the triangle inequality
\(\square \)
4.2 Bounds on the special function
We begin with a bound on the difference \(\Phi (s) - \Phi _0(s)\) obtained from the uniform bound on the characteristic functions (Lemma 11 above). In contrast to Lemma 6, the bound is not uniform in s. For s not too large (the bulk), we incur the factor \(s^{3/4}\). To fight it off for large values of s, we shall employ a Gaussian approximation. For that part to work, it is crucial that \(\Phi _0(2) - \Phi _0(\infty ) = \sqrt{2} - \sqrt{\frac{6}{\pi }} > 0\).
4.2.1 The bulk
Lemma 12
Let X be a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X,\xi )\) and characteristic function \(\phi \) satisfying (5) for some \(C_0 > 0\). Let \(\Phi \) and \(\Phi _0\) be defined through (22) and (24) respectively. For every \(s \ge 2\), we have
Proof
Given the definitions, we have
We fix \(T>0\) and split the integration into two regions.
Small t. Using Lemma 11 and \(||a|^s-|b|^s| \le s|a-b|\) when \(|a|, |b| \le 1\), we obtain
Large t. Since \(s \ge 2\), we have
By virtue of the decay assumption (5), this is at most
Adding up these two bounds and optimising over T yields
Plugging this back gives the assertion. \(\square \)
4.2.2 The Gaussian approximation
We now present a bound on \(\Phi (s)\) which does not grow as \(s\rightarrow \infty \) that will allow us to prove Lemma 10 for s sufficiently large.
Lemma 13
Let X be a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X,\xi )\) and characteristic function \(\phi \) satisfying (5) for some \(C_0 > 0\). Let \(\Phi \) be defined through (22). Assuming that \(\delta \le \min \{\frac{1}{\sqrt{3}}, (15C_0)^{-2}\}\), we have
with arbitrary \(0< \theta < \frac{(1-\delta \sqrt{3})^2}{3\mathbb {E}|X|^3}\).
Proof
We split the integral defining \(\Phi (s) = \frac{1}{2\pi ^2}\int _{\mathbb {R}^3} |\phi (s^{-1/2}t)|^s|t|^{-2} \textrm{d}t\) into several regions.
Large t. Using the decay condition (5), we get
Thus, for \(s \ge 2\),
as \(\frac{2 e\sqrt{s}}{\pi (s-1)} < \frac{4}{\sqrt{s}}\) for \(s \ge 2\).
Moderate t. This case is vacuous unless \(C_0 > \pi /e\). We use Lemma 11 to obtain
In this case, the condition \(\delta < (15C_0)^{-2}\) suffices to guarantee that \(\frac{1}{\pi }+\frac{\delta (\delta +2)}{2}\left( eC_0\right) ^2 < \frac{1}{e}\) (also using, say \(\delta +2 < 3\)). Then we get
Small t. For \(0< u < \pi \), we have
Fix \(0< \theta < \pi \). Then, first using Lemma 11 and then (28), we obtain
Integrating using polar coordinates and invoking the standard tail bound
the last integral gets upper bounded by
Summarising, we have shown that
Very small t. Taylor-expanding \(\phi \) at 0 with the Lagrange remainder,
for some point \(\eta \) in the segment \([0,s^{-1/2}t]\). To bound the error term, we note that
thus
We also note that in the domain \(\{|t| \le \theta \sqrt{s}\}\), the leading term \(1 - \frac{1}{2}\mathbb {E}\!\left\langle X, s^{-1/2}t \right\rangle \!^2\) is nonnegative, provided that \(\frac{1}{2}\theta ^2\mathbb {E}|X|^2 \le 1\). Since \(\Vert X\Vert _2 \le \delta + 1\) under the assumption (6), it suffices that \(\theta < \frac{\sqrt{2}}{1+\delta }\). Assuming this, we thus get
Evoking (6), let \(\xi \) be uniform on \(\mathbb {S}^2\) such that \(\Vert X-\xi \Vert _2 \le \delta \) with respect to some coupling. Then, for a fixed vector v in \(\mathbb {R}^3\), we obtain the bound
Thus, provided that \(\delta < \frac{1}{\sqrt{3}}\), this yields
where we have set \(\alpha = (\frac{1}{\sqrt{3}}-\delta )^2 - \frac{1}{3}\theta \mathbb {E}|X|^3\) and assumed that \(\alpha \) is positive in the last equality (guaranteed by choosing \(\theta \) sufficiently small). Then we finally obtain
Putting these three bounds together gives the assertion. Note that we have imposed the conditions \(\delta < \frac{1}{\sqrt{3}}\) and \(\delta < (15C_0)^{-2}\) when \(C_0 > \frac{\pi }{e}\), as well as \(\theta < \pi \), \(\theta < \frac{\sqrt{2}}{1+\delta }\) and \(\theta < \frac{(1-\delta \sqrt{3})^2}{3\mathbb {E}|X|^3}\). Since \(\Vert X\Vert _3 \ge \Vert X\Vert _2 \ge 1 - \delta \) and \(\delta < \frac{1}{\sqrt{3}}\), we have \(\frac{(1-\delta \sqrt{3})^2}{3\mathbb {E}|X|^3}< \frac{(1-\delta \sqrt{3})^2}{3(1-\delta )^3} = \frac{1}{3(1-\delta )}\left( \frac{1-\delta \sqrt{3}}{1-\delta }\right) ^2< \frac{1}{3-\sqrt{3}} < 0.79\). Moreover, \(\frac{\sqrt{2}}{1+\delta }> \frac{\sqrt{2}}{1+1/\sqrt{3}} > 0.89\), so the condition \(\theta < \frac{(1-\delta \sqrt{3})^2}{3\mathbb {E}|X|^3}\) implies the other two conditions on \(\theta \). \(\square \)
4.3 Bounds on the derivative of the special function
Lemma 14
Let X be a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X,\xi )\) and characteristic function \(\phi \) satisfying (5) for some \(C_0 > 0\). Let \(\Phi \) and \(\Phi _0\) be defined through (22) and (24) respectively. For every \(s \ge 2\), we have
Proof
First we take the derivative,
For the resulting \(\Phi -\Phi _0\) term, we use Lemma 12. To bound the difference of the integrals resulting from the second term, we fix \(T>0\) and split the integration into two regions.
Small t. Using Lemmas 7 and 11, we obtain
Large t. Note that for \(s \ge 2\), and \(0< u < 1\) we have,
Thus,
which, after applying the decay condition (5), gets upper bounded by
Adding up these two bounds and optimising over T yields
Going back to the difference of the derivatives, we arrive at the desired bound using
\(\square \)
4.4 Bounds on Ball’s special function
We will need two estimates on \(\Phi _0\) defined in (24), that is
First, we have a bound on the derivative near \(s=2\).
Lemma 15
For \(2 \le s \le 2.01\), we have \(\Phi _0'(s) \le -0.02\).
Second, on the complementary range, \(\Phi _0(s)\) is separated from its supremal value \(\Phi _0(2)\).
Lemma 16
For \(s \ge 2.01\), we have \(\Phi _0(s) \le \Phi _0(2) - 2\cdot 10^{-4}.\)
We begin with a numerical bound which will be used in the proofs of these assertions.
Lemma 17
We have
Proof
Using (28), we get
Moreover,
Therefore our integral is bounded above by
\(\square \)
We let
Proof of Lemma 15
First we observe that
Note that I is decreasing. We have,
since \(I(2)=\frac{\pi }{2}\). Moreover,
With the aid of Lemma 17, we therefore have
Thus, for \(2 \le s \le 2.01\), we have
where in the first inequality we used that the term in parenthesis is negative. \(\square \)
For the proof of Lemma 16, we need several more estimates. First, we record a lower bound on the derivative of \(\Phi _0(s)\) for arbitrary s.
Lemma 18
For \(s \ge 2\), we have \(\Phi _0'(s) \ge -\frac{12\sqrt{s}}{\pi e}\).
Proof
We have,
so it is enough to upper bound \(|I'(s)|\). Note that
\(\square \)
Second, we obtain a quantitative drop-off of the values of \(\Phi _0\).
Lemma 19
Let \(a \in [1, \frac{\pi }{3}]\) and suppose that for some \(s_0 \ge 2\), we have \(\Phi _0(s_0)= \sqrt{\frac{2}{a}}\). Then
To prove this, we build on the argument of Nazarov and Podkorytov from [31]. For a somewhat similar bound, we refer to Proposition 7 in König and Koldobsky’s work [21] on maximal-perimeter sections of the cube. For convenience and completeness, we include all arguments in detail. We consider functions
and their distribution functions
Lemma 20
For \(a \in [1,\frac{\pi }{3}]\) the function \(F_a-G\) has precisely one sign change point \(y_0\) and at this point changes sign from “−” to “\(+\)”.
Proof
Note that \(F_a(y)=G(y)=0\) for \(y \ge 1\), so we only consider \(y \in (0,1)\). We have \(F_a(y)=\sqrt{\frac{2}{\pi a} \ln (\frac{1}{y})}\).
The function g(x) has zeros for \(x \in \mathbb {Z}\). For \(m\in \mathbb {N}\), let \(y_m = \max _{[m,m+1]} g\). We clearly have \(y_m< \frac{1}{\pi m}\) and \(y_m> g(m+\frac{1}{2}) = \frac{1}{\pi (m+\frac{1}{2})}\). Thus \(y_m \in (\frac{1}{\pi (m+\frac{1}{2})}, \frac{1}{\pi m})\), which shows that the sequence \(y_m\) is decreasing. We have the following claims.
Claim 1. The function \(F_a-G\) is positive on \((y_1,1)\).
Note that if \(g(x)>y_1\) then \(x \in (0,1)\). Moreover \(g(x) \le { f_a(x)}\) for \(x \in [0,1]\), since
Thus, for \(y\in (y_1,1)\), we have
Claim 2. The function \(F_a-G\) changes sign at least once in (0, 1).
Due to Claim 1 it is enough to show that \(F_a-G\) is sometimes negative. We have \(F_a-G \le F_1 -G\) and \(\int _0^\infty 2y(F_1(y)-G(y)) \textrm{d}y = \int (f_1^2 -g^2)=0\), so \(F_1-G\) can be negative.
Claim 3. The function \(F_a-G\) is increasing on \((0,y_1)\).
Clearly \(F_a'>F_1'\) and thus the claim follows from the fact that \(F_1-G\) is increasing on \((0,y_1)\), which was proved in [31] (Chapter I, Step 5). \(\square \)
Proof of Lemma 19
The assumption \(\Phi _0(s_0) = { \sqrt{\frac{2}{a}}}\) is equivalent to
After changing variables and using Lemma 20, we get from the Nazarov–Podkorytov lemma (Chapter I, Step 4 in [31]) that for \(s \ge s_0\)
\(\square \)
Proof of Lemma 16
Take \(s_0=2.01\) and \(a=2\Phi _0(s_0)^{-2}\) in Lemma 19. Since \(\Phi _0(2) = \sqrt{2}\), Ball’s inequality gives that \(a \ge 1\). We need to check that \(a \le \frac{\pi }{3}\). From Lemma 18, we have that for \(s \in [2,2.01]\), \(\Phi _0'(s) \ge -\frac{12\sqrt{2.01}}{\pi e} > -2\). Thus, \(\Phi _0(s_0) \ge \Phi _0(2)-2(s_0-2) = \sqrt{2} - 0.02\). Therefore, \(a< 2\cdot (\sqrt{2}-0.02)^{-2}< 1.03 < \frac{\pi }{3}\), as needed. By Lemmas 19 and 15, we thus get that for \(s \ge s_0=2.01\),
\(\square \)
4.5 Proof of Lemma 10
Recall that we assume X is a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X, \xi )\) and characteristic function \(\phi \) satisfying (5), that is \(|\phi (t)| \le C_0/|t|\), for all \(t \in \mathbb {R}^3{\setminus }\{0\}\). Let \(C_1 = \max \{C_0,1\}\). Our goal is to show that if (6) holds, that is
then \(\Phi (s) \le \Phi (2)\) for all \(s \ge 2\), where \(\Phi \) is defined in (22). For the sake of clarity, we shall be fairly lavish with choosing constants. Since \(C_1 \ge 1\), the above assumes in particular that \(\delta \le 10^{-38}\). With this in mind, we note the following consequences of Lemmas 12 and 14 respectively: for \(s \ge 2\),
and similarly
We also remark that \(\Vert X\Vert _3 \ge \Vert X\Vert _2 \ge \Vert \xi \Vert _2 - \Vert X-\xi \Vert _2 = 1 - \delta \ge 1-10^{-38}\).
We break the argument into several regimes for the parameter s.
Large s. With hindsight, we set
In particular, \(s_0 \ge 10^5\). Using Lemma 13, that is
we will show that \(\Phi (s) \le \Phi (2)\) for all \(s \ge s_0\). We take \(\theta = \frac{1}{100\mathbb {E}|X|^3}\) which satisfies the conditions of the lemma and then, for the first term \(A_1\), we use
Thanks to (34), we also have
so it suffices to show that each of the second and third terms \(A_2\), \(A_3\) as well as this additional error \(A_4\) do not exceed \(\frac{1}{150}\). Using \(\delta < 10^{-38}C_1^{-9}\), we get
For the exponent in the second term \(A_2\), observe that
and, consequently,
Thus, using \(s \ge s_0 \ge 10^6(\mathbb {E}|X|^3)^2\), we get
Finally, for the third term, since \(s \ge s_0 \ge 10^5\),
therefore, since \(s \ge s_0 \ge 2\log C_1\),
Moderate s. We now assume that \(2.01 \le s \le s_0\). Using (34) twice and Lemma 16,
Inserting the bound on \(\delta \),
If \(s_0 = 10^6(\mathbb {E}|X|^3)^2\), then using the \((\mathbb {E}|X|^3)^{-3/2}\) term in the minimum and \(C_1^{-3/4} \le 1\), we get the above bounded by \(3\cdot 10^{-19/2+9/2} = 3\cdot 10^{-5}\). If \(s_0 = 2\log C_1\), then using the other term in the minimum, we get the bound by \(3\cdot 2^{3/4}10^{-19/2} C_1^{-3/4} (\log C_1)^{3/4}< 3(2/e)^{3/4}10^{-19/2} < 10^{-4}\) since \(u^{-1}\log u \le e^{-1}\) for \(u > 1\). In either case, we get the conclusion \(\Phi (s) \le \Phi (2)\).
Small s. We finally assume that \(2 \le s \le 2.01\). To argue that \(\Phi (s) \le \Phi (2)\), we will show that \(\Phi '(s) < 0\). By virtue of (35) and Lemma 15,
Since \(\delta C_1^6 \le \delta C_1^9 \le 10^{-38}\), this is clearly negative and the proof is complete.\(\square \)
5 Concluding remarks
Remark 1
Assumption (3) seems natural: plainly, there are distributions which are not close to the Rademacher one, for which the unit vector attaining \(\inf \mathbb {E}|\sum a_jX_j|\) is different than \(a = (\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}}, 0, \dots , 0)\), for instance it is \(a = (1,0,\dots , 0)\) for Gaussian mixtures (see [1, 10]), or for the Rademacher distribution with a large atom at 0 (see Theorem 4 and Remark 14 in [16]).
Remark 2
Handling the complementary case \(\Vert a\Vert _\infty > \frac{1}{\sqrt{2}}\) which is not covered by Theorems 1 and 2 is a different story. The trivial convexity argument presented in the introduction works in fact only for the Rademacher case, as it requires \(\frac{1}{\sqrt{2}}\mathbb {E}|X_1| \ge \mathbb {E}\left| \frac{X_1+X_2}{\sqrt{2}}\right| \), and only for the \(L_1\)-norm (see Remark 21 in [6]). To circumvent this, several different approaches have been used: Haagerup’s ad hoc approximation (see §3 in [14]), Nazarov and Podkorytov’s induction with a strengthened hypothesis (see Ch. II, Step 5 in [31]) which has also been adapted to other distributions (see [5, 6, 8]), and very recently a different inductive scheme near the extremiser (without a strengthening) needed in a geometric context (see [12]). None of these techniques appears amenable to the broad setting of general distributions that is treated in this paper.
Remark 3
De, Diakonikolas and Servedio obtained in [9] a stable version of Szarek’s inequality (1) with respect to the unit vector a, namely
for a universal positive constant \(\kappa \), where the deficit is given by \(\delta (a) = |a - (\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}, 0, \dots , 0)|^2\), assuming that \(a_1 \ge a_2 \ge \dots \ge a_n \ge 0\). Note that in the setting of Theorem 1, we have
by a simple application of the triangle inequality and \(\Vert \cdot \Vert _1 \le \Vert \cdot \Vert _2\). Thus, applying this (twice) and the bound (37) of De Diakonikolas and Servedio, we conclude that Theorem 1 also holds for unit vectors a with \(\delta (a) \ge (2\delta _0/\kappa )^2\). The same will apply to Theorem 2 with the aid of Theorem 1.2 from [7], a strengthening of Ball’s inequality (2) (see also [27]). See [12] for numerical values of the constants \(\kappa \).
Remark 4
We have used the \(\textsf{W}_2\)-distance in Theorems 1 and 2 for concreteness and convenience. Of course, for every \(p \ge 1\), if we use the \(\textsf{W}_p\)-distance in (3) and assume that \(X_1\) is in \(L_{\frac{p}{p-1}}\), then the proofs of Lemmas 5 and 11 go through with the Cauchy–Schwarz inequality replaced by Hölder’s inequality and the rest of the proof remains unchanged. It might be of interest to examine weaker distances in such statements.
Remark 5
Szarek’s sharp \(L_1 - L_2\) inequality (1) was extended to sharp \(L_p - L_2\) bounds for all \(p > 0\) by Haagerup in [14], using Fourier-integral representations of \(|x|^p\). It therefore seems plausible that our techniques allow to extend Theorem 1 to sharp bounds on \(L_p\) norms, but additional (nontrivial and technical) work is needed to treat the analogues of the special function \(\Psi _0\), (13), relevant to Haagerup’s \(L_p\) bounds. Similarly, the main result from [6] which extends (2) to sharp \(L_p - L_2\) bounds for all \(-1< p < 0\) could be a starting point for extensions of Theorem 2 to \(L_p\) norms with \(-1< p < 0\).
Data availibility
This manuscript has no associated data.
References
Averkamp, R., Houdré, C.: Wavelet thresholding for non-necessarily Gaussian noise: idealism. Ann. Stat. 31, 110–151 (2003)
Baernstein, A., II., Culverhouse, R.: Majorization of sequences, sharp vector Khinchin inequalities, and bisubharmonic functions. Studia Math. 152(3), 231–248 (2002)
Ball, K.: Cube slicing in \({ R}^n\). Proc. Am. Math. Soc. 97(3), 465–473 (1986)
Barthe, F., Naor, A.: Hyperplane projections of the unit ball of \(\ell _p^n\). Discrete Comput. Geom. 27(2), 215–226 (2002)
Chasapis, G., Gurushankar, K., Tkocz, T.: Sharp bounds on \(p\)-norms for sums of independent uniform random variables, \(0<p<1\). J. Anal. Math. 149(2), 529–553 (2023)
Chasapis, G., König, H., Tkocz, T.: From Ball’s cube slicing inequality to Khinchin-type inequalities for negative moments. J. Funct. Anal. 281(9), 109185 (2021)
Chasapis, G., Nayar, P., Tkocz, T.: Slicing \(\ell _p\)-balls reloaded: stability, planar sections in \(\ell _1\). Ann. Probab. 50(6), 2344–2372 (2022)
Chasapis, G., Singh, S., Tkocz, T.: Haagerup’s phase transition at polydisc slicing. Anal. PDE (2022). arXiv:2206.01026(preprint, to appear)
De, A., Diakonikolas, I., Servedio, R.A.: A robust Khintchine inequality, and algorithms for computing optimal constants in Fourier analysis and high-dimensional geometry. SIAM J. Discrete Math. 30(2), 1058–1094 (2016)
Eskenazis, A., Nayar, P., Tkocz, T.: Gaussian mixtures: entropy and geometric inequalities. Ann. Probab. 46(5), 2908–2945 (2018)
Eskenazis, A., Nayar, P., Tkocz, T.: Sharp comparison of moments and the log-concave moment problem. Adv. Math. 334, 389–416 (2018)
Eskenazis, A., Nayar, P., Tkocz, T.: Resilience of cube slicing in \(\ell _p\). (2022) arXiv:2211.01986(preprint)
Gorin, E.A., Favorov, S.Yu.: Generalizations of the Khinchin inequality (Russian). Teor. Veroyatnost. i Primenen. 35(4), 762–767 (1990) [translation in Theory Probab. Appl. 35(4), 766–771 (1991)]
Haagerup, U.: The best constants in the Khintchine inequality. Studia Math. 70(3), 231–283 (1981)
Hall, R.R.: On a conjecture of Littlewood. Math. Proc. Camb. Philos. Soc. 78(3), 443–445 (1975)
Havrilla, A., Tkocz, T.: Sharp Khinchin-type inequalities for symmetric discrete uniform random variables. Isr. J. Math. 246(1), 281–297 (2021)
Havrilla, A., Nayar, P., Tkocz, T.: Khinchin-type inequalities via Hadamard’s factorisation. Int. Math. Res. Not. IMRN 3, 2429–2445 (2023)
Khintchine, A.: Über dyadische Brüche. Math. Z. 18(1), 109–116 (1923)
König, H.: On the best constants in the Khintchine inequality for Steinhaus variables. Isr. J. Math. 203(1), 23–57 (2014)
König, H., Koldobsky, A.: On the maximal measure of sections of the \(n\)-cube. Geometric analysis, mathematical relativity, and nonlinear partial differential equations, 123–155, Contemp. Math., vol. 599, Amer. Math. Soc., Providence (2013)
König, H., Koldobsky, A.: On the maximal perimeter of sections of the cube. Adv. Math. 346, 773–804 (2019)
König, H., Kwapień, S.: Best Khintchine type inequalities for sums of independent, rotationally invariant random vectors. Positivity 5(2), 115–152 (2001)
Kwapień, S., Latała, R., Oleszkiewicz, K.: Comparison of moments of sums of independent random variables and differential inequalities. J. Funct. Anal. 136(1), 258–268 (1996)
Latała, R., Oleszkiewicz, K.: On the best constant in the Khinchin–Kahane inequality. Studia Math. 109(1), 101–104 (1994)
Latała, R., Oleszkiewicz, K.: A note on sums of independent uniformly distributed random variables. Colloq. Math. 68(2), 197–206 (1995)
Littlewood, J.E.: On bounded bilinear forms in an infinite number of variables. Q. J. Math. Oxf. Ser. 1, 164–174 (1930)
Melbourne, J., Roberto, C.: Quantitative form of Ball’s cube slicing in \(\mathbb{R} ^n\) and equality cases in the min-entropy power inequality. Proc. Am. Math. Soc. 150(8), 3595–3611 (2022)
Melbourne, J., Roberto, C.: Transport-majorization to analytic and geometric inequalities. J. Funct. Anal. 284(1), Paper No. 109717 (2023)
Nayar, P., Oleszkiewicz, K.: Khinchine type inequalities with optimal constants via ultra log-concavity. Positivity 16(2), 359–371 (2012)
Nayar, P., Tkocz, T.: Extremal sections and projections of certain convex bodies: a survey. In: Koldobsky, A., Volberg, A. (eds.) Harmonic Analysis and Convexity, pp. 343–390. De Gruyter, Berlin, Boston (2023). https://www.degruyter.com/document/doi/10.1515/9783110775389-008/html
Nazarov, F.L., Podkorytov, A.N.: Ball, Haagerup, and distribution functions. Complex analysis, operators, and related topics, pp. 247–267, Oper. Theory Adv. Appl., vol. 113. Birkhäuser, Basel (2000)
Newman, C.M.: An extension of Khintchine’s inequality. Bull. Am. Math. Soc. 81(5), 913–915 (1975)
Oleszkiewicz, K.: Comparison of moments via Poincaré-type inequality. Advances in stochastic inequalities (Atlanta, GA, 1997), pp. 135–148, Contemp. Math., vol. 234. Amer. Math. Soc., Providence (1999)
Szarek, S.: On the best constant in the Khintchine inequality. Studia Math. 58, 197–208 (1976)
Tomaszewski, B.: A simple and elementary proof of the Kchintchine inequality with the best constant. Bull. Sci. Math. (2) 111(1), 103–109 (1987)
Acknowledgements
We should very much like to thank an anonymous referee for their careful reading of the manuscript and helpful suggestions, particularly the one leading to Remark 5.
Funding
Open Access funding provided by Carnegie Mellon University
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors state that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This material is based upon work supported by the NSF grant DMS-1929284 while A.E. was in residence at ICERM for the Harmonic Analysis and Convexity program. P.N.’s research was supported by the National Science Centre, Poland, grant 2018/31/D/ST1/0135. T.T.’s research was supported by the NSF grant DMS-2246484.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Eskenazis, A., Nayar, P. & Tkocz, T. Distributional stability of the Szarek and Ball inequalities. Math. Ann. 389, 1161–1185 (2024). https://doi.org/10.1007/s00208-023-02669-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00208-023-02669-9