## Abstract

We prove an extension of Szarek’s optimal Khinchin inequality (1976) for distributions close to the Rademacher one, when all the weights are uniformly bounded by a \(1/\sqrt{2}\) fraction of their total \(\ell _2\)-mass. We also show a similar extension of the probabilistic formulation of Ball’s cube slicing inequality (1986). These results establish the distributional stability of these optimal Khinchin-type inequalities. The underpinning to such estimates is the Fourier-analytic approach going back to Haagerup (1981).

### Similar content being viewed by others

Avoid common mistakes on your manuscript.

## 1 Introduction

Let \(\varepsilon _1, \varepsilon _2, \dots \) be independent identically distributed (i.i.d.) Rademacher random variables, that is, symmetric random signs satisfying \(\mathbb {P}\left( \varepsilon _j = \pm 1 \right) = \frac{1}{2}\). Motivated by his study of bilinear forms on infinitely many variables, Littlewood conjectured in [26] (see also [15]) the following inequality: for every \(n \ge 1\) and every unit vector *a* in \(\mathbb {R}^n\), we have

which is clearly best possible. Not until 46 years after it had been posed, was this proved by Szarek in [34]. His result was later generalised in a stunning way to the setting of vector-valued coefficients \(a_j\) in arbitrary normed space by Latała and Oleszkiewicz in [24] (see also [30, Section 4.2] for a modern presentation of their proof using discrete Fourier analysis). Szarek’s original proof was based mainly on an intricate inductive scheme (see also [35]). Note that (1) holds trivially if \(\Vert a\Vert _\infty = \max _j |a_j| \ge \frac{1}{\sqrt{2}}\), for if, say we have \(|a_1| \ge \frac{1}{\sqrt{2}}\), then thanks to independence and convexity,

Haagerup in his pioneering work [14] on Khinchin inequalities offered a very different approach to the nontrivial regime \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\), using classical Fourier-analytic integral representations along with tricky estimates for a special function.

Taking that route, the point of this paper is to illustrate the robustness of Haagerup’s method and extend (1) to i.i.d. sequences of random variables whose distribution is *close* to the Rademacher one in the \(\textsf{W}_2\)-Wasserstein distance. Using the same framework, we also treat Ball’s cube slicing inequality from [3] which asserts that the maximal-volume hyperplane section of the cube \([-1,1]^n\) in \(\mathbb {R}^n\) is attained at \((1,1,0,\dots , 0)^\perp \). This can be equivalently stated in probabilistic terms as an inequality akin to (1) as follows (see, e.g. equation (2) in [6]). Let \(\xi _1, \xi _2, \dots \) be i.i.d. random vectors uniform on the unit Euclidean sphere in \(\mathbb {R}^3\). For every \(n \ge 1\) and every unit vector *a* in \(\mathbb {R}^n\), we have

where here and throughout \(|\cdot |\) denotes the standard Euclidean norm.

Szarek’s inequality (1), Ball’s inequality (2), as well as these extensions fall under the umbrella of so-called Khinchin-type inequalities. The archetype was Khinchin’s result asserting that all \(L_p\) norms of Rademacher sums \(\sum a_j\varepsilon _j\) are comparable to its \(L_2\)-norm, established in his work [18] on the law of the iterated logarithm (and perhaps discovered independently by Littlewood in [26]). Due to the intricacies of the methods involved, sharp Khinchin inequalities are known only for a handful of distributions, most notably random signs [14, 29], but also uniforms [2, 5, 6, 8, 19, 22, 25], type L [17, 32], Gaussian mixtures [1, 10], marginals of \(\ell _p\)-balls [4, 11], or distributions with good spectral properties [23, 33]. The present work makes a first step towards more general distributions satisfying only a closeness-type assumption instead of imposing structural properties. Viewing sharp Khinchin-type inequalities as maximization problems for functionals on the sphere, our results assert, perhaps surprisingly, the fact that such inequalities are stable with respect to perturbations of the law of the underlying random vectors. These *distributional stability* results are novel in the context of optimal probabilistic inequalities.

## 2 Main results

For \(p > 0\) and a random vector *X* in \(\mathbb {R}^d\), we denote its \(L_p\)-norm with respect to the standard Euclidean norm \(|\cdot |\) on \(\mathbb {R}^d\) by \(\Vert X\Vert _p = (\mathbb {E}|X|^p)^{1/p}\), whereas for a (deterministic) vector *a* in \(\mathbb {R}^n\), \(\Vert a\Vert _\infty = \max _{j \le n} |a_j|\) is its \(\ell _\infty \)-norm. We say that the random vector *X* in \(\mathbb {R}^d\) is symmetric if \(-X\) has the same distribution as *X*. We also recall that the vector *X* is called rotationally invariant if for every orthogonal map *U* on \(\mathbb {R}^d\), *UX* has the same distribution as *X*. Equivalently, *X* has the same distribution as \(|X|\xi \), where \(\xi \) is uniformly distributed on the unit sphere \(\mathbb {S}^{d-1}\) in \(\mathbb {R}^d\) and independent of |*X*|. Recall that the \(\textsf{W}_2\)-Wasserstein distance \(\textsf{W}_2(X,Y)\) between (the distributions of) two random vectors *X* and *Y* in \(\mathbb {R}^d\) is defined as \(\inf _{(X',Y')} \Vert X'-Y'\Vert _2\), where the infimum is taken over all couplings of *X* and *Y*, that is, all random vectors \((X',Y')\) in \(\mathbb {R}^{2d}\) such that \(X'\) has the same distribution as *X* and \(Y'\) has the same distribution as *Y*.

Our first result is an extension of Szarek’s inequality (1) which reads as follows.

### Theorem 1

There is a positive universal constant \(\delta _0\) such that if we let \(X_1, X_2, \dots \) be i.i.d. symmetric random variables satisfying

then for every \(n \ge 3\) and unit vectors *a* in \(\mathbb {R}^n\) with \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\), we have

Moreover, we can take \(\delta _0 = 10^{-4}\).

Note that left hand side of (3) is nothing but the \(\textsf{W}_2\)-Wasserstein distance between the distribution of \(X_1\) and the Rademacher distribution since \(|x\pm 1| \ge \big ||x|-1\big |\) for \(x \in \mathbb {R}\) and thus the optimal coupling of the two distributions is \(\big (X_1,\textrm{sign}(X_1)\big )\).

Our second main result provides an analogous extension for Ball’s inequality (2).

### Theorem 2

Let \(X_1, X_2, \dots \) be i.i.d. symmetric random vectors in \(\mathbb {R}^3\). Suppose their common characteristic function \(\phi (t) = \mathbb {E}e^{i\!\left\langle t, X_1 \right\rangle \!}\) satisfies

for some constant \(C_0 > 0\). Assume that

where \(C_1 = \max \{C_0,1\}\) and \(\xi \) is a random vector uniform on the unit Euclidean sphere \(\mathbb {S}^2\) in \(\mathbb {R}^3\). Then for every \(n \ge 3\) and unit vectors *a* in \(\mathbb {R}^n\) with \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\), we have

Plainly, if we know that \(X_1\) and \(\xi \) are sufficiently close in \(\textsf{W}_3\), then the parameter \(\mathbb {E}|X_1|^{3}\) in (6) is redundant. In contrast to Theorem 1, here the closeness assumption (6) is put in terms of two parameters of the distribution: its third moment and the polynomial decay of its characteristic function. It is not clear whether this is essential. At the technical level of our proofs, the third moment is needed to carry out a certain Gaussian approximation, whilst the decay assumption has to do with an a priori lack of integrability in the Fourier-analytic representation of the \(L_{-1}\) norm (as opposed to the \(L_1\)-norm handled in Theorem 1).

On the other hand, neither of these is very restrictive. In particular, if \(X_1\) has a density *f* on \(\mathbb {R}^3\) vanishing at \(\infty \) whose gradient is integrable, then

so (5) holds with \(C_0 = \sqrt{3}\int _{\mathbb {R}^3}|\nabla f|\).

Another natural sufficient condition is the rotational invariance of \(X_1\): if, say, \(X_1\) has the same distribution as \(R\xi \), for a nonnegative random variable *R* and an independent of it random vector \(\xi \) uniform on the unit sphere \(\mathbb {S}^2\), then Archimedes’ Hat-Box theorem implies that \(\left\langle t, R\xi \right\rangle \!\), conditioned on the value of *R*, is uniform on \([-R|t|,R|t|]\) and thus

Moreover, in this case \(\textsf{W}_2(X_1,\xi ) = \Vert R-1\Vert _2\) (since for every unit vectors \(\theta , \theta '\) in \(\mathbb {R}^d\) and \(R \ge 0\), we have \(|R\theta -\theta '| \ge |R-1|\), as is easily seen by squaring). Probabilistically, this is an important special case as it yields results for symmetric unimodal distributions on \(\mathbb {R}\). Indeed, if *X* is of the form \(R\xi \) as above, for \(q > -1\), we have the identity

where the \(R_j\) are i.i.d. copies of *R* and the \(U_j\) are i.i.d. uniform random variables on \([-1,1]\), independent of the \(R_j\) (see Proposition 4 in [20]). The \(R_jU_j\) showing up in this formula can have any symmetric unimodal distribution, uniquely defined by the distribution of \(R_j\). Thus, if \(V_1, V_2, \dots \) be i.i.d. symmetric unimodal random variables, Theorem 2 then immediately yields a sharp upper bound on \(\lim _{q\downarrow -1}(1+q)\mathbb {E}\left| \sum _{j=1}^n a_jX_j\right| ^{q}\) for all unit vectors *a* with \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\) (cf. [5, 6, 11, 25]).

A result in the same vein as Theorem 2 is König and Koldobsky’s extension [20] of Ball’s cube slicing inequality to product measures with densities satisfying certain regularity and moment assumptions. Their result also applies specifically to vectors of weights satisfying the small coefficient condition \(\Vert a\Vert _\infty \le \tfrac{1}{\sqrt{2}}\).

Approached differently, *full* extensions of (1) and (2) (i.e. without the small coefficient restriction on *a*) have been obtained in our recent work [12] for a very special family of distributions corresponding geometrically to extremal sections and projections of \(\ell _p\)-balls.

## 3 Proof of Theorem 1

Our approach builds on Haagerup’s slick Fourier-analytic proof from [14]. We let

be the characteristic function of \(X_1\). Using the elementary Fourier-integral representation

as well as the symmetry and independence of the \(X_j\), we have,

(see also Lemma 1.2 in [14]). If *a* is a unit vector in \(\mathbb {R}^n\) with nonzero components, using the AM-GM inequality, we obtain Haagerup’s lower bound

where

(see Lemma 1.3 in [14]). The crucial lemma reads as follows.

### Lemma 3

Under the assumptions of Theorem 1, we have \(\Psi (s) \ge \Psi (2)\) for every \(s\ge 2\).

If we take the lemma for granted, the proof of Theorem 1 is finished because the small coefficient assumption \(\Vert a\Vert _\infty \le \frac{1}{\sqrt{2}}\) gives \(\Psi (a_j^{-2}) \ge \Psi (2)\) for each *j*, and as a result we get

where the last equality is justified by (10).

It remains to prove Lemma 3. To this end, we recall that if the \(X_j\) were Rademacher random variables, then the special function \(\Psi \) becomes

Haagerup showed that for every \(s>0\),

and concluded by the product representation that \(\Psi _0\) is strictly increasing. In particular, Lemma 3 holds in the Rademacher case due to monotonicity. The rest of the proof builds exactly on this observation: we show that the closeness of distributions guarantees that \(\Psi \) and \(\Psi _0\) are close for, say \(s \ge 3\), and that their derivatives are close for \(2 \le s \le 3\). Crucially, not only do we know that \(\Psi _0\) is strictly monotone, but also we can get a good bound on its derivative near the endpoint \(s=2\), which we record now for future use.

### Lemma 4

We have

### Proof

Differentiating Haagerup’s product expression (14) term-by-term yields

\(\square \)

The rest of this section is devoted to the proof of Lemma 3. We break it into several parts.

### 3.1 A uniform bound on the characteristic function

### Lemma 5

Let *X* be a symmetric random variable satisfying (3). Then its characteristic function \(\phi (t) = \mathbb {E}e^{itX}\) satisfies,

### Proof

By symmetry, the triangle inequality and the bound \(|\sin u| \le |u|\), we get

using the Cauchy–Schwarz inequality in the last estimate. Moreover,

Plugging in the assumption \(\big \Vert |X|-1\big \Vert _2 \le \delta _0\) completes the proof. \(\square \)

### 3.2 Uniform bounds on the special function and its derivative

### Lemma 6

Assuming (3) and the symmetry of \(X_1\), the functions \(\Psi \) and \(\Psi _0\) defined in (12) and (13) respectively satisfy

### Proof

Fix \(T>0\). Breaking the integral defining \(\Psi \) into \(\int _0^T + \int _T^\infty \) and using that \(|a-b| \le 1\) for \(a, b \in [0,1]\), we obtain

We also have \(\big ||a|^s - |b|^s\big | \le s|a-b|\) for \(a, b \in [-1,1]\), \(s \ge 1\), thus Lemma 5 yields

Optimizing over the parameter *T* gives the desired bound. \(\square \)

### Lemma 7

For \(s \ge 2\) and \(0< u, v < 1\), we have

### Proof

Let \(f(x)=x^s \log x\). It suffices to prove that on (0, 1) we have \(|f'(x)| \le 1\), which is equivalent to \(|\alpha t \log t + t| \le 1\) with \(t = x^{s-1} \in (0,1)\) and \(\alpha = \frac{s}{s-1} \in [1,2]\). To prove this observe that for \(t\in (0,1)\) we have \(\alpha t \log t + t \le t \le 1\) and

\(\square \)

### Lemma 8

Assuming (3) and the symmetry of \(X_1\), the functions \(\Psi \) and \(\Psi _0\) defined in (12) and (13) satisfy

### Proof

Changing the variables and differentiating gives

Thus,

To estimate the integral, we proceed along the same lines as in the proof of Lemma 6. We fix \(T > 0\), write \(\int _0^\infty = \int _0^T + \int _T^\infty \) and for the second integral use \(|u^s\log u| = \frac{1}{s}|u^s\log (u^s)| \le \frac{1}{es}\), \(0< u < 1\), to get a bound on it by \(\frac{2}{esT}\), whilst for the first integral, using first Lemma 7 and then Lemma 5, we obtain

Altogether, with the aid of Lemma 6,

Minimising the second term over \(T > 0\) leads to the bound by

For \(s \ge 2\), we have \(\frac{1}{\pi s}\left( \sqrt{2}+\frac{4}{\sqrt{e}}\right) < 0.61\ldots \) and this completes the proof. \(\square \)

### 3.3 Proof of Lemma 3

First we assume that \(s \ge 3\). Using Lemma 6 and letting \(\eta = \frac{2}{\pi }\sqrt{2\delta _0(\delta _0+2)}\) for brevity, we get

Since \(\Psi _0\) is increasing, \(\Psi _0(s) \ge \Psi _0(3) = \Psi _0(3) - \Psi _0(2) + \Psi _0(2)\) and \(\Psi _0(2) \ge \Psi (2) -\eta \), again using Lemma 6. Therefore,

It is now clear that as long as \(\delta _0\) is sufficiently small, namely \(2\eta \le \Psi _0(3) - \Psi _0(2)\), we get \(\Psi (s) \ge \Psi (2)\), as desired. It can be checked that \(\Psi _0(3) - \Psi _0(2) = \frac{4}{\pi \sqrt{3}} - \frac{1}{\sqrt{2}} = 0.027..\) and a choice of \(\delta _0 \le 10^{-4}\) suffices for the estimate \(\Psi (s)\ge \Psi (2)\) to hold for \(s\ge 3\).

Now we assume that \(2< s < 3\). We have

for some \(2< \theta < s\). Using Lemmas 4 and 8, we get

which is positive for all \(\delta _0 \le 3.7\cdot 10^{-4}\). Thus, \(\Psi (s) \ge \Psi (2)\) holds in both cases.\(\square \)

## 4 Proof of Theorem 2

The approach is the same as for Theorem 1, however certain technical details are substantially more involved. We begin with a Fourier-analytic representation for *negative* moments due to Gorin and Favorov [13].

### Lemma 9

(Lemma 3 in [13]) For a random vector *X* in \(\mathbb {R}^d\) and \(-d< q < 0\), we have

where \(\beta _{q,d} = 2^{q}\pi ^{-d/2}\frac{\Gamma \left( (d+q)/2\right) }{\Gamma (-q/2)}\), provided that the integral on the right hand side exists.

Specialised to \(d = 3\), \(q = -1\) (\(\beta _{-1,3} = \frac{1}{2\pi ^2}\)) and \(X = \sum _{j=1}^n a_jX_j\) with \(X_1,\ldots ,X_n\) independent random vectors, we obtain

Note that thanks to the decay assumption (5), the integral on the right hand side converges as long as \(n \ge 2\) (assuming the \(a_j\) are nonzero). As in Ball’s proof from [3], Hölder’s inequality yields

where

with

denoting the characteristic function of \(X_1\). Exactly as in the proof of Theorem 1, the following pivotal lemma allows us to finish the proof.

### Lemma 10

Under the assumptions of Theorem 2, we have \(\Phi (s) \le \Phi (2)\) for every \(s\ge 2\).

If the \(X_j\) are uniform on the unit sphere \(\mathbb {S}^2\) in \(\mathbb {R}^3\), we have \(\phi (t) = \frac{\sin |t|}{|t|}\) (because \(\left\langle t, X_1 \right\rangle \!\) is uniform on \([-|t|,|t|]\)), in which case the special function \(\Phi \) defined in (22) becomes

(after integrating in polar coordinates). Ball’s celebrated integral inequality states that \(\Phi _0(s) \le \Phi _0(2)\), for all \(s \ge 2\) (see Lemma 3 in [3], as well as [28, 31] for different proofs). Our proof of Lemma 10 relies on this, additional bounds on the derivative \(\Phi _0'(s)\) near \(s=2\), as well as, crucially, bounds quantifying how close \(\Phi \) is to \(\Phi _0\). In the following subsections we gather such results and then conclude with the proof of Lemma 10.

### 4.1 A uniform bound on the characteristic function

Throughout these sections \(\xi \) always denotes a random vector uniform on the unit sphere \(\mathbb {S}^2\) in \(\mathbb {R}^3\).

### Lemma 11

Let *X* be a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X,\xi )\). Then, its characteristic function \(\phi (t) = \mathbb {E}e^{i\!\left\langle t, X \right\rangle \!}\) satisfies

### Proof

Let \(\xi \) be uniform on \(\mathbb {S}^2\) such that for the joint distribution of \((X, \xi )\), we have \(\Vert X-\xi \Vert _2 = \textsf{W}_2(X,\xi ) = \delta \). By symmetry, the bound \(|\sin u| \le |u|\) and the Cauchy–Schwarz inequality (used twice), we get

To conclude we use the triangle inequality

\(\square \)

### 4.2 Bounds on the special function

We begin with a bound on the difference \(\Phi (s) - \Phi _0(s)\) obtained from the uniform bound on the characteristic functions (Lemma 11 above). In contrast to Lemma 6, the bound is not uniform in *s*. For *s* *not* too large (the bulk), we incur the factor \(s^{3/4}\). To fight it off for large values of *s*, we shall employ a Gaussian approximation. For that part to work, it is crucial that \(\Phi _0(2) - \Phi _0(\infty ) = \sqrt{2} - \sqrt{\frac{6}{\pi }} > 0\).

#### 4.2.1 The bulk

### Lemma 12

Let *X* be a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X,\xi )\) and characteristic function \(\phi \) satisfying (5) for some \(C_0 > 0\). Let \(\Phi \) and \(\Phi _0\) be defined through (22) and (24) respectively. For every \(s \ge 2\), we have

### Proof

Given the definitions, we have

We fix \(T>0\) and split the integration into two regions.

*Small* *t*. Using Lemma 11 and \(||a|^s-|b|^s| \le s|a-b|\) when \(|a|, |b| \le 1\), we obtain

*Large* *t*. Since \(s \ge 2\), we have

By virtue of the decay assumption (5), this is at most

Adding up these two bounds and optimising over *T* yields

Plugging this back gives the assertion. \(\square \)

#### 4.2.2 The Gaussian approximation

We now present a bound on \(\Phi (s)\) which does not grow as \(s\rightarrow \infty \) that will allow us to prove Lemma 10 for *s* sufficiently large.

### Lemma 13

Let *X* be a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X,\xi )\) and characteristic function \(\phi \) satisfying (5) for some \(C_0 > 0\). Let \(\Phi \) be defined through (22). Assuming that \(\delta \le \min \{\frac{1}{\sqrt{3}}, (15C_0)^{-2}\}\), we have

with arbitrary \(0< \theta < \frac{(1-\delta \sqrt{3})^2}{3\mathbb {E}|X|^3}\).

### Proof

We split the integral defining \(\Phi (s) = \frac{1}{2\pi ^2}\int _{\mathbb {R}^3} |\phi (s^{-1/2}t)|^s|t|^{-2} \textrm{d}t\) into several regions.

*Large t.* Using the decay condition (5), we get

Thus, for \(s \ge 2\),

as \(\frac{2 e\sqrt{s}}{\pi (s-1)} < \frac{4}{\sqrt{s}}\) for \(s \ge 2\).

*Moderate* *t*. This case is vacuous unless \(C_0 > \pi /e\). We use Lemma 11 to obtain

In this case, the condition \(\delta < (15C_0)^{-2}\) suffices to guarantee that \(\frac{1}{\pi }+\frac{\delta (\delta +2)}{2}\left( eC_0\right) ^2 < \frac{1}{e}\) (also using, say \(\delta +2 < 3\)). Then we get

*Small* *t*. For \(0< u < \pi \), we have

Fix \(0< \theta < \pi \). Then, first using Lemma 11 and then (28), we obtain

Integrating using polar coordinates and invoking the standard tail bound

the last integral gets upper bounded by

Summarising, we have shown that

*Very small* *t*. Taylor-expanding \(\phi \) at 0 with the Lagrange remainder,

for some point \(\eta \) in the segment \([0,s^{-1/2}t]\). To bound the error term, we note that

thus

We also note that in the domain \(\{|t| \le \theta \sqrt{s}\}\), the leading term \(1 - \frac{1}{2}\mathbb {E}\!\left\langle X, s^{-1/2}t \right\rangle \!^2\) is nonnegative, provided that \(\frac{1}{2}\theta ^2\mathbb {E}|X|^2 \le 1\). Since \(\Vert X\Vert _2 \le \delta + 1\) under the assumption (6), it suffices that \(\theta < \frac{\sqrt{2}}{1+\delta }\). Assuming this, we thus get

Evoking (6), let \(\xi \) be uniform on \(\mathbb {S}^2\) such that \(\Vert X-\xi \Vert _2 \le \delta \) with respect to some coupling. Then, for a fixed vector *v* in \(\mathbb {R}^3\), we obtain the bound

Thus, provided that \(\delta < \frac{1}{\sqrt{3}}\), this yields

where we have set \(\alpha = (\frac{1}{\sqrt{3}}-\delta )^2 - \frac{1}{3}\theta \mathbb {E}|X|^3\) and assumed that \(\alpha \) is positive in the last equality (guaranteed by choosing \(\theta \) sufficiently small). Then we finally obtain

Putting these three bounds together gives the assertion. Note that we have imposed the conditions \(\delta < \frac{1}{\sqrt{3}}\) and \(\delta < (15C_0)^{-2}\) when \(C_0 > \frac{\pi }{e}\), as well as \(\theta < \pi \), \(\theta < \frac{\sqrt{2}}{1+\delta }\) and \(\theta < \frac{(1-\delta \sqrt{3})^2}{3\mathbb {E}|X|^3}\). Since \(\Vert X\Vert _3 \ge \Vert X\Vert _2 \ge 1 - \delta \) and \(\delta < \frac{1}{\sqrt{3}}\), we have \(\frac{(1-\delta \sqrt{3})^2}{3\mathbb {E}|X|^3}< \frac{(1-\delta \sqrt{3})^2}{3(1-\delta )^3} = \frac{1}{3(1-\delta )}\left( \frac{1-\delta \sqrt{3}}{1-\delta }\right) ^2< \frac{1}{3-\sqrt{3}} < 0.79\). Moreover, \(\frac{\sqrt{2}}{1+\delta }> \frac{\sqrt{2}}{1+1/\sqrt{3}} > 0.89\), so the condition \(\theta < \frac{(1-\delta \sqrt{3})^2}{3\mathbb {E}|X|^3}\) implies the other two conditions on \(\theta \). \(\square \)

### 4.3 Bounds on the derivative of the special function

### Lemma 14

Let *X* be a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X,\xi )\) and characteristic function \(\phi \) satisfying (5) for some \(C_0 > 0\). Let \(\Phi \) and \(\Phi _0\) be defined through (22) and (24) respectively. For every \(s \ge 2\), we have

### Proof

First we take the derivative,

For the resulting \(\Phi -\Phi _0\) term, we use Lemma 12. To bound the difference of the integrals resulting from the second term, we fix \(T>0\) and split the integration into two regions.

*Small t.* Using Lemmas 7 and 11, we obtain

*Large* *t*. Note that for \(s \ge 2\), and \(0< u < 1\) we have,

Thus,

which, after applying the decay condition (5), gets upper bounded by

Adding up these two bounds and optimising over *T* yields

Going back to the difference of the derivatives, we arrive at the desired bound using

\(\square \)

### 4.4 Bounds on Ball’s special function

We will need two estimates on \(\Phi _0\) defined in (24), that is

First, we have a bound on the derivative near \(s=2\).

### Lemma 15

For \(2 \le s \le 2.01\), we have \(\Phi _0'(s) \le -0.02\).

Second, on the complementary range, \(\Phi _0(s)\) is separated from its supremal value \(\Phi _0(2)\).

### Lemma 16

For \(s \ge 2.01\), we have \(\Phi _0(s) \le \Phi _0(2) - 2\cdot 10^{-4}.\)

We begin with a numerical bound which will be used in the proofs of these assertions.

### Lemma 17

We have

### Proof

Using (28), we get

Moreover,

Therefore our integral is bounded above by

\(\square \)

We let

### Proof of Lemma 15

First we observe that

Note that *I* is decreasing. We have,

since \(I(2)=\frac{\pi }{2}\). Moreover,

With the aid of Lemma 17, we therefore have

Thus, for \(2 \le s \le 2.01\), we have

where in the first inequality we used that the term in parenthesis is negative. \(\square \)

For the proof of Lemma 16, we need several more estimates. First, we record a lower bound on the derivative of \(\Phi _0(s)\) for arbitrary *s*.

### Lemma 18

For \(s \ge 2\), we have \(\Phi _0'(s) \ge -\frac{12\sqrt{s}}{\pi e}\).

### Proof

We have,

so it is enough to upper bound \(|I'(s)|\). Note that

\(\square \)

Second, we obtain a quantitative drop-off of the values of \(\Phi _0\).

### Lemma 19

Let \(a \in [1, \frac{\pi }{3}]\) and suppose that for some \(s_0 \ge 2\), we have \(\Phi _0(s_0)= \sqrt{\frac{2}{a}}\). Then

To prove this, we build on the argument of Nazarov and Podkorytov from [31]. For a somewhat similar bound, we refer to Proposition 7 in König and Koldobsky’s work [21] on maximal-perimeter sections of the cube. For convenience and completeness, we include all arguments in detail. We consider functions

and their distribution functions

### Lemma 20

For \(a \in [1,\frac{\pi }{3}]\) the function \(F_a-G\) has precisely one sign change point \(y_0\) and at this point changes sign from “−” to “\(+\)”.

### Proof

Note that \(F_a(y)=G(y)=0\) for \(y \ge 1\), so we only consider \(y \in (0,1)\). We have \(F_a(y)=\sqrt{\frac{2}{\pi a} \ln (\frac{1}{y})}\).

The function *g*(*x*) has zeros for \(x \in \mathbb {Z}\). For \(m\in \mathbb {N}\), let \(y_m = \max _{[m,m+1]} g\). We clearly have \(y_m< \frac{1}{\pi m}\) and \(y_m> g(m+\frac{1}{2}) = \frac{1}{\pi (m+\frac{1}{2})}\). Thus \(y_m \in (\frac{1}{\pi (m+\frac{1}{2})}, \frac{1}{\pi m})\), which shows that the sequence \(y_m\) is decreasing. We have the following claims.

**Claim 1**. The function \(F_a-G\) is positive on \((y_1,1)\).

Note that if \(g(x)>y_1\) then \(x \in (0,1)\). Moreover \(g(x) \le { f_a(x)}\) for \(x \in [0,1]\), since

Thus, for \(y\in (y_1,1)\), we have

**Claim 2**. The function \(F_a-G\) changes sign at least once in (0, 1).

Due to Claim 1 it is enough to show that \(F_a-G\) is sometimes negative. We have \(F_a-G \le F_1 -G\) and \(\int _0^\infty 2y(F_1(y)-G(y)) \textrm{d}y = \int (f_1^2 -g^2)=0\), so \(F_1-G\) can be negative.

**Claim 3**. The function \(F_a-G\) is increasing on \((0,y_1)\).

Clearly \(F_a'>F_1'\) and thus the claim follows from the fact that \(F_1-G\) is increasing on \((0,y_1)\), which was proved in [31] (Chapter I, Step 5). \(\square \)

### Proof of Lemma 19

The assumption \(\Phi _0(s_0) = { \sqrt{\frac{2}{a}}}\) is equivalent to

After changing variables and using Lemma 20, we get from the Nazarov–Podkorytov lemma (Chapter I, Step 4 in [31]) that for \(s \ge s_0\)

\(\square \)

### Proof of Lemma 16

Take \(s_0=2.01\) and \(a=2\Phi _0(s_0)^{-2}\) in Lemma 19. Since \(\Phi _0(2) = \sqrt{2}\), Ball’s inequality gives that \(a \ge 1\). We need to check that \(a \le \frac{\pi }{3}\). From Lemma 18, we have that for \(s \in [2,2.01]\), \(\Phi _0'(s) \ge -\frac{12\sqrt{2.01}}{\pi e} > -2\). Thus, \(\Phi _0(s_0) \ge \Phi _0(2)-2(s_0-2) = \sqrt{2} - 0.02\). Therefore, \(a< 2\cdot (\sqrt{2}-0.02)^{-2}< 1.03 < \frac{\pi }{3}\), as needed. By Lemmas 19 and 15, we thus get that for \(s \ge s_0=2.01\),

\(\square \)

### 4.5 Proof of Lemma 10

Recall that we assume *X* is a symmetric random vector in \(\mathbb {R}^3\) with \(\delta = \textsf{W}_2(X, \xi )\) and characteristic function \(\phi \) satisfying (5), that is \(|\phi (t)| \le C_0/|t|\), for all \(t \in \mathbb {R}^3{\setminus }\{0\}\). Let \(C_1 = \max \{C_0,1\}\). Our goal is to show that if (6) holds, that is

then \(\Phi (s) \le \Phi (2)\) for all \(s \ge 2\), where \(\Phi \) is defined in (22). For the sake of clarity, we shall be fairly lavish with choosing constants. Since \(C_1 \ge 1\), the above assumes in particular that \(\delta \le 10^{-38}\). With this in mind, we note the following consequences of Lemmas 12 and 14 respectively: for \(s \ge 2\),

and similarly

We also remark that \(\Vert X\Vert _3 \ge \Vert X\Vert _2 \ge \Vert \xi \Vert _2 - \Vert X-\xi \Vert _2 = 1 - \delta \ge 1-10^{-38}\).

We break the argument into several regimes for the parameter *s*.

*Large s.* With hindsight, we set

In particular, \(s_0 \ge 10^5\). Using Lemma 13, that is

we will show that \(\Phi (s) \le \Phi (2)\) for all \(s \ge s_0\). We take \(\theta = \frac{1}{100\mathbb {E}|X|^3}\) which satisfies the conditions of the lemma and then, for the first term \(A_1\), we use

Thanks to (34), we also have

so it suffices to show that each of the second and third terms \(A_2\), \(A_3\) as well as this additional error \(A_4\) do not exceed \(\frac{1}{150}\). Using \(\delta < 10^{-38}C_1^{-9}\), we get

For the exponent in the second term \(A_2\), observe that

and, consequently,

Thus, using \(s \ge s_0 \ge 10^6(\mathbb {E}|X|^3)^2\), we get

Finally, for the third term, since \(s \ge s_0 \ge 10^5\),

therefore, since \(s \ge s_0 \ge 2\log C_1\),

*Moderate* *s*. We now assume that \(2.01 \le s \le s_0\). Using (34) twice and Lemma 16,

Inserting the bound on \(\delta \),

If \(s_0 = 10^6(\mathbb {E}|X|^3)^2\), then using the \((\mathbb {E}|X|^3)^{-3/2}\) term in the minimum and \(C_1^{-3/4} \le 1\), we get the above bounded by \(3\cdot 10^{-19/2+9/2} = 3\cdot 10^{-5}\). If \(s_0 = 2\log C_1\), then using the other term in the minimum, we get the bound by \(3\cdot 2^{3/4}10^{-19/2} C_1^{-3/4} (\log C_1)^{3/4}< 3(2/e)^{3/4}10^{-19/2} < 10^{-4}\) since \(u^{-1}\log u \le e^{-1}\) for \(u > 1\). In either case, we get the conclusion \(\Phi (s) \le \Phi (2)\).

*Small* *s*. We finally assume that \(2 \le s \le 2.01\). To argue that \(\Phi (s) \le \Phi (2)\), we will show that \(\Phi '(s) < 0\). By virtue of (35) and Lemma 15,

Since \(\delta C_1^6 \le \delta C_1^9 \le 10^{-38}\), this is clearly negative and the proof is complete.\(\square \)

## 5 Concluding remarks

### Remark 1

Assumption (3) seems natural: plainly, there are distributions which are *not* close to the Rademacher one, for which the unit vector attaining \(\inf \mathbb {E}|\sum a_jX_j|\) is different than \(a = (\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}}, 0, \dots , 0)\), for instance it is \(a = (1,0,\dots , 0)\) for Gaussian mixtures (see [1, 10]), or for the Rademacher distribution with a large atom at 0 (see Theorem 4 and Remark 14 in [16]).

### Remark 2

Handling the complementary case \(\Vert a\Vert _\infty > \frac{1}{\sqrt{2}}\) which is not covered by Theorems 1 and 2 is a different story. The trivial convexity argument presented in the introduction works in fact only for the Rademacher case, as it requires \(\frac{1}{\sqrt{2}}\mathbb {E}|X_1| \ge \mathbb {E}\left| \frac{X_1+X_2}{\sqrt{2}}\right| \), and only for the \(L_1\)-norm (see Remark 21 in [6]). To circumvent this, several different approaches have been used: Haagerup’s ad hoc approximation (see §3 in [14]), Nazarov and Podkorytov’s induction with a strengthened hypothesis (see Ch. II, Step 5 in [31]) which has also been adapted to other distributions (see [5, 6, 8]), and very recently a different inductive scheme near the extremiser (without a strengthening) needed in a geometric context (see [12]). None of these techniques appears amenable to the broad setting of general distributions that is treated in this paper.

### Remark 3

De, Diakonikolas and Servedio obtained in [9] a stable version of Szarek’s inequality (1) with respect to the unit vector *a*, namely

for a universal positive constant \(\kappa \), where the deficit is given by \(\delta (a) = |a - (\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}, 0, \dots , 0)|^2\), assuming that \(a_1 \ge a_2 \ge \dots \ge a_n \ge 0\). Note that in the setting of Theorem 1, we have

by a simple application of the triangle inequality and \(\Vert \cdot \Vert _1 \le \Vert \cdot \Vert _2\). Thus, applying this (twice) and the bound (37) of De Diakonikolas and Servedio, we conclude that Theorem 1 also holds for unit vectors *a* with \(\delta (a) \ge (2\delta _0/\kappa )^2\). The same will apply to Theorem 2 with the aid of Theorem 1.2 from [7], a strengthening of Ball’s inequality (2) (see also [27]). See [12] for numerical values of the constants \(\kappa \).

### Remark 4

We have used the \(\textsf{W}_2\)-distance in Theorems 1 and 2 for concreteness and convenience. Of course, for every \(p \ge 1\), if we use the \(\textsf{W}_p\)-distance in (3) and assume that \(X_1\) is in \(L_{\frac{p}{p-1}}\), then the proofs of Lemmas 5 and 11 go through with the Cauchy–Schwarz inequality replaced by Hölder’s inequality and the rest of the proof remains unchanged. It might be of interest to examine weaker distances in such statements.

### Remark 5

Szarek’s sharp \(L_1 - L_2\) inequality (1) was extended to sharp \(L_p - L_2\) bounds for all \(p > 0\) by Haagerup in [14], using Fourier-integral representations of \(|x|^p\). It therefore seems plausible that our techniques allow to extend Theorem 1 to sharp bounds on \(L_p\) norms, but additional (nontrivial and technical) work is needed to treat the analogues of the special function \(\Psi _0\), (13), relevant to Haagerup’s \(L_p\) bounds. Similarly, the main result from [6] which extends (2) to sharp \(L_p - L_2\) bounds for all \(-1< p < 0\) could be a starting point for extensions of Theorem 2 to \(L_p\) norms with \(-1< p < 0\).

## Data availibility

This manuscript has no associated data.

## References

Averkamp, R., Houdré, C.: Wavelet thresholding for non-necessarily Gaussian noise: idealism. Ann. Stat.

**31**, 110–151 (2003)Baernstein, A., II., Culverhouse, R.: Majorization of sequences, sharp vector Khinchin inequalities, and bisubharmonic functions. Studia Math.

**152**(3), 231–248 (2002)Ball, K.: Cube slicing in \({ R}^n\). Proc. Am. Math. Soc.

**97**(3), 465–473 (1986)Barthe, F., Naor, A.: Hyperplane projections of the unit ball of \(\ell _p^n\). Discrete Comput. Geom.

**27**(2), 215–226 (2002)Chasapis, G., Gurushankar, K., Tkocz, T.: Sharp bounds on \(p\)-norms for sums of independent uniform random variables, \(0<p<1\). J. Anal. Math.

**149**(2), 529–553 (2023)Chasapis, G., König, H., Tkocz, T.: From Ball’s cube slicing inequality to Khinchin-type inequalities for negative moments. J. Funct. Anal.

**281**(9), 109185 (2021)Chasapis, G., Nayar, P., Tkocz, T.: Slicing \(\ell _p\)-balls reloaded: stability, planar sections in \(\ell _1\). Ann. Probab.

**50**(6), 2344–2372 (2022)Chasapis, G., Singh, S., Tkocz, T.: Haagerup’s phase transition at polydisc slicing. Anal. PDE (2022). arXiv:2206.01026

**(preprint, to appear)**De, A., Diakonikolas, I., Servedio, R.A.: A robust Khintchine inequality, and algorithms for computing optimal constants in Fourier analysis and high-dimensional geometry. SIAM J. Discrete Math.

**30**(2), 1058–1094 (2016)Eskenazis, A., Nayar, P., Tkocz, T.: Gaussian mixtures: entropy and geometric inequalities. Ann. Probab.

**46**(5), 2908–2945 (2018)Eskenazis, A., Nayar, P., Tkocz, T.: Sharp comparison of moments and the log-concave moment problem. Adv. Math.

**334**, 389–416 (2018)Eskenazis, A., Nayar, P., Tkocz, T.: Resilience of cube slicing in \(\ell _p\). (2022) arXiv:2211.01986

**(preprint)**Gorin, E.A., Favorov, S.Yu.: Generalizations of the Khinchin inequality (Russian). Teor. Veroyatnost. i Primenen.

**35**(4), 762–767 (1990) [translation in Theory Probab. Appl. 35(4), 766–771 (1991)]Haagerup, U.: The best constants in the Khintchine inequality. Studia Math.

**70**(3), 231–283 (1981)Hall, R.R.: On a conjecture of Littlewood. Math. Proc. Camb. Philos. Soc.

**78**(3), 443–445 (1975)Havrilla, A., Tkocz, T.: Sharp Khinchin-type inequalities for symmetric discrete uniform random variables. Isr. J. Math.

**246**(1), 281–297 (2021)Havrilla, A., Nayar, P., Tkocz, T.: Khinchin-type inequalities via Hadamard’s factorisation. Int. Math. Res. Not. IMRN

**3**, 2429–2445 (2023)Khintchine, A.: Über dyadische Brüche. Math. Z.

**18**(1), 109–116 (1923)König, H.: On the best constants in the Khintchine inequality for Steinhaus variables. Isr. J. Math.

**203**(1), 23–57 (2014)König, H., Koldobsky, A.: On the maximal measure of sections of the \(n\)-cube. Geometric analysis, mathematical relativity, and nonlinear partial differential equations, 123–155, Contemp. Math., vol. 599, Amer. Math. Soc., Providence (2013)

König, H., Koldobsky, A.: On the maximal perimeter of sections of the cube. Adv. Math.

**346**, 773–804 (2019)König, H., Kwapień, S.: Best Khintchine type inequalities for sums of independent, rotationally invariant random vectors. Positivity

**5**(2), 115–152 (2001)Kwapień, S., Latała, R., Oleszkiewicz, K.: Comparison of moments of sums of independent random variables and differential inequalities. J. Funct. Anal.

**136**(1), 258–268 (1996)Latała, R., Oleszkiewicz, K.: On the best constant in the Khinchin–Kahane inequality. Studia Math.

**109**(1), 101–104 (1994)Latała, R., Oleszkiewicz, K.: A note on sums of independent uniformly distributed random variables. Colloq. Math.

**68**(2), 197–206 (1995)Littlewood, J.E.: On bounded bilinear forms in an infinite number of variables. Q. J. Math. Oxf. Ser.

**1**, 164–174 (1930)Melbourne, J., Roberto, C.: Quantitative form of Ball’s cube slicing in \(\mathbb{R} ^n\) and equality cases in the min-entropy power inequality. Proc. Am. Math. Soc.

**150**(8), 3595–3611 (2022)Melbourne, J., Roberto, C.: Transport-majorization to analytic and geometric inequalities. J. Funct. Anal.

**284**(1), Paper No. 109717 (2023)Nayar, P., Oleszkiewicz, K.: Khinchine type inequalities with optimal constants via ultra log-concavity. Positivity

**16**(2), 359–371 (2012)Nayar, P., Tkocz, T.: Extremal sections and projections of certain convex bodies: a survey. In: Koldobsky, A., Volberg, A. (eds.) Harmonic Analysis and Convexity, pp. 343–390. De Gruyter, Berlin, Boston (2023). https://www.degruyter.com/document/doi/10.1515/9783110775389-008/html

Nazarov, F.L., Podkorytov, A.N.: Ball, Haagerup, and distribution functions. Complex analysis, operators, and related topics, pp. 247–267, Oper. Theory Adv. Appl., vol. 113. Birkhäuser, Basel (2000)

Newman, C.M.: An extension of Khintchine’s inequality. Bull. Am. Math. Soc.

**81**(5), 913–915 (1975)Oleszkiewicz, K.: Comparison of moments via Poincaré-type inequality. Advances in stochastic inequalities (Atlanta, GA, 1997), pp. 135–148, Contemp. Math., vol. 234. Amer. Math. Soc., Providence (1999)

Szarek, S.: On the best constant in the Khintchine inequality. Studia Math.

**58**, 197–208 (1976)Tomaszewski, B.: A simple and elementary proof of the Kchintchine inequality with the best constant. Bull. Sci. Math. (2)

**111**(1), 103–109 (1987)

## Acknowledgements

We should very much like to thank an anonymous referee for their careful reading of the manuscript and helpful suggestions, particularly the one leading to Remark 5.

## Funding

Open Access funding provided by Carnegie Mellon University

## Author information

### Authors and Affiliations

### Corresponding author

## Ethics declarations

### Conflict of interest

The authors state that there is no conflict of interest.

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This material is based upon work supported by the NSF grant DMS-1929284 while A.E. was in residence at ICERM for the Harmonic Analysis and Convexity program. P.N.’s research was supported by the National Science Centre, Poland, grant 2018/31/D/ST1/0135. T.T.’s research was supported by the NSF grant DMS-2246484.

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Eskenazis, A., Nayar, P. & Tkocz, T. Distributional stability of the Szarek and Ball inequalities.
*Math. Ann.* **389**, 1161–1185 (2024). https://doi.org/10.1007/s00208-023-02669-9

Received:

Revised:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s00208-023-02669-9