An Almost Constant Lower Bound of the Isoperimetric Coefficient in the KLS Conjecture

We prove an almost constant lower bound of the isoperimetric coefficient in the KLS conjecture. The lower bound has the dimension dependency d-od(1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d^{-o_d(1)}$$\end{document}. When the dimension is large enough, our lower bound is tighter than the previous best bound which has the dimension dependency d-1/4\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d^{-1/4}$$\end{document}. Improving the current best lower bound of the isoperimetric coefficient in the KLS conjecture has many implications, including improvements of the current best bounds in Bourgain’s slicing conjecture and in the thin-shell conjecture, better concentration inequalities for Lipschitz functions of log-concave measures and better mixing time bounds for MCMC sampling algorithms on log-concave measures.


Introduction
Given a distribution, the isoperimetric coefficient of a subset is the ratio of the measure of the subset boundary to the minimum of the measures of the subset and its complement. Taking the minimum of such ratios over all subsets defines the isoperimetric coefficient of the distribution, also called the Cheeger isoperimetric coefficient of the distribution.
Kannan, Lovász and Simonovits (KLS) [12] conjecture that for any distribution that is log-concave, the Cheeger isoperimetric coefficient equals to that achieved by half-spaces up to a universal constant factor. If the conjecture is true, the Cheeger isoperimetric coefficient can be determined by going through all the half-spaces instead of all subsets. For this reason, the KLS conjecture is also called the KLS hyperplane conjecture. To make it precise, we start by formally defining log-concave distributions and then we state the conjecture.
A probability density function p : R d → R is log-concave if its logarithm is concave, i.e., for any x, y ∈ R d × R d and for any λ ∈ [0, 1], Common probability distributions such as Gaussian, exponential and logistic are log-concave. This definition also includes any uniform distribution over a convex set defined as follows. A subset K ⊂ R d is convex if ∀x, y ∈ K×K, z ∈ [x, y] =⇒ z ∈ K.
The isoperimetric coefficient ψ(p) of a density p in R d is defined as ψ(p) := inf S⊂R d p + (∂S) min(p(S), p(S c )) (2) where p(S) = x∈S p(x)dx and the boundary measure of the subset is where d(x, S) is the Euclidean distance between x and the subset S. The KLS conjecture is stated by Kannan, Lovász and Simonovits [12] as follows.

Conjecture 1.
There exists a universal constant c, such that for any log-concave density p in R d , we have where ρ (p) is the spectral norm of the covariance matrix of p. In other words, ρ (p) = A 2 , where A = Cov X∼p (X) is the covariance matrix.
An upper bound of ψ(p) of the same form is relatively easy and it was shown to be achieved by half-spaces [12]. Proving the lower bound on ψ(p) up to some small factors in Conjecture 1 is the main goal of this paper. We say a log-concave density is isotropic if its mean E X∼p [X] equals to 0 and its covariance Cov X∼p (X) equals to I d . In the case of isotropic log-concave densities, the KLS conjecture states that any isotropic log-concave density has its isoperimetric coefficient lower bounded by a universal constant. There are many attempts trying to lower bound the Cheeger isoperimetric coefficient in the KLS conjecture. We refer readers to the survey paper by Lee and Vempala [18] for a detailed exposition of these attempts. In particular, the original KLS paper [12] (Theorem 5.1) shows that for any log-concave density p with covariance matrix A, The original KLS paper [12] only deals with uniform distributions over convex sets, but their proof techniques can be easily extended to show that the same results hold for all log-concave densities. Remark that Equation (3) implies ψ(p) ≥ log(2) . The current best bound is shown in Lee and Vempala [17], where they show that there exists a universal constant c such that for any log-concave density p with covariance matrix A, It implies that ψ(p) ≥ c d 1/4 · √ ρ (p) . Note that in Lee and Vempala [17], their notation of ψ(p) is the reciprocal of ours and it is later switched in Theorem 32 of the survey paper [18] by the same authors. As a result, the above bound is not a misstatement of the results in Lee and Vempala [17] and it is simply translated into our notations. In this paper, we improve the dimension dependency d −1/4 to d −od (1) in the lower bound of the isoperimetric coefficient.
There are many implications of improving the lower bound in the KLS conjecture. The two closely related conjectures are Bourgain's slicing conjecture [3,4] and the thin-shell conjecture [2]. It is worth noting that Bourgain [4] stated the slicing conjecture earlier than the introduction of the KLS conjecture. In terms of their connections to the KLS conjecture, Eldan and Klartag [9] proved that the thin-shell conjecture implies Bourgain's slicing conjecture up to a universal constant factor. Later, Eldan [8] showed that the inverse of an lower bound of the isoperimetric coefficient is equivalent to an upper bound of the thin-shell constant in the thinshell conjecture. Combining these two results, we have that an lower bound in the KLS conjecture implies upper bounds in the thin-shell conjecture and in Bourgain's slicing conjecture.
The current best upper bound of the thin-shell constant has the dimension dependency d 1/4 due to Lee and Vempala's [17] improvement in the KLS conjecture. The current best bound of the slicing constant in Bourgain's slicing conjecture also has the dimension dependency d 1/4 , proved by Klartag [13] without using the KLS conjecture. Klartag's slicing constant bound is a slight improvement over Bourgain's earlier slicing bound [4] which has the dimension dependency d 1/4 log(d). Given the current best bounds in these three conjectures and the relation among them, we conclude that improving the current best lower bound in the KLS conjecture improves the current best bounds for the other two conjectures, as noted in Lee and Vempala [18]. For a detailed exposition of the three conjectures and related results since the introduction of Bourgain's slicing conjecture, we refer readers to Klartag and Milman [14].
Additionally, improving the lower bound in the KLS conjecture also improves concentration inequalities for Lipschitz functions of log-concave measures. It also leads to faster mixing time bounds of Markov chain Monte Carlo (MCMC) sampling algorithms on log-concave measures. Despite the great importance of these results, deriving these results from our new bound in the KLS conjecture is not the main focus of our paper. We refer readers to Milman [20] and Lee and Vempala [18] for more details about the abundant implications of the KLS conjecture. Notation For two sequences a n and b n indexed by an integer n, we say that a n = o n (b n ) if lim n→∞ an bn = 0. The Euclidean norm of a vector x ∈ R d is denoted by x 2 . The spectral norm of a square matrix A ∈ R d×d is denoted by A 2 . The Euclidean ball with center x and radius r is denoted by B(x, r). For a real number x ∈ R, we denote its ceiling by x = min {m ∈ Z | m ≥ x}. We say a density p is more log-concave than a Gaussian density ϕ if p can be written as a product form p = ν · ϕ where ϕ is the Gaussian density and ν is a log-concave function (that is, ν is proportional to a log-concave density). For a martingale (M t , t ∈ R + ), we use [M ] t to denote its quadratic variation, defined as

Main results
We prove the following lower bound on the isoperimetric coefficient of any logconcave density.

Theorem 1.
There exists a universal constant c such that for any log-concave density p in R d and any integer ≥ 1, we have where ρ (p) is the spectral norm of the covariance matrix of p.
As a corollary, take = = 0, for d large enough, the above lower bound is better than any lower bound of the form The proof of the main theorem uses the stochastic localization scheme introduced by Eldan [8]. Eldan uses this stochastic localization scheme to show that the thin shell conjecture is equivalent to the KLS conjecture up to a logarithmic factor. The construction of stochastic localization scheme uses elementary properties of semimartingales and stochastic integration. The main idea of Eldan's proof to derive the KLS conjecture from the thin shell conjecture is to smoothly multiply a Gaussian part to the log-concave density, so that the modified density is more log-concave than a Gaussian density. When the Gaussian part is large enough, one can then easily prove the isoperimetric inequality.
The same scheme was refined in Lee and Vempala [17] to obtain the current best lower bound in the KLS conjecture. Lee and Vempala directly attack the KLS conjecture while following the same stochastic localization scheme to smoothly multiply a Gaussian part to the log-concave density. Their use of a new potential function leads to the current best lower bound in the KLS conjecture. The proof in this paper builds on Lee and Vempala [17]'s refinements of Eldan's method, while it improves  the handling of several quantities involved in the stochastic localization scheme. Figure 1 provides a diagram showing the relationship between the main lemmas.
To ensure the existence and the uniqueness of the stochastic localization construction, we first prove a lemma that deals with log-concave densities with compact support. Then we relate back to the main theorem by finding a compact support which contains most of the probability measure for a log-concave density. Lemma 1. There exists a universal constant c such that for any log-concave density p in R d with compact support and any integer ≥ 1, we have The proof of Lemma 1 is provided in Section 2.5 after we introduce the intermediate lemmas. The use of the integer l in the lemma indicates that we control the Cheeger isoperimetric coefficient in an iterative fashion. In fact, we prove Lemma 1 by induction over l starting from the known bound in Equation (3). For this, we define the supremum of the product of the isoperimetric coefficient and the squareroot of its spectral norm over all log-concave densities in R d with compact support: Then we prove the following lemma on the lower bound of ψ d , which serves as the main induction argument.
Lemma 2. Suppose that ψ k ≥ 1 αk β for all k ≤ d for some 0 ≤ β ≤ 1 2 and α ≥ 1, take q = 1 β + 1, there exists a universal constant c such that we have The proof of Lemma 2 is provided towards the end of this section in Section 2.4. To have a good understanding of how we get there, we start by introducing the stochastic localization scheme introduced by Eldan [8].

Eldan's stochastic localization scheme.
Given a log-concave density p in R d with covariance matrix A, we define the following stochastic differential equation (SDE) where W t is the Wiener process, the matrix C t , the density p t , the mean μ t and the covariance A t are defined as follows The next lemma shows the existence and the uniqueness of the SDE solution.

Lemma 3. Given a density p in R d with compact support with covariance A and A is invertible, then the SDE (8) is well defined and it has a unique solution on the time interval
The proof of Lemma 3 follows from the standard existence and uniqueness theorem of SDE (Theorem 5.2 in Øksendal [21]). The proof is provided in Appendix A. Before we dive into the proof of Lemma 2, we discuss how the stochastic localization scheme allows us to control the boundary measure of a subset. First, according to the concavity of the isoperimetric profile (Theorem 2.8 in Sternberg and Zumbrun [25] or Theorem 1.8 in Milman [20]), it is sufficient to consider subsets of measure 1/2 in the definition of the isoperimetric coefficient in Equation (2). Second, the density p t is log-concave and it is more log-concave than the Gaussian density proportional to e − 1 2 x Btx . It can be shown via the KLS localization lemma [12] that a density which is more log-concave than a Gaussian has an isoperimetric coefficient lower bound that depends on the covariance of the Gaussian (see e.g. Theorem 2.7 in Ledoux [16] or Theorem 4.4 in Cousins and Vempala [7]). Third, given an initial subset E of R d with measure p(E) = 1 2 , using the martingale property of p t (E), we observe that Inequality (i) uses the isoperimetric inequality for a log-concave density which is more log-concave than a Gaussian density proportional to e − 1 2 x Btx [7,16]. Inequality (ii) uses the fact that p t (E) is nonnegative.
Based on the above observation, the high level idea of the proof requires two main steps: • There exists some time t > 0, such that the Gaussian component 1 2 x B t x of the density p t is large enough, so that we can apply the known isoperimetric inequality for densities more log-concave than a Gaussian.
• We need to control the quantity p t (E) so that the obtained isoperimetric inequality at time t can be related back to that at time 0.
The first step is obvious since our construction explicitly enforces the density p t to have a Gaussian component 1 2 x B t x in Equation (9). Then the remaining question is whether we can run the SDE long enough to make the Gaussian component large enough while still keeping p t (E) to be the same order as p(E) = 1 2 with large probability.

Lemma 4. Under the same assumptions of Lemma 3, for any measurable subset
This lemma is proved in Lemma 29 of Lee and Vempala [17]. We provide a proof here for completeness.
Proof of Lemma 4. Let g t = p t (E). Using Equation (13), we obtain the following derivative of g t Its quadratic variation is where the inequality follows from Cauchy-Schwarz inequality. Applying the Dambis, Dubins-Schwarz theorem (see e.g. Revuz and Yor [23] Section V.1 Theorem 1.7), there exists a Wiener processW t such that g t − g 0 has the same distribution as where the last inequality follows from the fact that P (ξ > 2) < 0.023 for ξ follows the standard Gaussian distribution.

Control the evolution of the spectral norm.
According to Lemma 4, to control the evolution of the measures of subsets, we need to control the spectral The following lemma serves the purpose.

Lemma 5. In addition to the same assumptions of Lemma
Direct control of the largest eigenvalue of A −1/2 A t A −1/2 is not trivial, instead we use the potential function Γ t to upper bound the largest eigenvalue. Define It is clear that Γ The advantage of using Γ t is that it is differentiable. We have the following differential for A t and Γ t : Obtaining these differentials uses Itô's formula and the proofs are provided in Appendix A. The next lemma upper bounds the terms in the potential Γ t .
Lemma 6. Under the same assumptions of Lemma 5, the potential Γ t defined in Equation (14) can be written as follows and The proof of Lemma 6 is provided in Section 3.1. Remark that bounds similar to the first bound of δ t in Lemma 6 have appeared in Lee and Vempala [17], whereas the second bound of δ t in Lemma 6 is novel. The second bound of δ t also leads to the following Lemma 8 which gives better control of the potential than the previous proof by Lee and Vempala [17] when t is large. Using the bounds in Lemma 6, we state the two lemmas which control the potential Γ t in two ways.

Lemma 7.
Under the same assumptions of Lemma 6, using the following transformation Lemma 8. Under the same assumptions of Lemma 6, using the following transformation The proofs of Lemma 7 and 8 are provided in Section 3.2. Now we are ready to prove Lemma 5.
Proof of Lemma 5. We take We bound the spectral norm of A −1/2 A t A −1/2 in two time intervals via Lemma 7 and Lemma 8. In the first time interval [0, T 1 ], we have Inequality (i) follows from the condition βq ≥ 1. (ii) follows from the fact that Tr (iv) follows from Lemma 7. In the first time interval, we can also bound the expectation of Γ 1/q T1 . Since the density p T1 is more log-concave than a Gaussian density with covariance matrix A T1 , the covariance matrix of p T1 is upper bounded as follows (see Theorem 4.1 in Brascamp-Lieb [5] or Lemma 5 in Eldan and Lehec [10]) Consequently, all the eigenvalues of Q T1 are less than 1 T1 and Γ T1 is upper bounded by d T q 1 . Using the above bound, we can bound the expectation of Γ 1/q T1 as follows Inequality (i) follows from Lemma 7, the inequality 3 q d ≥ 2 q (d + 1) (similar to what we did in the last four steps of Equation (17)) and Equation (18) 40 T 1 . Using the above bound, we control the spectral norm in the second time interval via Markov's inequality where inequality (i) follows from Markov's inequality and (ii) follows from Equation (20). (iii) follows from the definition of T 2 and β 2 + 1 q ≤ 2β − β/(4q) when βq ≥ 1 and q ≥ 2.
Combining the bounds in the first and second time intervals in Equation (17) and (21), we obtain 2.4 Proof of Lemma 2. The proof of Lemma 2 follows the strategy described after Lemma 3. We make the arguments rigorous here. We consider a log-concave density p in R d with compact support. Without loss of generality, we can assume that the covariance matrix A of the density p is invertible. Otherwise, the density p is degenerate and we can instead prove the results in a lower dimension.
According to the concavity of the isoperimetric profile (Theorem 2.8 in Sternberg and Zumbrun [25] or Theorem 1.8 in Milman [20]), it is sufficient to consider subsets of measure 1/2 in the definition of isoperimetric coefficient (2). Given an initial subset E of R d with p(E) = 1 2 , use the martingale property of p T2 (E), we have Inequality (i) uses the isoperimetric inequality for a log-concave density which is more log-concave than a Gaussian density proportional to e − 1 2 x Btx (see e.g. Theorem 2.7 in Ledoux [16] or Theorem 4.4 in Cousins and Vempala [7]). Inequality (ii) follows from the fact that p t (E) is nonnegative. (iii) follows from Lemma 4 and Lemma 5 (for d ≥ 3). (iv) follows from the construction that B t = tA −1 . We conclude the proof since T 2 is taken as  For ≥ 1, we define α and β recursively as follows: where c is the constant in Lemma 2. It is not difficult to show by induction that α and β satisfy We start with a known bound from the original KLS paper [12] In the induction, suppose that we have From the above inequality, we obtain for any 1 ≤ k ≤ d, with α = α (log(d) + 1) /2 . Using the above lower bounds for ψ k , we can apply Lemma 2. For integer + 1, we have where inequality (i) follows from Lemma 2, inequality (ii) follows from q ≤ 2 β and the last equality follows from the definition of α and β . We conclude Lemma 1 using the α and β bounds in Equation (24).

Proof of Theorem 1.
To derive Theorem 1 from Lemma 1, it is sufficient to show that for any log-concave density p in R d , most of its probability measure is on a compact support. Let μ be the mean of the density p. Since r → p(B (μ, r) c ) is an non-increasing function of r with limit 0 at ∞, there exists a radius R > 0, such that p(B (μ, R) c ) ≤ 0.2. Note that it is possible to get a better bound via e.g. log-concave concentration bounds from Paouris [22], but knowing the existence of such radius R is sufficient for the proof here.
Denote B = B (μ, R). Then p(B c ) ≤ 0.2. Let be the density obtained by truncating p on the ball B. Then is log-concave and it has compact support. For a subset E ⊂ R d of measure such that p(E) = 1 2 , we have The last inequality follows because p(E c )−p(B c ) ≥ 0.5−0.2 ≥ 1 4 . Since it is sufficient to consider subsets of measure 1/2 in the definition of the isoperimetric coefficient [20,25], we conclude that the isoperimetric coefficient of p is lower bounded by half of that of . Applying Lemma 1 for the isoperimetric coefficient of , we obtain Theorem 1.

Proof of auxiliary lemmas
In this section, we prove auxiliary Lemmas 6, 7 and 8.

Tensor bounds and proof of Lemma 6.
In this subsection, we prove Lemma 6. Since Lemma 6 involves the third-order moment tensor of a log-concave density, we define the following 3-Tensor for any probability density p ∈ R d with mean μ to simplify notations.
For A, B, C three matrices in R d×d , we can write T p (A, B, C) equivalently as Before we prove Lemma 6, we prove the following properties related to the 3-Tensor.
Lemma 9. Suppose p is a log-concave density with mean μ and covariance A. Then for any positive semi-definite matrices B and C, we have Lemma 11. Given τ > 0. Suppose p is a log-concave density which is more logconcave than N (0, 1 τ I d ). Let A be its covariance matrix. Suppose A is invertible then for q ≥ 3, we have Lemma 12. Suppose p is a log-concave density in R d . For any δ ∈ [0, 1], for A, B, C positive semi-definite matrices then The proofs of the above lemmas are provided in Section 3.3. Now we are ready to prove Lemma 6.
Proof of Lemma 6. We first prove the bound Applying Lemma 9 and knowing the covariance of p t is A t , we obtain Equality (i) uses the definition of Q t = A −1/2 A t A −1/2 . Equality (ii) uses the fact that MM 2 = M M 2 for any square matrix M ∈ R d×d . Inequality (iii) uses that M 2 ≤ Tr (M q ) 1/q for any positive semi-definite matrix M . Next, we bound δ t in two ways. We can ignore the negative term in δ t to obtain the following: where t is the density of linear-transformed random variable A −1/2 (X − μ t ) for X drawn from p t and μ t is the mean of p t . t is still log-concave since any linear transformation of a log-concave density is log-concave (see e.g. Saumard and Wellner [24]). t has covariance A −1/2 A t A −1/2 , which is also Q t . For a ∈ {0, · · · , q − 2}, we have Inequality (i) follows from Lemma 12. Inequality (ii) follows from Lemma 10. Since there are q − 1 terms in the sum, we conclude the first part of the bound for δ t . On the other hand, since p t is more log-concave than the Gaussian density proportional to e − t 2 (x−μt) A −1 (x−μt) , t is more log-concave than the Gaussian density proportional to e − t 2 x x . Applying Lemma 12 and Lemma 11 to each term in Equation (27), we obtain This concludes the second part of the bound for δ t .

Control of the potential in two time intervals.
In this subsection, we prove Lemma 7 and Lemma 8.
Proof of Lemma 7. The function h has the following derivatives Using Itô's formula, we obtain where inequality (i) plugs in the bounds in Lemma 6. Define a martingale Y t such that According to the v t 2 upper bound in Lemma 6, we have Hence the martingale Y t is well-defined. According to the Dambis, Dubins-Schwarz theorem (see e.g. Revuz and Yor [23] Section V.1 Theorem 1.7), there exits a Wiener processW t such that Y t has the same distribution asW [Y ]t . Then we have for any Inequality (i) follows from the choice of T . (ii) uses Equation (28). (iii) follows by plugging in Ψ = 1 2 (d + 1) −1/q and 3 q d 2 ≥ 2 q (d +1) 2 . (iv) follows from βq ≥ 1, d ≥ 3, q ≥ 2 and 3 −4/3 < 0.3.
Proof of Lemma 8. The function f has the following derivatives Using Itô's formula, we obtain Using the bounds in Lemma 6 and the martingale property of the term 1 q Γ Solving the above differential equation, we obtain , ∀t 2 > t 1 > 0.
Proof of Lemma 9. Since C is positive semi-definite, we can write its eigenvalue decomposition as follows Inequality (i) follows from triangular inequality. (ii) follows from Cauchy-Schwarz inequality. (iii) follows from the statement below, which upper bounds the fourth moment of a log-concave density via its second moment.
For any log-concave density ν and any vector θ ∈ R d , we have for a ≥ b > 0, where μ ν is the mean of ν. Equation (29) is proved e.g. in Corollary 5.7 of Guédon et al. [11] and the exact constant is provided in Proposition 3.8 of Lata la and Wojtaszczyk [15]. In order to prove Lemma 10, we need to introduce one additional lemma as follows.
Lemma 13. Suppose that ψ k ≥ 1 αk β for all k ≤ d for some 0 < β ≤ 1 2 and α ≥ 1. For an isotropic log-concave density p in R d and a unit vector v ∈ R d , define Δ = E X∼p X v · XX , then we have 1. For any orthogonal projection matrix P ∈ R d×d with rank r, we have

For any positive semi-definite matrix A, we have
This lemma was proved in Lemma 41 in an older version (arXiv version 2) of Lee and Vempala [17]. The main proof idea for the first part of Lemma 13 appeared in Eldan [8] (Lemma 6). we provide a proof here for completeness.
Proof of Lemma 13. For the first part, we have Since E X∼p X v = 0, we can subtract the mean of the first term X ΔP X without changing the value of Tr (ΔP Δ). Then ≤ 4ψ −1 min(2r,d) (Tr (ΔP Δ)) 1/2 . Inequality (i) follows from the Cauchy-Schwarz inequality. Inequality (ii) follows from the fact that E X∼p (X v) 2 = 1 as p is isotropic and that the inverse Poincaré constant is upper bounded by twice of inverse of the squared isoperimetric coefficient (also known as Cheeger's inequality [6,19] or Theorem 1.1 in Milman [20]). The matrix ΔP + P Δ has rank at most min(2r, d). Rearranging the terms in the above equation, we conclude the first part of Lemma 13.
For the second part, we write the matrix A in its eigenvalue decomposition and group the terms by eigenvalues. We have where A i has eigenvalues between the interval ( A 2 e i−1 /d, A 2 e i /d] and B has eigenvalues smaller than or equal to A 2 /d. Because the intervals have right bounds increasing exponentially, we have J = log(d) . Let P i be the orthogonal projection matrix formed by the eigenvectors in A i . Then we have where inequality (i) follows from the first part of Lemma 13 and inequality (ii) follows from the hypothesis of Lemma 13. Similarly for matrix B, we have where inequality (i) follows from the hypothesis of Lemma 13 and inequality (ii) follows from the fact that B 2 ≤ A 2 /d and 2β ≤ 1. Putting the bounds (30) and (31) together, we have Inequality (i) follows from Holder's inequality and inequality (ii) follows from the fact that A j 1/2β 2 rank(A j ) ≤ e Tr A 1/2β j due to the construction of A j . This concludes the second part of Lemma 13.
Proof of Lemma 10. Let μ be the mean of p. First, for X a random vector in R d drawn from p, we define the standardized random variable A −1/2 (X − μ) and its density . is an isotropic log-concave density. Then through a change of variable, we have where the last inequality follows from Lemma 12. A q is positive semi-definite and we write down its eigenvalue decomposition Since is isotropic, we can rewrite the 3-Tensor into a summation form and apply Lemma 13.
where we define Δ i = (x v i )xx (x)dx, inequality (i) follows from Lemma 13 and that is isotropic, inequality (ii) follows from Cauchy-Schwarz inequality and the assumption that q ≥ 1 2β . Proof of Lemma 11. Without loss of generality, we can assume that the density p has mean 0. Its covariance matrix A is positive semi-definite and invertible. We can write down its eigenvalue decomposition as follows and v i are eigenvectors with norm 1. Then A q has an eigenvalue decomposition with the same eigenvectors Next we bound the terms Tr (Δ i Δ i ). We have  [5]) together with the assumption that p is more log-concave than N (0, 1 τ I d ). Plugging the bounds of the terms Tr (Δ i Δ i ) into Equation (32), we obtain Inequality (i) follows from Cauchy-Schwarz inequality. For q ≥ 3, inequality (ii) follows from Lemma 12. From the above equation, after rearranging the terms, we obtain Proof of Lemma 12. This lemma is proved in Lemma 43 in an older version (arXiv version 2) of Lee and Vempala [17], we provide a proof here for completeness. Without loss of generality, we can assume that the density p has mean 0. For i ∈ {1, · · · , d}, we define Δ i = E X∼p B 1/2 XX B 1/2 X C 1/2 e i where e i ∈ R d is the vector with ith coordinate 1 and 0 elsewhere. We have d i=1 e i e i = I d . We can rewrite the tensor on the left hand side as a sum of traces.
For any symmetric matrix F , a positive-semidefinite matrix G and δ ∈ [0, 1], we have Applying the above trace inequality (34) that we prove later for completeness (see also Lemma 2.1 in Zhu et al. [1]), we obtain Writing the sum of traces in Equation (33) back to the 3-Tensor form, we conclude Lemma 12. It remains to prove the trace inequality in Equation (34). Without loss of generality, we can assume G is diagonal. Hence, we have where the inequality follows from Jensen's inequality and the fact that the logarithm function is concave (or the inequality of arithmetic and geometric means). wright for helpful discussions. We thank Bo'az Klartag and Joseph Lehec for pointing out a mistake in the previous revision. We also thank anonymous reviewers for their careful reading of our manuscript and their suggestions on presentation and writing. Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Proof of Lemma 3 and derivatives
In this section, we first prove the existence and uniqueness of the SDE solution in Lemma 3 and then derive the derivatives of p t , A t and Γ t in Equation (13), Equation (15) and (16) using Itô's calculus. Similar results are also proved in Eldan [8] and Lee and Vempala [17] since a similar stochastic localization is used. We provide a proof here for completeness.
Proof of Lemma 3. We can rewrite the stochastic differential equation (8) as follows to make the dependency clear: Since p has a compact support, given x ∈ R d , (·, ·, x) as a function of (c, B) is Lipschitz in c and B. Similarly, μ is also Lipschitz in c and B. Consequently, A −1/2 , A −1 μ(c t , B t ) and A −1 are all bounded and Lipschitz on c t and B t on the compact support. Applying the existence and uniqueness theorem of SDE solutions (Theorem 5.2 in Øksendal [21]), we show that the SDE solution exists and is unique on the time interval [0, T ] for any T > 0. Next, we derive the derivative of p t . Define Then p t (x) can be written as Gt(x) Vt . Let S t (x) denote the quadratic variation of the process c t x. We have dS t (x) = x A −1 xdt.
Using Itô's formula, we have Using Itô's formula on the inverse of V t , we have Using Itô's formula on p t , with the above derivatives, we obtain Then we derive the derivative of A t . By the definition of A t , we have where μ t = R d xp t (x)dx. Using Itô's formula on μ t , we obtain Using Itô's formula on A t and viewing it as a function of μ t and p t , we obtain We observe that dμ t (x − μ t ) p t (x)dx = 0 and (x − μ t ) (dμ t ) p t (x)dx = 0. Then, Combining all the terms together, we have Finally, we derive the derivative of Γ t . Define the function Γ : R d×d → R as Γ(X) = Tr (X q ). The first-order and second-order derivatives of Γ are given by Tr X a H 2 X q−2−a H 1 .
Using the above derivatives and Itô's formula, we obtain