An Almost Constant Lower Bound of the Isoperimetric Coefficient in the KLS Conjecture

We prove an almost constant lower bound of the isoperimetric coefficient in the KLS conjecture. The lower bound has dimension dependency $d^{-o_d(1)}$. When the dimension is large enough, our lower bound is tighter than the previous best bound which has dimension dependency $d^{-1/4}$. Improving the isoperimetric coefficient in the KLS conjecture has many implications, including improvements of the bounds in the thin-shell conjecture and in the slicing conjecture, better concentration inequalities for Lipschitz functions of log-concave measures and better mixing time bounds for MCMC sampling algorithms on log-concave measures.


Introduction
Given a distribution, the isoperimetric coefficient of a subset is the ratio of the measure of the subset boundary to the minimum of the measures of the subset and its complement. Taking the minimum of such ratios over all subsets defines the isoperimetric coefficient of the distribution, also called the Cheeger isoperimetric coefficient of the distribution.
Kannan, Lovász and Simonovits (KLS) [11] conjecture that for any distribution that is log-concave, the Cheeger isoperimetric coefficient equals to that achieved by half-spaces up to a universal constant factor. If the conjecture is true, the Cheeger isoperimetric coefficient can be determined by going through all the half-spaces instead of all subsets. For this reason, the KLS conjecture is also called the KLS hyperplane conjecture. To make it precise, we start by formally defining log-concave distributions and then we state the conjecture.
A probability density function p : R d → R is log-concave if its logarithm is concave, i.e., for any x, y ∈ R d × R d and for any λ ∈ [0, 1], Common probability distributions such as Gaussian, exponential and logistic are log-concave. This definition also includes any uniform distribution over a convex sets defined as follows. A subset K of R d is convex if ∀x, y ∈ K × K, z ∈ [x, y] =⇒ z ∈ K. The isoperimetric coefficient ψ(p) of a density p in R d is defined as where d(x, S) is the Euclidean distance between x and the subset S. The KLS conjecture is stated by Kannan, Lovász and Simonovits [11] as follows.

Conjecture 1.
There exists a universal constant c, such that for any log-concave density p in R d , we have where ρ (p) is the spectral norm of the covariance matrix of p. In other words, ρ (p) = A 2 , where A = Cov X∼p (X) is the covariance matrix.
An upper bound of ψ(p) of the same form is relatively easy and it was shown to be achieved by half-spaces [11]. Proving the lower bound ψ(p) up to some small factors in Conjecture 1 is the main goal of this paper. We say a log-concave density is isotropic if its mean E X∼p [X] equals to 0 and its covariance Cov X∼p (X) equals to I d . In the case of isotropic log-concave densities, the KLS conjecture states that any isotropic log-concave density has its isoperimetric coefficient lower bounded by a universal constant.
There are many attempts trying to lower bound the Cheeger isoperimetric coefficient in the KLS conjecture. We refer readers to the survey paper by Lee and Vempala [14] for a detailed exposition of these attempts. In particular, the original KLS paper [11] (Theorem 5.1) shows that for any log-concave density p with covariance matrix A, The original KLS paper [11] only deals with uniform distributions over convex sets, but their proof techniques can be easily extended to show that the same results hold for all log-concave densities. Remark that Equation (3) implies ψ(p) ≥ log(2) . The current best bound is shown in Lee and Vempala [13], where they show that there exists a universal constant c such that for any log-concave density p with covariance matrix A, . Note that in Lee and Vempala [13], their notation of ψ(p) is the reciprocal of ours and it is later switched in Theorem 32 of the survey paper [14] by the same authors. As a result, the above bound is not a misstatement of the results in Lee and Vempala [13] and it is simply translated into our notations. In this paper, we improve the dimension dependency d −1/4 to d −o d (1) in the lower bound of the isoperimetric constant.
There are many implications of improving the bound in the KLS conjecture. KLS conjecture implies the thin-shell conjecture [2], the slicing conjecture [4,3]. Improving the bound in the KLS conjecture also improves concentration inequalities for Lipschitz functions of logconcave measures. It also leads to faster mixing time bounds of Markov chain Monte Carlo (MCMC) sampling algorithms on log-concave measures. We refer readers to Milman [16] and Lee and Vempala [14] for more details.
Notation: For two sequences a n and b n indexed by an integer n, we say that a n = o n (b n ) if lim n→∞ an bn = 0. The Euclidean norm of a vector x ∈ R d is denoted by x 2 . The spectral norm of a square matrix A ∈ R d×d is denoted by A 2 . The Euclidean ball with center x and radius r is denoted by B(x, r). For a real number x ∈ R, we denote its ceiling by ⌈x⌉ = min {m ∈ Z | m ≥ x}. For a martingale (M t , t ∈ R + ), we use [M ] t to denote its quadratic variation, defined as

Main results
We prove the following bound.
There exists a universal constant c such that for any log-concave density p in R d and any integer l ≥ 1, we have where ρ (p) is the spectral norm of the covariance matrix of p.
As a corollary, take l = log(d) log log(d) .
Since lim d→∞ log log(d) log(d) = 0, for d large enough, the above lower bound is better than any lower bound of the form The proof of the main theorem uses the stochastic localization scheme introduced by Eldan [8]. Eldan [8] uses this stochastic localization scheme to show that the thin shell conjecture is equivalent to the KLS conjecture up a logarithmic factor. The same scheme was used in Lee and Vempala [14] to obtain the current best lower bound in the KLS conjecture. The construction of stochastic localization scheme uses elementary properties of semimartingales and stochastic integration. The main idea of the previous proof in Lee and Vempala [13] is to smoothly multiply a Gaussian part to the log-concave density, so that the modified density is strongly log-concave. When the density is more log-concave than a Gaussian density, then it has a dimension-independent isoperimetric coefficient [7]. We say a density p is more log-concave than a Gaussian density ϕ if p can be written as a product form p = ν · ϕ where ϕ is the Gaussian density and ν is a log-concave function (that is, ν is proportional to a logconcave density). The proof in this paper follows a similar strategy as Lee and Vempala [13] but with a refined analysis. Figure 1 provides a diagram showing the relationship between the main lemmas.
To ensure the existence and the uniqueness of the stochastic localization construction, we first prove a lemma that deals with log-concave densities with compact support. Then we relate back to the main theorem via known concentration inequalities for log-concave densities. Lemma 1. There exists a universal constant c such that for any log-concave density p in R d with compact support and any integer l ≥ 1, we have  The proof of Lemma 1 is provided in Section 2.5 after we introduce the intermediate lemmas. The use of the integer l in the lemma indicates that we control the Cheeger isoperimetric coefficient in an iterative fashion. In fact, we prove Lemma 1 by induction over l starting from the known bound in Equation (3). For this, we define the supremum of the product of the isoperimetric coefficient and the square-root of its spectral norm over all log-concave densities in R d with compact support: Then we prove the following lemma on the lower bound of ψ d , which serves as the main induction argument.
Lemma 2. Suppose that ψ k ≥ 1 αk β for all k ≤ d for some 0 ≤ β ≤ 1 2 and α ≥ 1, take q = ⌈ 1 β ⌉ + 1, there exists a universal constant c such that we have The proof of Lemma 2 is provided towards the end of this section in Section 2.4. To have a good understanding of how we get there, we start by introducing the stochastic localization scheme introduced by Eldan [8].

Eldan's stochastic localization scheme
Given a log-concave density p in R d with covariance matrix A, we define the following stochastic differential equation (SDE) where W t is the Wiener process, the matrix C t , the density p t , the mean µ t and the covariance A t are defined as follows The next lemma shows the existence and the uniqueness of the SDE solution.
The proof of Lemma 3 follows from the standard existence and uniqueness theorem of SDE from [17] (Theorem 5.2 in Chapter 5). The proof is provided in Appendix A.
Before we dive into the proof of Lemma 2, we discuss how the stochastic localization scheme allows us to control the boundary measure of a subset. First, according to the concavity of the isoperimetric profile [16] (Theorem 1.8), it is sufficient to consider subsets of measure 1/2 in the definition of the isoperimetric coefficient in Equation (2). Second, the density p t is log-concave and it is more log-concave than the Gaussian density proportional to e − 1 2 x ⊤ Btx . It can be shown via the KLS localization lemma [11] that a density which is more log-concave than a Gaussian has an isoperimetric coefficient lower bound that depends on the covariance of the Gaussian (see e.g. Theorem 4.4 in [7] or Theorem 30 in [13]). Third, given an initial subset E of R d with measure p(E) = 1 2 , using the martingale property of p t (E), we observe that Inequality (i) uses the isoperimetric inequality for a log-concave density which is more logconcave than a Gaussian density proportional to e − 1 2 x ⊤ Btx (Theorem 4.4 in [7]). Inequality (ii) uses the fact that p t (E) is nonnegative.
Based on the above observation, the high level idea of the proof requires two main steps: • There exists some time t > 0, such that the Gaussian component 1 2 x ⊤ B t x of the density p t is large enough, so that we can apply the known isoperimetric inequality for densities more log-concave than a Gaussian.
• We need to control the quantity p t (E) so that the obtained isoperimetric inequality at time t can be related back to that at time 0.
The first step is obvious since our construction explicitly enforces the density p t to have a Gaussian component 1 2 x ⊤ B t x in Equation (9). Then the remaining question is whether we can run the SDE long enough to make the Gaussian component large enough while still keeping p t (E) to be the same order as p(E) = 1 2 with large probability.

Control the evolution of the measure of a subset
Lemma 4. Under the same assumptions of Lemma 3, for any measurable subset E of R d with p(E) = 1 2 and t > 0, the solution p t of the SDE (9) satisfies This lemma is proved in Lemma 29 of Lee and Vempala [13]. We provide a proof here for completeness.
Proof of Lemma 4: Let g t = p t (E). Using Equation (13), we obtain the following derivative of g t Its quadratic variation is where the inequality follows from Cauchy-Schwarz inequality. Applying the Dambis, Dubins-Schwarz theorem (see e.g. Theorem 1.7 in Section V.1 [19]), there exists a Wiener processW t such that g t − g 0 has the same distribution asW [g]t . Since g 0 = 1 2 , we obtain where the last inequality follows from the fact that P (ξ > 2)) < 0.03 if ξ follows the standard Gaussian distribution.

Control the evolution of the spectral norm
According to Lemma 4, to control the evolution of the measures of subsets, we need to control the spectral norm of A −1/2 A t A −1/2 . The following lemma serves the purpose.
Lemma 5. In addition to the same assumptions of Lemma 3, if ψ k ≥ 1 αk β for all k ≤ d for some 0 < β ≤ 1 2 and α ≥ 1, then there exists a universal constant c such that for q = ⌈ 1 Direct control of the largest eigenvalue of A −1/2 A t A −1/2 is not trivial, instead we use the potential function Γ t to upper bound the largest eigenvalue. Define It is clear that Γ The advantage of using Γ t is that it is differentiable. We have the following differential for A t and Γ t : Obtaining these differentials uses Itô's formula and the proofs are provided in Appendix A. The next lemma upper bounds the terms in the potential Γ t .
Lemma 6. Under the same assumptions of Lemma 5, the potential Γ t defined in Equation (14) can be written as follows and The proof of Lemma 6 is provided in Section 3.1. Using Lemma 6, the following two lemmas show that the potential Γ t can be bounded in two ways.

Lemma 7.
Under the same assumptions of Lemma 6, using the following transformation Lemma 8. Under the same assumptions of Lemma 6, using the following transformation The proofs of Lemma 7 and 8 are provided in Section 3.2. Now we are ready to prove Lemma 5.
Proof of Lemma 5: We take We bound the spectral norm of A −1/2 A t A −1/2 in two time intervals via Lemma 7 and Lemma 8. In the first time interval [0, T 1 ], we have Inequality (i) follows from the condition βq ≥ 1. (ii) follows from the fact that Tr (A q ) 1/q ≥ A 2 . (iii) is because 3 q d ≥ 2 q (d + 1) when q ≥ 2 and d ≥ 1. h is defined in Lemma 7. (iv) follows from Lemma 7.
In the first time interval, we can also bound the expectation of Γ 1/q T 1 . Since the density p T 1 is more log-concave than a Gaussian density with covariance matrix A T 1 , the covariance matrix of p T 1 is upper bounded as follows (Theorem 4.1 in Brascamp-Lieb [5] or Lemma 5 in Eldan and Lehec [9]) Consequently, all the eigenvalues of Q T 1 are less than 1 T 1 and Γ T 1 is upper bounded by d T q 1 .
Using the above bound, we can bound the expectation of Γ 1/q T 1 using the probability bound in Equation (17). We obtain Inequality (i) follows from Lemma 7, Equation (17) and (18). (ii) follows from q ≥ 2, β ≤ 1/2 and d 1/2 ≥ log(d) for d ≥ 3. In the second time interval, for t ∈ [T 1 , T 2 ], we have Inequality (i) follows from Lemma 8. (ii) is because t ≤ T 2 . (iii) follows from T 2 = d β/q 40 T 1 . Using the above bound, we control the spectral norm in the second time interval via Markov's inequality where inequality (i) follows from Markov's inequality and (ii) follows from Equation (19). (iii) follows from the definition of T 2 and β 2 + 1 q ≤ 2β − β/q when βq ≥ 1 and q ≥ 2.
Combining the bounds in the first and second time intervals in Equation (17) and (21), we obtain

Proof of Lemma 2
The proof of Lemma 2 follows the strategy described after Lemma 3. We make the arguments rigorous here. We consider a log-concave density p in R d with compact support. Without loss of generality, we can assume that the covariance matrix A of the density p is invertible. Otherwise, the density p is degenerate and we can instead prove the results in a lower dimension.
First, according to the concavity of the isoperimetric profile [16] (Theorem 1.8), it is sufficient to consider subsets of measure 1/2 in the definition of isoperimetric coefficient (2). Second, the density p t is log-concave and it has a Gaussian component of the form 1 2 x ⊤ B t x. It can be shown via the localization lemma [11] that a density which is more log-concave than a Gaussian has an isoperimetric constant that depends on the variance Gaussian part (see e.g. Theorem 4.4 in [7]). Third, given an initial subset E of R d with p(E) = 1 2 , use the martingale property of p T 2 (E), we have Inequality (i) uses the isoperimetric inequality for a log-concave density more log-concave than a Gaussian (Theorem 4.4 in [7]). Inequality (ii) follows from the fact that p t (E) is nonnegative. (iii) follows from Lemma 4 and Lemma 5 (for d ≥ 3). (iv) follows from the construction that B t = tA −1 . We conclude the proof since T 2 is taken as with c as a constant. The above proof only works for d ≥ 3. It is easy to verify that Lemma 2 still holds for the case for d = 1, 2 from the original KLS bound in Equation (3).

Proof of Lemma 1
The proof of Lemma 1 consists of applying Lemma 2 recursively. We define For l ≥ 1, we define α l and β l recursively as follows: where c is the constant in Lemma 2. It is not difficult to show by induction that α l and β l satisfy We start with a known bound from the original KLS paper [11] ψ d ≥ 1 α 1 d β 1 , ∀d ≥ 1.
In the induction, suppose that we have Then for integer l + 1, applying Lemma 2, we have where the last inequality follows from q ≤ 2 β and the last equality follows from the definition of α l and β l . We conclude Lemma 1 using the α l and β l bounds in Equation (24).

Proof of Theorem 1
To derive Theorem 1 from Lemma 1, it is sufficient to show that for any log-concave density p in R d , most of its probability measure is concentrated on a compact support. We invoke the log-concave concentration bound by Paouris [18] (Theorem 1) as follows: there exists a constant c > 0 such that for X a random vector in R d with an isotropic log-concave density, then Theorem 1 in Paouris [18] only deals with convex bodies, however it is also valid for any isotropic log-concave measure according the remarks in its Section 8. Let µ and A be the mean and the covariance matrix of the density p. Define R = c A d where c is the constant in Equation (25). For Y a random vector in R d with density p, A −1/2 (Y − µ) has isotropic log-concave density. Applying Equation (25), we obtain Let B = B (µ, R) be the Euclidean ball with center µ and radius R. Then p(B c ) ≤ exp − √ d ≤ 0.2 for d ≥ 3. Let ̺ be the density obtained by truncating p on the ball B (µ, R). Then ̺ has compact support. For a subset E ∈ R d of measure such that p(E) = 1 2 , we have

The last inequality follows because
Since it is sufficient to consider subsets of measure 1/2 in the definition of the isoperimetric coefficient [16] (Theorem 1.8), we conclude that the isoperimetric coefficient of p is lower bounded by half of that of ̺. Applying Lemma 1 for the isoperimetric coefficient of ̺, we obtain Theorem 1.

Proof of auxiliary lemmas
In this section, we prove auxiliary lemmas 6, 7 and 8.

Tensor bounds and proof of Lemma 6
In this subsection, we prove Lemma 6. Since Lemma 6 involves the third-order moment tensor of a log-concave density, we define the following 3-Tensor for any probability density p ∈ R d with mean µ to simplify notations.
For A, B, C three matrices in R d×d , we can write T p (A, B, C) equivalently as Before we prove Lemma 6, we prove the following properties related to the 3-Tensor.
Lemma 9. Suppose p is a log-concave density with mean µ and covariance A. Then for any positive semi-definite matrices B and C, we have Lemma 10. Suppose that ψ k ≥ 1 αk β for all k ≤ d for some 0 ≤ β ≤ 1 2 and α ≥ 1. Suppose p is a log-concave density in R d with covariance A and A is invertible. Then for q ≥ 1 2β , we have Lemma 11. Given τ > 0. Suppose p is a log-concave density which is more log-concave than N (0, 1 τ I d ). Let A be its covariance matrix. Suppose A is invertible then for q ≥ 3, we have Lemma 12. Suppose p is a log-concave density in R d . For any δ ∈ [0, 1], for A, B, C positive semi-definite matrices then The proofs of the above lemmas are provided in Section 3.3. Now we are ready to prove Lemma 6.
Proof of Lemma 6: We first prove the bound Applying Lemma 9 and knowing the covariance of p t is A t , we obtain Equality (i) uses the definition of Q t = A −1/2 A t A −1/2 . Equality (ii) uses the fact that M M ⊤ 2 = M ⊤ M 2 for any square matrix M ∈ R d×d . Inequality (iii) uses that M 2 ≤ Tr (M q ) 1/q for any positive semi-definite matrix M .
Next, we bound δ t in two ways. We can ignore the negative term in δ t to obtain the following: where ̺ t is the density of linear-transformed random variable A −1/2 (X − µ t ) for X drawn from p t and µ t is the mean of p t . ̺ t is still log-concave since any linear transformation of a log-concave density is log-concave (see e.g. [20]). ̺ t has covariance A −1/2 A t A −1/2 , which is also Q t . For a ∈ {0, · · · , q − 2}, we have Inequality (i) follows from Lemma 12. Inequality (ii) follows from Lemma 10. Since there are q − 1 terms in the sum, we conclude the first part of the bound for δ t . On the other hand, since p t is more log-concave than the Gaussian density proportional to , ̺ t is more log-concave than the Gaussian density proportional to e − t 2 x ⊤ x . Applying Lemma 11, we obtain This concludes the second part of the bound for δ t .

Control of the potential in two time intervals
In this subsection, we prove Lemma 7 and Lemma 8.

Proof of Lemma 7:
The function h has the following derivatives Using Itô's formula, we obtain where inequality (i) plugs in the bounds in Lemma 6.

Proof of Lemma 8:
The function f has the following derivatives Using Itô's formula, we obtain Using the bounds in Lemma 6 and the martingale of the term 1 q Γ Solving the above differential equation, we obtain , ∀t 2 > t 1 > 0.

Proof of tensor bounds
In this subsection, we prove Lemma 9, 10, 11 and 12.
Proof of Lemma 9: Since C is positive semi-definite, we can write its eigenvalue decomposition as follows Inequality (i) follows from triangular inequality. (ii) follows from Cauchy-Schwarz inequality.
(iii) follows from the statement below, which upper bounds the fourth moment of a log-concave density via its second moment. For any log-concave density ν and any vector θ ∈ R d , we have for a ≥ b > 0, where µ ν is the mean of ν. Equation (29) is proved in Corollary 5.7 of Guédon et al. [10] and the exact constant is provided in Proposition 3.8 of Lata la and Wojtaszczyk [12]. In order to prove Lemma 10, we need the following lemma.
Lemma 13. Suppose that ψ k ≥ 1 αk β for all k ≤ d for some 0 < β ≤ 1 2 and α ≥ 1. For an isotropic log-concave density p in R d and a unit vector v ∈ R d , define ∆ = E X∼p X ⊤ v ·XX ⊤ , then we have 1. For any orthogonal projection matrix P ∈ R d×d with rank r, we have This lemma was proved in Lemma 41 in an older version (arXiv version 2) of Lee and Vempala [13], we provide a proof here for completeness.
Proof of Lemma 13: For the first part, we have Since E X∼p X ⊤ v = 0, we can subtract the mean of the first term X ⊤ ∆P X without changing the value of Tr (∆P ∆). Then Inequality (i) follows from the Cauchy-Schwarz inequality. Inequality (ii) follows from the fact that E X∼p (X ⊤ v) 2 = 1 as p is isotropic and that the inverse Poincaré constant is upper bounded by twice of inverse of the isoperimetric constant ( [15,6] or Theorem 1.1 in [16]). The matrix ∆P + P ⊤ ∆ has rank at most min(2r, d). Rearranging the terms in the above equation, we conclude the first part of Lemma 13.
For the second part, we write the matrix A in its eigenvalue decomposition and group the terms by eigenvalues. We have where A i has eigenvalues between the interval ( A 2 e i−1 /d, A 2 e i /d] and B has eigenvalues smaller than or equal to A 2 /d. Because the intervals have right bounds increasing exponentially, we have J = ⌈log(d)⌉. Let P i be the orthogonal projection matrix formed by the eigenvectors in A i . Then we have where inequality (i) follows from the first part of Lemma 13 and inequality (ii) follows from the hypothesis of Lemma 13. Similarly for matrix B, we have Tr (∆A j ∆) + Tr (∆B∆) Inequality (i) follows from Holder's inequality and inequality (ii) follows from the fact that A j 1/2β 2 rank(A j ) ≤ e Tr A 1/2β j due to the construction of A j . This concludes the second part of Lemma 13.
Proof of Lemma 10: Let µ be the mean of p. First, for X a random vector in R d drawn from p, we define the standardized random variable A −1/2 (X − µ) and its density ̺. ̺ is an isotropic log-concave density. Then through a change of variable, we have where the last inequality follows from Lemma 12. A q is positive semi-definite and we write down its eigenvalue decomposition Since ̺ is isotropic, we can rewrite the 3-Tensor into a summation form and apply Lemma 10.
, inequality (i) follows from Lemma 13 and that ̺ is isotropic, inequality (ii) follows from Cauchy-Schwarz inequality and the assumption that q ≥ 1 2β .
Proof of Lemma 11: Without loss of generality, we can assume that the density p has mean 0. Its covariance matrix A is positive semi-definite and invertible. We can write down its eigenvalue decomposition as follows and v i are eigenvectors with norm 1. Then A q has an eigenvalue decomposition with the same eigenvectors Next we bound the terms Tr (∆ i ∆ i ). We have Equality (i) is because E X∼p X = 0. Inequality (ii) follows from Cauchy-Schwarz inequality. Equality (iii) follows from the definition of the covariance matrix E X∼p XX ⊤ = A. Inequality (iv) follows from the Brascamp-Lieb inequality (or Hessian Poincaré, see Theorem 4.1 in Brascamp and Lieb [5]) together with the assumption that p is more log-concave than N (0, 1 τ I d ). Plugging the bounds of the terms Tr (∆ i ∆ i ) into Equation (32), we obtain Inequality (i) follows from Cauchy-Schwarz inequality. For q ≥ 3, inequality (ii) follows from Lemma 12. From the above equation, after rearranging the terms, we obtain T p A q−2 , I d , I d ≤ 1 τ Tr (A q ) .

Proof of Lemma 12:
This lemma is proved in Lemma 43 in an older version (arXiv version 2) of Lee and Vempala [13], we provide a proof here for completeness.
Without loss of generality, we can assume that the density p has mean 0. For i ∈ {1, · · · , d}, we define ∆ i = E X∼p B 1/2 XX ⊤ B 1/2 X ⊤ C 1/2 e i where e i ∈ R d is the vector with i-th coordinate 1 and 0 elsewhere. We have d i=1 e i e ⊤ i = I d . We can rewrite the tensor on the left hand side as a sum of traces.
Applying the extended Lieb-Thirring inequality (see e.g. Lemma 2.1 in [1]), we obtain Writing the sum of traces back to the 3-Tensor form, we conclude Lemma 12.
Finally, we derive the derivative of Γ t . Define the function Γ : R d×d → R as Γ(X) = Tr (X q ). The first-order and second-order derivatives of Γ are given by Tr X a H 2 X q−2−a H 1 .
Using the above derivatives and Itô's formula, we obtain where E ij is the matrix that takes 1 at the entry (i, j) and 0 otherwise and Q ij,t is the stochastic process defined by the (i, j) entry of Q t . Using the derivative of A t in Equation (15), we have where z(x) i is the i-th coordinate of A −1/2 (x − µ t ) . Plugging the expressions of dA t and d [A ij , A kl ] t into Equation (33), we obtain