Optimal regularity for degenerate Kolmogorov equations in non-divergence form with rough-in-time coefficients

We consider a class of degenerate equations in non-divergence form satisfying a parabolic Hörmander condition, with coefficients that are measurable in time and Hölder continuous in the space variables. By utilizing a generalized notion of strong solution, we establish the existence of a fundamental solution and its optimal Hölder regularity, as well as Gaussian estimates. These results are key to study the backward Kolmogorov equations associated to a class of Langevin diffusions.


Introduction
In recent years, several sharp Schauder estimates have been proved for Kolmogorov equations with coefficients that are Hölder-continuous in the space-variables but only measurable in time.In this article we prove global Schauder estimates which we claim to be optimal, meaning that the inherent Hölder spaces are the strongest possible under the given assumptions on the coefficients.In particular, our results include and improve some known estimates in the framework of non-divergence form operators satisfying a parabolic Hörmander condition.
A prototype example of the class under consideration is which is the backward Kolmogorov operator of the system of stochastic differential equations    dV t = σ(t, V t , X t )dW t , where W is a real Brownian motion.The study of these models is motivated by several applications, including kinetic theory and finance.In the classical Langevin model, (V, X) describes velocity and position of a particle in the phase space and is a pilot example of more complex kinetic models (cf.[13], [10], [11]).In mathematical finance, (V, X) represents the log-price and average processes utilized in modeling path-dependent financial derivatives, such as Asian options (cf.[1], [22]).
We now introduce the general class under consideration.Let d, N ∈ N, with d ≤ N , be fixed throughout the paper.We denote by (t, x) the point in R × R N and consider the second-order operator in non-divergence form where and B = (b ij ) i,j=1,••• ,d is a constant matrix of dimension N × N .The diffusion part A is an elliptic operator on R d , while the drift (or transport) term Y is a first order differential operator on R N +1 .We impose two main structural hypotheses: (H.1) the matrix A = (a ij ) i,j=1,...,d is symmetric and there exists a positive constant µ such that for almost every t ∈ [0, T ] where T > 0 is fixed once for all; (H.2) the following Hörmander condition is satisfied: The focus of the paper is on the case d < N , which is L is fully degenerate, namely no coercivity condition on R N is satisfied.Condition (1.1) is also known as parabolic Hörmander condition since the drift Y plays a key role in the generation of the Lie algebra.
We consider the Cauchy problem posed on the strip Generally speaking, Schauder estimates give a bound of some Hölder norm of the solution u in terms of some (possibly different) Hölder norms of the data, namely the coefficients of L, the non-homogeneous term f and the datum g.The "strength" of a Schauder estimate depends on the norms involved and this is a sensitive issue in the theory of degenerate PDEs: clearly, for a given Hölder norm on the solution u, the weaker the norms on the data, the stronger the Schauder estimate; conversely, if the norms on the data are given, then the stronger the norm on u the stronger the Schauder estimate.Now, in the literature on degenerate Kolmogorov equations, we may recognize at least two notions of Hölder norm as well as variants of them: the so-called anisotropic and intrinsic norms, whose precise definitions are given in Section 2.1.Intuitively, the former norm takes into account the anisotropic behavior in space induced by the underlying diffusion, but does not require any time-regularity.The intrinsic norm, by opposite, is induced by the geometric properties of the differential operator L and takes into account the regularity along the vector field Y (and thus along the time variable).Therefore, the intrinsic norm is stronger in the sense that it allows to see the full regularizing effect (in both space and time) of the fundamental solution of L.
Roughly speaking, we may catalogue the known Schauder estimates for solutions to (1.2) as follows: • anisotropic-to-anisotropic: the anisotropic norms of the data bound the anisotropic norm of the solution, as in [18], [16], [26] and [3]; • intrinsic-to-intrinsic: the intrinsic norms of the data bound the intrinsic norm of the solution, as in [19], [5], [9], [10] and [25]; • anisotropic-to-intrinsic: the anisotropic norms of the data bound the intrinsic norm of the solution.This class is stronger than the two above: only recently, partial results were proved in [6] and [2].
Our main result, Theorem 2.7, provides an optimal anisotropic-to-intrinsic global Schauder estimate which improves the results in [6] and [2] in a subtle but crucial way, as explained in Section 2.3.For instance, the Hölder norm we adopt for the solution u is strong enough to derive intrinsic Taylor formulas and therefore also an Itô formula for the underlying diffusion processes, which is a fundamental tool in stochastic calculus.
Our estimate is global in that it holds all the way up to the boundary, with an explosion factor that depends on the regularity of g, namely (T − t) − 2+α−β 2 : here α and β represent the Hölder exponents of the solution u and of the terminal datum g, respectively.In particular, it is possible to recognize two limiting cases: β = 2+α, no explosion close to the boundary; β = 0 (g only continuous), maximum explosion rate.As a corollary, we obtain a sharp regularity estimate (Corollary 2.8) in the Y direction, at the boundary t = T .
Although this is an expected phenomenon, the quantitative characterization of the penalty term is novel, to the best of our knowledge, for degenerate Kolmogorov operators in the context of variable coefficients or of intrinsic Hölder spaces.We refer the reader to [27] for the case of constant diffusion coefficients and anisotropic Besov-Hölder spaces.We also mention that intrinsic embedding theorems of Sobolev type were recently proved in [8] and [23].
Theorem 2.7 comprises a well-posedness result for (1.2).The proof of Theorem 2.7 goes as follows.
First we define a candidate solution u via Duhamel principle.After proving that it is actually a solution to (1.2), we prove the regularity estimates for the two convolution terms that constitute u, from which the Schauder estimate follows.The proof critically relies on the recent results in [17], where the existence of the fundamental solution of L was established, together with optimal Hölder estimates, by means of a suitable modification of the parametrix technique, already employed in [24] and [4] in the case of intrinsic Hölder-continuous coefficients.
The rest of the article is organized as follows.Section 2 contains the main results and a detailed comparison with the related literature.Precisely, in Section 2.1 we define both the anisotropic and intrinsic Hölder spaces; in Section 2.2 we state our main result, Theorem 2.7, and comment on it; Section 2.3 contains the comparison with the literature.Section 3 is entirely devoted to the proof Theorem 2.7.

Schauder estimates
We begin by fixing some general notation that will be utilized throughout the rest of the article.Let g : R N → R. For any i = 1, . . ., N , we denote by ∂ i g(x) the partial derivatives of u with respect to x i .
We also denote by ∇ d the gradient operator (∂ -C b , the set of bounded continuous functions g : R N → R, equipped with the norm t , for t > 0, the set of measurable functions f , defined on the strip S t (cf.(1.3)), such that the norm Finally, all the normed spaces in this article are defined for scalar valued functions and naturally extend to vector valued functions by considering the sum of the norms of their single components.

Hölder spaces
In this section, we introduce the anisotropic and intrinsic Hölder spaces that appear in the Schauder estimates for (1.2).Loosely speaking, the general idea behind the definition of these spaces is the following: • anisotropic Hölder spaces are defined for functions of x ∈ R N , assuming regularity in all N directions w.r.t. an anisotropic distance that reflects the different time-scaling properties of the underlying diffusion process.This distance is defined in term of an anisotropic norm that assigns to each component of x ∈ R N a different weight corresponding to the number of commutators of ∇ d and Y that are required to reach that direction.The definition then extends to functions defined on R N +1 by only requiring measurability and local boundedness with respect to the time variable.
• intrinsic Hölder spaces are defined for functions of (t, x) ∈ R N +1 that are assumed to be anisotropically Hölder continuous, in the sense above, uniformly in time.Additional Hölder regularity in the direction of the drift Y is also assumed.By means of the Hörmander condition, it is then possible to infer Hölder regularity jointly with respect to all variables.
Let us first recall that the parabolic Hörmander condition (1.1) is equivalent to the well-known Kalman rank condition for controllability in linear systems theory (cf., for instance, [22]).Also, it was shown in [14] that (1.1) is equivalent to B having the block-form where the * -blocks are arbitrary and B j is a (d j−1 × d j )-matrix of rank d j with The block decomposition (2.1) induces naturally an anisotropic norm for x ∈ R N defined as • The anisotropic Hölder norms on R N are defined recursively as We denote by C α the set of functions g : R N → R such that the norm g C α is finite.Set also • For t > 0, the anisotropic Hölder norms on S t are We denote by , respectively, are finite.Before introducing the intrinsic Hölder spaces, we recall the following Definition 2.2 (Lie derivative).For t > 0 and α ∈ ]0, 2] we set .
Moreover, we say that f is a.e.Lie-differentiable along Y on S t if there exists In that case, we set Y f = F and call it an a.e.Lie derivative of f on S t .
Definition 2.3 (Intrinsic Hölder spaces).Let t > 0. The intrinsic Hölder norms on S t are defined recursively as: For α ∈ ]0, 3], we denote by C α t the set of functions f : S t → R such that the norm f C α t is finite.
Remark 2.4 (Intrinsic vs anisotropic spaces).Obviously, the intrinsic space C α t is strictly included in the anisotropic space L ∞ t (C α ).Note that, for α ∈ ]0, 1], the addition in the yields Hölder regularity jointly in the time and space variables: in particular, it is standard to show that if Remark 2.5 (Intrinsic Taylor formula).For α ∈ ]0, 1], the intrinsic spaces C α t and C 1+α t are equivalent to those in [20].However, C 2+α t is slightly weaker than the one in [20] in that the Lie derivative Y f is not required to be in ).This difference is dictated by our assumptions on the coefficients that are merely measurable in the temporal variable: so, for a solution u of (1.2), one may expect that Y u exists in the strong sense but, in general, is not more than bounded in the Y -direction.Despite this, C 2+α t in Definition 2.3 is still strong enough to prove the following intrinsic Taylor formula as in [20, Theorem where T 2 f is the second order intrinsic Taylor polinomial Furthermore, by adding the term (τ − s)Y u(s, y) to T 2 , the estimate can be improved by obtaining a term of order o(|τ − s|), as τ − s → 0, in place of |τ − s| in (2.3).
It is worth noting that, for f in the anisotropic space 3) generally holds only for s = τ : this is the best result that one can deduce from Schauder estimates of anisotropic-to-anisotropic type.

Main result
Indeed, u is not regular enough to support even the first-order derivatives ∂ xi for d < i ≤ N and the full gradient ∇u appearing in the drift of L must be interpreted in a suitable way.Thus, in accordance with the intrinsic Hölder spaces defined above, we employ the following natural definition of solution to the kinetic equation.
For g ∈ C(R N ), a solution to the Cauchy problem (1.2) is a solution u to (2.4), which can be extended continuously to ]0, T ] × R N with u(T, •) = g.Theorem 2.7.Let assumptions (H.1) and (H.2) be in force and assume also the coefficients there exists a unique bounded strong Lie solution u to the Cauchy problem (1.2).Furthermore, we have where C is a positive constant which depends only on T, B, ᾱ, α, β, γ and on the norms L ∞ T (C ᾱ) of the coefficients of A. In particular, C does not depend on t.
We illustrate the Schauder estimate through particular instances; by linearity, we can treat the cases g = 0 and f = 0 separately: In particular, if γ = 0 then f is bounded and (2.6) holds true up to the boundary t = T : • [Case f = 0] We have two extreme cases: , that is g is only bounded and continuous, then the solution has the same explosion behavior of the fundamental solution as t → T − (cf.[17]), which is ⋄ If β = 2 + α then the solution is (2 + α)-Hölder continuous up to the boundary (2.7) Estimate (2.7) entails a regularity result along Y at the boundary t = T , which is reported in the following Corollary 2.8.Let the assumptions of Theorem 2.7 be in force with γ = 0.The solution u to the Cauchy Remark 2.9.Recall that (see Remark 2.5) Theorem 2.7 does not imply, under the assumptions therein, that the Lie derivative Y u is jointly continuous in time and space variables.However, under the additional assumption that f and a ij , a i , a are continuous on S T , it can be seen by Definitions 2.3 and 2.6, together with Remark 2.4, that Y u turns out to be continuous on S T .In particular, u is Lie differentiable along Y everywhere on S T , and thus equation (2.4) is satisfied pointwise everywhere on S T .

Comparison with the literature
Global anisotropic-to-anisotropic Schauder estimates were proved by Lunardi [18], Lorenzi [16], Priola [26], Menozzi [3] under different assumptions on the coefficients, stronger or equivalent to ours.These estimates read as follows: if u is a solution of the Cauchy problem (1.2) then Estimate (2.8) is similar to (2.5) for the particular choice β = 2 and γ = 0; however, by Remark 2.4, estimate (2.8) is weaker than (2.5) due to the strict inclusion of intrinsic into anisotropic spaces.Moreover, the above mentioned papers assume a smooth datum, g ∈ C 2+α , missing the smoothing effect of the equation, which is well-known in the uniformly parabolic case (cf.[15]).On the other hand, the class of equations considered by Menozzi [3] allows for more general drift terms.Zhang [27] proved general estimates in the context of Besov-Holder spaces, which, at order two read as (2.9) Once more, this estimate is proved for the anisotropic Hölder norm and thus only catches the smoothing effect of the kernel with respect to the spatial variables.Also, (2.9) is proved in the case of A with constant coefficients.
Global intrinsic-to-intrinsic Schauder estimates were obtained by Imbert and Mouhot [10] for B as in (2.11), assuming coefficients in C α T .As noticed in [6], the results in [10] do not cover the case of some elementary smooth functions, for instance, d = 1 and f (t, x 1 , x 2 ) = sin x 2 .We also mention that interior estimates were obtained by Manfredini [19], Di Francesco and Polidoro [5], Henderson and Snelson [9] and very recently by Polidoro, Rebucci and Stroffolini [25] for operators with Dini continuous coefficients.
Global anisotropic-to-intrinsic Schauder estimates were obtained by Biagi and Bramanti [2] under the additional assumption that the * -blocks in (2.1) are null and therefore L is homogeneous w.r.t. a suitable family of dilations: if u is a solution to Lu = f then Estimate (2.10) is weaker than (2.5) because the norm on the l.h.s. of (2.10) is smaller than the norm in and ∇ 2 d u C α Y,T along the direction Y are missing.Moreover, the optimal regularity for the second derivatives ∇ 2 d u is obtained only locally.The closest results to ours were recently obtained by Dong and Yastrzhembskiy [6] for the Langevin operator in R 2d+1 , that is for B in (2.1) in the particular form B = 0 0 where I d is the d × d identity matrix and for the Cauchy problem with null-terminal datum, g = 0.The techniques in [6] are based on a kernel free approach inspired by Campanato's ideas.Despite the proof is quite different, the estimates are very similar to ours except they miss the optimal regularity along Y of the first order derivatives ∇ d u: we remark that this piece of information is crucial to guarantee the validity of Taylor formulas like (2.3).As already mentioned, the latter is a basic tool to prove probabilistic results such as the Itô formula (cf.[12]), as well as analytical (cf.[10]) and asymptotic results (cf.[21]).
3 Proof of Theorem 2.7 The proof of Theorem 2.7 relies on the results of [17] where a fundamental solution to the operator L = A + Y was constructed in the form p(t, x; T, y) = P(t, x; T, y) + Φ(t, x; T, y), 0 < t < T, x, y ∈ R N , where: • the function P is a so-called parametrix, which is defined as where, for (τ, v) ∈ S T , we set • the function Φ(t, x; T, y) is a remainder enjoying suitable regularity estimates, which are recalled in Proposition 3.6 below.
The strategy of our proof is to define a candidate solution to the Cauchy problem (1.2) in the form and prove that u: (i) is the unique bounded solution to (1.2) and (ii) satisfies the estimate (2.5).
Note that u in (3.1) can be written as We now prove our main result, Theorem 2.7.The proof is based on the sharp regularity estimates for V g , V P,f and V Φ,f contained in Propositions 3.11 and 3.10.
Proof of Theorem 2.7 (well-posedness of (1.2)).The uniqueness of the solution follows from standard arguments: we refer to [7] for a detailed proof.
We prove that u as defined in (3.1) is a solution to the Cauchy problem (1.2) in the sense of Definition 2.6.The facts that ∇ d u, ∇ 2 d u have the required regularity on S T , and that u can be extended continuously to the closure of S T in a way that u(T, •) ≡ g, are straightforward consequences of the estimates of Propositions 3.10-3.11and of the Dirac delta property of p.To prove that Y u = f − Au in the sense of Definition 2.2, we can consider separately, by linearity, two cases.We state here, once and for all, that all the applications of Fubini's theorem throughout this proof are justified by the estimates of Propositions 3.10 and 3.11.
Case f ≡ 0. We have Therefore, by Fubini's theorem, we have .
By moving the operator A out of the integral in dηdτ , we obtain Au(r, e (r−t)B x)dr, and thus, to show (Y + A)u = 0 on S T , we need to prove that By applying Fubini's theorem we obtain Ap(r, e (r−t)B x; τ, η)f (τ, η)dη dr dτ.
As (Y + A)p(•, •; τ, η) = 0 on S τ , we obtain and since f (τ, •) is bounded and continuous, the Dirac delta property of p yields On the other hand, the estimates of Lemma 3.4 and of Proposition 3.6 yield This and (3.4), by Lebesgue dominated convergence theorem, yield (3.3).
Proof of Theorem 2.7 (estimate (2.5)).As u is a solution to (1.2), Y u = f − Au is a Lie derivative of u on S T .Therefore, we have Now we recall that, for α ∈]0, 1], the intrinsic spaces C α t , C 1+α t are exactly equivalent to those in [20].This is a consequence of the intrinsic Taylor formula of Theorem 2.10 in the latter reference, with n = 0, 1.In particular, we have where Therefore, the estimates of Propositions 3.10 and 3.11 yield To obtain the same estimate for u which can be done by proceeding as in the proofs of (3.20) and (3.31): we omit the details for brevity.
Furthermore, by the regularity assumptions on the coefficients of A, we have Finally, we have f and thus (2.5).
The rest of the section is devoted to proving the regularity estimates employed in the proof of Theorem 2.7.Hereafter, we denote, indistinctly, by C any positive constant depending at most on T, B, ᾱ, α, β, γ and on the L ∞ T (C ᾱ) norms of the coefficients of A. We also introduce the following Notation 3.1.For any f = f (t, x; T, y) and i = 1, . . ., N , we set and we adopt analogous notations for the higher-order derivatives.Thus, ∂ i always denotes a derivative with respect to the first set of space variables.Some caution is necessary when considering the composition of f with a given function F = F (x): ∂ i f t, F (x); T, y denotes the derivative ∂ zi f (t, z; T, y)| z=F (x) , and similarly for higher order derivatives.We also denote by e k the k-th element of the canonical basis of R N .

Preliminaries results
We first recall the useful result [ in the sense of Definition 2.6.
In order to state the next preliminary lemma, we fix the following (2j + 1) We have the following potential estimates, whose proof is identical to the one of [17,Proposition B.2].
By the Lemma 3.4 and Proposition 3.6, we have the following, direct, Proposition 3.7.For F = P, Φ, we have for any 0 < t < T and x ∈ R N .
The following identities directly stem from the boundedness assumption on g and from Propositions 3.5 and 3.6.
In the sequel we will make use of the special functions 2 F 1 and B, which denote the Gaussian hypergeometric function and the incomplete Beta function, respectively.
Remark 3.9.We recall the following known properties.For any γ ∈ [0, 1) and α ∈ (0, 1) we have: 3.2 Hölder estimates for V P,f and V Φ,f In this section we prove the following Hölder estimates for V P,f and V Φ,f , on which the proof of Theorem 2.7 relies.
Proposition 3.10.For F = P, Φ, and for any i, j, k = 1, . . ., d, we have and for any 0 < t < s < T , x ∈ R N and h ∈ R.
Proof of Proposition 3.10 for F = P. Estimates (3.17We now fix 0 < t < T , x ∈ R N and prove (3.20) in two separate cases.
We consider I 1 .By the mean-value theorem, there exists a real h with | h| ≤ |h| such that Therefore, by the estimate (3.6) with ∂ κ x = ∂ ijk we have Now, a direct computation yields where Γ E and 2 F 1 denote, respectively, the Euler Gamma and the Gaussian hypergeometric functions.This, together with (3.23), 2h 2 ≤ T − t Lemma 3.9-(a), proves We now consider I 2 .By employing triangular inequality and estimate (3.6) with ∂ κ x = ∂ ij we obtain A direct computation yields where B denotes the incomplete Beta function.This, together with (3.24) and Lemma 3.9-(b), proves and thus (3.20) when 2h 2 ≤ T − t.
Case 2h 2 > T − t.By employing the triangular inequality and estimate (3.6) with ∂ κ x = ∂ ij we obtain We now prove (3.21).By adding and subtracting, we have We now prove Set h := s − t and consider, once more, two separate cases.
Case 2h < T − s.We split the integral .
We estimate H 1 .By the triangular inequality and (3.6) with ∂ κ x = ∂ i we have (by a direct computation) Therefore by Lemma 3.9-(b) we have We now consider H 2 .By Lemma 3.2 and by Fubini's theorem we have Therefore, the estimates (3.6) with [κ] B = 3 and the regularity assumptions on the coefficients yield (as 2h ≤ T − s and by Lemma 3.9-(a)) which, together with (3.26), yields (3.25).
Case 2h ≥ T − s.By triangular inequality, and by (3.6) with ∂ κ x = ∂ i , we obtain which is (3.25).The proof of (3.22) is completely analogous, and thus is omitted for brevity.
We now prove (3.21).By adding and subtracting, we have We first bound the first integral.By applying (3.12) in the case s − t < τ − s, and (3.10) in the case s − t ≥ τ − s, we obtain As for L, estimate (3.10) yields (by integrating in η, and since T which, together with (3.27), proves (3.21).

Hölder estimates for V g
In this section we prove the following Hölder estimates for V g , on which the proof of Theorem 2.7 relies.
Recall that, by assumption, g ∈ C β with β ∈ [0, 2 + α].Therefore, for any fixed x ∈ R N , the following truncated Taylor polynomials are well defined for any y ∈ R N , where ψ is a cut-off function such that ψ(x) = 1 if |x| B ≤ 1.We also set the remainder The next lemma is a straightforward consequence of the definition of anisotropic norm • C β .
By the first part of Theorem 2.7, V g is the solution to the Cauchy problem 1.2 with f = 0. Therefore, it is easy to check that u x is the solution to the Cauchy problem 1.2 with and terminal datum given by R g x,β .In particular (see (3.2)), u x is of the form Therefore, owing to Proposition 3.10 and to (3.34), in order to prove the inequalities in Proposition 3.11, it is sufficient to prove them for V R g x,β , with an arbitrary x ∈ R N .Proof of Proposition 3.11.Let 0 < t < s < T , x ∈ R N and h ∈ R be fixed.For brevity, we only prove (3.30), (3.31) and (3.33), the proofs of (3.28), (3.29) and (3.32) begin simpler.
We first prove (3.30).By Remark 3.13, it is enough to prove the estimate for V R g ξ,β with ξ := e (T −t)B x.By (3.16), which remains true for V R g ξ,β , and Lemma 3.12, we obtain We now prove (3.31) by considering two separate cases.
Case h 2 ≥ T − t.By employing triangular inequality, we have (by (3.30) that we have just proved) where we used T − t ≤ h 2 in the last inequality.