Regularity for Subelliptic PDE Through Uniform Estimates in Multi-Scale Geometries

We aim at reviewing and extending a number of recent results addressing stability of certain geometric and analytic estimates in the Riemannian approximation of subRiemannian structures. In particular we extend the recent work of the the authors with Rea [19] and Manfredini [17] concerning stability of doubling properties, Poincar\'e inequalities, Gaussian estimates on heat kernels and Schauder estimates from the Carnot group setting to the general case of H\"ormander vector fields.

distance function d 0 (·, ·)) with a one-parameter family of degenerating Riemannian metric (associated to distance functions d (·, ·)), which converge in the Gromov-Hausdorff sense as → 0 to the original one. This approximation is described in detail in from the point of view of the distance functions in Sect. 2.2 and from the point of view of the Riemannian setting in Definition 3.7. The approximating distance functions d can be defined in terms of an extended generating frame of smooth vector fields X 1 , . . . , X p , with p ≥ n and X i = X i for i = 1, . . . , m, that converges/collapses uniformly on compact sets to the original familyX 1 , . . . , X m as → 0. This frame includes all the higher order commutators needed to bracket generate the tangent space. When coupled with uniform estimates, this method provides a strategy to extend known Riemannian results to the subRiemannian setting. Such approximations have been widely used since the mid-80's in a variety of contexts. As example we recall the work of Debiard [33], Koranyi [55,56], Ge [45], Rumin [77] as well as the references in [67] and [68]. More recently this technique has been used in the study of minimal surfaces and mean curvature flow in the Heisenberg group. Starting from the existence theorem of Pauls [71], and Cheng et al. [24], to the regularity results by Manfredini and the authors [15,16]. Our work is largely inspired to the results of Manfredini and one of us [27] where the Nagel et al. estimates for the fundamental solution of subLaplacians have been extended to the Riemannian approximants uniformly as → 0. In the following we list in more detail the nature of the stability estimates we investigate. Given a Riemannian manifold (M n , g), with a Riemannian smooth volume form expressed in local coordinates (x 1 , . . . , x n ) as d vol = √ gdx 1 . . . dx n , one can consider the corresponding heat operator acting on functions u : M → R, The study of such operators is closely related to certain geometric and analytic estimates, namely: For K ⊂⊂ M and r 0 > 0 there exists positive constants C D , C P , .. below depending on K , r 0 , g such that for all x ∈ K and 0 < r < r 0 , one has • (Doubling property) vol(B(x, r )) ≥ C D vol(B(x, 2r )); (1.2) • (Poincaré inequality)´B (x,r ) |u − u B(x,r ) |dvol ≤ C P r´B (x,2r ) |∇ g u|dvol; • (Gaussian estimates) If h g denotes the heat kernel of L g , x, y ∈ M and t > 0 one has √ t))) −n/2 exp A g d(x, y) 2 t ≤ |h(x, y, t) ≤ C g (vol(B(x, √ t))) −n/2 exp B g d(x, y) 2 t (1. 3) and if appropriate curvature conditions hold |∂ s t ∂ i 1 · · · ∂ i k h(x, y, t, s) ≤ C s,k,g t −s− k 2 (vol(B(x, t − s))) −n/2 exp B G d(x, y) 2 t−s ; (1.4) • (Parabolic Harnack inequality) If L g u = 0 in Q = M × (0, T ) and u ≥ 0 then sup B(x,r )×(t−r 2 ,t−r 2 /2) u ≤ C g inf B(x,r )×(t+r 2 /2,t+r 2 ) u. (1.5) The connections between such estimates was made evident in the work of Saloff-Coste [78] and Grigoryan [46], who independently established the equivalence See also related works by Biroli and Mosco [6], and Sturm [80]. This paper aims at describing the behavior of such estimates along a sequence of metrics g , that collapse to a subRiemannian structure as → 0. We will prove that the estimates are stable as → 0 and explore some of the consequences of this stability. Although, thanks to the work of Jerison [52], Nagel et al. [70] and Jerison and Sanchez-Calle [53], the Poincarè inequality, the doubling property and the Gaussian bounds are well known for subRiemannian structures, it is not immediate that they continue to hold uniformly in the approximation as → 0. For one thing, the Riemannian curvature tensor is unbounded as → 0, thus preventing the use of Li-Yau's estimates. Moreover, as → 0 the Hausdorff dimension of the metric spaces (M, d ), where d denotes the distance function associated to g , typically does not remain constant and in fact increases at = 0 to the homogeneous dimension associated to the subRimannian structure. The term multiscale from the title reflects the fact that the blow up of the metric as → 0 is Riemannian at scales less than and subRiemannian at larger scales.
To illustrate our work we introduce a prototype for the class of spaces we investigate, we consider the manifold M = R 2 × S 1 , with coordinates (x 1 , x 2 , θ). The horizontal distribution is given by = span{X 1 , X 2 }, with X 1 = cos θ∂ x 1 + sin θ∂ x 2 , and X 2 = ∂ θ .
The subRiemannian metric g 0 is defined so that X 1 and X 2 form a orthonormal basis. This is the group of Euclidean isometries defined below in Example 2.1. For each > 0 we also consider the Riemannian metric g on M uniquely defined by the requirement that X 1 , X 2 , X 3 is an orthonormal basis, with X 3 = − sin θ∂ x 1 + cos θ∂ x 2 . Denote by d the corresponding Riemannian distance, by X * i the adjoint of X i with respect to Lebesgue measure and by the fundamental solution of the Laplace-Beltrami operator L = 3 i=1 X * i X i . Since L is uniformly elliptic, then there exists C , R > 0 such that for d (x, y) < R the fundamental solution will satisfy C −1 d (x, y) −1 ≤ (x, y) ≤ C e d (x, y) −1 .
As → 0 this estimate will degenerate in the following way: R → 0, C → ∞ and for = 0 one will eventually have 0 (x, y) ≈ d 0 (x, y) −2 .
As a result of the work in [70] one has that for each > 0 there exists C > 0 such that The main result of [27] was to provide stable bounds for the fundamental solution by proving that one can choose C independent of as → 0. In this paper we extend such stable bounds to the degenerate parabolic setting and to the more general subRiemannian setting.
Since our results will be local in nature, unless explicitly stated we will always assume that M = R n and use as volume the Lebesgue measure. The first result we present is due to Rea and the authors [18] and concerns stability of the doubling property.

Here we have denoted by B the balls related to the d distance function.
We present here a rather detailed proof of this result, amending some minor gaps in the exposition in [18]. If the subRiemannian structure is equiregular, as an original contribution of this paper, in Theorem 3.10 we also present a quantitative version of this result, by introducing an explicit quasi-norms equivalent to d . These families of quasi-norms play a role analogue to the one played by the Koranyi Gauge quasi-norm (2.5) in the Heisenberg group. We also sketch the proof of the stability of Jerison's Poincare inequality from [18]. |∇ u|dx with a constant C P depending on K , 0 and the subRiemannian structure, but independent of . Here we have denoted by ∇ u the gradient of u along the frame X 1 , . . . , X p .
Our next results concerns the stability, as → 0, of the Gaussian estimates for the heat kernels associated to the family of second order, sub-elliptic differential equations in non divergence form a i j X i X j u = 0, in a cylinder Q = × (0, T ). Here {a i j } i, j=1,..., p is a constant real matrix such that for all ξ ∈ R p , uniformly in > 0 and for all ξ ∈ R m and > 0. Theorem 1.3 Let K ⊂⊂ R n , > 0 and 0 > 0. The fundamental solution ,A of the operator L ,A , is a kernel with exponential decay of order 2, uniform with respect to ∈ [0, 0 ] and for any coefficients matrix A satisfying the bounds above for the fixed > 0. In particular, the following estimates hold: • For every K ⊂⊂ there exists a constant C > 0 depending on but independent of ∈ [0, 0 ], and of the matrix A such that for each ∈ [0, 0 ], x, y ∈ K and t > 0 one has (1.8) • For s ∈ N and k-tuple (i 1 , . . . , i k ) ∈ {1, . . . , m} k there exists a constant C s,k > 0 depending only on k, s, X 1 , . . . , X m , such that for all x, y ∈ K and t > 0. • For any A 1 , A 2 ∈ M , s ∈ N and k-tuple (i 1 , . . . , i k ) ∈ {1, . . . , m} k there exists C s,k > 0 depending only on k, s, X 1 , . . . , X m , such that as → 0 uniformly on compact sets and in a dominated way on subcompacts of .
This theorem extends to the general Hörmander vector fields setting analogue results proved by Manfredini and the authors in [17], in the setting of Carnot groups.
In a similar fashion, one of our main result in this paper is the extension to the Hörmander vector fields setting of the Carnot groups Schauder estimates established in previous work with Manfredini in [17]. To prove such extension we combine the Gaussian bounds above with a refined version of Rothschild and Stein [76] freezing and lifting scheme, adapted to the multi-scale setting, to establish Schauder type estimates which are uniform in ∈ [0, 0 ], for the family of second order, sub-elliptic differential equations in non divergence form in a cylinder Q = × (0, T ). Our standing assumption is that the coefficients of the operator satisfy (1.6), and (1.7) for some fixed > 0.

Theorem 1.4
Let α ∈ (0, 1), f ∈ C ∞ (Q) and w be a smooth solution of L ,A w = f on Q. Let K be a compact sets such that K ⊂⊂ Q, set 2δ = d 0 (K , ∂ p Q) and denote by K δ the δ-tubular neighborhood of K . Assume that there exists a constant C > 0 such that for some value k ∈ N and for every ∈ [0, 0 ]. There exists a constant C 1 > 0 depending on α, C, 0 , δ, and the constants in Proposition 5.2, but independent of , such that Here we have set and if k ≥ 1 we have let u ∈ C k,α ,X (Q) if for all i = 1, . . . , m, one has X i ∈ C k−1,α ,X (Q).
Analogous estimates in the L p spaces, for operators independent of are well known (see for instance [76] for the constant coefficient case and [9] for the Carnot group setting). Our result yield a stable version, as → 0, of such estimates, which is valid for any family of Hörmander vector fields. Theorem 1.5 Let α ∈ (0, 1), f ∈ C ∞ (Q) and w be a smooth solution of L ,A w = f on Q. Let K be a compact sets such that K ⊂⊂ Q, set 2δ = d 0 (K , ∂ p Q) and denote by K δ the δ-tubular neighborhood of K . Assume that there exists a constant C > 0 such that for some value k ∈ N and for every ∈ [0, 0 ]. For any p > 1, there exists a constant C 1 > 0 depending on p, α, C, 0 , δ, and the constants in Proposition 5.2, but independent of , such that Here we have set We conclude the paper with two, related, groups of applications of our stability results. In the first we recall the notion of p-admissible structure (Definition 7.1), originally introduced by Hajlasz and Koskela in [48]. This class of spaces supports a rich analytic structure and allows for the development of a first-order (in the sense of derivatives up to order one) potential theory. We review some recent results by the authors and collaborators [1,18] concerning Harnack inequalities for weak solutions of classes of quasilinear degenerate parabolic PDE in such spaces. The main point of the section is that in view of Theorems 1.1 and 1.2, the Riemannian approximations of a subRiemannian structure satisfy the hypothesis of p-admissible structure uniformly in ≥ 0. Consequently, the Harnack inequalities hold uniformly across all scales. This provides a powerful technique in the study of degenerate elliptic and parabolic problems through the process of regularization and approximation. To exemplify this observation in a simple case, we consider approximating Riemannian metrics g with generating frame X 1 , . . . , X n defined in an open set ⊂ R n , and a family of divergence form parabolic linear equations analogue to (1.12), i.e.
in a cylinder Q = × (0, T ). We assume that the coefficients of the operator depend smoothly on u and satisfy (1.6), and (1.7) for some fixed > 0. Thanks to the stability estimates one can prove that there exists positive constants C, R > 0 depending only on the fixed subRiemannian structure, but independent of , such that for any ≥ 0 and any non-negative weak solution u ≥ 0 of (1.13), one has u for any metric ball B (x, 4r ) ⊂ and 0 < r < R. Clearly this yields Hölder regularity for u that is stable as → 0. Applying the Schauder estimates from Theorem 1.4 one obtains higher order regularity, uniformly in → 0 and so in particular we obtain smoothness of solutions in the case = 0. For further details and for a more general version of this result, applied to weak solutions of quasilinear equations, we refer the reader to Theorem 7.5.
In the last section we discuss one of the motivating applications of our work. We outline how the structure stability results, the stability of the Schauder estimates and of the Harnack inequalities can be used to prove regularity and long time existence theorems for solutions of the subRiemannian mean curvature flow and the total curvature flow of graphs over bounded sets in step 2 Carnot groups and even in some non-nilpotent Lie groups. This is part of the work developed by the authors jointly with Maria Manfredini in [14,17]. The notion of horizontal, or p-mean curvature has arisen in the last 10 years thanks to the work of many researchers. The two main motivations are Pansu conjecture, concerning the isoperimetric profile of the Heisenberg group [19,26,43,49,51,69,[73][74][75]; and the existence, regularity and uniqueness of minimals surfaces [20][21][22][23][24][25]31,32,50,71], and [72]. The mean curvature flow and the total curvature flow arise in connection to gradient descent for the perimeter functional and as such can be used for both applications. Very little is known about both flows in the subRiemannian setting and as far as we know the results in [14,17] are the first to establish existence of long time smooth flows. For other contributions to this topics, from different points of view, we recall the recent work in [35,36].

Definitions and preliminary results
Let X = (X 1 , . . . , X m ) denote a collection of smooth vector fields defined in an open subset ⊂ R n satisfying Hörmander's finite rank condition (1.1), that is there exists an integer s such that the set of all vector fields, along with their commutators up to order s spans R n for every point in , rank Lie(X 1 , . . . , X m )(x) = n, for all lx ∈ . (2.1) The standard example for such families is the Heisenberg group H 1 . This is a Lie group whose underlying manifold is R 3 and is endowed with a group law With respect to such law one has that the vector fields Together with their commutator [X 1 , X 2 ] = 2∂ x 3 they yield a basis of R 3 . A second example is given by the classical group of rigid motions of the plane, also known as the roto-translation group RT . This is a Lie group with underlying manifold R 2 × S 1 and a group law (x 1 , x 2 , θ 1 )(y 1 , y 2 , θ 2 ) = (x 1 + y 1 cos θ − y 2 sin θ, x 2 + y 1 sin θ + y 2 cos θ, θ 1 + θ 2 ).
Following Nagel, Stein and Wainger, [70, page 104] we define letting X (k) denote the set of all commutators of order k = 1, . . . , r . Indicate by Y 1 , . . . , Y p an enumeration of the components of X (1) , X (2) , . . . , If we instead consider the vectors arising from the group of roto-translations one has

Example 2.3
Note that the sets X (i) may have non-trivial intersection. For instance, consider the vector fields In this case r = 3 and

Carnot-Caratheodory distance
For each x, y ∈ and δ > 0 denote by (δ) the space of all absolutely continuous curves γ : [0, 1] → R n , joining x to y (i.e., γ (0) = x and γ (1) = y) which are tangent a.e. to the horizontal distribution span{X 1 , . . . , X m }, and such that if we write In [70], the authors introduce several other distances that eventually are proved to be equivalent to d 0 (x, y). The equivalence itself yields new insight into the Carnot-Caratheodory distance. Because of this, we will remind the reader of one of these distances. For each x, y ∈ and δ > 0 denote byˆ (δ) the space of all absolutely continuous curves γ : [0, 1] → R n , joining x to y and such that if one writes It is fairly straightforward (see [70,Proposition 1.1] to see that

Proposition 2.4
The functiond is a distance function in and for any K ⊂⊂ there exists C = C(X 1 , . . . , X m , K ) > 0 such that It is far less trivial to prove the following (see [70,Theorem 4])

Theorem 2.5
The distance functions d 0 andd are equivalent.

The approximating distances
There are several possibile definitions for Riemannian distance functions which approximate a Carnot-Caratheodory metric in the Gromov-Hausdroff sense.

Definition 2.6
Let {Y 1 , . . . , Y p } be a generating family of vector fields constructed as in (2.2) from a family of Hörmander vector fields X 1 , . . . , X m . For every > 0 denote by d (·, ·) the Carnot-Caratheodory metric associated to the family of vector fields (X 1 , . . . , X p ), defined as We will also define an extension of the degree function, setting d (i) = 1 for all i ≤ p, and d (i) = d(i − p + m) if i ≥ p + 1. In order to simplify notations we will denote X = X 0 , d 0 = d and use the same notation for both families of vector fields (dependent or independent of ).
Note that for every ∈ (0,¯ ) the sets {X i } extends the original family of vector fields (X i ) to a new families of vector fields satisfying assumption (I) on page 107 [70], i.e. there exist smooth functions c l jk , depending on , such that span R n at every point .

Remark 2.7
Note that the coefficients c l jk will be unbounded as → 0. In principle this could be a problem as the doubling constant in the proof in [70] depends indirectly from the C r norm of these functions. In this survey we will describe a result, originally proved in [18], showing that this is not the case.

Remark 2.8
It follows immediately from the definition that for fixed x, y ∈ the function d (x, y) is decreasing in and for every ∈ (0,¯ ), Remark 2.9 Let us consider a special case where dim span (X 1 , . . . , X m ) is constant and the vector fields X 1 , . . . , X p are chosen to be linearly independent in . In this case we can consider two positive defined symmetric quadratic forms g 0 , and λ defined respectively on the distribution H (x) = span (X 1 , . . . , X m )(x), for x ∈ and on H ⊥ (x). The product metric g 0 ⊕ λ is then a Riemannian metric on all of T . The form g 0 is called a subRiemannian metric on , corresponding to H . Next, for every ∈ (0,¯ ] reconsider the rescaled metric g := g 0 ⊕ −1 λ and the corresponding Riemannian distance function d in . The latter is bi-Lipschitz equivalent to the distance d defined above. In [45, Theorem 1.1] Ge proved that that as metric spaces, the sequence ( , d ) converges to ( , d 0 ) as → 0 in the sense of Gromov-Hausdorff. In this limit the Hausdorff dimension of the space degenerates from coinciding with the topological dimension, for > 0, to a value Q > n which may change from open set to open set. We will go more in detail about this point in the next section. In this sense the limiting approximation scheme we are using can be described by the collapsing of a family of Riemannian metric to a subRiemannian metric. See also [68, Theorem 1.2.1] for yet another related Riemannian approximation scheme.

Remark 2.10
From different perspectives, note that the subLaplacian associated to the family X 1 , . . . , X m i.e. Lu = m i=1 X ,2 i u is an elliptic operator for all > 0, degenerating to a subelliptic operator for = 0.

A special case: the Heisenberg group H 1
In this section we describe the behavior of the distance d (and of the corresponding metric balls B (x, r ) as → 0, by looking at the special case of the Heisenber group.
In this setting we will also provide an elementary argument showing that the doubling property holds uniformly as → 0.
Consider the vector fields from Example 2.1 The Carnot-Carathéodory distance d 0 associated to the subRiemannian metric defined by the orthonormal frame X 1 , X 2 is equivalent to a more explicitly defined pseudo-distance function, that we call gauge distance, defined as 2 3 , and ρ(x, y) = |y −1 x|, (2.5) where is the Heisenberg group multiplication.

Lemma 2.11
For each x ∈ R 3 , for some constant A > 0.

Remark 2.12
Since the Heisenberg group is a Lie group, then it is natural to use a left-invariant volume form to measure the size of sets, namely the Haar measure. It is not difficult to see [29] that the Haar measure coincides with the Lebesgue measure in R 3 . It follows immediately from the previous lemma that the corresponding volume of a ball B(x, r ) is As a consequence one can show that the Hausdorff dimension of the metric space (H 1 , d 0 ) is 4. The Hausdorff dimension of any horizontal curve (i.e. tangent to the distribution generated by X 1 and X 2 ) is 1, while the Hausdorff dimension of the vertical z-axis is 2.
Next we turn our attention to the balls in the metrics g and the associated distance functions d . To better describe the approximate shape of such balls we define the pseudo-distance function d G, (x, y) = N (y −1 x) corresponding to the regularized gauge function Our next goal is to show that the Riemannian distance function d is well approximated by the gauge pseudo-distance d G, .

Lemma 2.13
There exists A > 0 independent of such that for all x, y ∈ R 3 The estimate (2.9) yields immediately Corollary 2.14 The doubling property holds uniformly in > 0.

Remark 2.15
Before proving (2.9) it is useful to examine a specific example: compare two trajectories from the origin 0 = (0, 0, 0) to the point x = (0, 0, x 3 ). The first is the segment γ 1 defined by s → (0, 0, x 3 s), for s ∈ [0, 1]. The length of this segment in the Riemannian metric g given by the orthonormal frame X 1 , X 2 , X 3 is We also consider a second trajectory γ 2 given by the subRiemannian geodesic between the two points. In view of (2.6) the length of this curve in the subRiemannian metric g 0 defined by the orthonormal frame X 1 , X 2 is proportional to √ |x 3 | and coincides with the length in the Riemannian metric g , i.e.
Since d is computed by selecting the shortest path between two points in the g metric, From this simple example one can expect that at large scale (i.e. for points d 0 (x, 0) > ) the Riemannian and the subRiemannian distances are approximately Proof From the invariance by left translations of both d G, and d it is sufficient to prove that d (x, 0) and N (x) are equivalent. We begin by establishing the first inequality in (2.9), i.e. we want to show that there exists a positive constant A such that

and three curves
• A length minimizing curve γ : [0, 1] → R 3 for the metric g , such that Observe that in view of the equivalence (2.6), for some constant C > 0. On the other hand one also has The latter yields immediately that d (x, 0) ≥ C −1 N (x), for some value of C > 0 independent of > 0. Next we consider the case | p 3 | > 1 2 |x 3 |. This yields where |P| is defined as in (2.6). In summary, so far we have proved the first half of (2.9).
To prove the second half of the inequality we consider an horizontal segment 1 joining the origin to In view of (2.10) one has The latter completes the proof of (2.9).

Remark 2.16
Similar arguments continue to hold more in general, in the setting of Carnot groups.
As a consequence of Lemma 2.13, one has that for > 0 the metric space (R 3 , d ) is locally bi-Lipschitz to the Euclidean space, and hence its Hausdorff dimension will be 3. As → 0 the non-horizontal directions are penalized causing a sharp phase transition between the regime at > 0 and = 0.
The intuition developed through this example hints at the multiple scale aspect of the d metrics: At scales smaller than > 0 the local geometry of the metric space (R 3 , d ) is roughly Euclidean; For scales larger than > 0 it is subRiemannian. This intuition will inform the proofs of the stability for the doubling property in the next section.

Stability of the homogenous structure
The volume of Carnot-Caratheodory balls, and its doubling property, has been studied in Nagel, Stein and Wainger's seminal work [70]. In this section we recall the main results in this paper and show how to modify their proof so that the stability of the doubling constant as → 0 becomes evident.
For every n-tuple I = (i 1 , . . . , i n ) ∈ {1, . . . , 2 p − m} n , and for¯ ≥ ≥ 0 define the coefficient For a fixed 0 ≤ ≤¯ and for a fixed constant 0 where the maximum ranges over all n-tuples. Denote J the family of remaining indices, so that . . , X 2 p−m . When = 0 we will refer to I 0 as a choice corresponding to the n-tuple X 0 i 01 , . . . , X 0 i 0n realizing (3.1). One of the main contributions in Nagel, Stein and Wainger's seminal work [70], consists in the proof that for a v and a x fixed, and letting denote a weighted cube in R n , then the quantity |λ I (x)| provides an estimates of the Jacobian of the exponential mapping u → ,v,x (u) defined for u ∈ Q(r ) as More precisely, for ≥ 0 and fixed one has Theorem 3.1 [70,Theorem 7] For every ≥ 0, and K ⊂⊂ R n there exist R > 0 and constants 0 < C 1, , C 2, < 1 such that for every x ∈ K and 0 < r < R , if I is such that (3.1) holds, then As a corollary one has that the volume of a Carnot-Caratheodory ball centered in x can be estimated by the measure of the corresponding cube and the Jacobian determinant of ,v,x .

Corollary 3.2 ([70, Theorem 1])
For every ≥ 0, and K ⊂⊂ R n and for R > 0 as in Theorem 3.1, there exist constants C 3 , C 4 > 0 depending on K , R , C 1, and C 2 such that for all x ∈ K and 0 < r < R one has Estimates (3.3) in turn implies the doubling condition (1.2) with constants depending eventually on R , C 1 and C 2 .

Uniform estimates as → 0
Having already proved the stability of the doubling property in the special case of the Heisenberg group, in this section we turn to the general case of Hörmander's vector fields and describe in some details results from [18] establishing that the constants C 1 C 2 do not vanish as → 0. Without loss of generality one may assume that both constants are non-decreasing in . In fact, if that is not the case one may consider a new pair of constantsC i, = inf s∈[ ,¯ ] C i,s , for i = 1, 2.

Proposition 3.3
For every ∈ [0,¯ ], the constants R , C 1, and C 2, in Theorem 3.1 may be chosen to be independent of , depending only on the C r +1 norm of the vector fields, on the number¯ , and on the compact K .
Proof The proof is split in two cases: first we study the range < r < R 0 which roughly corresponds to the balls of radius r having a sub-Riemannian shape. In this range we show that one can select the constants C i, to be approximately C i,0 . The second case consists in the analysis of the range r < <¯ . In this regime the balls are roughly of Euclidean shape and we show that the constants C i, can be approximately chosen to be C i,¯ .
Let us fix ∈ (0,¯ ], R = R 0 and r < R 0 . We can start by describing the family I defined in (3.1), which maximize λ I (x). We first note that for every > 0 and for In the range 0 < r < <¯ one can assume without loss of generality that the n-tuple satisfying the maximality condition (3.1) will include only vectors of In fact, if this were not the case and the n-tuple were to include a vector of the form X j = Y j− p+m for some p < j, then we could substitute such vector with Similarly, in the range 0 < < r <¯ one can assume that the n-tuple satisfying the maximality condition (3.1) will include only vectors of the form would then be one of the terms in the left hand side of (3.1) for = 0, and thus is maximized by C −1 2,0 |λ 0 I 0 (x)|r d(I 0 )−1 . Case 1: In view of the argument above, for every < r < R 0 the indices I defined by the maximality condition (3.1) can be chosen to coincide with indices of the family I 0 and do not depend on . On the other hand the vector excluded from I will be not only those in J 0 but also the ones that have been added with a weight factor of a power of , In correspondence with this decomposition of the set of indices we define a splitting in the v-variables in (5.14) as Consequently for every < r the function ,v,x (u) can be written as Let us define mappings and In view of (3.5) we can write Note that for any ≥ 0 and for a fixed v, the mapping In view of (3.6) and of Theorem 3.1, as a function of u, the mapping ,v,x (u) is defined, invertible, and satisfies the Jacobian estimates in Theorem 3.1 (ii) for all u such that F 1, ,v (u) ∈ Q 0 (C 1,0 r ) and for v such that The completion of the proof of Case 1 rests on the following two claims: Proof of the claim If we choose C 6 < min{C 1,0 , C 2,0 } and it follows that So that completing the proof of the claim.
One has that for some constant C 5 > 0 independent of ≥ 0.
Proof of the claim Choose C 5 sufficiently large so that 2 max{C −1 5 , . This proves the first inclusion in the claim. To establish the second inclusion we choose C 5 large enough so that 2( . The corresponding estimate for the range k = 1, . . . , m is immediate. In view of Claims 1 and 2, and of Theorem 3.1 It follows that for < r and these choices of constants (independent of ) 1 the function ,v,x (u) is invertible on Q 0 (C 1,0 r ) and (i), (ii) and (iii) are satisfied.
Case 2: As remarked above, in the range 0 < r < <¯ one can assume that the n-tuple satisfying the maximality condition (3.1) will include only vectors of the form Note that in view of (3.4) and the maximality condition (3.1) the corresponding term can be rewritten and estimated as follows It is then clear that the maximizing n-tuple I in (3.1) will be identified by the lowest degree d(I ) among all n-tuples corresponding to non-vanishing determinants det(Y i 1 , . . . , Y i n ) in a neighborhood of the point x. Since this choice does not depend on > r , then one has that I = I¯ . In other words, if we denote then the maximality condition (3.1) in the range 0 < r < <¯ can be satisfied independently from by selecting the family of vector fields: The complementary family J becomes If we denote A , and B these three sets, and split the v-variable from (5.14) as v = (v,ṽ), then it is clear that and in this case the values of d and d¯ are the same on the corresponding indices. Analogously Y ∈ B iff Y ∈ B¯ and the degrees are the same. For every > r the map ,v,x (u) then can be written as This function is defined and invertible for Recall that with the present choice of r < <¯ , we have and argue similarly to Case 1, then the function ,v,x will satisfy conditions i), ii), and iii) on Q(C 1,¯ r ) and hence on Q(C 1, r ), with constants independent of .

Equiregular subRiemannian structures and equivalent pseudo-distances
The intrinsic definition, based on a minimizing choice, of the Carnot-Caratheodory metric is not convenient when one needs to produce quantitative estimates, as we will do in the following sections. It is then advantageous to use equivalent pseudo-distances which are explicitly defined in terms of certain system of coordinates. In the last section we have already encountered two special cases, i.e. the norms | · | defined in (2.5) and its Riemannian approximation (2.8). In this section we extend this construction to a all equi-regular subRiemannian structures. For ⊂ R n consider the subRiemannian manifold ( , , g) and iteratively set 1 := , and The bracket generating condition is expressed by saying that there exists an integer s ∈ N such that s p = R n for all p ∈ M.
coincides with the Hausdorff dimension with respect to the Carnot-Caratheodory distance.
This class is generic as any subRiemannian manifold has a dense open subset on which the restriction of the subRiemannian metric is equiregular. Example 3.5 Systems of free vector fields, as defined in Definition 5.4, yield a distribution that supports an equiregular subRiemannian structure for any choice of the horizontal metric g. Example 3.6 An analytic Lie group G is called a homogenous stratified Lie group if its Lie algebra admits a stratification Given a positive definite bilinear form g 0 on V 1 we call the pair (G, g 0 ) a Carnot group and the corresponding left invariant metric g 0 is a equiregular sub-Riemannian metric.
Next we assume we have a equiregular subRiemannian manifold ( , , g) and consider an orthonormal horizontal basis X 1 , . . . , X m of . Following the process in The equiregularity hypothesis allows one to choose Y 1 , . . . , Y n linearly independent. Next we extend g to a Riemannian metric g 1 on all of T by imposing that Y 1 , . . . , Y n is an orthonormal basis. Definition 3.7 For any ∈ (0,¯ ] we define the Riemannian metric g by setting We define canonical coordinates around a point x 0 ∈ as follows. Since Y 1 , . . . , Y n is a generating frame for T then for any point x in a neighborhood ω of x 0 one has that there exists a unique n-tuple (x 1 , . . . , x n ) such that (3.9) We will set x = (x 1 , . . . , x n ) and use this n-tuple as local coordinates in ω.

Definition 3.9 For every
Theorem 3.10 For every compact x 0 ∈ K ⊂ ω there exists C = C(K , , g, ω) > 0, independent of ∈ (0,¯ ] , such that for all x ∈ K . Remark 3.11 Note that for = 0 the equivalence is a direct consequence of the Ball-Box theorem proved by Nagel et al. [70] or Mitchell [65,Lemma 3.4]. This observation replaces the estimates (2.6) from the Heisenberg group setting.
The proof of Theorem 3.10 follows as a corollary of the following Proposition 3. 12 In the hypothesis of Theorem 3.10 one has that there exists R = R(K , , g, ω) > 0, C = C(K , , g, ω) > 0, independent of ∈ (0,¯ ] , such that for all x ∈ K and r ∈ (0, R), Proof The proof follows closely the arguments in the previous section and is based on the results in [70]. In view of the equiregularity hypothesis note that Y 1 , . . . , Y n are linearly independent and the construction in with constants independent from ≥ 0, where Q = {u ∈ R n : |u j | ≤ r d (i j ) }, and The n-tuple I contains n indexes related either to the horizontal vector fields X 1 , . . . , X m or to the commutators X m+1 , . . . , X n . The latter may consist of weighted versions X m+1 , . . . , X n or unweighted versions X n+1 , . . . , X 2n−m . In either case the same vector will appear both in the weighted and in the unweighted version (either among the I indexes or in the complement J ). Comparing the representation ,v,x 0 with the x-coordinates representation (3.9) one has and we let for each k = 1, . . . , n From the latter we obtain that for all k = 1, . . . , n .
This shows that for r > 0 sufficiently small, and for some choice of To prove the reverse inclusion we consider a point Cr). Select I as in (3.1) and set v = 0 to represent x in the basis X i 1 , . . . , X i n as In view of Theorem 3.1, and (3.11), to prove the proposition it suffices to show that there exists a constant C > 0 independent of > 0 such that for each j = 1, . . . , n one has |u j | ≤ Cr d (i j ) .
We distinguish two cases: In the range ≥ 2r one can argue as in (3.4) to deduce that for each j = 1, . . . , n we may assume without loss of generality that the contribution due to u j X i , j follows from the choice of a weighted vector, and hence is of the form On the other hand, since ≥ 2r then one must also have that Consequently one has In the range < 2r we observe that one must have |x k | ≤ Cr d (k). Arguing as in (3.4) we see that without loss of generality, or each j = 1, . . . , n, the contribution due to u j X i , j follows from the choice of a un-weighted vector, and hence is of the form u j Y k for some k > m. Consequently one has d (i j ) = d(k) > 1 and x k = u j , concluding the proof.

Stability of the Poincaré inequality
In this section we will focus on the Poincaré inequality and prove that it holds with a choice of a constant which is stable as → 0. Our argument rests on results of Lanconelli and Morbidelli [59] whose proof, in some respects, simplifies the method used by Jerison in [52]. Using some Jacobian estimates from [44] or [40] we will establish that the assumptions required in the key result [59, Theorem 2.1] are satisfied independently from ≥ 0. We start by recalling Then there exists a constant C P depending only on the constants α 1 , α 2 , α 3 and the doubling constant C D such that (P) is satisfied.
We are now ready to prove Theorem 1.2 Proof All one needs to establish is that the assumptions of Theorem 4.1 are satisfied unformly in on a metric ball. Apply Proposition 3.3 and Theorem 3.1 with K = B (x 0 , r ) and choose the constants C i produced by these results. Set Q = Q ( 3C 1 C 2 r ) and let To establish assumption (i) of Theorem 4.1 it suffices to note that by virtue of condition (iii) in Theorem 3.1 one has that for x ∈ B (x 0 , r ), Assumption (ii) in Theorem 4.1 is a direct consequence of condition (ii) in Theorem 3.1, with α 1 = 16. Chow's connectivity theorem implies that E(x, u) satisfies assumption (iii), with a function γ , piecewise expressed as exponential mappings of vector fields of −degree one. Let us denote (X i ) i∈I the required vector fields. With this choice of path, it is known (see for example [44,Lemma 2.2] or [40, pp 99- for a suitable function ψ(x, u, t) satisfying Since the constant c depends solely on the Lipschitz constant of the vector fields (X i ) i∈I then it can be chosen independently of . As a consequence condition (iv) is satisfied and the proof is concluded.

Hörmander type parabolic operators in non divergence form
The results in this section concern uniform Gaussian estimates for the heat kernel of certain degenerate parabolic differential equations, and their parabolic regularizations. We will consider a collection of smooth vector fields X = (X 1 , . . . , X m ) satisfying Hörmander's finite rank condition (1.1) in an open set ⊂ R n . We will use throughout the section the definition of degree d(i) relative to the stratification (2.2). A second order, non-divergence form, ultra-parabolic operator with constant coefficients a i j can be expressed as: where A = (a i j ) i j=1,...,m is a symmetric, real-valued, positive definite m × m matrix satisfying for a suitable constant . We will also call M m, the set of symmetric m × m real valued matrix, satisfying (5.2) (5.3) If A is the identity matrix then the existence of a heat kernel for the operator L A is a by now classical result due to Folland [37] and Rothschild and Stein [76]. Gaussian estimates have been provided by Jerison and Sanchez-Calle [53], and by Kusuoka and Strook [58]. There is a broad, more recent literature dealing with Gaussian estimates for non divergence form operators with Hölder continuous coefficients a i j . Such estimates have been systematically studied in [7][8][9]11] where a self-contained proof is provided. A natural technique for studying the properties of the operator L A is to consider a parabolic regularization induced by the vector fields X i defined in (2.4). More precisely, we will define the operator where a i, j is any p × p positive definite matrix belonging to M p,2 and such that a i, j = a i, j for i, j = 1, . . . , m.
We will denote M p,2 (5.5) the set of such matrices. Formally, the operator L A can be recovered as a limit as → 0 of operator L ,A . Here we are interested in understanding which are the properties of solutions of L ,A which are preserved in the limit. For > 0 consider a Riemannian metric g defined as in Remark 2.9, such that the vector fields X i are orthonormal. The induced distance function d is biLipschitz equivalent to the Euclidean norm || E . Consequently, the operator L ,A has a fundamental solution ,A , which can be estimated as for some positive constant C depending on A, and X 1 , . . . , X m .
Unfortunately the constant C blows up as approaches 0, so the Riemannian estimate (5.7) alone does not provide Gaussian bounds of the fundamental solution A of the limit operator (5.1) as goes to 0. In [57] the elliptic regularization technique has been used to obtain L p and C α regularity of the solutions, which however are far from being optimal. In [27], new estimates uniform in have been provided, in the time independent setting which are optimal with respect to the decay of the limit operator. In [17] the result has been extended to the parabolic operators, in the special case of Carnot groups.
In order to further extend these estimates, we need to formulate the following definition: to be the set of all kernels (P ,A ) >0,A∈M p,2 , defined on R 2n ×]0, ∞[ that have an exponential decay of order 2 + h, uniformly with respect to a family of distances (d ) and of matrices A ∈ M p,2 (see definition 5.5),on any compact sets of an open set . More precisely, we will say that P ,A ∈ E(2 + h, d , M p,2 ) if the following three conditions hold: • For every K ⊂⊂ there exists a constant C > 0 depending on but independent of > 0, and of the matrix A ∈ M p,2 such that for each > 0, x, y ∈ K and t > 0 one has • For s ∈ N and k-tuple (i 1 , . . . , i k ) ∈ {1, . . . , m} k there exists a constant C s,k > 0 depending only on k, s, X 1 , . . . , X m , such that for all x, y ∈ K and t > 0. • For any A 1 , A 2 ∈ M , s ∈ N and k-tuple (i 1 , . . . , i k ) ∈ {1, . . . , m} k there exists C s,k > 0 depending only on k, s, X 1 , . . . , X m , such that where ||A|| 2 := n i, j=1 a 2 i j . With these notations we will now extend all these previous results to vector fields which only satisfy the Hörmander condition, establishing estimates which are uniform in the variable as → 0, and in the choice of the matrix A ∈ M 2 for the fundamental solutions ,A of the operators L ,A . To be more specific, we will prove:

as → 0 uniformly on compact sets and in a dominated way on subcompacts of .
Our main contribution is that all the constants are independent of . The proof of this assertion is based on a lifting procedure, which allows to express the fundamental solution of the operator L A, in terms of the fundamental solution of a new operator L A independent of . The lifting procedure is composed by a first step in which we apply the delicate Rothschild and Stein lifting technique [76]. After that, when the vector fields are free up to a specific step, we apply a second lifting which has been introduced in [27], where the time independent case was studied, and from [17] where the Carnot group setting is considered.
The simplest example of such an equation is the Heat equation associated to the Kohn Laplacian in the Heisenberg group, ∂ t − X 2 1 − X 2 2 , where the vector fields X 1 and X 2 have been expressed on coordinates in Example 2.1. In order to present our approach we will give an outline of the proof in this special setting. (x 1 , x 2 , x 3 ) points of R 3 , let X 1 , X 2 , X 3 be the vector fields defined in Example 2.1, and let I denote the identity matrix: Consider the parabolic operator

Example 5.3 Denote by
and note that it becomes degenerate parabolic as → 0. Let d denote the Carnot-Caratheodory distance associated to the distribution X 1 , X 2 , X 3 . In order to handle such degeneracy we introduce new variables (z 1 , z 2 , z 3 ) and a new set of vector fields replicating the same structure of the initial ones, i.e., with (x 1 , x 2 , x 3 , z 1 , z 2 , z 3 ) ∈ H 1 × H 1 . The next step consists in lifting L ,I to an operatorL defined on H 1 × H 1 , and denote by¯ its fundamental solution. Letd denote the Carnot-Caratheodory distance generated by X 1 , X 2 , Z 1 , Z 2 , (Z 3 + X 3 ) and arguing as in (5.22) note thatd ((x, z), (y, z)) ≥ d (x, y) − C 0 , for some constant C 0 independent of . Consider the change of variables on the Lie algebra of H 1 × H 1 , Note that the Jacobian of such change of variables does not depend on and that it reduces the operatorL tō Is is clear that the operatorL is independent of , and consequently its fundamental solution¯ satisfies standard Gaussian estimates with constants independent of whered denotes the Carnot-Caratheodory distance in H 1 × H 1 generated by the distribution of vector fields X 1 , X 2 , Z 1 , Z 2 , Z 3 . Changing back to the original variable we see that also¯ satisfies analogous estimates with the same constants, with the distanced replaced by the distanced naturally associated to the operatorL . Finally, integrating with respect to the added variable (z 1 , z 2 , z 3 ), we obtain an uniform bound for the fundamental solution of the operator L ,I in terms of the distance d .

The Rothschild-Stein freezing and lifting theorems
Let us first recall a local lifting procedure introduced by Rothschild and Stein in [76] which, starting from a family (X i ) i=1,...,m of Hörmander type vector fields of step s in a neighborhood of R n , leads to the construction of a new family of vector fields which are free, and of Hörmander type with the same step s, in a neighborhood of a larger space. The projection of the new free vector fields on R n yields the original vector fields, and that is why they are called liftings. Let us start with some definitions: where the sets X j are as defined in (2.2). We shall say that X 1 , . . . , X m are free up to step s if for any 1 ≤ r ≤ s we have n m,s = dim(V (s) ).
If a point x 0 ∈ R n is fixed, the lifting procedure of Rothschild-Stein locally introduces new variablesz and new vector fields (Z i ) expressed in terms of the new variables such that in a neighborhood U of x 0 the vector fieldsX  , and smooth functions λ i j (x,z), with x ∈ R n andz = (z n+1 , . . . , z ν ) ∈ V , defined in a neighborhoodŨ ofx = (x, 0) ∈ U × V ⊂ R ν , such that the vector fieldsX 1 , . . . ,X m given bỹ are free up to step r at every point inŨ . Remark 5.6 In the literature the lifting procedure described above is often coupled with another key result introduced in [76], a nilpotent approximation which is akin to the classical freezing technique for elliptic operators. Let us explicitly note that in Sect. 5.3 we need only to apply the lifting theorem mentioned above, and not the freezing procedure. In particular, in the example of the so called Grushin vector fields they would need to be lifted through this procedure to the Heisenberg group structure On the other hand the vector fields X 1 = cos θ∂ x 1 + sin θ∂ x 2 and X 2 = ∂ θ will be unchanged by the lifting process, since they are already free up to step 2.
Later on, In Sect. 5.4 we will apply Rothschild and Stein's freezing theorem to a family of vector fields X 1 , . . . , X m free up to step r . This will allow to approximate a given family of vector fields with homogeneous ones. Note that in this case the function in (5.14) is independent of v and its expression reduces to: The pertinent theorem from [76] is the following, Theorem 5.7 Let X 1 , . . . , X m be a family of vector fields are free up to rank r at every point. Then for every x there exists a neighborhood V of x and a neighborhood U of the identity in G m,r , such that: (a) the map x : U → V is a diffeomorphism onto its image. We will call x its inverse map where R i is a vector field of local degree less or equal than zero, depending smoothly on x.
Hence the operator R i will represented in the form: where σ is an homogeneous polynomial of degree d(X i ) − 1.

A lifting procedure uniform in
So far we have started with a set of Hörmander vector fields X 1 , . . . , X m in ⊂ R n and we have lifted them through Theorem 5.5 to a setX 1 , . . . ,X m of Hörmander vector fields that are free up to a step s in a neighborhood˜ ⊂ R ν . Next, we perform a second lifting inspired by the work in [17]. We will consider the augmented space R ν × R ν defined in terms of ν new coordinatesẑ = (ẑ 1 , . . . ,ẑ ν ). Set z = (z,ẑ) and denote points of R ν × R ν byx = (x,z,ẑ) = (x, z). Denote byẐ 1 , . . . ,Ẑ m a family of vector fields free up to step s.X 1 , . . . ,X m , i.e. a family of vector fields free of step s in the variablesẑ, and letẐ m+1 , . . .Ẑ ν denote the complete set of their commutators, as we did in (2.2). Note that the subRiemannian structure generated byẐ 1 , . . . ,Ẑ m coincides with the structure generated by the familyX i , but are defined in terms of new variablesẑ. For every ∈ [0, 1) consider a sub-Riemannian structure determined by the choice of horizontal vector fields given by Since the space is free up to step r the function in (5.14) is independent of v and its expression reduces to: (5.14) In the sequel, when we will need to explicitly indicate the vector fields defining we will also use the notation: ,x,X (u) = ,x (u), and ,x,X i (u) its components, (5.15) and analogous notations will be used for the inverse map ,x,X For every > 0 andx,x 0 , in view of Theorem 3.10 the associated ball box distances reduce to:

Proof of the stability result
The sub-Laplacian/heat operator associated to this structure is and I is the identity matrix of dimension ν × ν. We denote by¯ ,A the heat kernels of the corresponding heat operators, and prove a lemma analogous to Lemma 5.2 for the lifted operator: uniformly on compact sets, in a dominated way on allḠ.
Proof The result for the limit operatorL 0,A is well known and contained for example in [11]. Hence we only have to estimate the fundamental solution of the operatorsL ,A in terms of the one ofL 0,A . In order to do so, we first define a change of variable on the Lie algebra: Then from a fixed pointz we apply the exponential map to induce on the Lie group a volume preserving change of variables. Using the notation introduced in (5.15), we will denoteF Since the distances are defined in terms of the exponential maps, this change of variables induces a relation between the distancesd 0 andd : Analogously we also havē Hence assertions (5.7) follow from the estimates of¯ 0,A contained for instance in [53]. Indeed the second inequality can be established as follows: The proof of the first inequality in (5.7) and (5.8) is analogous, while (5.9) follows from the estimates of the fundamental solution contained in ( [11]).
The pointwise convergence (5.16) is also an immediate consequence of (5.18) and (5.19). In order to prove the dominated convergence result we need to relate the distancesd 0 andd . On the other side, the change of variable (5.17) allows to express exponential coordinates u i , in terms of u 0 i as follows: where C 0 is independent of . The latter and (5.8) imply that there is a constantC s,k independent of such that and this imply dominated convergence with respect to the variable.
In order to be able to conclude the proof of Proposition 5.2, we need to study the relation between the fundamental solutions A (x, y, t) and its liftinḡ 0,A ((x, 0), (y, z), t), as well as the relation between A (x, y, t)and ,A ((x, 0), (y, z), t), Remark 5. 9 We first note that for every f ∈ C ∞ 0 (R n × R + ) f can be identified with a C ∞ and bounded function defined on R n+ν × R + and constant in the z-variables. Hence A ((x, 0, s), (y, z, t)
We conclude this section with the proof of the main result Proposition 5.2.
Proof In view of the previous remanrk and (global) dominated convergence of the derivatives of¯ ,A to the corresponding derivatives of¯ 0,A as → 0, we deduce that as → 0. The Gaussian estimates of ,A follow from the corresponding estimates on¯ ,A and the fact that in view of (5.20), Indeed the latter shows that there exists a constant C > 0 depending only on G, σ 0 such that for every x ∈ G, The conclusion follows at once.

Differential of the integral operator associated to
In this subsection we will show how to differentiate a functional F expressed as follows: In order to do so, we will need to differentiate both with respect to x and to y, so that we will denote X ,x i ,A (x, y, t) the derivative with respect to the variable x and X ,y i ,A (x, y, t) the derivative with respect to the variable y. Analogously, we will denote the derivative with the first variable of the lifted fundamental solutionX ,x i¯ , A ((x, w), (y, z), t).
The derivative with respect to the second variable will be denotedX 0,ȳ i . If is the Euclidean heat kernel, there is a simple relation between the derivative with respect to the two variables, indeed in this case , A (x, y, t) = ,A (x − y, 0, t), so that A (x, y, t). (5.23) Consequently for every function This is no more the case in general Lie groups, or for Hörmander vector fields. However we will see that there is a relation between the two derivatives, which allows to prove the following: (Let us note explicitly that the term R ,i,h (x, y, t) plays the role of an error term).
Proof We can apply the lifting procedure described in Sects. 5.2 and 5.3, and representing the fundamental solution as in (5.19) and (5.21), we obtain the following expression for F : (F (x, 0),F (y, z), t)dz f (y)dy.
By differentiating with respect to X i we get: F (y, z), t)dz f (y)dy. (5.24) Note that the family of vectorsX 0 i is independent of and free of step r . Hence, in view of [76, page 295, line 3 from below] one has that for every i, j = 1, · · · m, there exist families of indices I i, j , and polynomialsp ih homogeneous of degree ≥ h such that:X A (x,ȳ, t).
In particular using this expression in the variable z alone, and integrating by parts we deduceˆˆZ w i¯ 0,A (F (x, 0),F (y, z), t)dz = 0 We now call This kernel, being obtained by multiplication of 0,A (x,ȳ, t) by a polynomial, has locally the same decay as 0,A (x,ȳ, t). In particular it is clear that the conditions 5.7, 5.8, 5.9 are satisfied uniformly with respect to , since there is no dependence on . As a consequence, if we set Now we use the fact that¯ 0,A ∈ E(2,d, M m+ν, ) together with the fact thatp ih is a polynomial of the degree equal of the order ofX 0,y h to conclude that Setting P ,i,h (x, y, t) =ˆP 0,i,h (F (x, 0),F (y, z), t)dz it follows that P ,i,h (x, y, t) ∈ E(2, d , M m, )

Stability of interior Schauder estimates
In this section we will prove uniform estimates in spaces of Hölder continuous functions and in Sobolev spaces for solutions of second order sub-elliptic differential equations in non divergence form in a cylinder Q = × (0, T ) that are stable as → 0. The proof of both estimates is largely based on the heat kernel estimates established above. Internal Schauder estimates for these type of operators are well known. We recall the results of Xu [83], Bramanti and Brandolini [10] for heat-type operators, and the results of Lunardi [62], and Polidoro and Di Francesco [34], and Gutierrez and Lanconelli [47], which apply to a large class of squares of vector fields plus a drift term. We also recall [64] where uniform Schauder estimates for a particular elliptic approximation of subLaplacians are proved.
Here the novelty is due to the uniform condition with respect to . This is accomplished by using the uniform Gaussian bounds established in in the previous section. This result extends to Hörmander type operators the analogous assertion proved by Manfredini and the authors in [14] in the setting of Carnot Groups.

Uniform Schauder estimates
Let us start with the definition of classes of Hölder continuous functions in this setting Definition 6.1 Let 0 < α < 1, Q ⊂ R n+1 and u be defined on Q. We say that u ∈ C α ,X (Q) if there exists a positive constant M such that for every (x, t), , t), (x 0 , t 0 )). (6.1) We put Iterating this definition, if k ≥ 1 we say that u ∈ C k,α ,X (Q) if for all i = 1, . . . , m X i ∈ C k−1,α ,X (Q). Where we have set C 0,α ,X (Q) = C α ,X (Q).
The main results of this section, which generalizes to the Hörmander vector fields setting our previous result with Manfredini in [14] is Proposition 6.2 Let w be a smooth solution of L ,A w = f on Q. Let K be a compact sets such that K ⊂⊂ Q, set 2δ = d 0 (K , ∂ p Q) and denote by K δ the δ-tubular neighborhood of K . Assume that there exists a constant C > 0 such that for any ∈ (0, 1). There exists a constant C 1 > 0 depending on α, C, δ, and the constants in Proposition 5.2, but independent of , such that We will first consider to a constant coefficient operator, for which we will obtain a representation formula, then we will show how to obtain from this the claimed result.
Precisely we will consider the constant coefficient frozen operator: where (x 0 , t 0 ) ∈ Q. We explicitly note that for > 0 fixed the operator L ,(x 0 ,t 0 ) is uniformly parabolic, so that its heat kernel can be studied through standard singular integrals theory in the corresponding Riemannian balls. As a direct consequence of the definition of fundamental solution one has the following representation formula , t), (y, τ )) where we have denoted by (x 0 ,t 0 ) the heat kernel for of L ,(x 0 ,t 0 ) .
Iterating the previous lemma we get the following Lemma 6.4 Let k ∈ N and consider a k-tuple (i 1 , . . . , i k ) ∈ {1, . . . , m} k . There exists a differential operator B of order k + 1, depending on horizontal derivatives of a i j of order at most k, such that Proof The proof can be made by induction. Indeed it is true with B = 0 by definition if k = 0: if it true for a fixed value of k then we have By the properties of B it follows thatB is a differential operator of order k + 2, depending on horizontal derivatives of a i j of order at most k + 1. This concludes the proof.
We can go back to our operator L and establish the following regularity results, differentiating twice the representation formula: Proposition 6.5 Let 0 < α < 1 and w be a smooth solution of L w = f ∈ C α .X (Q) in the cylinder Q. Let K be a compact sets such that K ⊂⊂ Q, set 2δ = d 0 (K , ∂ p Q) and denote by K δ the δ-tubular neighborhood of K . Assume that there exists a constant C > 0 such that for every ∈ (0, 1) There exists a constant C 1 > 0 depending on δ, α, C and the constants in Proposition 5.2 such that Proof The proof follows the outline of the standard case, as in [41], and rests crucially on the Gaussian estimates proved in Proposition 5.2. Choose a parabolic sphere 3 B ,δ ⊂⊂ K where δ > 0 will be fixed later and a cut-off function φ ∈ C ∞ 0 (R n+1 ) identically 1 on B ,δ/2 and compactly supported in B ,δ . This implies that for some constant C > 0 depending only on G and σ 0 , in Q. Now we represent the function wφ through the formula 6.3 and take two derivatives in the direction of the vector fields. We remark once more that the operator is uniformly elliptic due to the −regularization, hence the differentiation under the integral can ben considered standard. As a consequence for every multi-index I = (i 1 , i 2 ) ∈ {1, . . . , m} 2 and for every (x 0 , t 0 ) ∈ B ,δ one has: In order to study the Hölder continuouity of the second derivatives, we note that the uniform Hölder continuity of a i j , and Proposition 5.2 ensure that the kernal satisfy the classical singular integral properties (see [37]): with C > 0 independent of . From here, proceeding as in [41,Theorem 2,Chapter 4], the first term in the right hand side of formula (6.3) can be estimated as follows: where C 1 , andC 1 are stable as → 0. Similarly, if φ is fixed, the Hölder norm of the second term in the representation formula (6.3) is bounded by ˆX From (6.3), (6.5) and (6.6) we deduce that Choosing δ sufficiently small we prove the assertion on the fixed sphere B ,δ The conclusion follows from a standard covering argument.
We can now conclude the proof of Proposition 6.2: Proof The proof is similar to the previous one for k = 1. We start by differentiating the representation formula (6.2) along an arbitrary direction X i 1 Now we apply Theorem 5.10 and deduce that there exist kernels with the same decay of the fundamental solution such that (6.9) Using Lemma 6.4, this yields the existence of new kernels , t), (y, τ )) with the behavior of a fundamental solution (and the same dependence on ) such that , t), (y, τ )) where φ is as in the proof of Proposition 6.5 and B is a differential operator of order k + 1. The conclusion follows by further differentiating the previous representation formula along two horizontal directions X j 1 X j 2 and arguing as in the proof of Proposition 6.5.

Application I: Harnack inequalities for degenerate parabolic quasilinear equations hold uniformly in
The results we have presented so far show that for any 0 > 0, the 1-parameter family of metric spaces (M, d ) associated to the Riemannian approximations of a subRiemannian metric space (M, d 0 ), satisfy uniformly in ∈ [0, 0 ] the hypothesis in the definition of p-admissible structure in the sense of [48,Theorem 13.1]. This class of metric measure spaces has a very rich analytic structure (Sobolev-Poincaré inequalities, John-Nirenberg lemma, ...) that allows for the development of a basic regularity theory for weak solutions of classes of nonlinear degenerate parabolic and elliptic PDE. In the following we will remind the reader of the definition and basic properties of p-admissible structures and sketch some of the main regularity results from the recent papers [1] and [18]. We will conclude the section with a sample application of these techniques to the global (in time) existence of solutions for a class of evolution equations that include the subRiemannian total variation flow [14].

Admissible ambient space geometry
Consider a smooth real manifold M and let μ be a locally finite Borel measure on M which is absolutely continuous with respect the Lebesgue measure when represented in local charts. Let d(·, ·) : M × M → R + denote the control distance generated by a system of bounded, μ-measurable, Lipschitz vector fields = (X 1 , . . . , X m ) on M. As in [3] and [44] one needs to assume as basic hypothesis (1) Doubling property: |B(x, 2r )| ≤ C D |B(x, r )| whenever x ∈ K and 0 < r < R.
• Our setting is also sufficiently broad to include some non-smooth vector fields such as the Baouendi-Grushin frames, e.g., consider, for γ ≥ 1 and (x, y) ∈ R 2 , the vector fields X 1 = ∂ x and X 2 = |x| γ ∂ y . Unless γ is a positive even integer these vector fields fail to satisfy Hörmander's finite rank hypothesis. However, the doubling inequality as well as the Poincaré inequality hold and have been used in the work of Franchi and Lanconelli [38] to establish Harnack inequalities for linear equations. • Consider a smooth manifold M endowed with a complete Riemannian metric g.
Let μ denote the Riemann volume measure, and by denote a g-orthonormal frame. If the Ricci curvature is bounded from below (Ricci ≥ −K g) then one has a 2-admissible structure. In fact, in this setting the Poincaré inequality follows from Buser's inequality while the doubling condition is a consequence of the Bishop-Gromov comparison principle. If K = 0, i.e. the Ricci tensor is non-negative, then these assumptions holds globally and so does the Harnack inequality.
Spaces with a p-admissible structure support a homogenous structure in the sense of Coifman and Weiss [28].  support in the norm u p 1, p = u p + u p with respect to μ. In the following we will omit μ in the notation for Lebesgue and Sobolev spaces. Given t 1 < t 2 , and 1 ≤ p ≤ ∞, we let t 1 ,t 2 ≡ × (t 1 , t 2 ) and we let L p (t 1 , t 2 ; W 1, p ( )), t 1 < t 2 , denote the parabolic Sobolev space of real-valued functions defined on t 1 ,t 2 such that for almost every t, t 1 < t < t 2 , the function x → u(x, t) belongs to W 1, p ( ) and The spaces L p (t 1 , t 2 ; W 1, p ,0 ( )) is defined analogously. We let W 1, p (t 1 , t 2 ; L p ( )) consist of real-valued functions η ∈ L p (t 1 , t 2 ; L p ( )) such that the weak derivative ∂ t η(x, t) exists and belongs to L p (t 1 , t 2 ; L p ( )). Consider the set of functions φ, φ ∈ W 1, p (t 1 , t 2 ; L p ( )), such that the functions have compact support in (t 1 , t 2 ). We let W 1, p 0 (t 1 , t 2 ; L p ( )) denote the closure of this space under the norm in W 1, p (t 1 , t 2 ; L p ( )).
A 0 and A 1 are called the structural constants of A. If A andÃ are both admissible symbols, with the same structural constants A 0 and A 1 , then we say that the symbols are structurally similar. Let E be a domain in M × R. We say that the function u : E → R is a weak solution to in E, where X * i is the formal adjoint w.r.t. dμ, if whenever t 1 ,t 2 E for some domain ⊂ M, u ∈ L p (t 1 , t 2 ; W 1, p ( )) and for every test function A function u is a weak super-solution (sub-solution) to (7.6) in E if whenever , and the left hand side of (7.7) is non-negative (non-positive) for all non-negative test func- ,0 ( )). The main results in [1] and [18] can be summarized in the following theorem.

Theorem 7.5 Let (M, μ, d) be a p-admissible structure for some fixed p ∈ [2, ∞).
For a bounded open subset ⊂ M, let u be a non-negative, weak solution to (7.6) in an open set containing the cylinder × [0, T 0 ] and assume that the structure conditions (7.5) are satisfied.
We conclude this section with a corollary of the proof in [18] [Lemma 3.6], a weak Harnack inequality that plays an important role in the proof of the regularity of the mean curvature flow for graphs over certain Lie groups established in [14]. Consider a weak supersolution w ∈ L p (t 1 , t 2 ; W 1, p ( )) of the linear equation with t 1 , t 2 , as defined above. Assume the coercivity hypothesis for a.e. (x, t) and all ξ ∈ R m , for a suitable constant .

Proposition 7.6 Let (M, μ, d) be a 2-admissible structure. For a bounded open subset
⊂ M, let w be a non-negative, weak supersolution to (7.10) in an open set containing the cylinder × [0, T 0 ] and assume that conditions (7.11) are satisfied. For any subcylinder Q 3ρ = B(x, 3ρ) × (t − 9ρ 2 ,t) ⊂ Q there exists a constant C > 0 depending on C D , C L , C P , the structure conditions (7.3) and on ρ such that with Q + , Q − as defined in (7.9).

Application II: regularity for quasilinear subelliptic PDE through Riemannian approximation
Let G be a Carnot group of step two. We denote by g its Lie algebra and by g = V 1 ⊕ V 2 its stratification. If g 0 is a subRiemannian metric on V 1 we let d 0 denote its corresponding CC metric, and X 1 , . . . , X m a left-invariant orthonormal basis of To quote a few: h 0 can be defined in terms of the legendrian foliation [22], first variation of the area functional [19,22,30,49,75], as horizontal divergence of the horizontal unit normal or as limit of the mean curvatures h of suitable Riemannian approximating metrics σ [19]. If the surface is not regular, the notion of curvature can be expressed in the viscosity sense (we refer to [2,4,5,13,61,63,81,82] for viscosity solutions of PDE in the sub-Riemannian setting). The formulation we use here is the following, at every non-characteristic point p in the level set of f we set The mean curvature flow is the motion of a surface where each points is moving in the direction of the normal with speed equal to the mean curvature. In the total variation flow the speed is the mean curvature, scaled by the modulus of the gradient. Both flows play key roles in digital image processing as well as provide prototypes for modeling motion of interfaces in a variety of physical settings.
As an illustration of the usefulness of the uniform estimates established above, in this section we want to briefly sketch the strategy used in [14] and [17], where the Riemannian approximation scheme is used to establish regularity for the graph solutions of the Total Variation flow In both cases ⊂ G is a bounded open set, with G is a Lie group, free up to step two, but not necessarily nilpotent. These equations describe the motions of the (noncharacteristic) hypersurfaces given by the graphs of the solutions in G × R.
We will consider solutions arising as limits of solutions of the analogue Riemannian flows, i.e. ∂u ∂t and a i j (∇ u )X i X j u for x ∈ , t > 0, (8.4) where, h is the mean curvature of the graph of u (·, t) and for all ξ ∈ R n . The main results in [14] and [17] concern long time existence of solutions of the initial value problems × (0, T )) denoting the parabolic boundary of Q.

Theorem 8.1 Let G be a Lie group of step two,
⊂ G a bounded, open, convex set (in a sense to be defined later) and ϕ ∈ C 2 (¯ ). There exists unique solutions u ∈ C ∞ ( × (0, ∞)) ∩ L ∞ ((0, ∞), C 1 (¯ )) of the two initial value problems in (8.6), and for each k ∈ N and K ⊂⊂ Q, there exists C k = C k (G, ϕ, k, K , ) > 0 not depending on such that ||u || C k (K ) ≤ C k . The proof of this result rests crucially on the estimates established in this paper. In the following we list the main steps. First of all we note that in view of the short time existence result in the Riemannian setting we can assume that locally u are smooth both in time and space.

Interior gradient bounds
Denote by right X r i the left invariant frame corresponding to X i s and observe that these two frames commute. For both flows, consider solutions u ∈ C 3 (Q) and denote v 0 = ∂ t u , v i = X r i u for i = i, . . . , n. Then for every h = 0, . . . , n one has that v h is a solution of for the mean curvature flow. The weak parabolic maximum principle yields that there exists C = C(G, ||ϕ|| C 2 ( ) ) > 0 such that for every compact subset K ⊂⊂ one has sup K ×[0,T ) where ∇ 1 is the full g 1 −Riemannian gradient. This yields the desired unform interior gradient bounds. This argument works in any Lie group, with no restrictions on the step.

Global gradient bounds
The proof of the boundary gradient estimates is more delicate and depends crucially on the geometry of the space. In particular the argument we outline here only holds in step two groups G and for domains ⊂ G that are locally Euclidean convex when expressed in the Rothschild-Stein preferred coordinates. As usual we note that the function v = u − ϕ solves the homogenous 'boundary' value problem with b (x) = a i j (∇ v (x) + ∇ ϕ(x))X i X j ϕ(x). and Proof In the hypothesis that is convex in the Euclidean sense we have that every x 0 ∈ ∂ supports a tangent hyperplane defined by an equation of the form (x) = n i=1 a i x i = 0 with > 0 in , (x 0 ) = 0, and normalized as d(i)=1,2 a 2 i = 1. Following the standard argument (see for instance [60, Chapter 10]) we prove that there exists a function such that the barrier at (x 0 , t 0 ) ∈ ∂ × (0, T ) can be expressed in the form w = ( ). Now comparison with the barrier constructed above yields that Proposition 8.4 Let G be a Carnot group of step two, ⊂ G a bounded, open, convex (in the Euclidean sense) set and ϕ ∈ C 2 (¯ ). For > 0 denote by u ∈ C 2 ( × (0, T )) ∩ C 1 (¯ × (0, T )) the non-negative unique solution of the initial value problem (8.6). There exists C = C(G, ||ϕ|| C 2 (¯ ) ) > 0 such that , ν), (8.13) in V ∩ Q, with dist σ 1 (x, x 0 ) being the distance between x and x 0 in the Riemannian metric σ 1 , concluding the proof of the boundary gradient estimates.

Harnack inequalities and C 1,α estimates
Let us first recall that (G, d ) is a 2-admissible geometry in the sense of Definition 7.1, with Doubling and Poincare constants uniform in ≥ 0, as we proved in Theorems 1.1 and 1.2. The total variation equation is expressed in divergence form, hence also the Eq. (8.8) satisfied by the right derivatives v h = X r h u is in the same form. The mean curvature flow is not in divergence form. However, arguing as in [60] it is possible to show that there exists a regular, positive and strictly monotone function such that (v h ) satisfies a divergence form equation. As a consequence we can apply the Harnack inequalities in Theorem 7.5 and Proposition 7.6 to the bounded solutions v h (or (v h )), thus yielding the C 1,α uniform interior estimates.

Schauder estimates and higher order estimates
The uniform Gaussian estimates and Schauder estimates in Theorem 1.4 applied to (8.8) yield the higher order estimates and conclude the proof. Once obtained the interior C 1,α estimate of the solution uniform in , we write the mean curvature flow equation in non divergence form: a i j (x, t)X i X j u = 0.
Applying Schauder estimates in Proposition 6.2 we immediately deduce the proof of Theorem 8.1.

Proof of Theorem 8.1
Since the solution is of class C 1,α , and the norm is bounded uniformly in then u it is a solution of a divergence form equation with a i j of class C α such that for every K be a compact sets such that K ⊂⊂ Q and 2δ = d 0 (K , ∂ p Q) there exists a positive constant C 0 such that for every ∈ (0, 1). Consequently, by Proposition 6.2 there exists a constant C 2 such that We now prove by induction that for every k ∈ N and for every compact set K ⊂⊂ Q there exists a positive constant C such that ||u || C k,α ,X (K ) ≤ C, (8.14) for every > 0. The assertion is true if k = 2, by Proposition 6.5. If (8.14) is true for k + 1, then the coefficients a i j belong to C k,α ,X uniformly as ∈ (0, 1) and (8.14) holds at the level k + 2 by virtue of Proposition 6.2.