Hardy-Littlewood-Sobolev inequalities for a class of non-symmetric and non-doubling hypoelliptic semigroups

In his seminal 1934 paper on Brownian motion and the theory of gases Kolmogorov introduced a second order evolution equation which displays some challenging features. In the opening of his 1967 hypoellipticity paper H\"ormander discussed a general class of degenerate Ornstein-Uhlenbeck operators that includes Kolmogorov's as a special case. In this note we combine semigroup theory with a nonlocal calculus for these hypoelliptic operators to establish new inequalities of Hardy-Littlewood-Sobolev type in the situation when the drift matrix has nonnegative trace. Our work has been influenced by ideas of E. Stein and Varopoulos in the framework of symmetric semigroups. One of our objectives is to show that such ideas can be pushed to successfully handle the present degenerate non-symmetric setting.

The first author was supported in part by a Progetto SID (Investimento Strategico di Dipartimento) "Non-local operators in geometry and in free boundary problems, and their connection with the applied sciences", University of Padova, 2017.

Introduction
Sobolev inequalities occupy a central position in analysis, geometry and physics. Typically, in such a priori estimates one is able to control a certain L q norm of a derivative of a function in terms of a L p norm of derivatives of higher order. One distinctive aspect of these inequalities is that there is gain in the exponent of integrability, i.e., q > p. For instance, the prototypical Sobolev inequality in R N states that for any 1 ≤ p < N , there exists a constant S N,p such that for any function f in the Schwartz class S , one has In such framework, (⋆) is referred to as the embedding theorem W 1,p (R N ) ֒→ L q (R N ). The relation between the exponents p and q in (⋆) is the well-known Hardy-Littlewood-Sobolev condition, and the if and only if character is connected with the interplay between the differential operator ∇ and the homogeneous structure (Euclidean dilations) of the ambient space.
In this paper we are concerned with a scale of global inequalities such as the one above for the following class of second-order partial differential equations in R N +1 , where the N × N matrices Q and B have real, constant coefficients, and Q = Q ⋆ ≥ 0. We assume throughout that N ≥ 2, and we indicate with X the generic point in R N , with (X, t) the one in R N +1 . It is worth noting here that when Q = I N and B = O N , then (1.1) becomes the standard heat operator ∆ − ∂ t in R N +1 , and we are back into the framework of (⋆). But in the degenerate case when Q ≥ 0 and B = O N , then the evolution of equations such as (1.1) is driven by semigroups P t = e −tA which, in general, are non-symmetric and non-doubling. Furthermore, there is no global homogeneous structure associated with them, and they lack an obvious notion of "gradient". For instance, a tool like the P.A. Meyer carré du champ Γ(f ) = 1 2 [A (f 2 ) − 2f A f ] is not directly effective here since Γ(f ) =< Q∇f, ∇f >. This misses all directions of non-ellipticity in the degenerate case, and also does not provide control on the drift.
The class (1.1) first appeared in the 1967 work of Hörmander [34], in which he proved his celebrated hypoellipticity theorem asserting that if smooth vector fields Y 0 , Y 1 , ..., Y m in R N +1 verify the finite rank condition on the Lie algebra, then the operator m i=1 Y 2 i +Y 0 is hypoelliptic. To motivate this result, in the opening of his paper he discussed (1.1) and showed that K is hypoelliptic if and only if Ker Q does not contain any non-trivial subspace which is invariant for B ⋆ . This condition can be equivalently expressed in terms of the strict positivity, hence invertibility, of the covariance matrix (1.2) K(t) = 1 t t 0 e sB Qe sB ⋆ ds for every t > 0. We note that, in the degenerate case when Q fails to be elliptic, this property becomes void at t = 0, since K(0) = Q. Also, it is easy to see that K(t) > 0 for every t > 0 if and only if K(t 0 ) > 0 for one t 0 > 0. Under the hypoellipticity assumption Hörmander constructed a fundamental solution p(X, Y, t) > 0 for (1.1), and proved that, given f ∈ S , the Cauchy problem K u = 0, u(X, 0) = f (X) admits a unique solution given by P t f (X) = R N p(X, Y, t)f (Y )dY .
This defines a non-symmetric semigroup {P t } t>0 which is strongly continuous in L p , 1 ≤ p < ∞, satisfies P t 1 = 1, but which, however, is not contractive in general.
Our primary interest in this paper is on the subclass of (1.1) which, besides Hörmander's hypoellipticity condition K(t) > 0, also satisfy the assumption (1.3) tr B ≥ 0.
This serves to guarantee that the semigroup {P t } t>0 be contractive in L p for 1 ≤ p < ∞, a fact that plays a pervasive role in our work. A prototypical example to keep in mind is the operator introduced by Kolmogorov in his seminal 1934 note [36] on Brownian motion and the theory of gases. Here, we have let N = 2n, and X = (v, x), with v, x ∈ R n . Such K 0 fails to be parabolic since it is missing the diffusive term ∆ x u, but it is easily seen to satisfy Hörmander's finite rank condition for the hypoellipticity. Equivalently, one can verify that K(t) = I n t/2 I n t/2 I n t 2 /3 I n > 0 for every t > 0. Remarkably, Kolmogorov himself had already produced the following explicit fundamental solution where Y = (w, y). Since such function is smooth off the diagonal, it follows that he had proved that K 0 is hypoelliptic more than thirty years before [34]. We note that the hypothesis (1.3) trivially includes Kolmogorov's operator K 0 since for the latter we have tr B = 0, but also encompasses several different examples of interest in mathematics and physics. For a short list the reader can see the items in red in the table in fig.1 in Section 3. For the items in black (see [49], [60], [12] and [24]) we have tr B < 0, thus they are not covered by our results. Such subclass of (1.1) will be analysed in a future study.
Our approach combines semigroup theory with the nonlocal calculus for (1.1) recently developed in [28], and it has been influenced by the ideas of E. Stein in [56] and Varopoulos in [63] in the setting of positive symmetric semigroups. In fact, one of the objectives of the present paper is to show that their powerful ideas can be pushed to successfully handle the degenerate non-symmetric setting of (1.1).
A discussion of the main results and techniques seems in order at this point. Section 2 is devoted to collecting the known background results on the semigroup {P t } t>0 . We introduce the intertwined non-symmetric pseudo-distance m t (X, Y ), and the time-dependent pseudo-balls B t (X, r). The volume function V (t) = Vol N (B t (X, √ t)) is defined in (2.5). The relevance of such function is demonstrated by its place in Hörmander's probability transition density (2.6). We also recall for completeness an important result from [39] stating that as t → 0 + the small-time behaviour of V (t) is governed by a suitable infinitesimal homogeneous structure. Using such information one can show that there exists D 0 ≥ N ≥ 2 such that V (t) ∼ = t D 0 /2 as t → 0 + . We call the number D 0 the intrinsic dimension of the semigroup at zero.
As it became evident from the work [63] (see also [61], [62] and [64]), in Varopoulos' semigroup approach to the Hardy-Littlewood theory the evolution is driven by the large time behaviour of the semigroup. It should thus come as no surprise that the functional inequalities in this paper hinge on the behaviour of the volume function V (t) as t → ∞. Section 3 is dedicated to the analysis of this aspect. The first key result is Proposition 3.1 in which we show that, under the hypothesis (1.3), the function V (t) must blow-up at least linearly as t → ∞ (note that for the Ornstein-Uhlenbeck operator ∆ X − < X, ∇ X > −∂ t , for which tr(B) < 0, one has instead V (t) → c N > 0 as t → ∞). Furthermore, if the drift matrix B has at least one eigenvalue with strictly positive real part, then V (t) blows up exponentially and is not doubling. In other words, in such situation the drift induces a negative "curvature" in the ambient space R N . In Definition 3.3 we introduce the key notion of intrinsic dimension at infinity of the semigroup, and we indicate such number with D ∞ . We note that the above mentioned minimal linear growth of V (t) at infinity, provides the basic information that D ∞ ≥ 2. The reader should see the table in fig.1 where the quantities D 0 and D ∞ are compared for several differential operators of interest in mathematics and physics. The second result of the section is Proposition 3.5 which establishes the L p − L ∞ ultracontractivity of the semigroup {P t } t>0 for 1 ≤ p < ∞. As the reader can surmise from the seminal work [63,Theorem 1] in the symmetric case, such property plays a central role in our work as well.
In Section 4 we introduce the relevant Sobolev spaces. One of the difficulties in the analysis of (1.1), already hinted at above, is that a "gradient" is not readily available. This problem is circumvented using the nonlocal operator (−A ) 1/2 as a gradient since it intrinsically contains the appropriate fractional order of differentiation along the drift, which is instead missing in the above mentioned carré du champ. By means of Balakrishnan's formula (4.1), we can precisely identify the nonlocal operators (−A ) s by means of the semigroup {P t } t>0 . This allows to introduce spaces of Sobolev type as follows. Given 0 < s < 1 and 1 ≤ p < ∞, we define the Banach space where for a function in Schwartz class S we have denoted by ||f || L 2s,p def = ||f || L p + ||(−A ) s f || L p . We stress that, when A = ∆, s = 1/2 and 1 < p < ∞, the classical Calderón-Zygmund theory guarantees that the space L 1,p coincides with the standard Sobolev space In Section 5, under the hypothesis (1.3), we establish a Littlewood-Paley estimate that has been so far missing in the analysis of the class (1.1). To achieve this we have combined a far reaching idea of E. Stein in [56] with the kernel associated with the Poisson semigroup P z = e z(−A ) 1/2 in [28]. Combining such tools with the powerful abstract Hopf-Dunford-Schwartz ergodic theorem in [19] we obtain the main weak−L 1 estimate in Theorem 5.5.
In Section 6 we introduce, for any 0 < α < D ∞ , the Riesz potential operators I α . Our central result is Theorem 6.3 that shows that for any 0 < α < 2 and f ∈ S , one has This proves that I α = (−A ) −α/2 . Again, the hypothesis (1.3) is essential. The reader should pay attention here to the fact noted above that, under such assumption, we have D ∞ ≥ 2, and thus (1.4) covers the whole range 0 < α < 2. We note that, once again, the semigroup P z = e z(−A ) 1/2 , z > 0, is in the background here. In Section 7 we establish our main Hardy-Littlewood-Sobolev embedding, Theorem 7.4. Suppose that there exist D, γ D > 0 such that Then, for every 0 < α < D the operator I α maps L 1 into L D D−α ,∞ . If instead 1 < p < D/α, then I α maps L p to L q , with 1 p − 1 q = α D . Combining this result with (1.4) we finally obtain the Sobolev embedding Theorem 7.5. We mention that in the "negative curvature" situation when D ∞ = ∞, see in this respect the operator of Kolmogorov with friction in ex.6 + in fig.1, given any 1 ≤ p < ∞ we are free to chose D > max{D 0 , 2sp} such that (1.5) hold. For such D we thus obtain L 2s,p ֒→ L pD/(D−2sp) . The reader should note that (1.5) implies that 2 ≤ D 0 ≤ D ≤ D ∞ , and thus Theorems 7.4 and 7.5 do not cover the possibility D 0 > D ∞ . In the degenerate setting this case can occur, see the Ex. 4 of the Kramers' operator in fig.1. When D 0 > D ∞ the estimate (1.5) must be replaced by (7.9) below and, under such hypothesis, we obtain appropriate versions of the above described results, see Theorems 7.6 and 7.7.
In closing, we compare our results with the available literature. Presently, there exist very few Sobolev-type estimates related to the class of degenerate operators (1.1). In [50,15] the authors prove some interesting local results for nonnegative solutions to equations modelled on (1.1). They use tools from potential theory and representation formulas. The restriction to solutions, however, does not allow to obtain a priori information for arbitrary functions. For kinetic Fokker-Planck equations (where in particular we have X = (v, x), with v indicating velocity and x position), we mention the recent papers [31] and [3]. In the former the authors prove a local gain of integrability for nonnegative sub-solutions via a non-trivial adaptation of the so-called velocity averaging method. In the latter the authors obtain a Poincaré inequality in a weighted L 2 space by means of a ad-hoc variational space. Our results differ from either one of these works since our Sobolev spaces L 2s,p are defined with the aid of the nonlocal operators (−A ) s . Similarly to the classical potential estimate |f (X)| ≤ c N I 1 (|∇f |)(X), our formula (1.4), combined with Theorem 7.5, provides the sharp a priori control of the L q norm of a function, in terms of the appropriate fractional order of differentiation. Both, along the directions of ellipticity, and of the drift.
We also mention [10], in which the author obtained L 2 a priori estimates for the above discussed homogeneous Kolmogorov's operator K 0 , and the work [11], where the authors prove some Calderón-Zygmund type estimates (both in L p and weak-L 1 ) for the operator A . The interesting analysis in [11] combines local singular integral estimates with suitable coverings that exploit the homogeneous structure discovered in [39] (see also subsection 2.4 below). Our approach, based on the semigroup P z = e z √ −A , is different and allows to obtain results of a global nature, both in space and time.
1.1. Notation. The notation tr A indicates the trace of a matrix A, A ⋆ is the transpose of A, and ∇ 2 u denotes the Hessian matrix of a function u. All the function spaces in this paper are based on R N , thus we will routinely avoid reference to the ambient space throughout this work. For instance, the Schwartz space of rapidly decreasing functions in R N will be denoted by S , and for 1 ≤ p ≤ ∞ we let L p = L p (R N ). The norm in L p will be denoted by || · || p , instead of || · || L p . We will indicate with L ∞ 0 the Banach space of the f ∈ C(R N ) such that lim |X|→∞ |f (X)| = 0 with the norm || · || ∞ . The reader should keep in mind the following simple facts: (1) P t : for every t > 0; (2) S is dense in L ∞ 0 . The notation |E| will indicate the N -dimensional Lebesgue measure of a set E. If T : L p → L q is a bounded linear map, we will indicate with ||T || p→q its operator norm. If q = p, the spectrum of T on L p will be denoted by σ p (T ), the resolvent set by ρ p (T ), the resolvent operator by R(λ, T ) = (λI − T ) −1 . The notation tr A indicates the trace of a matrix A, A ⋆ is the transpose of A, and ∇ 2 u denotes the Hessian matrix of a function u. For x > 0 we will indicate with Γ(x) = ∞ 0 t x−1 e −t dt Euler's gamma function. For any N ∈ N we will use the standard notation N , respectively for the (N − 1)-dimensional measure of the unit sphere S N −1 ⊂ R N , and N -dimensional measure of the unit ball. We adopt the convention that a/∞ = 0 for any a ∈ R.

Preliminaries
In this section we collect, mostly without proofs, various properties of the semigroup associated with (1.1) which will be used throughout the rest of the paper. One should see [28,Section 2], where some of the results in this section are discussed in detail.
2.1. One-parameter intertwined pseudo-distances. Given matrices Q and B as in (1.1) we introduce a one-parameter family of intertwined pseudo-distances which plays a key role in the analysis of the relevant operators K . For X, Y ∈ R N we define and call it the time-varying pseudo-ball. We will need the following simple result.
Then, for every X ∈ R N and t > 0 one has In particular, we have . The latter gives The proof of (2.3) is similar and we leave it to the reader. To obtain (2.4) it suffices to apply (2.2) with g = 1 (0,r) .
We stress that the quantity in the right-hand side of (2.4) is independent of X ∈ R N , a reflection of the underlying Lie group structure induced by the matrix B, see Remark 2.3. As a consequence, we will hereafter drop the dependence in such variable and indicate Vol

2.2.
The Cauchy problem. We next recall the theorem in the opening of [34] which constitutes the starting point of the present work. We warn the unfamiliar reader that our presentation of the fundamental solution (2.6) of (1.1) differs from that in [34]. This is done to emphasise the role of the one-parameter intertwined pseudo-distances (2.1) and of the corresponding volume function V (t) defined by (2.5). In (2.6) below we have let c N = (4π) −N/2 ω N .
Theorem 2.2 (Hörmander). Given Q and B as in (1.1), for every t > 0 consider the covariance matrix (1.2). Then, the operator K is hypoelliptic if and only if det K(t) > 0 for every t > 0. In such case, given f ∈ S , the unique solution to the Cauchy problem K u = 0 in R N +1 For a small list of differential operators of interest in mathematics and physics that are encompassed by Theorem 2.2 the reader should see the table in fig.1 at the end of this section. Remark 2.3. We mention that it was noted in [39] that the class (1.1) is invariant with respect to the following non commutative group law (X, s) • (Y, t) = (Y + e −tB X, s + t). Endowed with the latter, the space (R N +1 , •) becomes a non-Abelian Lie group. This aspect is reflected in the expression (2.6), as well as in the invariance with respect to • of the volume of the intertwined pseudoballs, see (2.4) in Lemma 2.1. Except for this, such Lie group structure will play no role in our work.

Semigroup aspects.
In the following lemmas we collect the main (well-known) properties of the semigroup {P t } t>0 defined by Theorem 2.2.
Lemma 2.5. The following properties hold: is a strongly continuous semigroup on L p . The same is true when p = ∞, if we replace L ∞ by the space L ∞ 0 . Remark 2.8. The reader should keep in mind that from this point on when we consider {P t } t>0 as a strongly continuous semigroup in L p , we always intend to use L ∞ 0 when p = ∞. Denote by (A p , D p ) the infinitesimal generator of the semigroup {P t } t>0 on L p with domain One knows that (A p , D p ) is closed and densely defined (see [20,Theorem 1.4]).
Corollary 2.9. We have S ⊂ D p . Furthermore, A p f = A f for any f ∈ S , and S is a core for (A p , D p ).
Remark 2.10. From now on for a given p ∈ [1, ∞] with a slight abuse of notation we write A : D p → L p instead of A p . In so doing, we must keep in mind that A actually indicates the closed operator A p that, thanks to Corollary 2.9, coincides with the differential operator A on S . Using this identification we will henceforth say that Lemma 2.11. Assume that (1.3) be in force, and let 1 ≤ p ≤ ∞. Then: (1) For any λ ∈ C such that ℜλ > 0, we have λ ∈ ρ p (A ); (2) If λ ∈ C such that ℜλ > 0, then R(λ, A ) exists and for any f ∈ L p it is given by the formula ℜλ . 2.4. Small-time behaviour of the volume function. The small-time behaviour of the function V (t) was studied in the paper [39], where it was shown that the class of operators (1.1) possesses an infinitesimal osculating structure. For completeness of presentation we recall it in this subsection. We begin with the following known result, see [34], [39], [44] and [43].
Proposition 2.12. The following are equivalent: (v) in a suitable basis of R N the matrices Q and B assume the following form where Q 0 is a p 0 × p 0 non-singular matrix, and B j is a p j × p j−1 matrix having rank p j , j = 1, ..., r, with p 0 ≥ p 1 ≥ ... ≥ p r ≥ 1, and p 0 + p 1 + ... + p r = N . The ⋆ blocks in the canonical form of B can be arbitrary matrices.
Let us now suppose that in a given basis of R N the matrices Q and B are given as in (v) of Proposition 2.12. Recall that Q 0 is a p 0 × p 0 positive matrix. We form a new matrixB by replacing all the elements with a ⋆ in B with a zero matrix of the same dimensions, i.e., We recall that B j is a p j ×p j−1 matrix having rank p j . If we denote by X = x (p 0 ) , x (p 1 ) , . . . , x (pr) the generic point of R N = R p 0 × R p 1 × · · · × R pr , then the differential operator associated to the matrices Q andB is given by The fact that the blocks B j have maximal rank allows to easily check the condition (iv) in Proposition 2.12, therefore alsoK verifies the Hörmander's condition (i) in Proposition 2.12, with a matrixK(t) defined as in (1.2) withB in place of B. Furthermore,K is left-invariant with respect to the group law • in Remark 2.3, in which B has been replaced byB. We remark that trB = 0, and thatB is nilpotent, therefore e sB is in fact a finite sum. One important aspect of the operatorK is that, unlike K , it possesses a homogeneous structure: it is invariant of degree 2 with respect to the group of anisotropic dilations δ λ : We mention that it was proved in [39, Proposition 2.2] that a necessary and sufficient condition for the existence of a family of non-isotropic dilations δ λ associated with the operator K in (1.1) is that B in (v) takes precisely the special formB. The homogeneous dimension of (R N +1 , •, δ λ ) is given by Returning to the general discussion, we consider the one-parameter group of anisotropic dila- The fact that δ λ are group automorphisms with respect to • is a consequence of the following commutation property valid for any λ > 0 and τ ∈ R, (see [39, eq. (2.20)] and also [37]). From this, and the fact that trB = 0, one can see that the positive definite matrixK(t), defined in (1.2) withB instead of B, satisfies det(tK(t)) = t D 0 det(K(1)).
Denoting withV (t) the volume of the pseudoballsB t (X, √ t) associated withK, we thus conclude that we must have for every t > 0, The result in [39, eq. (3.14) and Remark 3.1] gives us the following asymptotic.
Definition 2.14. We call the number D 0 in (2.9) the intrinsic dimension at zero of the Hörmander semigroup {P t } t>0 . Note that it follows from (2.9) that it must be D 0 ≥ N ≥ 2.

Large time behaviour of the volume function and ultracontractivity
The analysis of the semigroup {P t } t>0 revolves on the large time behaviour of the volume function V (t). In this section we analyse this behaviour under the assumption (1.3). Our main result, Proposition 3.1, plays a pervasive role in the rest of the paper since: 1) it shows that V (t) grows at infinity at least linearly; and, 2) it says that when at least one of the eigenvalues of the drift matrix B has a strictly positive real part, then V (t) must blow up exponentially. In what follows we will make use of the equivalence (i) ⇐⇒ (ii) in Proposition 2.12. The notation σ(B) indicates the spectrum of B.
Proof. As it will be evident from the proof, we first establish (ii) and then (i). Up to a change of variables in R N , we can assume that B * is in the following block-diagonal real Jordan canonical form (see, e.g., [35,Theorem 3.4.5]) where σ(B) = σ(B ⋆ ) = {λ 1 , . . . , λ q , a 1 ± ib 1 , . . . , a p ± ib p } with λ k , a ℓ , b ℓ ∈ R (b ℓ = 0), n 1 + . . . + n q + 2m 1 + . . . + 2m p = N with n k , m ℓ ∈ N, and the n k × n k matrix J n k (λ k ) and the 2m ℓ × 2m ℓ matrix C m ℓ (a ℓ , b ℓ ) are respectively in the form Since tr B = q k=1 λ k + 2 p ℓ=1 a ℓ ≥ 0, we have two cases: Suppose L 0 > 0. We are going to show that, for some C 0 > 0, we have To do this, it is enough to show that where λ M (t) is the largest eigenvalue of tK(t). In fact, since t → tK(t) is monotone increasing in the sense of matrices, for t ≥ 1 all the eigenvalues of tK(t) are larger than the minimum eigenvalue of K(1), which is strictly positive by Hörmander condition: this tells us that (3.2) implies (3.1). To prove (3.2), we notice that at least one of the following two possibilities occurs: . . , p} such that a ℓ 0 = L 0 . Suppose case (a) occurs. It is not restrictive to assume k 0 = 1. Then, v 0 = (1, 0, . . . , 0) ∈ R N is an eigenvector for B * with relative eigenvalue L 0 . Thus e sB ⋆ v 0 = e L 0 s v 0 , for all s ∈ R. From (ii) in Proposition 2.12 we know that v 0 / ∈ Ker Q, i.e. Qv 0 , v 0 > 0. Therefore, we have With these notations, we have that span{v 1 , v 2 } is an invariant subspace for B ⋆ . From (ii) in Proposition 2.12 we know that span{v 1 , v 2 } is not contained in Ker Q. Moreover, denoting by J the simplectic matrix restricted to span{v 1 , v 2 } such that Jv 1 = v 2 and Jv 2 = −v 1 , we have for all s ∈ R and for any v ∈ span{v 1 , v 2 }.
Hence, for v ∈ span{v 1 , v 2 }, we have Jv. The fact that v and Jv cannot belong both to Ker Q implies that, for any t > 0, also v t and cos(b 1 t)v+sin(b 1 t)Jv cannot be in Ker Q at the same time. This says, since Q ≥ 0, that Qv t , v t ≥c for some positivec, from which we can deduce This proves (3.2) and concludes case (b). This establishes (ii) in the statement of the proposition. We next turn to proving (i). Suppose that λ k = 0 = a ℓ ∀k ∈ {1, . . . , q}, ℓ ∈ {1, . . . , p}.
Proposition 3.1 has the following basic consequence.
Proof. Recalling that t → tK(t) is monotone increasing in the sense of matrices, we have that t → V (t) is also a monotone function. Then, the conclusion immediately follows from Proposition 3.1.

Intrinsic dimension at infinity.
In dealing with the general class (1.1) the first question that comes to mind is: what number occupies the role of the dimension N in the analysis of the semigroup {P t } t>0 ? This question is central since, as one can see in fig.1, the behaviour for large times of the volume function V (t) = Vol N (B t (X, √ t)) can be quite diverse, depending on the structure of the matrix B, and in fact non-doubling in general. The next definition introduces a notion which allows to successfully handle this matter.
We call the number D ∞ = sup Σ ∞ the intrinsic dimension at infinity of the semigroup {P t } t>0 . In both cases the theory developed in this paper does not apply (we will return to this aspect in a future study); (5) it can happen that D ∞ < D 0 , see Ex.4; (6) finally, one can have D ∞ = ∞, see Ex.6 + .
In the following table we illustrate the different behaviours of the volume function V (t) on a significant sample of operators. The items in red refer to situations in which the drift matrix satisfies tr(B) ≥ 0. This is the situation covered by this paper.

3.2.
Ultracontractivity. We next establish a crucial geometric property of the Hörmander semigroup that plays a pervasive role in the remainder of our work. The reader should note that we do not assume (1.3) in Proposition 3.5. As a consequence, such result alone does not imply a decay of the semigroup. In this respect, see Corollary 3.6.
Proposition 3.5 (L p → L ∞ Ultracontractivity). Let 1 ≤ p < ∞ and f ∈ L p . For every X ∈ R N and t > 0 we have , with 1/p + 1/p ′ = 1. Using (2.2) it is now easy to recognise that for any 1 ≤ r < ∞, there exists a universal constant c N,r > 0 such that The desired conclusion now follows taking r = p ′ in (3.5).
For later use, we also record the following formula, dual to (3.5), which easily follows by (2.3) Corollary 3.6. Assume (1.3) and let 1 ≤ p < ∞. For every f ∈ L p and X ∈ R N , we have lim t→∞ |P t f (X)| = 0.
Proof. By Proposition 3.5 we have for every X ∈ R N and t > 0 Combining this estimate with Corollary 3.2 we find

Sobolev spaces
In the recent work [28] we developed a fractional calculus for the operators K in (1.1) and solved the so-called extension problem. This is a generalisation of the famous work by Caffarelli and Silvestre for the fractional Laplacian (−∆) s , see [13]. As a by-product of our work, we obtained a nonlocal calculus for the "time-independent" part of the operators K , namely the second order partial differential operator A u = tr(Q∇ 2 u)+ < BX, ∇u > .
It is worth mentioning here that boundary values for these elliptic-parabolic operators were studied by Fichera in his pioneering works [22], [23].
Since the nonlocal operators (−A ) s play a central role in the present work we now recall their definition from [28,Definition 3.1]. Hereafter, when considering the action of the operators A or (−A ) s on a given L p , the reader should keep in mind Remark 2.10.
Definition 4.1. Let 0 < s < 1. For any f ∈ S we define the nonlocal operator (−A ) s by the following pointwise formula We mention that it was shown in [28] that the right-hand side of (4.1) is a convergent integral (in the sense of Bochner) in L ∞ , and also in L p for any p ∈ [1, ∞] when (1.3) holds. We note that, when A = ∆, it is easy to see that formula (4.1) allows to recover M. Riesz' definition in [55] of the fractional powers of the Laplacian Definition (4.1) comes from Balakrishnan's seminal work [5]. The nonlocal operators (4.1) enjoy the following semigroup property (see [5] for the case s + s ′ < 1 and [30] for s + s ′ = 1).

Proposition 4.2.
Let s, s ′ ∈ (0, 1) and suppose that s + s ′ ∈ (0, 1]. Then, for every f ∈ S we have For any given 1 ≤ p < ∞, and any 0 < s < 1, we denote by Proof. In view of (4.1) we have Thanks to Lemma 2.6 we now have for some universal constant C > 0, On the other hand, by (iv) in Lemma 2.5 we know that, under the hypothesis (1.3), P t is a contraction in L p . We thus obtain This proves the desired conclusion.
We now use the nonlocal operators (−A ) s to introduce the functional spaces naturally attached to the operator A . These spaces involve a fractional order of differentiation that is intrinsically calibrated both on the directions of ellipticity of the second order part of (1.1), as well as on the drift.  then for 1 < p < ∞ and s = 1/2 the space L 2s,p coincides with the classical Sobolev space W 1,p = {f ∈ L p | ∇f ∈ L p }, endowed with the usual norm ||f || W 1,p = ||f || L p + ||∇f || L p .
In other words, one has L 1,p = W 1,p , for 1 < p < ∞. This follows from the wellknown fact that W 1,p = S || || W 1,p (Friedrich's mollifiers, see [26]), combined with the L p continuity of the singular integrals (Riesz transforms) in the range 1 < p < ∞, see [57,Ch. 3]. This implies the double inequality (iii) We mention that such inequality, and therefore the identity L 1,p = W 1,p , continue to be valid on any complete Riemannian manifold with Ricci lower bound Ric ≥ −κ, where κ ≥ 0. This was proved by Bakry in [4]. A generalisation to the larger class of sub-Riemannian manifolds with transverse symmetries was subsequently obtained in [6]. (iv) As a final comment we note that, when p = 2, and again A = ∆, then the space L 2s,p coincides with the classical Sobolev space of fractional order H 2s , see e.g. [42] or [2].
We close this section by recalling the result from [28] that will be needed in the next one. Given 0 < s < 1, let a = 1 − 2s. The extension problem for (−A ) s consists in the following degenerate Dirichlet problem in the variables (X, z) ∈ R N +1 + , where X ∈ R N and z > 0: where f ∈ S . We note that, since s ∈ (0, 1), the relation a = 1 − 2s gives a ∈ (−1, 1), and that, in particular, a = 0 when s = 1/2. For the following Poisson kernel for the problem (4.2), and for the subsequent Theorem 4.6, one should see [28, Def. 5.1 and Theor. 5.5], The next result generalises the famous one by Caffarelli and Silvestre in [13] for the nonlocal operator (−∆) s . (0, ∞)) and solves the extension problem (4.2). By this we mean that A a U = 0 in R N +1 + , and we have in L ∞ Moreover, we also have in L ∞ If furthermore one has tr B ≥ 0, then the convergence in (4.4), (4.5) is also in L p for any 1 ≤ p ≤ ∞.

The key Littlewood-Paley estimate
In the Hardy-Littlewood theory the weak L 1 continuity of the maximal function occupies a central position. It is natural to expect that such result play a similar role for the operators in the general class (1.1), but because of the intertwining of the X and t variables it is not obvious how to select a "good" maximal function. At first it seems natural to consider M f (X) = sup t>0 |P t f (X)|, but such object presents an obstruction connected with the mapping properties of the Littlewood-Paley function that controls it. We have been able to circumvent this difficulty by combining a far-reaching idea of E. Stein in [56] with our work in [28]. In this respect, the case s = 1/2 of Theorem 4.6 provides the main technical tool to bypass the above mentioned difficulties connected with P t . It will lead us to Theorem 5.5, which is the main result of this section.
Since in what follows we are primarily interested in the nonlocal operator (−A ) 1/2 (the case a = 0 in Theorem 4.6), we will focus our attention on the corresponding Poisson kernel, which for ease of notation we henceforth denote by P(X, Y, z) def = P (0) (X, Y, z). In such case, formula (4.3) reads Definition 5.1. We define the Poisson semigroup as follows Using (5.1) and exchanging the order of integration in the above definition, we obtain the following useful representation of the semigroup P z in terms of the Hörmander semigroup P t This is of course an instance of Bochner's subordination, see [7]. We note in passing that, when the operator A = ∆, from (5.2) we recover the classical Poisson kernel for the half-space R N +1 + , see [57, (15), p.61], Some basic facts that we need about {P z } z>0 are contained in the next result.
Lemma 5.2. The following properties hold: (i) For every X ∈ R N and z > 0 we have P z 1(X) = 1; and it satisfies the partial differential equation Proof. The proof of (i) follows by taking a = 0 in [28,Proposition 5.2]. (ii) is a direct consequence of (i). To establish (iii) we use (5.2), that gives where in second inequality we have used (iv) in Lemma 2.5, and in the last equality the fact that The properties (iv) and (v) follow from the case a = 0 of Theorem 4.6.
Remark 5.3. We note explicitly that (iv) in Lemma 5.2 says, in particular, that the infinitesimal generator of P z is the nonlocal operator (−A ) 1/2 , i.e., P z = e z √ −A . In the case when A = ∆ one should see the seminal work [58], where an extensive use of the Poisson semigroup was made in connection with smoothness properties of functions.
Given a reasonable function f (for instance, f ∈ S ) we now introduce its Poisson radial maximal function as follows Lemma 5.4. There exists a universal constant A > 0 such that Proof. Adapting an idea idea in [56, p. 49], we can write (5.2) as where g(z, t) = zt −3/2 √ 4π e − z 2 4t , and we have let F (t) = 1 t t 0 P s f (X)ds. Notice that by (ii) in Lemma 2.5, we can bound |F (t)| ≤ ||f || ∞ . Also observe that tg(z, t) → 0 as t → ∞, and that t → t ∂g ∂t (z, t) ∈ L 1 (0, ∞). We can thus integrate by parts in (5.5), obtaining To complete the proof it suffices to observe that A(z) ≤ A = 7/2 for every z > 0. This follows from the fact that t ∂g ∂t (z, t) = z 2 t − 3 2 g(z, t), and that ∞ 0 g(z, t)dt = 1, and ∞ 0 The next is the main result in this section. It provides the key maximal theorem for the class (1.1). As far as we know, such tool has so far been missing in the existing literature.
(b) let 1 < p ≤ ∞, then there exists a universal constant A p > 0 such that for any f ∈ L p one has Proof. (a) In view of (iv) in Lemma 2.5, we know that {P t } t>0 is contractive in L 1 and in L ∞ . Furthermore, by Corollary 2.7 it is a strongly continuous semigroup in L p , for every 1 ≤ p < ∞. We can thus apply the powerful Hopf-Dunford-Schwartz ergodic theorem, see [19,Lemma 6,p. 153], and infer that, if f ∈ L 1 , then for every λ > 0 one has where we have let On the other hand, (5.4) in Lemma 5.4 gives where in the second inequality we have used (5.6).
(b) We observe that from (ii) in Lemma 5.2 we trivially have By (a) and the theorem of real interpolation of Marcinckiewicz (see [57, Chap. 1, Theor. 5]), we conclude that (b) is true for some A p > 0.

The fractional integration operator I α
In the classical theory of Hardy-Littlewood-Sobolev the M. Riesz' operator of fractional integration plays a pivotal role. We recall, see [55] and also [57,Chap. 5], that given a number 0 < α < N , the latter is defined by the formula The essential feature of such operator is that it provides the inverse of the fractional powers of the Laplacian, in the sense that for any f ∈ S one has f = I α • (−∆) α/2 f . Its role in the Hardy-Littlewood theory is perhaps best highlighted by the following interpolating inequality which goes back to [57,Chapter 5], see also [32]. Suppose 1 ≤ p < n/α and that f ∈ L p . Then, one has for any ε > 0, The usefulness of the inequality (6.2) is multi-faceted. One the one hand, when p > 1, combined with the strong L p continuity of the maximal operator, it shows that I α : L p → L q , provided that 1/p − 1/q = α/n. On the other hand, (6.2) allows to immediately establish the geometric weak end-point result W 1,1 ֒→ L n n−1 ,∞ . This implies, in turn, the isoperimetric inequality P (E) ≥ C n |E| n n−1 and, equivalently, the strong geometric Sobolev embedding, BV ֒→ L n n−1 , where P (E) denotes De Giorgi's perimeter and BV the subspace of L 1 of functions with bounded variation (for these aspects we refer to [14], where these ideas were developed in the general framework of Carnot-Carthéodory spaces).
In this section, we use the Poisson semigroup {P z } z>0 in Definition 5.1 to introduce, in our setting, the counterpart of the potential operators (6.1), see Lemma 6.2. Theorem 6.3 is the first main result of the section. It shows that the operator I 2s inverts the nonlocal operator (−A ) s . In the next definition the reader needs to keep in mind the number D ∞ in Definition 3.3. Definition 6.1. Let 0 < α < D ∞ . Given f ∈ S , we define the Riesz potential of order α as follows Let us observe that for every X ∈ R N the integral in Definition 6.1 converges absolutely. To see this we write The integral on [0, 1] is absolutely convergent for any α > 0 since, using (ii) in Lemma 2.5, we can bound |P t f (X)| ≤ ||P t f || ∞ ≤ ||f || ∞ . For the integral on [1, ∞) we use the ultracontractivity of P t in Proposition 3.5, which gives for any X ∈ R N and t > 0, In the next lemma, using Bochner's subordination, we recall a useful alternative expression of the potential operators I α based on the Poisson semigroup {P z } z>0 .
where in the last equality we have used, with x = α/2, the well-known duplication formula for the gamma function 2 2x−1 Γ(x)Γ(x + 1/2) = √ πΓ(2x), see e.g. The next basic result plays a central role for the remainder of this paper. It shows that the integral operator I α is the inverse of the nonlocal operator (−A ) α/2 . Theorem 6.3. Suppose that (1.3) hold, and let 0 < s < 1. Then, for any f ∈ S we have Proof. We only prove the first equality, the second is established similarly. It will be useful in what follows to adopt the following alternative expression, see [5], of the nonlocal operator (4.1) where we have denoted by R(λ, A ) = (λI − A ) −1 the resolvent of A in L ∞ 0 (we are now identifying A with A ∞ , the infinitesimal generator of {P t } t>0 in L ∞ 0 , see Remarks 2.8, 2.10 and Lemma 2.11). We remark that either one of the integrals in the right-hand side of (6.3) converge in L ∞ . For instance, in the first integral there is no issue near λ = 0 since s > 0, whereas (3) in Lemma 2.11 gives λ s−1 ||R(λ, A )(−A )f || ∞ ≤ λ s−2 ||A f || ∞ , which is convergent near ∞. Keeping in mind that by (2) in Lemma 2.11 we have R(λ, A )f = ∞ 0 e −λt P t f dt, we can alternatively express (6.3) as follows If we now combine Definition 6.1 with (6.4), we find where in the innermost integral we have made the change of variables ρ = τ (1 + u). We notice that one can justify the above relations by a standard application of Fubini and Tonelli theorems once we recognize that, for large t, the ultracontractivity and the fact that D ∞ ≥ 2 > 2s ensure the right summability properties. We now make the key observation that 7. An intrinsic embedding theorem of Sobolev type In this section we prove our main embedding of Sobolev type, Theorem 7.5. Our strategy follows the classical approach to the subject. We first establish the key Hardy-Littlewood-Sobolev type result, Theorem 7.4. With such tool in hands, we are easily able to obtain the Sobolev embedding, Theorem 7.5. We note that these results do not tell the whole story since, as noted in Remark 7.2, their main assumption (7.1) implies necessarily that D 0 ≤ D ∞ . But we have seen in Ex.4 in fig.1 that there exist operators of interest in physics for which we have instead D 0 > D ∞ . These cases are handled by Theorems 7.6 and 7.7. Since we will need to have in place all the results from the previous sections, hereafter we assume without further mention that the assumption (1.3) be in force. Our first result shows a basic property of the Poisson semigroup.
Lemma 7.1 (Ultracontractivity of P z = e z √ −A ). Suppose that there exist numbers D, γ D > 0 such that for every t > 0 one has If 1 ≤ p < ∞ one has for f ∈ L p , X ∈ R N and any z > 0, Proof. From (5.2), Proposition 3.5 and (7.1) we find Remark 7.2. Keeping Definitions 2.14 and 3.3 in mind, the reader should note that the assumption (7.1) implies necessarily that D 0 ≤ D ≤ D ∞ . Thus, the case D 0 > D ∞ is left out, but it will be addressed in Theorems 7.6 and 7.7.
The next proposition contains an essential interpolation estimate which generalises to the degenerate non-symmetric setting of (1.1) the one in [63], see also [64]. Such tool represents the semigroup replacement of the Stein-Hedberg estimate (6.2).
where C 2 = C 2 (N, D, α, γ D , p) > 0. Combining this estimate with (7.4) and (7.3), we conclude that (7.2) holds. With Proposition 7.3 in hands, we can now establish the first main result of this section.
To prove (ii), we suppose now that 1 < p < D/α. Minimising with respect to ε in (7.2) we easily find for some constant C 3 = C 3 (N, D, α, γ D , p) > 0 p . The desired conclusion (7.6) now follows from (7.8) and from (b) in Theorem 5.5. Theorem 7.4 is the keystone on which the second main result of this section leans. Before stating it, we emphasise that in view of (iii) in Remark 3.4 we know that D ∞ ≥ 2. Therefore, if 0 < s < 1 then 2s < 2 < D ∞ . Theorem 7.5 (of Sobolev type). Suppose that (7.1) hold. Let 0 < s < 1. Given 1 ≤ p < D/2s let q > p be such that 1 p − 1 q = 2s D . (a) If p > 1 we have L 2s,p ֒→ L pD D−2sp . More precisely, there exists a constant S p,s > 0, depending on N, D, s, γ D , p, such that for any f ∈ S one has ||f || q ≤ S p,s ||(−A ) s f || p .
Proof. We observe that (3) in Remark 3.4 guarantees that D ≤ D ∞ , and therefore I 2s is welldefined. At this point, the proof is easily obtained by combining Theorem 6.3, which allows to write for every X ∈ R N |f (X)| = |I 2s (−A ) s f (X)|, with Theorem 7.4. We leave the routine details to the interested reader. From Remark 7.2 we know that Theorem 7.4 does not cover situations, such as the Kramers' operator in Ex.4 in fig.1, in which D 0 > D ∞ . When this happens we have the following substitute result. In the sequel, when we write L q 1 + L q 2 we mean the Banach space of functions f which can be written as f = f 1 + f 2 with f 1 ∈ L q 1 and f 2 ∈ L q 2 , endowed with the norm Theorem 7.6. Suppose there exist γ > 0 such that for every t > 0 one has (7.9) V (t) ≥ γ min{t D 0 /2 , t D∞/2 }.
To prove (ii) we look for the minimum of G which is attained at some ε such that In other words ε min = max{A f (X) p/D 0 , A f (X) p/D∞ }. Going back to (7.14) we conclude αp D∞ +C Γ(α) ||f || p g(ε min ).
In the case 0 < A f (X) < 1, then we have .