A universal variational framework for parabolic equations and systems

We propose a variational approach to solve Cauchy problems for parabolic equations and systems independently of regularity theory for solutions. This produces a universal and conceptually simple construction of fundamental solution operators (also called propagators) for which we prove L2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {L}}^2$$\end{document} off-diagonal estimates, which is new under our assumptions. In the special case of systems for which pointwise local bounds hold for weak solutions, this provides Gaussian upper bounds for the corresponding fundamental solution. In particular, we obtain a new proof of Aronson’s estimates for real equations. The scheme is general enough to allow systems with higher order elliptic parts on full space or second order elliptic parts on Sobolev spaces with boundary conditions. Another new feature is that the control on lower order coefficients is within critical mixed time-space Lebesgue spaces or even mixed Lorentz spaces.


Introduction
The classical treatment of parabolic problems begins with solving the Cauchy problem with or without forcing terms and representing solutions by what is called fundamental solutions. Here, we consider operators of the form ∂ t +L, where L is an elliptic operator in divergence form with possibly complex-valued coefficients. Coefficients depend on all space and time variables. We assume strongly (Gårding) elliptic and bounded higher order coefficients and unbounded lower order coefficients controlled in mixed Lebesgue and even Lorentz norms that are compatible with Sobolev embeddings for solutions. In particular, our treatment includes parabolic Schrödinger operators with Coulomb like potentials.
When the coefficients are regular, several methods are possible to construct the fundamental solution and the most efficient one is via a parametrix, using the so-called freezing point technique, which reduces the situation to space-independent coefficients for which fundamental solutions are explicit kernels Γ(t, x, s, y) with exponential decay in (|x − y| 2m /|t − s|) 1/(2m−1) , where 2m is the order of the elliptic operator [21]. When m = 1, this is the Gaussian decay.
When coefficients become irregular (measurable, unbounded), one goes through the theory of weak solutions that was developed in the 1950's-60's, culminating in the treatise by Ladyženskaja, Solonnikov and Ural'ceva [28]. In parallel, when the coefficients are real-valued and the elliptic operator has order two, Aronson constructed generalized fundamental solutions, using Riesz representation theorems as a consequence of well-posedness of Cauchy problems that generate bounded solutions. He also proved Gaussian upper and lower bounds [3,4]. This supposedly closed the topic but here we shall reveal some new phenomena.
Our starting point is the guiding principle that many results on elliptic problems have counterparts in parabolic world, taking into account evolution with respect to time. However, elliptic problems are tackled using a coercive variational formulation, while parabolic problems are attacked via the Cauchy problem as mentioned above. The presence of the first order time derivative seems to forbid any possibility of coercivity as in the elliptic case.
Nevertheless, the heat kernel can be seen as the kernel of the operator (∂ t −∆) −1 . Here, the inverse can be computed using Fourier transform, but for more general parabolic operators this is not possible. Our question is whether some form of invertibility can still be implemented. We show that indeed there is a variational formulation in the parabolic setting, too. That is, we find a variational space V such that if ∂ t + L : V → V ′ , V ′ being its dual, is invertible, then one can represent the inverse by Green operators that eventually become the fundamental solution operators for the Cauchy problem (and whose kernels, whenever they exist in a pointwise sense, give a generalized fundamental solution). Invertibility and causality are then checked under appropriate coercivity requirements. In other words, we are reversing the order of the usual arguments. Our main conclusion for Cauchy problems in the case of coefficients in mixed Lebesgue spaces is in Theorem 2.54.
One may think this is a matter of cosmetic changes in the theory, but it is not. For instance, in the case of second order elliptic part, the usual energy space of weak solutions L 2 (I; H 1 ) ∩ L ∞ (I; L 2 ) or the smaller Lions' space L 2 (I; H 1 ) ∩ H 1 (I; H −1 ) 1 cannot play the role of a variational space as above, as the dual is either not handy or too big. Thus, we have to renounce to a priori boundedness in L 2 . Also, for symmetry reasons it is easier to let I = R as this avoids boundary conditions for the time derivative. If one looks for a variational space candidate, the space L 2 (R; H 1 ) is unavoidable for weak solutions since it is mapped to its dual by the leading terms. Another Hilbert space, H 1/2 (R; L 2 ), is mapped to its dual by the time derivative. This space already appeared in the theory [28,29] but rather in the regularity theory than with an instrumental role. Hence, the space V = L 2 (R; H 1 ) ∩ H 1/2 (R; L 2 ), or its homogeneous versionV, is a natural candidate and we are going to assume from the start the alludedV →V ′ invertibility of the parabolic operator. Even though H 1/2 (R) ⊂ L ∞ (R) fails, the homogeneous versions of these two spaces have the same scale invariance and therefore the homogeneous versions of the spaces V and L 2 (R; H 1 ) ∩ L ∞ (R; L 2 ) have the same embeddings into the mixed spaces L r (R; L q ) except for the endpoint exponents r = ∞, q = 2. Regularity theory based on improvements of Lions's embedding theorem allows us to introduce a class of solutions where one can uniquely solve ∂ t u + Lu = δ s ⊗ ψ for ψ ∈ L 2 and δ s the Dirac mass at s, and show that such solutions are continuously L 2 -valued except at s. This turns out to be precisely what is needed to define Green operators with the expected properties that can be used to represent solutions of the Cauchy problem. All boils down to proving invertibility of the parabolic operator, which uses an idea going back to [24] that has been rediscovered several times since. There, Kaplan showed for the first time, in absence of lower order coefficients, where coercivity of the parabolic operator ∂ t + L hides even though ∂ t alone is not coercive in any sense since Re R ∂ t u u dt = 0 holds for any reasonable function u.
In summary, solutions being in L ∞ (L 2 ) is an a priori requirement in most references to develop the theory and that the solutions belong to C(L 2 ) and H 1/2 (L 2 ) is an a posteriori gain. Here, we use the invertibility on a space involving H 1/2 (L 2 ) to construct (unique) solutions that are proved to be C(L 2 ) ∩ L ∞ (L 2 ) by a regularity argument, hence that are usual weak solutions in the end.
Let us next describe the new findings that emerge from these conceptual changes.
Weaker assumptions on the coefficients. An advantage of using the variational space V, as opposed to classical energy spaces, is that it not only embeds into mixed Lebesgue spaces L r (R; L q ) for pairs (r, q) of exponents that we call admissible, but also into mixed Lorentz spaces L r,2 (R; L q,2 ). As a consequence, this allows us to relax assumptions from Lr(R; Lq) to Lr ,∞ (R; Lq ,∞ ) for the lower order coefficients for pairs (r,q) that we call compatible, as far as invertibility is concerned. Also causality can be proved under a weaker assumption, namely Lr(R; Lq ,∞ ). This is explained in Section 2.15.
Adaptability of the approach. The "hidden coercivity" using the space V discovered in [24] had been explicitly appeared in several instances for other questions [5-7, 16, 23] concerning local regularity, maximal regularity or boundary value problems. The heart of the matter are Sections 2.2 -2.14. Once the framework is set up correctly, numerous, otherwise non-trivial extensions, will come effortlessly: Lower order coefficients in Lorentz spaces (Section 2.15), unbounded leading coefficients in BMO (Section 2.16), higher order systems on full space with integrability varying over the coefficients (Section 3), second order equations and systems with lateral boundary conditions (Section 4). We provide full details for the first two extensions and restrict ourselves to sketching the strategies for the latter two as the article is already quite long.
A self-contained theory with simpler proof techniques and improvement of Lions' embedding theorem. Many results we prove here could seem "well-known" to experts at first glance but we produce all details of the second order case in full space in order to show that the method we develop is self-contained with no recourse to older literature. Some arguments require new techniques of proof, hopefully simpler and without using Steklov averages. In particular, our approach is a consequence of L 2 continuity in time (up to constant) of solutions of the heat operator ∂ t u − ∆u = f or its adjoint when u a priori belongs to L 2 (R; H 1 ) or its homogeneous version and f belongs to sums of mixed Sobolev spaces of L 2 type with negative indices. This seems new. In the end, this yields an improvement of Lions' embedding theorem (Lemma 2.17).
A universal construction without approximation of coefficients. Lastly, our construction of propagators or fundamental solution operators avoids density arguments from operators with smooth coefficients or Galerkin methods. Uniqueness implies that our construction agrees with others under common hypotheses. In this sense it is universal and also constructive. In particular, we obtain a new proof of the Gaussian upper bound of Aronson as a consequence of L 2 off-diagonal estimates for fundamental solution operators, which hold in full generality (see Theorem 2.61). The latter is a new result in its own right.
Further details and precise assumptions are given in the course of the article.

Second order problems on full space
In what follows, we use L 2 spaces in both R n , n ≥ 1, and R n+1 , equipped with Lebesgue measures. We denote by ψ,ψ the complex inner product in the variable x ∈ R n and by φ,φ the complex inner product in the variables (t, x) ∈ R × R n =: R n+1 . (We prefer this order for the variables for practical reasons.) For a function f of the two variables, we set f (t) : x → f (t, x) for any t.
We use the notation D and D ′ for the spaces of C ∞ functions with compact support and of distributions, respectively, and S and S ′ for the spaces of Schwartz functions and tempered distributions, respectively. Variables will be indicated at the time of use. Duality brackets extend inner products on L 2 spaces, hence they are sesquilinear.
2.1. Variational space. As said in the introduction, we need a space, which can be thought of as L 2 x . However, some care must be taken because we use homogeneous norms.
Remark 2.1. For ϕ ∈ S(R n+1 ), we have, using Plancherel's formula, Here, D α t is the Fourier multiplier with symbol |τ | α . In fact, the closure of S(R n+1 ) for the norm defined by the left-hand side isV +C, seen as a subspace of S ′ (R n+1 )/C. For a proof see Lemma 3.11 in [7]; this closure is denoted byĖ(R n+1 ) there. HenceV is nothing but the realization of this closure within S ′ (R n+1 ) that eliminates constants. In particular, whenever u ∈V, then ∇u and D 1/2 t u exist as tempered distributions, belong to L 2 t L 2 x , and the identity above holds. We letV ′ be the dual ofV with respect to ·, · . Thus, it is a subspace of S ′ (R n+1 ) and a distribution w ∈ S ′ (R n+1 ) belongs toV ′ if and only if (|τ | + |ξ| 2 ) −1/2ŵ ∈ L 2 t L 2 x . It follows from Plancherel's formula that w ∈V ′ if and only if there exists a decompositionŵ = |ξ|g 1 +|τ | 1/2 g 2 with g 1 , g 2 ∈ L 2 t L 2 x and that w V′ ∼ inf( g 1 L 2 t L 2 x + g 2 L 2 t L 2 x ) taken over all such decompositions. In this sense we writeV ′ = L 2 tḢ −1 x with equivalent norms.

Embeddings.
Recall that the homogeneous Sobolev spaceḢ 1/2 (R) =Ḣ 1/2 t , that is, the closure of S(R) for the norm D 1/2 t ϕ 2 , has the same scaling properties as L ∞ (R). This results in continuous inclusions into mixed normed Lebesgue spaces foṙ V that, except for endpoints, are the same as for L 2 We describe them next. We mention thatḢ 1/2 t has an equivalent (semi-)norm, using difference quotients that is often used in the literature on this topic, see for instance [28]. We do not use them here.
We need the following mixed spaces. For pairs (r, q) ∈ [1, ∞] 2 of exponents, intervals I ⊂ R, and open sets Ω ⊂ R n , n ≥ 1, we write L r (I; L q (Ω)) for the mixed norm space of measurable functions u : I ×Ω → C with u L r (I;L q (Ω)) := and the usual changes if either r = ∞ or q = ∞. We set L r t L q x := L r (R; L q (R n )) with dummy variables in indices when t ∈ R and x ∈ R n .
We introduce the Banach space of tempered distributionṡ where the gradient is taken in the sense of distributions, 2 with norm u ∆r,q := ∇u L 2 x . Duality theory for∆ r,q is easily understood by identifying∆ r,q with a closed subspace of L r t L q x × L 2 t L 2 x through the map u → (u, ∇u). As usual, the Hölder conjugate of x , the space of elements div F + g, with vector field F ∈ L 2 t L 2 x and scalar function g ∈ L r ′ t L q ′ x , equipped with usual infimum norm. In 2 From now on, we choose not to indicate the target vector space in the notation.
is an admissible pair, thenV ֒→∆ r,q with continuous inclusion. In particular, elements ofV are locally integrable functions. By duality, this yields the continuous inclusion (∆ r,q ) ′ ֒→V ′ Proof. By density it suffices to work with ϕ ∈ S(R n+1 ) and show the first inclusion. The proof relies on two ingredients. First, for θ ∈ [0, 1], using the convexity inequality Fourier transform in the (t, x)-variable shows that x ≤ ϕ 2V . Next, Sobolev embeddings in R n and R give us where the first inequality holds exactly when 1 2 − 1−θ n = 1 q and 2 ≤ q < ∞, and the second one exactly when θ 2 − 1 2 = − 1 r and 2 ≤ r < ∞. Note that the second embedding is the vector-valued extension of the scalar embedding on R.
We can solve for θ ∈ [0, 1) if 1 r + n 2q = n 4 with 2 ≤ r < ∞ and 2 ≤ q < ∞, which is the definition of an admissible pair. Since the other part ∇ϕ L 2 t L 2 x of the∆ r,q norm is controlled by ϕ V , we are done.
Remark 2.4. AsḢ 1/2 (R) does not embed into L ∞ (R), the result fails for (r, q) = (∞, 2) and we excluded this pair from our admissible range although it satisfies the relation 1 r + n 2q = n 4 . Similarly, the embeddingḢ 1−θ x ⊂ L q x never holds when q = ∞ and requires 1 − θ < n 2 . 2.3. Variational approach. We study parabolic equations on R n+1 = R × R n , namely ∂ t u + Lu = f and its adjoint equation where L and its adjoint are second order elliptic operators in divergence form perturbed with unbounded lower order terms. The equalities are taken in the sense of distributions, provided Lu and L * ũ are well-defined. To be precise, we consider where the coefficients A, a, b, a depend on (t, x). Sometimes, we consider L as an operator acting on functions of the x-variable by freezing t: context will make things clear. The leading coefficient A = (a ij ) is an n × n matrix of bounded, possibly complex-valued, measurable functions on R n+1 . Thus, the sesquilinear form corresponding to the leading part in (2.3) satisfies The lower order coefficients a, b are n-vectors of complex-valued, measurable functions on R n+1 , and a is a complex-valued, measurable functions on R n+1 . The formal complex adjoint of L corresponds to We introduce the following quantity.
Pr 1 ,q 1 := |a| 2 1/2 and define (r 1 , q 1 ) ∈ [2, ∞] 2 through the relations Remark 2.6. In principle, we could let (r 1 ,q 1 ) be different for each entry of a and b, and a. At this point we do not go into this in order to simplify the exposition of ideas but we shall come back to the general version later on in Section 3.
Next, we introduce the sesquilinear pairings corresponding to the lower order terms in (2.3). We set βu := − div(a u) + b · ∇u + au, so that for appropriate u, v and for (almost) each t ∈ R, we write The relation (2.7) guarantees that the formal pairings above are absolutely convergent Lebesgue integrals, as becomes apparent from the next lemma.
Lemma 2.7. Let (r 1 ,q 1 ) ∈ [1, ∞] 2 and let (r 1 , q 1 ) given by (2.7). Suppose that Proof. Use Hölder inequalities in the x and then t-variables, taking into account the relations 1 2r 1 It is of course natural to relate the choice of pairs to Sobolev embeddings.
This terminology is motivated by the following principle.
Lemma 2.9. A pair (r 1 ,q 1 ) is compatible for lower order coefficients if and only if (r 1 , q 1 ) is admissible.
Proof. We can see that 1 Remark 2.10. The compatibility and admissibility conditions already appear in Chapter 3 of [28]. As in there (see p. 137), we include the caser 1 = ∞,q 1 = n 2 , n ≥ 3, but not the caser 1 = 1 as the variational space is not contained in L ∞ t L 2 x . We now introduce the variational setup. We use the Hilbert spaceV and its dualV ′ for ·, · . Since S is dense inV, this pairing is consistent with the sesquilinear pairing of tempered distributions and Schwartz functions. We have seen in Section 2.2 thaṫ V ⊂∆ r,q and L r ′ t L q ′ x ⊂V ′ if (r, q) is admissible. For u ∈V and v ∈ L r ′ t L q ′ x the pairing u, v is therefore the Lebesgue integral R n+1 u(t, x)v(t, x) dxdt. This observation will be tacitly used throughout the section. Proposition 2.11. Assume that Pr 1 ,q 1 < ∞ for some pair (r 1 ,q 1 ) compatible for lower order coefficients. Define the operator for u, v ∈V. In the same fashion, define the dual operator Then H, H * :V →V ′ are well-defined, bounded and adjoint to one another.
It follows that ∂ t , seen as an operator acting on tempered distributions in the two variables (t, x), mapsV intoV ′ .
Next, for the admissible conjugate pair (r 1 , q 1 ) and u, v ∈∆ r 1 ,q 1 , the pairing Lu, v is defined as so that by Lemma 2.7, By the embeddingV ֒→∆ r 1 ,q 1 for the admissible pair (r 1 , q 1 ), we conclude that Lu, v is defined onV ×V and that with C = C(n, r 1 , q 1 ) we have Eventually, for u, v ∈V, it follows using Fourier transform that and by inspection that Hence, H * is the adjoint of H and its boundedness follows.
This remark suggests the following notion of solution. We try to be very explicit in this regard in order not to confuse the reader by the versatile terminology of weak solution, and also because we work on R n+1 . Definition 2.13. We say that u is a∆ r 1 ,q 1 -solution of ∂ t u + Lu = f in R n+1 if u ∈∆ r 1 ,q 1 and the equation is satisfied in the sense of distributions on R n+1 , that is for allφ ∈ D(R n+1 ), There is no regularity in time attached to this definition. For appropriate right hand side, we shall show that in fact a∆ r 1 ,q 1 -solution is (up to a constant) continuous and bounded in time valued in L 2 x , so that in the end we will be able to identify with weak solutions when considering the Cauchy problem, see Section 2.11.
2.4. Main regularity estimates. Our approach builds on the results of the next two sections. Note that the assumptions have a homogeneous flavor, which is necessary when the time interval is infinite.
We begin with results providing existence and uniqueness of specific solutions for the heat operator ∂ t −∆ in R n+1 . (We could also use ∂ t +∆ as the choice of forward or backward time is irrelevant when t ∈ R.) This relies on Fourier transform arguments with tempered distributions.
. This implies that u j , ∂ t φ + ∆φ = 0 for all φ ∈ S 0 (R n+1 ), the space of Schwartz functions whose Fourier transforms vanish to infinite order at 0. Now, ∂ t + ∆, whose Fourier symbol is the polynomial iτ − |ξ| 2 , is an automorphism of S 0 (R n+1 ). We obtain u j , φ = 0 for all φ ∈ S 0 (R n+1 ), so that u j is a polynomial. As u j ∈ L 2 t L 2 x , this polynomial vanishes. We have shown that ∇u = 0, hence ∆u = 0, and ∂ t u = 0 follows from the equation. We conclude that u is constant.
Proposition 2.15. Let (r, q) be an admissible pair and w ∈ L 2 Uniqueness is provided by Lemma 2.14. The main work is to produce this solution. Using R as the time interval allows us to use embeddings for homogeneous Sobolev space on R. For θ ∈ [0, 1), we introduce the spacė x . We recall that D α t is the Fourier multiplier with symbol |τ | α . For θ = 0, we find L 2 tḢ −1 x . Elements inḢ −θ/2 tḢ θ−1 x are tempered distributions with locally square integrable Fourier transforms.
x . By density of G and the Riesz representation theorem there exists a unique w ∈ L 2 t L 2 such that v, M −1 θ g = w, g for all g ∈ G, and we can see that v = M θ w in S ′ (R n+1 ). This concludes the proof.
Armed with this embedding, it suffices to prove the following stronger statement in purely L 2 -based mixed Sobolev spaces.
Proof. Guided by the Duhamel formula, the function v(t) is formally defined by and we need to make sense of this integral. To this end, we introduce a smaller dense subspace ofḢ t (−∆) (1−θ)/2 as before and set G 0 := {M θ g : g ∈ S 00 (R n+1 )}, where S 00 (R n+1 ) is the space of Schwartz functions whose Fourier transforms are supported away from τ = 0 and ξ = 0. Note that M θ preserves S 00 (R n+1 ) and that G 0 is a dense subspace ofḢ −θ/2 tḢ θ−1 x . Call τ t the translation by t ∈ R: τ t g(s) := g(s + t), not indicating the x-variable as usual. It commutes with M θ . Step . We begin with a preliminary estimate. Let ϕ ∈ L 2 x and define where 1 A denotes the indicator function of A. A classical calculation shows that h(τ, ξ) = (−iτ + |ξ| 2 ) −1φ (ξ), whereφ is the Fourier transform of ϕ on R n . Using Plancherel's formula in R n and R, Fubini's theorem and the change of variables τ = σ|ξ| 2 when ξ = 0, we find 1). It follows that for any Now assume g ∈ S 00 (R n+1 ), set w := M θ g and fix t ∈ R. Then w ∈ S 00 (R n+1 ), so in particular w ∈ L 1 t L 2 x and the integral in (2.13) converges as a Bochner integral in L 2 x thanks to the contractivity of the heat semigroup on L 2 x . To get the appropriate estimate on v(t), we calculate and (2.14) yields In total, we have produced a bounded map By density, it has a bounded extension M :Ḣ x . Due to (2.15) this extension is defined weakly by x .
Step 2: v ∈ C 0 (L 2 x ). By density of In that case we have seen that w ∈ L 1 t L 2 x . For the continuity in time, write (2.13) as v(t) = 0 −∞ e −s∆ w(s + t) ds, so that continuity follows right away from the continuity of time translations in L 1 t L 2 x and contractivity of the heat semigroup on L 2 x . For the limits v(t) → 0 as |t| → ∞, we use dominated convergence in (2.13) as follows. By contractivity of the heat semigroup, the integrand is bounded in L 2 x by w(s) L 2 x and this function is integrable with respect to s. The limit as t → −∞ follows immediately, whereas for t → ∞ we additionally use that the heat semigroup tends to 0 strongly in L 2 x .
Step 3: v = Mw is a solution of ∂ t v − ∆v = w in S ′ (R n+1 ). Assume that w = M θ g ∈ G 0 with g ∈ S 00 (R n+1 ). Since w, ∆w ∈ L 1 t L 2 x and t → w(t) is continuous as an L 2 x -valued function, we obtain v ′ (t) = w(t) + ∆v(t) in L 2 x for all t ∈ R. From this we conclude that v, −∂ t φ − ∆φ = w, φ for all φ ∈ S(R n+1 ). It remains to argue by density, letting g approximate any element in L 2 t L 2 x in this equality for fixed φ. Step . Again, it is enough to proceed by density after proving the claim for w = M θ g when g ∈ S 00 (R n+1 ) with a constant that does not depend on this assumption. By Fourier transform from the equation, (iτ + |ξ| 2 )v = |τ | θ/2 |ξ| 1−θĝ as tempered distributions so thatv = (iτ + |ξ| 2 ) −1 |τ | θ/2 |ξ| 1−θĝ . Here, the right-hand side is again a tempered distribution of the We observe that there is a substitute result for the non admissible pair (∞, 2) that does not involve the variational spaceV.
Proof. Uniqueness is again provided by Lemma 2.14. The case where w = − div F ∈ L 2 tḢ −1 x is given by the solution v in Lemma 2.17 with θ = 0, and checking the constants we have v L ∞ t L 2 x . Assuming now that w ∈ L 1 t L 2 x , Steps 1 and 2 of the proof of Lemma 2.17 show that v given by (2.13) x . To show that ∇v ∈ L 2 t L 2 x , we take a vector field Φ ∈ S(R n+1 ) andṽ given byṽ(s) = ∞ s e (t−s)∆ (div Φ)(t) dt, s ∈ R, is the solution of Lemma 2.17 in the case θ = 0 for the backward equation x and c(0) = 2 −1/2 . As v, divΦ = w,ṽ , we deduce that ∇v L 2 t L 2 Let us state consequences of these two propositions. The first one shows a set of lower bounds for the heat operator.
provided that (r, q) is admissible or (r, q) = (∞, 2) and that the right hand side is finite.
Proof. Let v be the solution of ∂ t v − ∆v = ∂ t u − ∆u given by Propositions 2.15 or 2.18. By uniqueness, v − u is a constant c so that u − c ∈ C 0 (L 2 x ). Optimizing over all possible decompositions of ∂ t u − ∆u in L 2

tḢ
−1 x + L r ′ t L q ′ x yields the estimate. The second one will be our fundamental regularity estimate in the following.
We apply Corollary 2.19 to get the estimate. That u − c ∈V when (r, q) is admissible is already in Proposition 2.15.
(i) The case θ = 0 of Lemma 2.17 on the half-space (0, ∞) × R n appears in [9]. It can be seen as the homogeneous version of Lions' embedding theorem, which would apply had we assumed in addition u ∈ L 2 t L 2 x . An important point is that the interval is infinite, otherwise the homogeneous version is wrong. Here, the statement with the time interval being the real line simplifies some matters of the proof in [9] when θ = 0 and we extend it to θ < 1. However, the statement does not hold when θ = 1. When θ > 0, the situation is not as symmetric as Lions' embedding theorem since the hypothesis cannot be put into a form u ∈ E, ∂ t u ∈ E ′ , where E is a Banach space and E ′ its dual for the L 2 t L 2 x duality. We relax on u since we do not want to impose more than u ∈ L 2 tḢ 1 x for the spatial regularity. In any case, the conditions u ∈V and ∂ t u ∈V ′ do not imply u continuous into L 2 x . Indeed, we saw that the time derivative mapsV intoV ′ , butV is not contained in C(L 2 x ). (ii) In the hypotheses of Corollaries 2.19 and 2.20, one can take a finite sum of spaces of the same types with different pairs of exponents. This is a consequence of our constructive proof: u − c is the sum of solutions to the heat equation with right-hand sides equal to the various different components.
2.5. Integral equalities. Integral identities are well-known in our context when the time interval is finite [28]. When the time interval is infinite, they require additional care in both the assumptions and the proofs, and Lemma 2.17 becomes the essential instrument. Therefore, the next lemma is not a straightforward generalization of known results.
x and (r, q) an admissible pair or (r, q) = (∞, 2). Then, up to a constant, u ∈ C 0 (L 2 x ), t → u(t) 2 L 2 x is absolutely continuous on R and for all σ < τ , Proof. From Corollary 2.20 we know that u − c ∈ C 0 (L 2 x ) ∩ L r t L q x for some constant c. Adjusting the constant to 0, the integrand of (2.17) is well-defined and integrable on R. It remains to prove the integral identity.
We begin with assuming that (r, q) is an admissible pair. Let ϕ ε = 1 ε ϕ( · ε ), ε > 0, be a mollifying sequence of R with ϕ smooth, compactly supported in [−1, 1] and R ϕ = 1, and let u ε := ϕ ε ⋆ u, where convolution is in the t-variable. Clearly, t → u ε (t) is of class C 1 as an L 2 x -valued function. Thus, Since u ∈ C 0 (L 2 x ), the left-hand side converges to the one of (2.17) as ε → 0. Next, to justify the convergence of the integral in (2.18) to the one in (2.17), a computation yields and that the pairings above are of this type as (r, q) is admissible. In each factor the convolution with ϕ ε is uniformly bounded and converges strongly to the identity operator. The convergence follows.
If (r, q) = (∞, 2), then we repeat the above argument with (X, The lemma above can be localized to a half-infinite time interval as follows.
and (r, q) an admissible pair or (r, q) = (∞, 2). Then, up to a constant, u ∈ C 0 (I; L 2 x ) and Proof. It is enough to consider the case I = (a, ∞), a ∈ R. The strategy is to extend u by even reflection at t = a and F and g by odd reflection at t = a so that the assumptions of Lemma 2.22 apply to these extensions. The conclusion follows by restricting back to I. However, some care needs to be taken since we do not assume a priori that u is a locally integrable function and we provide details for the convenience of the reader. Take a = 0 for simplicity. We construct a distribution v ∈ D ′ (R n+1 ) (the even extension of u) verifying the hypothesis of Lemma 2.22 with the odd extensions of F and g, and whose restriction to (0, ∞) × R n is u. For any ψ ∈ D(R n ) we define the distribution u(t), ψ on (0, ∞) by where f ⊗ψ is the tensor product. The equation ∂ t u = − div F +g implies that for any . As the right-hand side is locally integrable from the assumptions on F and g, this shows that u(t), ψ can be identified with a continuous function on (0, ∞) that extends continuously to 0. We continue to use the suggestive notation t → u(t), ψ . For ψ ∈ D(R n ) and where F o and g o are the odd extensions of F and g, respectively. Thus, where (∇u) e is the even extension of ∇u, so that ∇v = (∇u) e in D ′ (R n+1 ). We have proved our claim.
The conclusion of Lemma 2.22 can be polarized, given two functions u,ũ that verify the assumptions (with possibly different pairs (r, q)). Due to the extendability that we have seen in the previous proof, the same also works with open, half-infinite intervals and the conclusion reads as follows.
Corollary 2.25. If u,ũ satisfy the same assumptions as in Corollary 2.24 on two open half-infinite intervals I and J, then after eliminating a constant from each of u andũ, the function t → u(t),ũ(t) is absolutely continuous on I ∩ J and whenever σ, τ ∈ I ∩ J, σ < τ .
2.6. Existence and uniqueness results. We come back to the study of parabolic equations. Up until Section 2.9 included, we fix a single compatible pair (r 1 ,q 1 ) for lower order coefficients and its corresponding admissible conjugate pair (r 1 , q 1 ). Recall that Proposition 2.11 yields boundedness of an operator H :V →V ′ , which acts as ∂ t + L. We develop the existence and uniqueness theory under the hypothesis This means that the constants in our estimates will also depend on Λ = A ∞ , Pr 1 ,q 1 and the norm of the inverse of H. We do not make this dependence explicit until Section 2.9, where we discuss a sufficient condition for invertibility. So, we write for instance C(n, q, r) to mean dependence on n, q, r and possibly the aforementioned quantities.
In the following, we only state results involving the operator ∂ t + L. All results apply mutatis mutandis to −∂ t + L * since both operators are indistinguishable from the assumptions at this stage. We are going to prove uniqueness and existence in the class of∆ r 1 ,q 1 -solutions that we introduced at the end of Section 2.3.
x with bound controlled by u ∆r1,q1 according to Remark 2.12 x . Remark 2.21 (ii) yields the desired conclusion up to a constant for u and the constant must be 0 as u ∈∆ r 1 ,q 1 .
Proof. We may apply Corollary 2.20, so that u − c ∈V ∩ C 0 (L 2 x ). The constant c vanishes as u ∈ C 0 (L 2 x ) from the above proposition. Thus, we have u ∈V with Hu = 0 and u = 0 follows by (H 0 ).
Proof. As − div F + g ∈V ′ by Lemma 2.3, we see that u is well-defined inV and belongs to∆ r 1 ,q 1 thanks to the same lemma. It is thus a∆ r 1 ,q 1 -solution of (2.21). By Theorem 2.27 it is unique and by Proposition 2.26 it belongs to C 0 (L 2 x ). The estimate (2.22) follows from that.
The previous theorem does not apply when g ∈ L 1 t L 2 x since (r, q) = (∞, 2) is not admissible and H −1 g does not make sense. Yet, we can construct a∆ r 1 ,q 1 -solution that falls outside of the variationalV −V ′ setting.
Moreover, this solution belongs to L r t L q x for any admissible pair (r, q) and to C 0 (L 2 x ) with x . Proof. The uniqueness follows from Theorem 2.27. For the existence, let T : is bounded for any admissible pair (r, q), so that (2.27) T g ∆r,q ≤ C(n, q, r) g L 1 We next show that u := T g is a∆ r 1 ,q 1 -solution of (2.23). Observe that u agrees with x , hence for g in a dense subspace, for example D(R n+1 ). For those functions g, we have that u is a∆ r 1 ,q 1 -solution of (2.23). Secondly, let g ∈ L 1 t L 2 x and (g k ) be a sequence in D(R n+1 ) that converges to g in L 1 t L 2 x . Testing the equation for u k := T g k against a functionφ ∈ D(R n+1 ), we have The estimate (2.27) shows that u k converges to u in the norm of∆ r 1 ,q 1 , and this allows us to pass to the limit on the left-hand side, using Remark 2.12 for the second term. This proves (2.23) and (2.24).
Eventually, u ∈ C 0 (L 2 x ) follows from Proposition 2.26 and we obtain the estimate (2.25) from the estimate in that proposition if we plug in (2.24) with (r, q) = (r 1 , q 1 ).
The next result is central for our constructive approach to fundamental solutions and Green operators.
where δ s is the Dirac mass at t = s and ⊗ denotes the tensor product. Moreover, this solution belongs to∆ r,q for any admissible pair (r, q) with x has an absolutely continuous extension to each of [s, ∞) and (−∞, s] with estimate where C(n, q 1 , r 1 ) does not depend on s and ψ. Eventually, on each of the intervals (s, ∞) and (−∞, s), the solution u is the restriction of a function inV.
Proof. The uniqueness follows from Theorem 2.27. For the existence, given s ∈ R, let T s : x ) is bounded and it follows that (2.33) T s ψ ∆r,q ≤ C(n, q, r) ψ L 2 x . In particular, u := T s ψ satisfies (2.29). It is no loss of generality to assume s = 0 as the general argument is the same (or can be deduced from s = 0 by a translation in time, which preserves all assumptions with uniform constants).
Step 1: u is a∆ r 1 ,q 1 -solution of (2.28). Let (ϕ ε ) be a mollifying sequence as in the proof of Lemma 2.22. We apply Theorem 2.29 to g ε := ϕ ε ⊗ ψ ∈ L 1 t L 2 x . We test the equation for u ε := T g ε against a functionφ ∈ D(R n+1 ) and pass to the limit in this equation as ε → 0. First and similarly Eventually, we have seen that u ε converges to u := T 0 ψ in D ′ (R n+1 ). The estimate (2.27) shows that (u ε ) is uniformly bounded in∆ r 1 ,q 1 , which is a dual space (it is even reflexive). Hence, it converges also weakly-star in∆ r 1 ,q 1 to u and so Remark 2.12 shows that This proves (2.28) and (2.29).
Step 3: Proof of (2.30). Letφ ∈ D(R n+1 ) and θ : R → R even, smooth, 0 on [0, 1] Expanding the first term, we obtain We now pass to the limit as ε → 0. Using again the duality between∆ r 1 ,q 1 and its pre-dual for the Hence, the left-hand side above converges to ψ,φ(0) . The right-hand side rewrites after a change of variable as which, by dominated convergence and the existence of limits from the left and the right at 0 from Corollary 2.24, tends to

This proves (2.30).
Step 4: On the left and right of 0, u is a restriction of an element inV. It remains to see this last point. Consider w the even extension across t = 0 of the restriction of u to (0, ∞) × R n . Using the same functionsφ and θ as above, we have Since w and θ ε are even in t, the only contribution ofφ is through its odd part The first term on the right-hand side we obtain a bound on the order of ε and this term tends to 0.
In conclusion, we proved x , we obtain w ∈V from Corollary 2.20 and since w| (0,∞) = u, we are done.
The same argument can be done with the restriction of u to (−∞, 0) × R n .
2.7. Green operators. Theorem 2.30 can be used to construct Green operators for the parabolic equation and its adjoint. We borrow this terminology from [29].
Definition 2.31. Assume (H 0 ) and let s, t ∈ R and ψ,ψ ∈ L 2 x . (i) For t = s, define G(t, s)ψ as the value at time t in L 2 The operators G(t, s) and G(s, t) are called the Green operators for the parabolic operator ∂ t + L and the (adjoint) parabolic operator −∂ t + L * , respectively.
We recall that the orbit G(·, s)ψ was defined as T s ψ in (2.32), which reads as the "double duality formula" Indeed, G(·, s)ψ and H * −1φ are solutions of parabolic problems for adjoint operators. Rephrasing parts of Theorem 2.30 in terms of Green operators yields the following result.
x ) and the following limits exist in L 2 x : x . The function s → G(s, t)ψ is in C 0 (R \ {t}; L 2 x ) and the following limits exist in L 2 x : The operators G(t, s), G(s, t) are uniformly bounded on L 2 x with respect to (s, t) with t = s.
Next, we list a number of properties involving the Green operators and their limits.
(v) For s, r, t distinct reals with r between s and t, G(t, s) = G(t, r)G(r, s).

Remark 2.34.
(i) The reader familiar with parabolic problems expects causality, that is, G(t, s) = 0 when t < s, and recovery of initial data, that is G(t, s)ψ → ψ when t → s + , meaning that Π + s = Id. At this stage however, there is yet no reason to expect these properties since all assumptions apply indifferently to the equation and its adjoint, going backward in time. In view of this, it is remarkable that we can prove the adjointness property (ii) and the Chapman-Kolmogorov formula (v) under the mere assumption (H 0 ). (ii) Properties (iii) and (iv) mean that the Green operators G(t, s) propagate the ranges of the limit operators Π + t when t > s and Π − t when t < s even though we do not have further information on the range and kernel of these operators at this point.
Proof. We proceed as follows.
Proof of (i). We may assume s = 0 as usual. The property Π + 0 − Π − 0 = Id is a rephrasing of the jump relation (2.30) and similarly Π + 0 − Π − 0 = − Id, the negative sign coming from the fact that −∂ t + L * is backward in time. Fix ψ,ψ ∈ L 2 x . We apply the integral identity of Corollary 2.25 to u := G(·, 0)ψ andũ := G(·, 0)ψ in the intervals (0, ∞) and (−∞, 0), knowing that the integrand vanishes almost everywhere in each interval and that u(t),ũ(t) → 0 in the limit as t → ±∞. This gives us x . This time we can apply the integral identity of Corollary 2.25 to u := G(·, s)ψ andũ := G(·, t)ψ in the intervals (−∞, s), (s, t) and (t, ∞), knowing that the integrand vanishes almost everywhere in each interval and that u(τ ),ũ(τ ) → 0 in the limit as τ → ±∞. We obtain Subtracting the first and third equalities to the second one and using (i), we obtain which proves (ii) in this case. From the adjoint relations in (i) we also see that Π − t G(t, s)ψ = 0 and G(t, s)Π − s ψ = 0. Again by (i) we conclude for the two missing relations in (iii).
Proof of (iv) and (ii) for s > t. The argument is completely symmetric to the previous case.
Proof of (v). Let us first treat the case s < r < t. Fix ψ ∈ L 2 x . Let u := G(·, s)ψ, v := G(·, r)(u(r)) and define Since the gluing procedure preserves∆ r 1 ,q 1 , the equality w = u follows by uniqueness provided we can show that w is a∆ r 1 ,q 1 -solution of By translation we can assume r = 0 in order to simplify the exposition. Letφ ∈ D ′ (R n+1 ). The argument is a reprise of the proof of the jump relation in Theorem 2.30.
Using the same cut-off functions θ ε supported outside of [−ε, ε], we have, as before, For the first two terms we see thatφ θ ε is the sum of one test functionφ + supported in (0, ∞) × R n and another oneφ − supported in (−∞, 0) × R n . With the first one we use the equation for v and with the second the equation for u, so that we find where the second step works provided ε is small enough in order to guarantee that θ ε (s) = 1. For the third term we can argue as in Step 3 of the proof of Theorem 2.30 to find lim

By definition and (iii), we have
as desired. The proof when t < r < s is similar, using (iv) instead of (iii).
Representation with Green operators. In this section we shall detail how the Green operators can be seen as operator-valued Schwartz kernels for the inverse of H. This illustrates nicely how we can re-discover objects of the classical theory for smooth coefficients as part of our 'universal' construction. While Proposition 2.36 is not used in other sections, a key ingredient in its proof implies representations for solutions (Theorem 2.38) that will be useful when we deal with Cauchy problems.
The operator-valued Schwartz kernels result is as follows.
Before we come to the proof, let us interpret the result in the context of the Schwartz kernel representation [34]. Indeed, there exists a unique . We indicate the dummy variables for notational simplicity and the bracket with parentheses are the bilinear dualities. For example, building H from the heat operator ∂ t − ∆, we see that K t,x,s,y can be identified with the heat kernel Not all such operators may have kernels with pointwise bounds. In any case, we can also proceed by fixing ψ,ψ ∈ D(R n ) and looking at the bilinear map Thus, Theorem 2.36 establishes that K ψ,ψ,t,s can be identified with a locally integrable function and we can set the values on the diagonal being irrelevant. In particular, K ψ,ψ,t,s agrees with a separately continuous function on In order to prove Proposition 2.36, we begin with a pointwise variant for all t ∈ R.
. For any f ∈ D(R), t ∈ R and ψ,ψ ∈ D(R n ), Proof. In order to simplify the exposition, we assume t = 0. The general case follows by translation as usual.
Let (ϕ ε ) be a standard mollifying sequence in the t-variable. Since H * −1 is the adjoint of H −1 and f ⊗ ψ, ϕ ε ⊗ψ belong toV, we have Since f ⊗ ψ and ϕ ε ⊗ψ are in L 1 t L 2 x , we know from Theorem 2.29 (and its proof) that H −1 (f ⊗ ψ) and H * −1 (ϕ ε ⊗ψ) belong to C 0 (L 2 x ). Both duality pairings are given by absolutely convergent Lebesgue integrals and we can apply Fubini's theorem on the right-hand side in order to write For fixed s we use (2.34) in order to arrive at Now, we pass to the limit as ε → 0 as follows. Since as mentioned before, we see that the left-hand side of (2.38) tends to the left-hand side of (2.37) as ε → 0.
On the right-hand side we use the properties of the Green operators from Proposition 2.32. Since G(·, s)ψ is continuous with values in L 2 x except at s, we have pointwise convergence x and the duality pairing on the right-hand side of (2.38) is another convergent Lebesgue integral that obeys the estimate with C independent of s and ε. Since f is integrable, the right-hand side of (2.38) tends to the right-hand side of (2.37) as ε → 0 by dominated convergence.
Proof of Proposition 2.36. We use the continuity of t → H −1 (f ⊗ψ)(t) in L 2 x to rewrite the left-hand side of (2.36) as a Lebesgue integral and use (2.37) for the integral in x in order to obtain The remaining question is therefore whether the above iterated integral can be taken in the sense of Lebesgue on R 2 , so that we can use Fubini's theorem to conclude the proof of (2.36).
t ∈ R} and we can consider it as an almost everywhere defined measurable function on R 2 . Finally, uniform boundedness of the Green operators in L 2 x yields Lemma 2.37 in turn implies a representation formula for solutions to equations with general right-hand side.
Theorem 2.38. Assume (H 0 ). Let (r, q) be an admissible pair or (r, q) = (∞, 2), obtained by combining Theorems 2.28, 2.29 and 2.30, can be represented by the equality when t = s for the first term and where the integrals are defined in the weak sense, that is, x . Proof. We prove (2.42). By uniqueness, it suffices to consider the three terms in the right-hand side of the equation individually, assuming the other two vanish. Fix t ∈ R.
For the term involving ψ, this is the definition of G(t, s)ψ if t = s.
For the term involving g, Lemma 2.37 and the adjoint relation G(σ, t) = G(t, σ) * yield the result whenψ is a test function and g = f ⊗ ψ a tensor product of test functions (or a linear combinations of such tensor products). Hence,ψ and g already describe dense subsets of L 2 x and L r ′ t L q ′ x , respectively. Consider g ∈ L r ′ t L q ′ x andψ ∈ L 2 x . If (r, q) is admissible, then (2.22) in Theorem 2.28 and Cauchy-Schwarz yield and by (2.29) for the adjoint equation, If (r, q) = (∞, 2), then Theorem 2.29 yields and by (2.31) for the adjoint equation, Under either assumption one can thus pass to the limit by density in (2.37), and (2.42) is proved in this case. For the term involving F the proof is analogous. As before, for F ∈ L 2 t L 2 x and ψ ∈ L 2 x , x and by construction of G and Theorem 2.30, ∇ G(·, t)ψ ∈ L 2 t L 2 x and R n+1 |F (σ, y)||∇ G(σ, t)ψ(y)| dσdy ≤ C(n, q, r) F L 2 Hence, the integral involving F on the right-hand side of (2.42) is defined and, by density, it remains to check that it agrees with − H −1 (div F )(t),ψ whenψ is a test function and F is a tensor product f ⊗ ψ of test functions. But in this case we have div F = f ⊗ div ψ and Lemma 2.37 along with the adjointness relations for G yields Remark 2.39. We draw the reader's attention to the following regularity result that is implicit from the equality proved in Theorem 2.38. The integral involving g is continuous in L 2 x as a function of t, while its definition merely yields continuity for the weak L 2 x topology. The same comment applies to the integral involving F .
2.9. Invertibility, Causality. So far, we have not addressed sufficient conditions for invertibility of H and the question of causality. The true use of the spaceV will become transparent here. The first two results require a smallness assumption on the lower order terms, hence are of perturbative nature from the purely second order case. The third one is non-perturbative and uses lower bounds. At this stage, we eventually impose ellipticity to A in the sense of Gårding : for some λ > 0 we assume that for all u ∈V, Note that this is equivalent to having for almost every t and every w ∈Ḣ 1 (R n ) that Re A(t)∇w, ∇w ≥ λ ∇w 2 L 2 x . This lower bound would also be the one to assume for systems. When A(t) is a matrix with real measurable entries with respect to x, this is also equivalent to a pointwise lower bound for A(t, x), see [36]. However, this last observation is not valid for complex matrices or systems.
Proof. For the sesquilinear form β corresponding to the lower order coefficients, we have seen in Lemma 2.7 that By the embeddingV ֒→∆ r 1 ,q 1 , we have with C = C(n, r 1 , q 1 ), be of the same type as H but without lower order terms. We use the classical hidden coercivity inequality x , where δ > 0 and H t is the Hilbert transform in the t-variable with symbol iτ /|τ |, see [24]. This inequality follows from the factorization t , so that, using also the skew-adjointness of the Hilbert transform and commutation, As x , we can now set δ := λ/(1 + Λ) and define ε 0 through Cε 0 √ 1 + δ 2 = δ/2 in order to conclude that Pr 1 ,q 1 ≤ ε 0 implies that It follows from the Lax-Milgram lemma that (Id +δ H t ) * H is invertible fromV toV ′ . As Id +δ H t is also invertible onV and its dual, this proves the invertibility of H.
Theorem 2.41 (Causality). Assume that A is elliptic and bounded with parameters λ, Λ as in (2.4), (2.43). There is ε 0 > 0 small enough depending on λ, Λ, n, q 1 , r 1 such that Pr 1 ,q 1 ≤ ε 0 implies that H is causal in the following sense: (i) If u is a∆ r 1 ,q 1 -solution of ∂ t u + Lu = − div F + g as in Theorems 2.28 or 2.29 or combination of both, and if F, g vanish on (−∞, s) × R n for some Proof. We begin with the proof of (i). We know that u ∈ C 0 (L 2 x ). As usual, we may assume s = 0 to simplify the exposition. We let S := sup t≤0 u(t) 2 L 2 x . The integral identities of Lemma 2.22 apply to u and for σ ≤ τ ≤ 0 we obtain since F, g vanish in this range. We send σ → −∞ and take some τ ≤ 0 at which the supremum S is attained. Then, we have where I := Altogether, where the second step is by Young's inequality, keeping in mind that 2 ≤ r 1 < ∞. If Pr 1 ,q 1 ≤ ε 0 is small enough, then we can hide the contribution of S on the left and obtain that S ≤ 0. Hence, u(t) = 0 for all t ≤ 0.
For the proof of (ii), we know that u ∈ C 0 (−∞, s; L 2 x ) with L 2 x limit when t → s − . In particular, we can argue with S = sup t≤s u(t) 2 L 2 x where u(s) means u(s − ) and the proof is the same.
We turn to lower bounds assumptions. We assume Pr 1 ,q 1 finite but not necessarily small. Here, we do not explicitly need the lower bound (2.43) on A but it is hidden in checking the assumptions.
(i) Assume that there exists c > 0 such that almost everywhere for all w ∈Ḣ 1 (R n ). Then H is causal in the sense of Theorem 2.41.
Proof. To prove (i), arguing as before and using the assumption, we have When r 1 = 2 (hence n ≥ 3 and q 1 = 2n n−2 ), the Sobolev inequality gives us u ∆r1,q1 ≤ c(n, q 1 , r 1 ) ∇u L 2 t L 2 x . Since the Hilbert transform is isometric on L 2 and commutes with the gradient, we see that if δ > 0 is so small that c − c(n, q 1 , r 1 )Cδ > 0, then H is invertible.
When r 1 > 2, we refine the first embedding of Lemma 2.3, by replacing (2.1) with x . Using that H t is isometric onV, we obtain Re Hu, (Id +δ H t )u x with C ′ = c ′ (n, q 1 , r 1 )C. One chooses first ε with 0 < C ′ ε 1/θ θ < 1 and then δ with This yields the desired invertibility for H. The proof of (ii) is easy. Let u be a∆ r 1 ,q 1 -solution of ∂ t u + Lu = δ s ⊗ ψ − div F + g. Then we know that for τ < s, We conclude right away that u(τ ) = 0.
Remark 2.43. In practice, the hypothesis in (i) follows from elliptic inequalities of the form Re βw, w ≤ (1 − γ) Re A∇w, ∇w with γ < 1, when there is a lower bound λ > 0 for A as in (2.43). It is not so much the smallness of Pr 1 ,q 1 that matters (although its size can be used in proofs).
We obtain as a corollary the further identities for the Green operators that have been mentioned in Remark 2.34 (i) earlier on.
Corollary 2.44. If the previous results on invertibility and causality under smallness assumptions or lower bounds hold, then G(t, s) = 0 if t < s and G(t, s) → I strongly as t → s + .
2.10. Inhomogeneous assumptions on lower order terms. So far, we have put ourselves in the situation where the lower order terms bring a contribution that is homogeneous to the gradient in L 2 . However, if the size of this contribution is not small enough, then invertibility of H is not clear and we have to consider inhomogeneous assumptions by adding a positive constant.
We define the inhomogeneous versions ofV and∆ r,q . We set V : The continuous inclusion V ֒→ ∆ r,q for admissible pairs follows from Lemma 2.3. For admissible pairs, we still miss the extreme cases r = ∞ that one obtains when L ∞ t L 2 x replaces H 1/2 t L 2 x , and q = ∞. The descriptions of the dual or pre-dual of ∆ r,q are similar.
We may as well enlarge the class of coefficients and assume from now on that x with (r 1 ,q 1 ) being a compatible pair for lower order coefficients, as in Section 2.6. Recall that this means 1 r 1 + n 2q 1 = 1 with (r 1 ,q 1 ) ∈ (1, ∞] 2 . Remark 2.45 (Subcritical exponents). Let (r,q) ∈ [1, ∞] 2 satisfy the subcritical compatibility relation 1 r + n 2q < 1. Coefficients with |a| 2 + |b| 2 + |a| ∈ Lr t Lq x have such a decomposition when, in addition, Indeed, this is immediately seen from visualizing exponents in a ( 1 r , 1 q )-plane and truncating coefficients at a fixed height. In [4], subcritical compatibility is assumed because the goal is to deal with bounded solutions. The same condition appears for Cauchy problems in [28]. See also [26].
We can define where V ′ is the dual of V with respect to L 2 t L 2 x duality. We use the same notation H as before, although V now is a smaller space. These operators are bounded and adjoint to one another. Instead of (H 0 ) we now work under the hypothesis that there is κ > 0 such that H + κ and H * + κ are invertible.
Then we naturally work with H + κ, which means that we add the assumption u ∈ L 2 t L 2 x in most statements. Again the estimates will depend on Λ = A ∞ , the bound implicit in (2.45), with fixed compatible pair (r 1 ,q 1 ) for lower order coefficients, and the norm of the inverse of H + κ. Here is a description of changes in Sections 2.1 -2.9: (i) In the modification of Corollary 2.20 we assume u ∈ L 2 t H 1 is an admissible pair or just u ∈ C 0 (L 2 x ) when (r, q) = (∞, 2). The proofs of the technical lemmas use estimates for the operator ∂ t − ∆ + 1. (Note that we still cannot use the Lions embedding theorem to deduce continuity because we do not, and do not want to, assume that u ∈ V.) (ii) In Lemma 2.22, we assume u ∈ L 2 t H 1 x with (r, q) an admissible pair or (r, q) = (∞, 2) and h ∈ L 2 t L 2 x . Then the conclusion is the same (without a constant) with the extra term h(t), u(t) in (2.17). The statements that follow in the same section are adapted similarly. (iii) In all statements of Section 2.6, Assumption (H 0 ) is replaced by (H κ ) and the equation to solve is for the operator ∂ t + L + κ in the sense of ∆ r 1 ,q 1 -solutions, which are defined by changing∆ r,q to ∆ r,q in Definition 2.13. Such solutions belong to C 0 (L 2 x ). Uniqueness is in that class, and existence theorems through the inverse of H + κ or its adjoint are proved for this operator with possible addition of an extra term h ∈ L 2 t L 2 x in Theorem 2.28. (iv) In Section 2.7, if we assume (H κ ) instead of (H 0 ), we obtain exactly the same statements for the Green operators G κ (t, s) and G κ (s, t) of ∂ t + L + κ and −∂ t + L * + κ, respectively. In particular, they are uniformly bounded operators on L 2 x provided t = s. (v) Sections 2.8 can be adapted mutatis mutandis with the Green operators G κ under (H κ ). In the statement corresponding to Theorem 2.38, one can add again an extra forcing term h ∈ L 2 t L 2 x . In order to check invertibility and causality, we introduce the following property on the lower order coefficients, where ε ≥ 0 will be chosen appropriately small later.
Assumption (D ε ). For some compatible pair (r 1 ,q 1 ) for lower order coefficients, one can find a decomposition with Pr 1 ,q 1 : = |a 0 | 2 1/2 Remark 2.46. By truncation at large height one can always do such a decomposition for any ε > 0 starting from |a| 2 + |b| 2 + |a| ∈ Lr 1 t Lq 1 x + L ∞ t L ∞ x , except ifr 1 = ∞. In this case, one needs further assumptions, such as that the part in L ∞ t Lq 1 x is uniformly continuous in time. For example, independence of time is a valid hypothesis. In other words, (D ε ) always holds for all ε > 0 except whenr 1 = ∞.
The quantities Pr 1 ,q 1 , P ∞ turn out to quantify nicely some estimates.
Remark 2.48. The proof shows that under the assumptions of Theorem 2.40, which corresponds to Pr 1 ,q 1 ≤ ε 0 and P ∞ = 0, that H + κ : V → V ′ is invertible for all κ > 0.
For causality, Theorem 2.41 becomes the following result.
Theorem 2.49 (Causality, inhomogeneous case). With the assumptions of Theorem 2.47 there exists κ 0 > 0 such that H + κ is causal for κ ≥ κ 0 in the following sense: (i) If u is a ∆ r 1 ,q 1 -solution of ∂ t u + Lu + κu = − div F + g + h as in the modifications of Theorems 2.28 or 2.29 (here, h ∈ L 2 t L 2 x , while g ∈ L r ′ t L q ′ x ) and F, g, h vanish on (−∞, s) × R n for some s ∈ R, then u = 0 identically on (−∞, s] × R n . (ii) If u is a ∆ r 1 ,q 1 -solution of ∂ t u + Lu + κu = δ s ⊗ ψ as in the modification of Theorem 2.30, then u = 0 identically on (−∞, s) × R n .
Proof. It begins as the proof of Theorem 2.41. In the first case, as u ∈ C 0 (L 2 x ), we fix s = 0 to simplify the exposition, let S := sup t≤0 u(t) 2 L 2 x , which is attained at some τ , and obtain and Young's inequality allows us to hide the contribution coming from S on the left-hand side up to loosing a little on −2λI by choosing ε small enough. Next the and Young's inequality yields again a contribution of x dt that is compensated if κ is larger than a constant times P 2 ∞ , up to loosing again a little on −2λI. We obtain S ≤ 0, and thus u(t) = 0 for t ≤ 0.
In the second case, we know that u ∈ C 0 (−∞, s; L 2 x ) with L 2 x limit when t → s − . In particular, we can argue with S = sup t≤s u(t) 2 L 2 x , where u(s) means u(s − ), and the proof is the same.
Finally the result with lower bounds (Theorem 2.42) becomes the following and we skip the easy adaptation of the proof. Again, a lower bound on A is implicit in order to check the assumptions.
x almost everywhere for some c ′ > 0 and for all w ∈ H 1 (R n ). Then H + κ is causal in the sense of Theorem 2.49 for any κ ≥ c ′ .
Corollary 2.51. Under either smallness assumption (Theorem 2.49) or lower bounds (Theorem 2.50 (ii)) it follows for all κ large enough that G κ (t, s) = 0 if t < s and G κ (t, s) → I strongly if t → s + .
2.11. The Cauchy problem and the fundamental solution operator. We consider in fine the Cauchy problem on the strip [0, T ] × R n with 0 < T < ∞. (We shall say a word concerning T = ∞ in Remark 2.55) . It is of course sufficient to consider coefficients only on this strip and the foregoing results will allow us to work under the following set of assumptions.
(A1) L is given as in (2.3) with coefficients A, a, b, a defined almost everywhere in (0, T ) × R n . (A2) A has bounds λ, Λ as in (2.4), (2.43) for almost every t ∈ (0, T ). (A3) The lower order coefficients satisfy (D ε ) on (0, T ) × R n for all ε small enough with compatible pair (r 1 ,q 1 ) as in Definition 2.8. Recall that we take coefficients with |a| 2 + |b| 2 + |a| ∈ Lr 1 t Lq 1 x + L ∞ t L ∞ x in (0, T ) × R n . As in the case of R n+1 , (A3) automatically holds except ifr 1 = ∞ (hence n ≥ 3 and q 1 = n/2), in which case we can proceed by imposing it or by assuming (uniform) t-continuity on [0, T ] instead of mere boundedness, valued in L n/2 x , compare with Remark 2.46. An alternative in the case (r 1 ,q 1 ) = (∞, n/2) is to replace (A3) by the following condition.
(A3)'r 1 = ∞ and there exist c > 0 and c ′ ≥ 0 such that x for almost every t ∈ (0, T ) and all v ∈ V. Let (r 1 , q 1 ) be the admissible conjugate pair to (r 1 ,q 1 ) defined in (2.7). For ψ ∈ L 2 x , F ∈ L 2 (0, T ; L 2 x ), g ∈ L r ′ (0, T ; L q ′ x ), where (r, q) is an arbitrary admissible pair as in Definition 2.2 or (r, q) = (∞, 2), and h ∈ L 2 (0, T ; L 2 x ), the Cauchy problem with initial condition ψ and forcing terms − div F + g + h consists of finding in the sense that the first equation is satisfied weakly against test functionsφ ∈ D((0, T ) × R n ) as in (2.12) and that the second equation means u(t, ·) → ψ in D ′ (R n ) as t → 0. Weak solutions for the Cauchy problem (2.49) on [0, T ] × R n are those solutions in the class L 2 (0, T ; H 1 (R n )) ∩ L ∞ (0, T ; L 2 (R n )). This space embeds into L r 1 (0, T ; L q 1 (R n )), see Proposition 5.1 for a quick proof. So we have defined an a priori larger class of solutions. We next show that in fact a∆ r 1 ,q 1 -solution is continuous in time valued in L 2 x , so that in the end it is a weak solution. This continuity can be obtained as a regularity result in the inhomogeneous variational setting. Lemma 2.52. Let I = (0, T ) and u ∈ L 2 (I; H 1 x ) with ∂ t u = − div F + g + h, where F ∈ L 2 (I; L 2 x ), g ∈ L r ′ (I; L q ′ x ) for (r, q) an admissible pair or (r, q) = (∞, 2), and h ∈ L 2 (I; L 2 x ). Then u ∈ C([0, T ]; L 2 x ) and we have the integral equalities (2.17) for 0 ≤ σ < τ ≤ T .
Proof. According to the discussion (ii) in Section 2.10, the result holds when I = (0, ∞). We proceed as in Corollary 2.24 by constructing an extension v of u to which this applies. The situation here is easier since u is locally integrable; still the language of distributions is convenient.
More precisely, for f ∈ D(0, ∞) and ψ ∈ D(R n ), we first note that t → u(t), ψ is absolutely continuous on [0, T ] with derivative equal to F (t), ∇ψ + g(t), ψ + h(t), ψ almost everywhere on (0, T ). Hence, if χ denotes a smooth function that is 1 for t ≤ T and 0 for t ≥ 2T , the expression where the subscripts o, e denote odd and even extensions at t = T , respectively. Hence, all the required assumptions can be verified to conclude that v ∈ C 0 ([0, ∞), L 2 x ). As usual one can add to g several other terms of the same type with different admissible pairs.  x . (ii) Changing the origin of time, for 0 ≤ s ≤ t ≤ T there exists a unique L 2 xbounded operator Γ(t, s), called fundamental solution operator for the Cauchy problem on (s, T ) × R n with no forcing terms and initial condition in L 2 x , that sends the initial condition at time s to the value of the unique solution at time t. (iii) For t ∈ [0, T ] the solution u above is then given by where the first two integrals are weakly defined in L 2 x , while the last one converges strongly (i.e. in the Bochner sense).
(iv) Define H on extending A by the identity and the lower order coefficients by 0 on (R \ (0, T )) × R n . 3 The fundamental solution operators themselves are given by for all κ ≥ 0 for which H + κ is invertible and causal and G κ (t, s) is the Green operator obtained under this assumption.
Proof. We extend the forcing terms F, g, h by 0, keeping the same notation for the extensions and L: they satisfy the same conditions on full space-time. We fix κ > 0 for which H + κ is invertible and causal (Theorems 2.47 and 2.49 or 2.50). First, we can use the inhomogeneous version of Theorems 2.28, 2.29 and 2.30 to build a (unique) ∆ r 1 ,q 1 -solution v to The assumption of causality implies indeed that u(t) → Γ(t, 0)ψ = ψ in L 2 x as t → 0. Applying the inhomogeneous version of Theorem 2.38 to the Green operators G κ (t, s) of H + κ gives us (2.50) with Γ(t, s) = e κ(t−s) G κ (t, s). We refer in particular to (v) in Section 2.10.
Next, we check uniqueness in the class L 2 (0, T ; H 1 x ) ∩ L r 1 (0, T ; L q 1 x ). Assume that u solves the Cauchy problem with ψ = 0, F = 0, g = 0, h = 0. By Corollary 2.53 we know that u ∈ C([0, T ]; L 2 x ). With κ as above, v := ue −κt solves the Cauchy problem with 0-data for ∂ t v + Lv + κv = 0 in (0, T ) × R n . By restriction, as before, we can use the global parabolic operator also to build a solution u T ∈ L 2 (T, ∞; to the same equation with initial data u T (T ) = u(T ) and in (−∞, 0] × R n we set u 0 (t) := 0. By continuity valued in L 2 x , we can glue u 0 , u, u T together to a ∆ r 1 ,q 1 -solution w of ∂ t w + Lw + κw = 0 in R n+1 , which vanishes identically by the inhomogeneous version of Theorem 2.27. Hence, we have u = 0.
The rest of the statement follows easily. By construction, u is the restriction of a function which belongs toV. Next, uniqueness implies that the formula (2.51) is valid for all κ > 0 for which H + κ is invertible and causal.
It remains to include the case κ = 0 for (2.51) when H is invertible and causal. To this end, we can apply the homogeneous versions in Sections 2.6, 2.8 and 2.9 with F, g, h being all zero. In that case, we consider a∆ r 1 ,q 1 -solution on R n+1 , which produces a solution by restriction. This solution has a representation using the Green operators G 0 (t, s). Already established uniqueness shows that Γ(t, s) = G 0 (t, s).
Indeed, visualizing the exponents in a ( 1 ρ , 1 η )-plane immediately reveals that they can be decomposed as required.
Remark 2.57. If the coefficients for H are defined on R n+1 , then by (2.51) we have e κ(t−s) G κ (t, s) = e κ 0 (t−s) G κ 0 (t, s) for all κ ≥ κ 0 ≥ 0 and t, s ∈ R, where κ 0 is such that H + κ 0 is invertible and causal (setting also G 0 := G). It is interesting to note that this relation between Green operators cannot be seen directly in R n+1 because the conjugation by the exponentials does not preserve the spaces of solutions. Another interesting consequence is that it implies exponential decay estimates for the operator norm: x , recalling that t − s ≥ 0 for the Green operators to be non-zero.
Remark 2.58. If (A3) is used, then constants in the implicit estimates for u depend on the choice of ε 0 for invertibility and causality, which was seen to depend on λ, Λ, n, q 1 , r 1 , and P ∞ in the decomposition (D ε 0 ). They do not depend on T unless P ∞ does.
where * is the complex adjoint and Γ(s, t) is the generalized fundamental solution of the adjoint problem.
Proof. We know that the Green operators G κ (t, s) and G κ (s, t) are adjoint operators. If we adapt the proof above to the adjoint backward operator −∂ t + L * , we produce solutions in (0, T ) × R n on restricting the ones in R n+1 for −∂ t + L * + κ multiplied by e κ(T −s) (in the variable s). Changing the initial time T to t, this yields that the fundamental solution operator Γ(s, t) for the adjoint problem agrees with e κ(t−s) G κ (s, t) = Γ(t, s) * , where the last equality follows from (2.51).
Corollary 2.60. The solution u of Theorem 2.54 agrees with the ones build in [28] and [4] under assumptions in these references.
Proof. By mixed embeddings (Proposition 5.1), the space for uniqueness in Theorem 2.54 contains the standard energy space L 2 (0, T ; H 1 x ) ∩ L ∞ (0, T ; L 2 x ). In Chapter 3 of [28], weak solutions in the latter class are constructed exactly under the same assumptions (A1), (A2) and subcritical (Remark 2.45) or critical (a.k.a compatible) conditions on the coefficients (which are even assumed real there). In [4] this is being done under the more restrictive conditions of subcritical and real coefficients with the structural conditions (A1) and (A2).
2.12. L 2 off-diagonal estimates. Aronson further proved pointwise Gaussian estimates of the generalized fundamental solution when the coefficients are real-valued [3,4]. As already mentioned, assumptions on lower order coefficients in [4] amount to what we called subcritical compatibility (Remark 2.45), used in an essential way together with the fact that the coefficients are real, to obtain local boundedness of weak solutions. Already in the elliptic case with leading term the Laplacian on the unit ball, explicit examples show existence of unbounded weak solutions for some first order coefficients in L n or some zero order coefficients in L n/2 , see [27].
We know from Corollary 2.60 that our solutions agree with the ones of Aronson under his assumptions; in particular his generalized fundamental solution operator and ours are identical. Hence, pointwise bounds under (critical) compatibility assumptions are not to be expected. Still, under this assumption, we will be able to show L 2 off-diagonal estimates (or Gaffney estimates) for the fundamental solution operator, that is, decay of localized L 2 norms.
When there are no lower order terms, the method of Aronson has been streamlined with the exponential trick of Davies [14] for time independent A and this has been adapted by Fabes-Stroock [19] when A is time-dependent, see also Hofmann-Kim [22] for a nice presentation, using the Gronwall lemma as a starting point. The same ideas go through with bounded lower order coefficients but when they are allowed to be unbounded, it is not clear how to set up the arguments properly. In [9], a construction is proposed in absence of lower order terms, starting from the semigroup case. This approach is not possible when using mixed norms on lower order coefficients because there is no semigroup to begin with. Our approach allows us to overcome these difficulties by extending Davies' ideas to the context of variational parabolic forms.
Theorem 2.61. Assume the conditions of Theorem 2.54. Then there are constants 0 < C, c 0 , ω < ∞ such that for all 0 ≤ s < t ≤ T , all closed sets E, F ⊂ R n and all ψ ∈ L 2 x with support in F , we have Let us comment on the three constants. If (A3) is used, then ω = c 0 P 2 ∞ with P ∞ from the decomposition given by (D ε 0 ) and C, c 0 depend only on λ, Λ, n, q 1 , r 1 , where ε 0 is such that the arguments for invertibility and causality apply. As in Remark 2.58, they may depend on T but only through P ∞ . If (A3)' is used, then C, ω = c 0 depend on Λ, n and c, c ′ in (A3)'.
Proof. We extend the coefficients to full space-time as in the proof of Theorem 2.54 and use the same notation. Henceforth, we work in R n+1 and prove (2.53) for all s < t.
For a function h : R n → [0, ∞[ bounded and Lipschitz, consider the operator obtained in R n+1 on conjugating H (= ∂ t + L) with the multiplication by e h . A calculation (in the weak sense) shows that and with A t being the real transpose of A, (2.56) The coefficients a h and b h are bounded by A ∞ ∇h ∞ . In a h , the first term is bounded by A ∞ ∇h 2 ∞ . To handle the second term, we distinguish the two assumptions.
Proof under (A3). The number ε 0 is chosen in particular such that (2.48) for H + κ holds with κ ≥ κ 0 where κ 0 = δ/4 + c δ P 2 ∞ . Our first goal is to check that (2.48) for H + κ + β h holds with δ/4 replaced by, say, δ/8 for large enough κ that will also depend on ∇h 2 ∞ . To this end, it will suffice to revisit the proof of that inequality after adding the contribution of the coefficients in (2.56).
Step 1: Proof of the lower bound for the perturbed H + κ + β h . We decompose a − b = (a 0 − b 0 ) + (a ∞ − b ∞ ) as in the assumption (D ε ) with ε = ε 0 . The term coming from (a ∞ − b ∞ ) · ∇h brings a bounded contribution of size P ∞ ∇h ∞ . For the other term, we observe that a 0 − b 0 belongs to L 2r 1 t L 2q 1 x with norm not exceeding 2ε 0 and (2r 1 , 2q 1 ) is a subcritically compatible pair for coefficients of order 0. We decompose further this term as suggested in Remark 2.45. To this end, call L 0 the elliptic operator with coefficients A, a 0 , b 0 , a 0 and H 0 the corresponding parabolic operator. Through the choice of ε 0 , we can make sure that (2.44) holds for H 0 . We also know that the multiplication by V ∈ Lr 1 t Lq 1 x is a bounded operatorV →V ′ . Thus, we can choose η > 0 (depending on n, q 1 , r 1 , δ) so small that V Lr 1 For m > 0, the truncation V 0 : We choose m so that 4ε 2 0 m −1 ∇h ∞ = η. On the other hand, withβ h + β ∞ having first order coefficients bounded by ∇h ∞ + P ∞ and zero order coefficients bounded by ∇h 2 ∞ + P 2 ∞ up to multiplicative constants that depend only on λ, Λ, n, q 1 , r 1 . This was the key point.
Applying the same simple absorption argument as in Theorem 2.47 to this decomposition reveals that for some constant c 0 with the same dependency and κ = 1 + c 0 ( ∇h 2 ∞ + P 2 ∞ ), the operator in (2.57) is invertible from V onto V ′ with a lower bound δ/8 in (2.48).
Our next goal is to transfer such lower lower bounds to operator norms for the perturbed fundamental solution operator, following the dependency in h.
Step 2: Norm bounds for the perturbed fundamental solution operator. With the constraints on κ and h above, the norm of the inverse of H + κ + β h depends on λ, Λ, n, q 1 , r 1 but not on h. Altogether, it follows that the Green operators G h,κ (t, s) associated to e h He −h + κ are uniformly bounded on L 2 x with respect to (t, s) with a bound C 0 depending only on λ, Λ, n, q 1 , r 1 . Now, by construction we have G h,κ (t, s) = e h G κ (t, s)e −h and by Theorem 2.54 we have Γ(t, s) = e κ(t−s) G κ (t, s). Hence, e h Γ(t, s)e −h = e κ(t−s) G h,κ (t, s). This infers that for all t − s = 1 and ψ ∈ L 2 x , e h Γ(t, s)ψ L 2 x ≤ (C 0 e)e c 0 ( ∇h 2 A scaling argument will now provide us with the right dependence of ω. Fix s = 0 to simplify matters by time translation invariance of the assumptions. Set u(t, ·) := Γ(t, 0)ψ. Recall that u solves ∂ t u + Lu = δ 0 ⊗ ψ, so that if R > 0, then u R (t, x) := u(R 2 t, Rx) solves ∂ t u R + L R u R = δ 0 ⊗ ψ R , with ψ R (x) = ψ(Rx) and L R has coefficients A(R 2 t, Rx), R a(R 2 t, Rx), R b(R 2 t, Rx), R 2 a(R 2 t, Rx). The quantity Pr 1 ,q 1 is scale invariant and therefore does not depend on R. The same applies to the ellipticity constants λ, Λ, while P ∞ becomes P ∞ R. Applying the above conclusion to the Green operator of ∂ t + L R at t = 1 with h R (x) = h(Rx), and changing variables in space, yields x . Altogether, this shows for all t > s and ψ ∈ L 2 x , x . It remains to optimize h appropriately.
Step 3: Choice of h. Fix E, F closed sets, let t > s and assume d( 2c 0 (t−s) on E, h = 0 on F , and ∇h ∞ = d(E,F ) 2c 0 (t−s) . Thus, if ψ has support in F , we obtain (2.53) with C = C 0 e and ω = c 0 P 2 ∞ . Proof under (A3)'. We modify the argument, explaining how to adapt the proof of Theorem 2.50 (or Theorem 2.42 in the inhomogeneous setting, to be precise). As r 1 = ∞, we have n ≥ 3 andq 1 = n/2.
There are two key observations. First, if we add lower order terms with bounded coefficients to L, then we still have the lower bound in (A3)' up to taking c smaller and c ′ larger. Second, if V ∈ L ∞ t L n/2 x , then x , so that in particular if η = c(n) −1 c/2 and V L ∞ t L n/2 x ≤ η, then we preserve the lower bound assumption of Theorem 2.42 on adding V .
In order to make use of these two observations, we recall that x m −1 ∇h ∞ , and we choose m so that this bound equals η. Thus, The decomposition replacing (2.57) is whereβ h has first order coefficients bounded by C ∇h ∞ and zeroth order coefficients bounded by C(1 + ∇h 2 ∞ ). Applying the two introductory observations and choosing κ = c 0 (1 + ∇h 2 ∞ ) for an appropriate constant c 0 , we see that the inverse of H +β h + V 0 + κ has a norm that is bounded by a constant independent of h. The rest of the proof is as in the first case but the scaling argument is not needed: we first obtain x for all t > s and then then same choice of h as before leads to (2.53) with ω = c 0 and C = C 0 .
2.13. Pointwise Gaussian bounds. We prove that pointwise Gaussian bounds for the fundamental solution operator follow from an assumption of local boundedness on weak solutions of both the parabolic equation and its adjoint. To this end, we extend the argument presented in [22] without lower order coefficients. This argument adapts once we have (2.58) at hand. As said before, we do not know how to modify the argument in [22] for this inequality directly in the presence of lower order coefficients. We recall that a weak solution of ∂ t u + Lu = 0 in an open set I ×Ω is a function u that is in the class L ∞ (I; (L 2 (Ω)) with ∇u in L 2 (I; (L 2 (Ω)) which satisfies the equation weakly against test functionsφ ∈ D(I ×Ω) as in (2.12). It is well-known that u is continuous in time locally in L 2 , see also Lemma 2.52. The following definition introduces quantitative boundedness in the two variables.
For (t, x) ∈ R n+1 and r > 0, we let Q r (t, x) = (t − r 2 , t] × B(x, r) and Q * r (t, x) = [t, t + r 2 ) × B(x, r) be the usual forward and backward in time parabolic cylinders. Definition 2.62. We say that ∂ t + L and −∂ t + L * have the local boundedness property if there are ρ ∈ (0, ∞] and 0 < B < ∞ such that for all (t, x) ∈ R n+1 and 0 < r < ρ, any weak solution of ∂ t u + Lu = 0 and −∂ tũ + L * ũ = 0 on neighborhoods of Q 2r (t, x) and Q * 2r (t, x), respectively, has local bounds of the form ess sup Remark 2.63. If ρ = ∞, the condition is scale invariant; here we will also encounter non-scale invariant situations, in which we need to consider ρ < ∞.
Note that these conditions are usually presented by taking suprema on Q r (t, x), Q * r (t, x) respectively, which means that one needs to know that solutions have pointwise values. Our weaker formulation suffices.
Proof. Under the hypotheses of Theorem 2.61 we have proved (2.58), which we rewrite for all t > s, ψ ∈ L 2 x and real, Lipschitz and bounded h as with Γ h (t, s) := e h Γ(t, s)e −h and ∇h ∞ = γ. By duality this inequality holds also We may apply (2.59) to u h and obtain for 0 < t − s < ρ 2 /2 and x ∈ R n that for almost every z ∈ B(x, x dτ. Note that the right-hand side does not depend on the space variable. As τ −s ≤ t−s, this implies Using (2.60) and (2.62) for the adjoint of Γ h (t, s) and duality, this yields Let us momentarily assume 0 < t − s < ρ 2 , that is k = 0. We shall remove this in the final step. By the Chapman-Kolmogorov identity of Theorem 2.33, which implies Γ(t, s) = Γ(t, r)Γ(r, s) with r = t+s 2 , we obtain x . By the Dunford-Pettis theorem (Theorem 1.3 in [2]), this amounts to the fact that for all t > s, Γ(t, s) = e −h Γ h (t, s)e h is an integral operator with measurable kernel that we denote by Γ(t, x, s, y), having an almost everywhere bound Taking h = 0 already gives us a uniform almost everywhere bound In order to prove (2.61), we fix x, y, t, s and assume |x−y| 16c 0 (t−s) . We pick h(z) = inf(γ|z − y|, N) with γ = |x−y| 4c 0 (t−s) and N > γ|x−y|. Thus, h is bounded and Lipschitz with ∇h ∞ = γ and 16c 0 (t−s) . Hence, This concludes the argument when 0 < t − s < ρ 2 . We are of course done when ρ = ∞. To conclude the proof when ρ < ∞, we iteratively apply the Chapman-Kolmogorov formula for Γ(t, s) together with the upper bound just found and the convolution rule g α ⋆ g β = g α+β , where g α (x) = (4πα) −n/2 e −|x| 2 /4α for α, β > 0. for almost every x, y ∈ R n+1 , where * is the complex adjoint (here the conjugation as the kernels are complex-valued) and Γ(s, y, t, x) is the generalized fundamental solution of the adjoint problem.
Remark 2.66. Aronson's prerequisite to obtaining Gaussian upper bounds for their generalized fundamental solution (which we now know agree with ours) is a condition on coefficients that insures the local boundedness property with the supremum, see Theorem B in [4]. Thus, Theorem 2.64 reproves Aronson's upper bound in a constructive way through identification of the general fundamental solution operators with integral kernels.
Remark 2.67. The stability result in Proposition 2.1 of [22] for pure second-order L could be adapted but not with full lower order terms. Although formulated as a perturbation result for local bounds, it proves more, namely: if weak solutions of ∂ t − div A∇ + b · ∇ satisfy local Hölder bounds with proper scaling, one preserves this regularity up to changing the Hölder exponent, on perturbing of A in L ∞ and b in the compatible mixed Lebesgue space. It is not clear what happens when adding the other terms with a or a.
2.14. Pure second order elliptic part. When the lower order coefficients are zero, that is, the elliptic part is the pure second order operator L 0 := − div A∇, we see that there is no need to introduce the compatible pair (r 1 ,q 1 ) to define H 0 = ∂ t + L 0 :V → V ′ in Proposition 2.11 and the information that ∇u ∈ L 2 t L 2 x suffices. Thus, we can introduce the (larger) class of L 2 tḢ 1 x -solutions of ∂ t u + L 0 u = f in R n+1 , which we define as the class of distributions u with ∇u ∈ L 2 Inspection of the arguments in Section 2.6 reveals that if H 0 is invertible, then the statements extend by replacing systematically H,∆ r 1 ,q 1 and u ∆r1,q1 by H 0 , L 2 tḢ 1 x and ∇u L 2 t L 2 x , respectively. In particular, uniqueness up to a constant (assuming invertibility) is obtained in the larger class L 2 tḢ 1 x . From there on, the theory develops analogously in this special case. The Cauchy problem for ∂ t +L 0 can be posed and solved uniquely in L 2 (0, T ; H 1 (R n )) when T < ∞ for arbitrary data ψ, F, g, h in appropriate spaces, or in L 2 (0, ∞;Ḣ 1 (R n )) when T = ∞ and h = 0 (recovering and extending the result in [9]). The elimination of the constant comes from the initial data being in L 2 x . The L 2 off-diagonal decay was already known in this case (see the beginning of Section 2.12) but we still offer a different proof.
2.15. Lower order coefficients in Lorentz spaces. We have developed our variational approach under control of mixed Lebesgue norms on the lower order coefficients. We shall now explain why these conditions can be relaxed with hardly any effort, using the Lorentz spaces L p,∞ . Recall that on a measure space (M, µ), a measurable function f belongs to L p,q in the case 1 ≤ p, q < ∞ if Here, f * is the non-increasing rearrangement of f . It is known that L p,p = L p and that f L p,q is non-increasing as a function of q, so that L p,q ⊂ L p,p ⊂ L p,r if q ≤ p ≤ r. Details are found in Chapter 5 of [32]. Mixed Lorentz spaces in (t, x) have been introduced by Fernandez [20], who also proved that they behave in the same way as Lebesgue spaces concerning duality and multiplication (Hölder's inequality). Simple functions are dense in spaces for which all exponents are finite.
The extension mainly relies on the following lemma.
Lemma 2.68. Let (r 1 ,q 1 ) be a compatible pair for lower order coefficients with admissible conjugate (r 1 , q 1 ).
Proof. Sobolev embeddings are equivalent to L p − L q boundedness of Riesz potentials with p < q. However, it was observed by O'Neil [30] that such Riesz potentials also have L p,s − L q,s boundedness for the same p, q and all 1 ≤ s ≤ ∞. In particular, they are L p − L q,p bounded as L p = L p,p . Thus, with the same relations between q, r and θ as in Lemma 2.3 but with different constants, x and the continuous inclusion forV follows from (2.2). Now, we assume (2.69). A modification of Lemma 2.7, using Hölder's inequality in Lorentz spaces to guarantee that a product of three functions in L p i ,s i belongs to L 1 if 1 = 1 p 1 + 1 p 2 + 1 p 3 and 1 = 1 With this at hand, the boundedness of H fromV to its dual follows exactly as in Proposition 2.11. Likewise, if we assume (2.70), then we proceed with the modifications as in Section 2.10.
Assuming that (2.69) holds for the compatible pair (r 1 ,q 1 ), one can define H and develop the variational theory upon replacing in the definition of the space∆ r 1 ,q 1 , where (r 1 , q 1 ) is the conjugate admissible pair, the mixed Lebesgue space L r 1 t L q 1 x by the mixed Lorentz space L r 1 ,2 t L q 1 ,2 x . With this precaution and these changes, the estimates in Corollary 2.20 and the integral equalities in Lemma 2.22 hold. (When (r, q) = (∞, 2), there is no weakening of assumptions and we keep working with the space L 1 t L 2 x .) We may proceed with the regularity Proposition 2.26, the uniqueness Theorem 2.27, the well-posedness Theorem 2.28 with g ∈ L r ′ ,2 t L q ′ ,2 x on the right-hand side, and so on up until Theorem 2.40. It is only for Theorem 2.41 that we need a stronger assumption on the coefficients to guarantee causality, as we have used inequalities in the spirit of Gagliardo-Nirenberg. It follows from Proposition 5.1 and Hölder inequalities that it is enough to impose One can also develop the corresponding inhomogeneous theory with coefficients as in (2.70), working mainly under the Lorentz-Lorentz analogue of Assumption (D ε ). While this amounts to the same symbolic changes from Lebesgue to Lorentz spaces in (D ε ) itself, the succeeding Remark 2.46 has to be interpreted correctly: it says that by truncation a decomposition as in (D ε ) for arbitrarily small ε > 0 can be achieved starting from |a| 2 + |b| 2 + |a| ∈ Lr 1 ,r 2 t Lq 1 ,q 2 x + L ∞ t L ∞ x with 1 ≤q 2 ,r 2 < ∞, but not when one ofq 2 ,r 2 is infinite. Hence, the lower bounds assumption (A3)' becomes more interesting here. In particular there is a statement corresponding to Theorem 2.54 in which mixed Lebesgue norms are replaced with mixed Lebesgue-Lorentz norms on the lower order coefficients with the same pairs (r 1 ,q 1 ) and x , and in the equation the forcing term g can be taken in L r ′ ,2 (0, T ; L q ′ ,2 x ) when (r, q) is admissible (but not when (r, q) = (∞, 2), where we take g ∈ L 1 (0, T ; L 2 x ) as before). All the direct consequences of this result also extend: Corollary 2.59, Theorem 2.61 and Theorem 2.64. In the latter theorem it depends on whether the local boundedness assumption is true for the particular L and its adjoint. Note that neither [28] nor [4] consider coefficients in mixed Lebesgue-Lorentz spaces. Hence this extension is quite a new observation.
Let us give an example in the case (r 1 ,q 1 ) = (∞, n/2), when n ≥ 3. Consider parabolic Schrödinger operators with c a complex-valued measurable and bounded function. One cannot use the assumption (D ε ) here. But the classical Hardy inequality which follows from Hardy's one dimensional inequality [33, Appendix A] using polar coordinates, allows one to apply Theorem 2.42 when ess inf Re c > −( n−2 2 ) 2 =: c n . Thus, H is invertible and causal (for causality, Re c ≥ c n works). One can therefore solve the Cauchy problem as above and obtain L 2 off-diagonal Gaussian decay of its fundamental solution operator. In [10], the slightly different but related question of existence of a distributional non-negative solution to the Cauchy problem for ∂ t −∆+ c|x| −2 with non-negative initial L 1 or measure data and c a constant with c ∈ [c n , 0] is considered.

2.16.
Adding a skew-symmetric real BMO matrix to higher order coefficients. Motivated by fluid dynamics, it has become interesting to add to the usual elliptic matrix A a skew-symmetric term with boundedness replaced by a BMO condition. Indeed, formally, pointwise lower ellipticity of the matrix A does not change if one adds to it a real and skew-symmetric matrix D(t, x) as Re D(t)∇u(t), ∇u(t) = 0 and, if D(t, x) has finite BMO norm in the x-variable, uniformly for each t, then for u, v ∈V, x using the BMO x − H 1 x duality and compensated compactness [13]. Integrating this in time guarantees boundedness and ellipticity of the second order term in L if A is changed to A + D with D L ∞ t BMOx < ∞. We shall make this precise below. All the results obtained up to this point extend with A replaced by A+D under this assumption on D. Indeed, the extension only affects the second order term, which has been treated via bounds for the pairing A∇u, ∇v at each occurrence rather than concrete bounds on A, with one sole exception that we address next.
The only subtle thing to handle is the proof of the L 2 off-diagonal estimates (2.53) as in Theorem 2.61 (with a less precise control on the constants C, ω, c 0 ), the difficulty being that D re-appears in lower order coefficients when using Davies' exponential trick in (2.54). We first give rigorous definitions of the bracket terms to justify computations.
We would like to set but the inner term is usually not an honest Lebesgue integral for arbitrary u, v ∈V. We introduce the set E of functions inV that are in S(R n+1 ) with bounded support in the x-variable, which is dense inV (resp. V). Indeed, we know that S(R n+1 ) is dense inV and from there, we can use smooth truncations. Consider u, v ∈ E. Let Q be a cube containing their support. Set for i, j ∈ {1, . . . , n} 2 , For each t, this is a bounded function with support in Q and mean value zero. Hence, it is a constant multiple of an atom in H 1 x , the real Hardy space on R n , and the BMO x − H 1 x duality is realized in this case as a Lebesgue integral As we know from [13] that x . Using the skew-symmetry of D, that is, d i,j = −d j,i , we can set and this form extends boundedly toV ×V. We now explain the necessary modifications.
Proof of Theorem 2.61, BMO-case. To check the invertibility, it suffices as before to look for lower bounds of e h (H + κ)e −h u. Thus, we study again e h He −h with h Lipschitz. We do not want to assume (qualitative) boundedness of h this time. Hence, we first restrict the operator to E but it extends to V through the right-hand side of (2.54). This allows us to take h an affine real-valued function given by h(x) = x· ζ + c, with ζ ∈ R n and c ∈ R. It will be important that the gradient of h is constant (as in [17,31]). Thus, we compute e h (H + κ)e −h u, v with u, v ∈ E and h affine.
Step 1: New error estimate. Compared to (2.54), we get an extra term coming from the presence of D. A calculation yields, with g i,j defined in (2.72), Next, we claim that for f, g ∈ H 1 x and each i ∈ {1, . . . , n} the function ∂ x i (f g) belongs to H 1 x with the estimate ). For f = g this is Proposition 3.2 in [31] and the argument applies mutadis mutandis in the general case. Moreover, if f, g are smooth with bounded support, then ∂ x i (f g) is a multiple of an atom in H 1 x , so that for any . Hence, for each fixed t, this applies to f = u(t), g = v(t) and, using again the skewsymmetry of D, we arrive at Using the above estimate and Young's inequality, we see that for any ε > 0, x . This is the required estimate for the additional error term in the presence of D.
Step 2: Off-diagonal estimate with affine perturbation. Now, it follows in the case (A3) that if (D ε 0 ) holds for ε 0 small enough, then e h (H + κ)e −h : V → V ′ is invertible for κ ≥ 1 + c 0 (|ζ| 2 + P 2 ∞ ). In the case of lower bounds assumptions for L, this is for κ ≥ c 0 (1 + |ζ| 2 ). Of course, c 0 now also depends on D L ∞ t BMO x . Moreover, in both cases, the operator norm of the inverse is bounded independently of |ζ|. In conclusion, we obtain an estimate of the form e h Γ(t, s)ψ L 2 x ≤ Ce (ω+c 0 |ζ| 2 )(t−s) e h ψ L 2 x for all t > s with positive constants C, ω, c 0 .
Step 3: Proof of (2.53). Let us first treat the case that E, F are convex and compact sets with d(E, F ) 2 > 4n(t − s). In this case, take e ∈ E, f ∈ F such that |e − f | = d(E, F ) and set Note that e is the orthogonal projection of f onto E and vice-versa. Hence, x ∈ E and h(y) ≤ 0 for y ∈ F , from which we obtain (2.53). For the general situation where E, F are arbitrary closed sets, we can assume d(E, F ) 2 > 8n(t − s); otherwise, we are done with the uniform L 2 x bound for Γ(t, s).
t − s k}, k ∈ Z n . Cover E with the cubes Q k that intersect E, and F with the cubes Q ℓ that intersect F . We have d(Q k , Q ℓ ) 2 > 4n(t−s). We apply the estimate just obtained for each pair Q k , Q ℓ and sum in order to conclude (of course the constants change), using that the cubes form a partition of R n up to a null set and simple discrete convolution inequalities.
Remark 2.69. When A is also a real matrix, pointwise upper and lower bounds were obtained for the fundamental solution of the parabolic operator with pure second order term and matrix coefficient A + D in [31]. Here, we allow complex A and unbounded lower order terms and limit ourselves to an L 2 − L 2 upper bound. Some similar estimates are obtained for time-independent matrix coefficients of the form A + D without lower order terms in [17]. In principle, we could re-discover pointwise upper bounds from (the extension of) Theorem 2.64, were we able to verify the local boundedness property without resorting to itself [31]. This is yet another example that illustrates how the order of classical arguments is reversed in our work.
2.17. Systems. The theory and its previous extensions do not change for systems of N equations, N ≥ 2. The results are the same with pointwise ellipticity in the x-variable replaced by ellipticity in the Gårding sense (uniformly in t): The matrix A(t) has entries being N × N matrices of bounded measurable coefficients in (t, x) and Re x holds for all t. Indeed, we have never used pointwise bounds and ellipticity on A for means other than bounding A∇u, ∇v from above and below. If one wants to add a matrix of BMO-type, it should be block diagonal, that is D = (δ α,β D α ) 1≤α,β≤N , where δ α,β is the Kronecker symbol, with each D α as in the previous section.
If the Gårding inequality comes with a negative L 2 norm on u(t), then one should apply the inhomogeneous theory. We leave details to the reader.

Higher order problems on full space
It is mainly a matter to fix algebraic notation as the analysis done for second order parabolic operators goes through almost verbatim for higher order problems on full space. We give details of the setup and sketch the main points, following faithfully what was done for second order problems. Given our omission of proofs, this section should be considered as an announcement of results, the verification of which is left to the interest readers. Results in this section also provide the generalization of the theory for second order elliptic parts when the compatible pairs are allowed to vary with the coefficients as mentionned earlier.

3.2.
Variational space. For the homogeneous theory, the spaceV becomes the space of tempered distributions u having Fourier transforms (|ξ| 2m + |τ |) −1/2 g for some (unique) g ∈ L 2 t L 2 x , equipped with the norm u V := (2π) −(n+1)/2 g L 2 t L 2 x . As in the case of order 2, this space realizes L 2 x defined within tempered distributions modulo polynomials with norm 3.3. Embeddings. For an arbitrary collection (r, q) of pairs of exponents (r α,β , q α,β ) in [1, ∞] 2 indexed by multi-indices (α, β) with 0 ≤ |α|, |β| ≤ m, we seṫ For each α, there could be several mixed spaces involved to which ∂ α u belongs, parametrized by the multi-indices β. If all pairs of exponents belong to [1, ∞) 2 , then the dual space of∆ r,q in the duality extending the L 2 t L 2 x inner product can be identified with with the same interpretation as in the case m = 1 in Section 2.2 and (r ′ , q ′ ) is the collection of pairs of Hölder conjugates obtained from (r, q). When all pairs of exponents in (r, q) belong to (1, ∞] 2 , then the dual space of Σ r ′ ,q ′ can be identified with∆ r,q for the same duality. In particular,∆ r,q is reflexive when all pairs belong to (1, ∞) 2 .
3.5. Main regularity estimates. We can now state the main regularity lemma. We set ∇ m u = (∂ α u) |α|=m for simplicity.
x and ∂ t u ∈Σ r ′ ,q ′ , where (r, q) is a super admissible collection. Then, there is a polynomial P in the x-variable with degree not exceeding m − 1, such that u − P ∈ C 0 (L 2 x ) and sup with some constant C independent of u and P . Moreover, if the collection (r, q) is admissible, then u − P ∈V with the same estimate on u − P V .
The integral equalities of Section 2.5 are also proved similarly.
3.6. The resulting theory. The invertibility of H is again enough to develop the uniqueness and existence of∆r 1 ,q 1 -solutions and to produce Green operators in order to obtain representations. For example, the uniqueness statement corresponding to Theorem 2.27 becomes that whenever H is invertible, then any u ∈∆r 1 ,q 1 such that ∂ t u + Lu = 0 vanishes. The invertibility for H can be checked provided there is a Gårding inequality in the spirit of (2.43) for the leading coefficients, that is, and for the lower order coefficients, smallness of Pr 1 ,q 1 is needed. Alternatively, invertibility can also follow from lower bounds on L as in Theorem 2.42. If (3.7) holds and the leading part of H is a pure 2m-order operator, then one can work with the uniqueness class of L 2 tḢ m x -solutions, which is defined analogously to Section 2.14.
If the Gårding inequality comes with a negative L 2 t L 2 x norm on u, or Pr 1 ,q 1 is not small enough, or bounded coefficients are added to the lower order coefficients while Pr 1 ,q 1 remains small, or again that a lower bound is assumed on L, then one uses inhomogeneous spaces to prove invertibility of H + κ : V → V ′ for large enough κ.
Using the improvement of (3.5) with the mixed Lorentz spaces L r(α,β),∞ t L q(α,β),∞ x replacing the mixed Lebesgue spaces L r(α,β) t L q(α,β) x for the lower order coefficients is possible and Pr 1 ,q 1 is modified accordingly. This covers, for example, power weights c(t, x)|x| −n/q(α,β) with c bounded above and below, when r(α, β) = ∞. For forcing terms and solutions, the mixed Lorentz spaces L r,2 t L q,2 x may replace the mixed Lebesgue spaces L r t L q x with the same collections of pairs. The proof of causality uses a variant of Gagliardo-Nirenberg inequalities and requires mixed Lebesgue-Lorentz norms. A quick proof of this variant can be found in Proposition 5.1.
The Cauchy problem can be stated and proved in a similar fashion. The fundamental solution operator can be identified with exponentially weighted Green operators as before. Under the same assumptions guaranteeing invertibility and causality of H + κ, the fundamental solution operator enjoys L 2 off-diagonal estimates. Lipschitz bounded functions of the x-variable are replaced by the regular functions considered by Davies in [15] for the case of time-independent parabolic operators with bounded lower terms. This is more complicated here, because we take unbounded coefficients. But we can obtain lower bounds of perturbed operators e h (H + κ)e −h using successive and tedious decompositions of the perturbed coefficients as in the condition (D ε ), where κ is chosen on the order of c+c ∇h 2m ∞ and optimization in h gives exponential decay in (d(E, F ) 2m /|t − s|) 1/(2m−1) .
Extensions to systems work without difficulty.

Second order problems with lateral boundary conditions
In this short section we describe an extension of our theory to second order parabolic problems on cylinders with lateral boundary conditions. As the previous section, this should be considered an announcement of results. Working out the details along our sketch and extending the results to systems is again left to interested readers. Adaptation to higher order problems is likely to hold but would require further work. The cases V = W 1,2 0 (Ω) and V = W 1,2 (Ω) correspond to (pure) lateral Dirichlet and Neumann boundary conditions. Spaces in between can be used to model for instance a mix of the two.
The only geometric assumption that we make on Ω are (fractional) Sobolev embeddings for V . We write [· , ·] θ for the complex interpolation bracket, see for example Section 1.9 in [35].
Assumption (V). We assume that there exists an embedding dimension d ∈ [1, ∞) with the following property: For all θ ∈ [0, 1] and 2 ≤ q < ∞ such that 1 2 − 1−θ d = 1 q , we have  Hence, it will be advantageous to take d as small as possible. The primary example we have in mind is when θ = 0 is allowed above (hence d > 2) and therefore V itself satisfies the Sobolev embedding V ֒→ L 2d/(d−2) (Ω). In this case, the other embeddings required in (V) follow by complex interpolation. However, already for Ω = R 2 the optimal choice is d = 2 and by fractional Sobolev embeddings we have indeed (V) with d = 2 and that (4.1) is satisfied when θ ∈ (0, 1], even though we do not have (4.2). In ambient dimension n = 1 and when Ω is an interval, (V) holds with embedding dimension d = 1 no matter what the boundary conditions are and (4.1) is satisfied in the limited range θ ∈ ( 1 2 , 1] due to the constraint 2 ≤ q < ∞. Remark 4.2. Testing (4.1) with cut-off functions ψ for arbitrarily small balls contained in Ω, reveals that d cannot be smaller than the ambient dimension n. In principle, d can be larger than n. When V = W 1,2 0 (Ω) or when Ω is sufficiently regular, the value d = n is obtained. For a discussion of irregular sets that satisfy (V), we refer to the introduction of [12] or [1,Ch. 4] for the case d > n and to [18,Sec. 3] for mixed Dirichlet-Neumann boundary conditions. 4.2. Variational space. The variational space is now V := L 2 t V ∩ H 1/2 t L 2 x , equipped with the Hilbertian norm u V given by where in this section we use the notation L p x := L p (Ω). Let −∆ V be the positive self-adjoint operator built from the sesquilinear form (ψ,ψ) → ∇ψ, ∇ψ on V × V . We let S = (1 − ∆ V ) 1/2 , so that by Kato's second representation theorem [25] the domain of S is equal to V with Sψ L 2 x = ψ V for all ψ ∈ V . It is also known that the domains of the powers S α , α ∈ R, interpolate by the complex method [8].
4.3. Embeddings. We begin by developing the theory along the lines of Section 2. As the reader may have already observed, we have used the full strength of distribution theory only in the t-variable, whereas in the x-variable distributions and test functions have mostly appeared for the sake of simple arguments but they could have been replaced by spectral theory for the Laplacian and functions in less regular spaces such as∆ r,q . This is our general guideline.
Our first task is to identify the pairs (r, q) for which we have the embedding (4.3) V ֒→ ∆ r,q := L 2 t V ∩ L r t L q x , where ∆ r,q is equipped with the norm u ∆ r,q := u L 2 t V + u L r t L q x . We set Σ r ′ ,q ′ = L 2 t V ′ + L r ′ t L q ′ x with the usual infimum norm. Lemma 4.3. Under Assumption (V), the embedding (4.3), and by duality Σ r ′ ,q ′ ֒→ V ′ , hold if 1 r + d 2q = d 4 with 2 ≤ r, q < ∞. Proof. We modify the proof of Lemma 2.3. In order to prove (2.1) we have previously used the Fourier transform on L 2 (R n ) to obtain unitary equivalence of −∆ to a multiplication operator m(ξ) = |ξ| 2 . Here, we use the spectral theorem for (1 − ∆ V ) and the same argument applies. As for (2.2), the required Sobolev inequality in the spatial variable is precisely our Assumption (V) and now d instead of n plays the role of the dimension. Hence, (4.3) holds under the given conditions on (r, q).

4.4.
Variational approach. Pairs that satisfy the relation in Lemma 4.3 will be called admissible pairs (for the boundary value problems under assumption (V)). Once again, admissible pairs (r, q) are conjugates of pairs (r,q), called compatible pairs for lower order coefficients, which are defined by 1 r The conjugation rule is (r, q) = (2(r) ′ , 2(q) ′ ) as in (2.7). Fixing once and for all a compatible pair (r 1 ,q 1 ) for lower order coefficients, we define the parabolic operator H on V by the sesquilinear form where as before, β includes the lower order terms, · , · is now the inner product on L 2 x = L 2 (Ω) and · , · the sesquilinear duality extending the L 2 t L 2 x inner product. As D(R; V ) is a dense subspace of V, we have for all u ∈ V and v ∈ D(R; V ), where L is defined by the integral above. Hölder's inequality, which is dimensionless in terms of exponents, yields x ∇v L 2 t L 2 x + Pr 1 ,q 1 u ∆ r 1 ,q 1 v ∆ r 1 ,q 1 with Pr 1 ,q 1 as in (2.6), so that Hence, using H gives access to weak solutions in L 2 t V of ∂ t u + Lu = w with lateral boundary conditions prescribed by V . Lemma 4.4. Let u ∈ D ′ (R; V ′ ) with u ∈ L 2 t V and ∂ t u ∈ Σ r ′ ,q ′ for (r, q) an admissible pair or (r, q) = (∞, 2) under Assumption (V). Then u ∈ C 0 (L 2 x ) and for some constant C < ∞ independent of u, Moreover, if (r, q) is admissible, then u ∈ V with the same estimate on u V .
Proof. We indicate the main changes.
Modification of the uniqueness Lemma 2.14. This is now stated for u ∈ D ′ (R; V ′ ) such that ∂ t u + (1 − ∆ V )u = 0 in D ′ (R; V ′ ): if u ∈ L 2 t V , then u = 0. Indeed, we see that ∂ t u ∈ L 2 t V ′ . By Lions' embedding theorem we have u ∈ C 0 (L 2 x ), and testing the equation against u yields that (1 − ∆ V )u, u = 0, which implies u = 0.
Modification of the embedding in Lemma 2.16. Here, we have to show that with θ = 1 − 2 r we have the continuous inclusion x . In the definition of this space, S 1−θ is extended dy duality to a map from L 2 into the dual of V with respect to the L 2 x duality. This uses that V is the domain of S and that 0 ≤ 1 − θ ≤ 1. Hence, we are working with a subspace of D ′ (R; V ′ ). The embedding itself is a repetition of the proof of Lemma 2.16 except that now we take G = S(R; V ) as dense subset. This is where we use assumption (V).
Modification of the stronger regularity statement in Lemma 2.17. We need a new dense subspace G 0 , which we can take as G 0 := S 00 (R; dom(∆ 2 V )) here.
Step 1 then goes through mutadis mutandis if we understand Fourier in the xvariable as a special case of the spectral theorem for the Laplacian, compare with the proof of Lemma 4.3.
Step 2 remains unchanged. For Step 3, we obtain v ′ (t) + (1 − ∆ V )v(t) = w(t) in L 2 x for all t ∈ R, whenever g ∈ G 0 . The equation can also be interpreted in D ′ (R; V ′ ) for the test functions φ ∈ D(R; V ): this interpretation passes to the limit for g ∈ L 2 t L 2 x , thanks to Step 1. Lastly, Step 4, again for g ∈ G 0 , has been a Fourier transform argument and now its use in the x-variable should be replaced by the spectral theorem. From this perspective, the proof is the same as before.
End of proof. Modifications of Proposition 2.18 and of the corollaries that follow are proved similarly with constant c = 0.
4.6. The resulting theory. From this point on, the theory can be developed similar to the inhomogeneous setting of Section 2.10, assuming (V). The two exceptional topics are pointwise Gaussian upper bounds (Section 2.13) and BMO-coefficients in the principal part (Section 2.16), the extension of which will require finer geometrical properties of the underlying domain Ω and should be considered open at this point.
The rest works out smoothly, as long as we assume uniform ellipticity in the sense of Gårding: There should exist λ > 0 and c 0 ∈ R such that for almost every t and every w ∈ V we have (4.4) Re A(t)∇w, ∇w ≥ λ ∇w 2 L 2 x − c 0 w 2 L 2 x . Then, we can work with lower order coefficients in Lebesgue-Lebesgue mixed spaces with the assumption (D ε ). This gives access to representation by Green operators for the inverse of H + κ for appropriate κ ≥ 0, causality and fundamental solution operators for the Cauchy problem. The proof of L 2 off-diagonal estimates can be adapted if V is invariant under multiplication with bounded Lipschitz functions. For example, the variational spaces for mixed Dirichlet-Neumann boundary conditions have this property [18,Lem. 4].
If (4.4) comes with c 0 = 0 and the leading part of H is a pure second order operator, then one can also develop the theory in the class L 2 t V similar to Section 2.14. Finally, for the extension of the definition of H when coefficients belong to mixed Lorentz spaces, we can use the following self-improvement property to treat all compatible pairs (r 1 ,q 1 ) withr 1 < ∞. Lemma 4.5. If Assumption (V) holds, then L q (Ω) can be replaced with L q,2 (Ω) in (4.1) when θ > 0.
With σ := 1−θ 1−ϑ ∈ (0, 1) we obtain the required continuous inclusion [L 2 x , V ] 1−θ = (L 2 x , V ) 1−θ,2 = (L 2 x , (L 2 x , V ) 1−ϑ,2 ) σ,2 ֒→ (L 2 x , L r x ) σ,2 = L q,2 x . The second equality is the reiteration theorem [35, Sec. 1.10.2] and the final equality follows from the real interpolation property for Lebesgue spaces [35,Sec. 1.18.6] and the relation 1−σ 2 + σ r = 1 q . With the previous lemma at hand, the extension of the definition of H with coefficients in mixed Lorentz spaces can be carried out as before for compatible pairs for the coefficients withr 1 < ∞. The case r 1 = 2 for the conjugate admissible pair (r 1 , q 1 ) is not covered by this statement. Invertibility can be shown under Lorentz-Lorentz mixed norms for the lower order coefficients and causality follows under Lebesgue-Lorentz mixed spaces.
In order to include Lorentz spaces for compatible pairs withr 1 = ∞, which is probably the most interesting case in applications, the improvement in Lemma 4.5 for θ = 0 is needed, that is, d ≥ 3 and the embedding V ֒→ L 2d/(d−2),2 x holds. One simple way to guarantee this embedding for d = n ≥ 3 is to assume that there is a bounded Sobolev extension operator V → W 1,2 (R n ) since then one can use the O'Neil's Sobolev embedding on R n and restrict back to Ω. Hence, this always works for pure lateral Dirichlet conditions (V = W 1,2 0 (Ω)), using the extension by zero. For the existence of an extension operator in the case of mixed lateral boundary conditions, the most general geometric assumptions as far we are aware can be found in [11].