1 Introduction

The question of local well-posedness for nonlinear wave equations with rough initial data is a fundamental question in the study of nonlinear waves, and which has received a lot of attention over the years. The result of Smith and Tataru [38], proved almost 20 years ago, provides the sharp regularity threshold for generic nonlinear wave equations in view of Lindblad’s counterexample [30]. On the other hand, it has also been conjectured [45] that for nonlinear wave equations that satisfy a suitable nonlinear null condition, the result of [38] can be improved, and the well-posedness threshold can be lowered. In this paper we provide the first result that proves the validity of this conjecture, for a representative equation in this class, namely the hyperbolic minimal surface equation. Further, our improvement turns out to be substantial; precisely, we gain \(3/8\) derivatives in two space dimensions and \(1/4\) derivatives in higher dimension. At this regularity level, the Lorentzian metric \(g\) in our problem is no better that \(C_{x,t}^{\frac{1}{4}+} \cap L^{2}_{t} C_{x}^{\frac{1}{2}+}\), (\(C_{x,t}^{ \frac{3}{8}+} \cap L^{4}_{t} C_{x}^{\frac{1}{2}+}\) in \(2D\)) far below anything studied before.

Most of the ideas introduced in this paper will likely extend to other nonlinear wave models, and open the way toward further progress in the study of low regularity solutions.

1.1 The minimal surface equation in Minkowski space

Let \(n \geq 2\), and \({\mathfrak {M}}^{n+2}\) be the \(n+2\) dimensional Minkowski space-time. A codimension one time-like submanifold \(\Sigma \subset {\mathfrak {M}}^{n+2}\) is called a minimal surface if it is locally a critical point for the area functional

$$ \mathcal {L} = \int _{\Sigma} \, dA, $$

where the area element is measured relative to the Minkowski metric. A standard way to think of this equation is by representing \(\Sigma \) as a graph over \({\mathfrak {M}}^{n+1}\),

$$ \Sigma = \{ x_{n+1} = u (t,x_{1},\ldots ,x_{n})\}, $$

where \(u\) is a real valued function

$$ u: D \subset {\mathfrak {M}}^{n+1} \rightarrow {\mathbb{R}}, $$

which satisfies the constraint

$$ u_{t}^{2} < 1+ |\nabla _{x} u|^{2}, $$
(1.1)

expressing the condition that its graph is a time-like surface in \({\mathfrak {M}}^{n+2}\).

Then the surface area functional takes the form

$$ \mathcal {L}(u) = \int \sqrt{1-u_{t}^{2} +|\nabla _{x} u|^{2}}\ dx. $$
(1.2)

Interpreting this as a Lagrangian, the minimal surface equation can be thought of as the associated Euler-Lagrange equation, which takes the form

$$ - \frac{\partial }{\partial t} \left ( \frac{u_{t}}{\sqrt{1 - u_{t}^{2} + |\nabla _{x} u|^{2}}}\right ) + \sum _{i = 1}^{n} \frac{\partial }{\partial x_{i}} \left ( \frac{u_{x_{i}}}{\sqrt{1 - u_{t}^{2} + |\nabla _{x} u|^{2}}}\right ) = 0. $$
(1.3)

Under the condition (1.1), the above equation is a quasilinear wave equation.

The left hand side of the last equation can be also interpreted as the mean curvature of the hypersurface \(\Sigma \), and as such the minimal surface equation is alternatively described as the zero mean curvature flow.

In addition to the above geometric interpretation, the minimal surface equation for time-like surfaces in the Minkowski space is also known as the Born-Infeld model in nonlinear electromagnetism [50], as well as a model for evolution of branes in string theory [15].

On the mathematical side, the question of global existence for small, smooth and localized initial data was considered in work of Lindblad [31], Brendle [8], Stefanov [40] and Wong [48]. The stability of a nonflat steady solution, called the catenoid, was studied in [10, 29]. Some blow-up scenarios due to failure of immersivity were investigated by Wong [49]. Minimal surfaces have also been studied as singular limits of certain semilinear wave equations by Jerrard [22]. The local well-posedness question fits into the similar theory for the broader class of quasilinear wave equations, but there is also one result that is specific to minimal surfaces, due to Ettinger [11]; this is discussed later in the paper.

In our study of the minimal surface equation, the above way of representing it is less useful, and instead it is better to think of it in geometric terms. In particular the fact that the above Lagrangian (1.2) and the equation (1.3) are formulated relative to a background Minkowski metric is absolutely non-essential; one may instead use any flat Lorentzian metric. This is no surprise since any two such metrics are equivalent via a linear transformation. Perhaps less obvious is the fact that the equations may be actually written in an identical fashion, independent of the background metric; see Remark 3.1 in Section 3.

For full details on the structure of the equation we refer the reader to Section 3 of the paper, but here we review the most important facts.

The main geometric object is the metric \(g\) that is the trace of the Minkowski metric in \({\mathfrak {M}}^{n+2}\) on \(\Sigma \), and which, expressed in the \((t=x_{0},x_{1},\ldots , x_{n})\) coordinates, has the form

$$ g_{\alpha \beta} := m_{\alpha \beta} + \partial _{\alpha} u \partial _{ \beta} u, $$
(1.4)

where \(m_{\alpha \beta}\) denotes the Minkowski metric with signature \((-1, 1, \ldots, 1)\) in \({\mathfrak {M}}^{n+1}\). Since \(\Sigma \) is time-like, this is also a Lorentzian metric. This has determinant

$$ g: = |\det (g^{\alpha \beta})| =1+ m^{\alpha \beta} \partial _{ \alpha }u \, \partial _{\beta }u , $$
(1.5)

and the dual metric is

$$ g^{\alpha \beta} := m^{\alpha \beta} - \frac{m^{\alpha \gamma} m^{\beta \delta} \partial _{\gamma} u \, \partial _{\delta }u}{1 + m^{\mu \nu} \partial _{\mu }u \, \partial _{\nu }u}. $$
(1.6)

Here, and later in the paper, we carefully avoid raising indices with respect to the Minkowski metric. Instead, all raised indices in this paper will be with respect to the metric \(g\).

Relative to this metric, the equation (1.3) can be expressed in the form

$$ \Box _{g} u = 0, $$
(1.7)

where \(\Box _{g}\) is the covariant d’Alembertian, and which in this problem will be shown to have the simple expression

$$ \Box _{g} = g^{\alpha \beta} \partial _{\alpha }\partial _{\beta}. $$
(1.8)

An important role will also be played by the associated linearized equation, which, as it turns out, may be easily expressed in divergence form as

$$ \partial _{\alpha }{\hat{g}}^{\alpha \beta} \partial _{\beta }v = 0, \qquad {\hat{g}}^{\alpha \beta}:= g^{-\frac{1}{2}} g^{\alpha \beta}. $$
(1.9)

Our objective in this paper will be to study the local well-posedness of the associated Cauchy problem with initial data at \(t = 0\),

$$ \left \{ \begin{aligned} & \Box _{g} u = 0, \\ & u(t=0) = u_{0}, \\ & u_{t}(t=0) = u_{1}, \end{aligned} \right . $$
(1.10)

where the initial data \((u_{0},u_{1})\) is taken in classical Sobolev spaces,

$$ u[0]:= (u_{0},u_{1}) \in \mathcal {H}^{s} := H^{s} \times H^{s-1}, $$
(1.11)

and is subject to the constraint

$$ u_{1}^{2} - |\nabla _{x} u_{0}|^{2} < 1. $$
(1.12)

Here we use the following notation for the Cauchy data in (1.3) at time \(t\),

$$ u[t]:= (u(t, \cdot ), u_{t}(t, \cdot )). $$

We aim to investigate the range of exponents \(s\) for which local well-posedness holds, and significantly improve the lower bound for this range.

1.2 Nonlinear wave equations

The hyperbolic minimal surface equation (1.3) can be seen as a special case of more general quasilinear wave equations, which have the form

$$ g^{\alpha \beta}(\partial u) \partial _{\alpha }\partial _{\beta }u = N(u, \partial u) , $$
(1.13)

where, again, \(g^{\alpha \beta}\) is assumed to be Lorentzian, but without any further structural properties. The simplest case is when \(u\) is a scalar, real valued function. But one may equally allow \(u\) to be a vector-valued function, in which case we think of the left hand side of the equation as being in diagonal form, with the coupling occurring only via \(g\) and \(N\). This generic equation will serve as a reference.

As a starting point, we note that the equation (1.3) (and also (1.13) if \(N=0\)) admits the scaling law

$$ u(t,x) \to \lambda ^{-1} u(\lambda t,\lambda x). $$

This allows us to identify the critical Sobolev exponent as

$$ s_{c}=\frac{n+2}{2}. $$

Heuristically, \(s_{c}\) serves as a universal threshold for local well-posedness, i.e. we have to have \(s > s_{c}\). Taking a naive view, one might think of trying to reach the scaling exponent \(s_{c}\). However, this is a quasilinear wave equation, and getting to \(s_{c}\) has so far proved impossible in any problem of this type.

As a good threshold from above, one might start with the classical well-posedness result, due to Hughes, Kato, and Marsden [18], and which asserts that local well-posedness holds for \(s > s_{c}+1\). This applies to all equations of the form (1.13), and can be proved solely by using energy estimates. These have the form

$$ \| u[t]\|_{\mathcal {H}^{s}} \lesssim e^{\int _{0}^{t} \| \partial ^{2} u(s) \|_{L^{\infty}} ds} \| u[0] \|_{\mathcal {H}^{s}}. $$
(1.14)

They may also be restated in terms of quasilinear energy functionals \(E^{s}\) that have the following two properties:

  1. (a)

    Coercivity,

    $$ E^{s}(u[t]) \approx \| u[t]\|_{\mathcal {H}^{s}}^{2}. $$
  2. (b)

    Energy growth,

    $$ \frac{d}{dt} E^{s}(u) \lesssim \| \partial ^{2} u\|_{L^{\infty}} \cdot E^{s}(u). $$
    (1.15)

To close the energy estimates, it then suffices to use Sobolev embeddings, which allow one to bound the above \(L^{\infty}\) norm, which we will refer to as a control parameter, in terms of the \(\mathcal {H}^{s}\) Sobolev norm provided that \(s > \frac{n}{2}+2\), which is one derivative above scaling.

The reason a derivative is lost in the above analysis is that one would only need to bound \(\|\partial ^{2} u\|_{L^{1} L^{\infty}}\), whereas the norm that is actually controlled is \(\|\partial ^{2} u\|_{L^{\infty }L^{\infty}}\); this exactly accounts for the one derivative difference in scaling. It also suggests that the natural way to improve the classical result is to control the \(L^{p} L^{\infty}\) norm directly. This is indeed possible in the context of the Strichartz estimates, which in dimension three and higher give the bound

$$ \|\partial ^{2} u\|_{L^{2} L^{\infty}} \lesssim \| u[0]\|_{\mathcal {H}^{{ \frac{n}{2}+\frac{3}{2}}}}, $$

with another \(\epsilon \) derivatives loss in three space dimensions. When true, such a bound yields well-posedness for \(s > \frac{n+3}{2}\), which is \(1/2\) derivatives above scaling. The numerology changes slightly in two space dimensions, where the best possible Strichartz estimate has the form

$$ \|\partial ^{2} u\|_{L^{4} L^{\infty}} \lesssim \|u[0]\|_{\mathcal {H}^{ \frac{n}{2}+\frac{7}{4}}}, \text{ where } n= 2, $$

which is \(3/4\) derivatives above scaling.

The difficulty in using Strichartz estimates is that, while these are well known in the constant coefficient case [12, 24] and even for smooth variable coefficients [23, 33], that is not as simple in the case of rough coefficients. Indeed, as it turned out, the full Strichartz estimates are true for \(C^{2}\) metrics, see [35] (\(n = 2,3\)), [44] (all \(n\)), but not, in general, for \(C^{\sigma}\) metrics when \(\sigma <2\), see the counterexamples of [36, 37]. This difficulty was resolved in two stages:

  1. (i)

    Semiclassical time scales and Strichartz estimates with loss of derivatives. The idea here, which applies even for \(C^{\sigma}\) metrics with \(\sigma <2\), is that, associated to each dyadic frequency scale \(2^{k}\), there is a corresponding “semiclassical” time scale \(T_{k} = 2^{-\alpha k}\), with \(\alpha \) dependent on \(\sigma \), so that full Strichartz estimates hold at frequency \(2^{k}\) on the scale \(T_{k}\). Strichartz estimates with loss of derivatives are then obtained by summing up the short time estimates with respect to the time intervals, separately at each frequency. This idea was independently introduced in [5] and [43], and further refined in [4] and [46].

  2. (ii)

    Wave packet coherence and parametrices. The observation here is that in the study of nonlinear wave equations such as (1.13), in addition to Sobolev-type regularity for the metric, we have an additional piece of information, namely that the metric itself can be seen as a solution to a nonlinear wave equation. This idea was first introduced and partially exploited in [26], but was brought to full fruition in [38], where it was shown that almost loss-less Strichartz estimates hold for the solutions to (1.13) at exactly the correct regularity level.

The result in [38] represents the starting point of the present work, and is concisely stated as follows:Footnote 1

Theorem 1.1

Smith-Tataru [38]

(1.13) is locally well-posed in \(\mathcal {H}^{s}\) provided that

$$ s> s_{c}+\frac{3}{4}, \qquad n = 2 , $$
(1.16)

respectively

$$ s> s_{c}+\frac{1}{2}, \qquad n = 3,4,5. $$
(1.17)

As part of this result, almost loss-less Strichartz estimates were obtained both directly for the solution \(u\), and more generally for the associated linearized evolution. We will return to these estimates in Section 10 for a more detailed statement and an in-depth discussion.

The optimality of this result, at least in dimension three, follows from work of Lindblad [30], see also the more recent two dimensional result in [34]. However, this counterexample should only apply to “generic” models, and the local well-posedness threshold might possibly be improved in problems with additional structure, i.e. some form of null condition.

Moving forward, we recall that in [45], a null condition was formulated for quasilinear equations of the form (1.13).

Definition 1.2

[45]

The nonlinear wave equation (1.13) satisfies the nonlinear null condition if

$$ \frac{\partial g^{\alpha \beta}(u,p)}{\partial p_{\gamma}} \xi _{ \alpha }\xi _{\beta }\xi _{\gamma }= 0 \qquad \text{in} \qquad g^{ \alpha \beta}(u,p) \xi _{\alpha }\xi _{\beta }= 0. $$
(1.18)

In this definition the vector \(p\) is a placeholder for the \(\partial u\) variable in (1.13); for added generality, we also allow for the dependence of \(g\) on the undifferentiated \(u\).

Here we use the terminology “nonlinear null condition” in order to distinguish it from the classical null condition, which is relative to the Minkowski metric, and was heavily used in the study of global well-posedness for problems with small localized data, see [25] as well as the books [17, 39]. In geometric terms, this null condition may be seen as a cancellation condition for the self-interactions of wave packets traveling along null geodesics. In Section 3 we verify that the minimal surface equation indeed satisfies the nonlinear null condition.

Further, it was conjectured in [45] that, for problems satisfying (1.18), the local well-posedness threshold can be lowered below the one in [38]. This conjecture has remained fully open until now, though one should mention two results in [27] and [11] for the Einstein equation, respectively the minimal surface equation, where the endpoint in Theorem 1.1 is reached but not crossed.

The present work provides the first positive result in this direction, specifically for the minimal surface equation. Indeed, not only are we able to lower the local well-posedness threshold in Theorem 1.1, but in effect we obtain a substantial improvement, namely by \(3/8\) derivatives in two space dimensions and by \(1/4\) derivatives in higher dimension.

1.3 The main result

Our main result, stated in a succinct form, is as follows:

Theorem 1.3

The Cauchy problem for the minimal surface equation (1.10) is locally well-posed for initial data \(u[0]\) in \(\mathcal {H}^{s}\) that satisfy the constraint (1.12), where

$$ s> s_{c}+\frac{3}{8}, \qquad n = 2 , $$
(1.19)

respectively

$$ s> s_{c}+\frac{1}{4}, \qquad n = 3,4,5. $$
(1.20)

Remark 1.4

The constraint \(n \leq 5\) in this result is inherited from Theorem 1.1, or more precisely its full formulation provided in Theorem 10.1, which also includes the Strichartz estimates. This result is used as a black box in the present paper, so knowledge of Theorem 10.1 in higher dimensions would directly imply that the above result also holds in higher dimensions.

The result is valid regardless of the \(\mathcal {H}^{s}\) size of the initial data. Here we interpret local well-posedness in a strong Hadamard sense, including:

  • existence of solutions in the class \(u[\, \cdot \,] \in C([0,T];\mathcal {H}^{s})\), with \(T\) depending only on the \(\mathcal {H}^{s}\) size of the initial data.

  • uniqueness of solutions, in the sense that they are the unique limits of smooth solutions.

  • higher regularity, i.e. if in addition the initial data \(u[0] \in \mathcal {H}^{m}\) with \(m > s\), then the solution satisfies \(u[\, \cdot \, ] \in C([0,T];\mathcal {H}^{m})\), with a bound depending only on the \(\mathcal {H}^{m}\) size of the data,

    $$ \| u[\, \cdot \, ]\|_{C([0,T];\mathcal {H}^{m})} \lesssim \| u[0]\|_{ \mathcal {H}^{m}}. $$
  • continuous dependence in \(\mathcal {H}^{s}\), i.e. continuity of the data to solution map

    $$ \mathcal {H}^{s} \ni u[0] \to u [ \, \cdot \, ] \in C([0,T];\mathcal {H}^{s}). $$
  • weak Lipschitz dependence, i.e. for two \(\mathcal {H}^{s}\) solutions \(u\) and \(v\) we have the difference bound

    $$ \| u[\, \cdot \, ]-v[\, \cdot \, ]\|_{C([0,T];\mathcal {H}^{\frac{1}{2}})} \lesssim \| u[0]-v[0]\|_{\mathcal {H}^{\frac{1}{2}}} $$

    where the exponent \(\frac{1}{2}\) is replaced by \(\frac{5}{8}\) in two space dimensions.

We remark on the weak-Lipschitz dependence, which in more classical results is proved for a much larger range of Sobolev exponents. Here the need for balanced estimates, together with a loss of symmetry in the linearized equation, have the effect of limiting this range, namely to a smaller neighbourhood of the exponent \(\frac{1}{2}\). For the present results a single exponent suffices.

In addition to the above components of the local well-posedness result, a key intermediate role in the proof of the above theorem is played by the Strichartz estimates, not only for the solution \(u\), but also, more importantly, for the linearized problem

$$ \left \{ \begin{aligned} & \partial _{\alpha }{\hat{g}}^{\alpha \beta} \partial _{\beta }v = 0, \\ & v[0] = (v_{0},v_{1}), \end{aligned} \right . $$
(1.21)

as well as its paradifferential counterpart

$$ \left \{ \begin{aligned} & \partial _{\alpha }T_{{\hat{g}}^{\alpha \beta}} \partial _{\beta }v = 0, \\ & v[0] = (v_{0},v_{1}). \end{aligned} \right . $$
(1.22)

Here the paraproducts are defined using the Weyl quantization, see Section 2.2 for more details. For later reference, we state the Strichartz estimates in a separate theorem:

Theorem 1.5

The following properties hold for every solution \(u\) to the minimal surface equation as in Theorem 1.3, in the corresponding time interval \([0,T]\):

a) There exists some \(\delta _{0} > 0\), depending on \(s\) in (1.19), (1.20) so that the solution \(u\) satisfies the Strichartz estimates

$$ \begin{aligned} &\| \langle D_{x} \rangle ^{\frac{1}{2}+\delta _{0}} \partial u \|_{L^{4} L^{\infty}} \lesssim 1, \qquad n = 2, \\ &\| \langle D_{x} \rangle ^{\frac{1}{2}+\delta _{0}} \partial u \|_{L^{2} L^{\infty}} \lesssim 1, \qquad n = 3, 4, 5. \end{aligned} $$
(1.23)

b) Both the linearized equation (1.21) and its paradifferential version (1.22) are well-posed in \(\mathcal {H}^{\frac{5}{8}}\) for \(n=2\) respectively \(\mathcal {H}^{\frac{1}{2}}\) for \(n = 3, 4, 5\), and the following Strichartz estimates hold for eachFootnote 2\(\delta > 0\):

$$ \begin{aligned} \|v[\, \cdot \, ]\|_{L^{\infty }\mathcal {H}^{\frac{5}{8}}}+ \| \langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{4}-\delta} \partial v \|_{L^{4} L^{\infty}} \lesssim & \ \| v[0]\|_{\mathcal {H}^{\frac{5}{8}}} \qquad n =2, \end{aligned} $$
(1.24)

respectively

$$ \begin{aligned} \|v[\, \cdot \,]\|_{L^{\infty }\mathcal {H}^{\frac{1}{2}}} + \| \langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{4}-\delta} \partial v \|_{L^{2} L^{\infty}} \lesssim & \ \| v[0]\|_{\mathcal {H}^{ \frac{1}{2}}}, \qquad n = 3, 4, 5. \end{aligned} $$
(1.25)

We note that the Strichartz estimates in both parts (a) and (b) have derivative losses, namely \(1/8\) derivatives in the \(L^{4} L^{\infty}\) bound in two dimensions, respectively \(1/4\) derivatives in higher dimensions. These estimates only represent the tip of the iceberg. One may also consider the inhomogeneous problem, allow source terms in dual Strichartz spaces, etc. These and other variations that play a role in this paper are discussed in Section 4.

To understand the new ideas in the proof of our main theorem, we recall the two key elements of the proof of the result in [38], namely (i) the classical energy estimates (1.15) and (ii) the nearly lossless Strichartz estimates; at the time, the chief difficulty was to prove the Strichartz estimates.

In this paper we completely turn the tables, taking part (ii) above for granted, and instead working to improve the energy estimates. Let us begin with a simple observation, which is that the minimal surface equation (1.7) has a cubic nonlinearity, which allows one to replace (1.15) with

$$ \frac{d}{dt} E^{s}(u) \lesssim \| \partial u\|_{L^{\infty}} \| \partial ^{2} u\|_{L^{\infty}} \cdot E^{s}(u). $$
(1.26)

This is what one calls a cubic energy estimate, which is useful in the study of long time solutions but does not yet help with the low regularity well-posedness question. The key to progress lies in developing a much stronger form of this bound, which roughly has the formFootnote 3

$$ \frac{d}{dt} E^{s}(u) \lesssim \| \partial u\|_{B^{\frac{1}{2}}_{ \infty ,2}}^{2} \cdot E^{s}(u), $$
(1.27)

where the two control norms on the right are now balanced, and only require \(1/2\) derivative less than (1.26). This is what we call a balanced energy estimate, which may only hold for a very carefully chosen energy functional \(E^{s}\).

This is an idea that originates in our recent work on 2D water waves (see [1]), where balanced energy estimates are also used in order to substantially lower the low regularity well-posedness threshold. Going back further, this has its roots in earlier work of the last two authors and their collaborators [19, 20], in the context of trying to apply normal form methods in order to obtain long time well-posedness results in quasilinear problems. There we have introduced what we called the modified energy method, which in a nutshell asserts that in quasilinear problems it is far better to modify the energies in a normal form fashion, rather than to transform the equation. It was the cubic energy estimates of [20] that were later refined in [1] to balanced energy estimates. Along the way, we have also borrowed and adapted another idea from Alazard and Delort [2, 3], which is to prepare the problem with a partial normal form transformation, and is a part of their broader concept of paradiagonalization; that same idea is also used here.

There are several major difficulties in the way of proving energy estimates such as the ones in (1.27):

  • The normal form structure is somewhat weaker in the case of the minimal surface equation, compared to water waves. As a consequence, we have to carefully understand which components of the equation can be improved with a normal form analysis and which cannot, and thus have to be estimated directly.

  • Not only are the energy functionals \(E^{s}\) not explicit, they have to be constructed in a very delicate way, following a procedure that is reminiscent of Tao’s renormalization idea in the context of wave-maps [42], as well as the subsequent work [47] of the third author on the same problem.

  • Keeping track of symbol regularities in our energy functionals and in the proof of the energy estimates is also a difficult task. To succeed, here we adapt and refine a suitable notion of paracontrolled distributions, an idea that has already been used successfully in the realm of stochastic pde’s [13, 14].

  • The balanced energy estimates need to be proved not only for the full equation, but also for the associated linear paradifferential equation, as a key intermediate step, as well as for the full linearized flow. In particular, when linearizing, some of the favourable normal form structure (or null structure, to use the nonlinear wave equations language) is lost, and the proofs become considerably more complex.

Finally, the Strichartz estimates of [38] cannot be used directly here. Instead, we are able to reformulate them in a paradifferential fashion, and to apply them on appropriate semiclassical time scales. After interval summation, this leads to Strichartz estimates on the unit time scale but with derivative losses. Precisely, in our main Strichartz estimates, whose aim is to bound the control parameters in (1.27), we end up losing essentially \(1/8\) derivatives in two space dimensions, and \(1/4\) derivatives in higher dimension. These losses eventually determine the regularity thresholds in our main result in Theorem 1.3.

One consequence of these energy estimates is the following continuation result for the solutions:

Theorem 1.6

The \(\mathcal {H}^{s}\) solution \(u\) given by Theorem 1.3can be continued for as long as the following integral remains finite:

$$ \int _{0}^{T} \| \partial u(t)\|_{B^{\frac{1}{2}}_{\infty ,2}}^{2} dt < \infty . $$
(1.28)

1.4 An outline of the paper

Paraproducts and paradifferential calculus.

The bulk of the paper is written in the language of paradifferential calculus. The notations and some of the basic product and paracommutator bounds are introduced in Section 2. Importantly, we use the Weyl quantization throughout; this plays a substantial role as differences between quantizations are not always perturbative in our analysis. Also of note, we emphasize the difference between balanced and unbalanced bounds, so some of our \(\Psi \)DO product or commutator expansions have the form

$$ \text{commutator} = \text{principal part} + \text{unbalanced lower order} + \text{balanced error}. $$

The geometric form of the minimal surface equation.

While the flat d’Alembertian may naively appear to play a role in the expansion (1.3) of the minimal surface equation, this is not at all useful, and instead we need to adopt a geometric viewpoint. As a starting point, in Section 3 we consider several equivalent formulations of the minimal surface equation, leading to its geometric form in (1.7). This is based on the metric \(g\) associated to the solution \(u\) by (1.4), whose dual we also compute. Two other conformally equivalent metrics will also play a role. In the same section we derive the linearized equation, and also introduce the associated linear paradifferential flow.

Strichartz estimates.

As explained earlier, Strichartz estimates play a major role in our analysis. These are applied to several equations, namely the full evolution, the linear paradifferential evolution and finally the linearized equation; in the present paper, we view the bounds for the paradifferential equation as the core ones, and the other bounds as derived bounds, though not necessarily in a directly perturbative fashion. The Strichartz estimates admit a number of formulations: in direct form for the homogeneous flow, in dual form for the inhomogeneous one, or in the full form. The aim of Section 4 is to introduce all of these forms of the Strichartz estimates, as well as to describe the relations between them, in the context of this paper. A new idea here is to allow source terms that are time derivatives of distributions in appropriate spaces; this is achieved by reinterpreting the wave equation as a system.

Control parameters in energy estimates.

We begin Section 5 by defining the control parameters \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\) and ℬ, which will play a fundamental role in our energy estimate. Here \({\mathcal {A}}\) and \({\mathcal {A}^{\sharp }}\) are scale invariant norms, at the level of \(\| \partial u\|_{L^{\infty}}\), which will remain small uniformly in time. ℬ, on the other hand, is time dependent and at the level of \(\||D_{x}|^{\frac{1}{2}} \partial u\|_{L^{\infty}}\), and will control the energy growth. Typically, our balanced cubic energy estimates will have the form

$$ \frac{\partial E}{\partial t} \lesssim _{{\mathcal {A}^{\sharp }}} { \mathcal {B}}^{2} E. $$

To propagate energy bounds we will need to know that \({\mathcal {B}}\in L^{2}_{t}\). Also in the same section we prove a number of core bounds for our solutions in terms of the control parameters.

The multiplier method and paracontrolled distributions.

Both the construction of our energies and the proof of the energy estimates are based on a paradifferential implementation of the multiplier method, which leads to space-time identities of the form

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\Box _{g} u \cdot X u \, dx dt = \left . E_{X}(u) \right |_{0}^{T} + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}R(u) \, dx dt $$

in a paradifferential format, where the vector field \(X\) is our multiplier and \(E_{X}\) is its associated energy, while \(R(u)\) is the energy flux term which will have to be estimated perturbatively. A fundamental difficulty is that the multiplier \(X\), which should heuristically be at the regularity level of \(\partial u\), cannot be chosen algebraically, and instead has to be constructed in an inductive manner relative to the dyadic frequency scales. In order to accurately quantify the regularity of \(X\), in Section 6 we use and refine the notion of paracontrolled distributions; in a nutshell, while \(X\) may not be chosen to be a function of \(\partial u\), it will still have to be paracontrolled by \(\partial u\), which we denote by .

Energy estimates for the paradifferential equation.

The construction of the energy functionals is carried out in Section 7, primarily at the level of the linear paradifferential equation, first in \(\mathcal {H}^{1}\) and then in \(\mathcal {H}^{s}\). In both cases there are two steps: first the construction of the symbol of the multiplier \(X\), as a paracontrolled distribution, and then the proof of the energy estimates. The difference between the two cases is that \(X\) is a vector field in the first case, but a full pseudodifferential operator in the second case; because of this, we prefer to present the two arguments separately.

Energy estimates for the full equation.

The aim of Section 8 is to prove that balanced cubic energy estimates hold for the full equation in all \(\mathcal {H}^{s}\) spaces with \(s \geq 1\). We do this by thinking about the full equation in a paradifferential form, i.e. as a linear paradifferential equation with a nonlinear source term, and then by applying a normal form transformation to the unbalanced part of the source term.

Well-posedness for the linearized equation.

The goal of Section 9 is to establish both energy and Strichartz estimates for \(\mathcal {H}^{\frac{1}{2}}\) solutions (\(\mathcal {H}^{\frac{5}{8}}\) in dimension two) to the linearized equation. This is achieved under the assumption that both energy and Strichartz estimates for \(\mathcal {H}^{\frac{1}{2}}\) solutions (\(\mathcal {H}^{\frac{5}{8}}\) in dimension two) for the linear paradifferential equation hold. We remark that, while the energy estimates for the linear paradifferential equation have already been established by this point in the paper, the corresponding Strichartz estimates have yet to be proved.

Short time Strichartz estimates for the full equation.

The local well-posedness result of Smith and Tataru [38] yields well-posedness and nearly sharp Strichartz estimates on the unit time scale for initial data that is small in the appropriate Sobolev space. Our objective in Section 10 is to recast this result as a short time result for a corresponding large data problem. This is a somewhat standard argument combining scaling and finite speed of propagation, though with an interesting twist due to the need to use homogeneous Sobolev norms.

Small vs. large \(\mathcal {H}^{s}\) data.

In our main well-posedness proof, in order to avoid more cumbersome notations and estimates, it is convenient to work with initial data that is small in \(\mathcal {H}^{s}\). This is not a major problem, as this is a nonlinear wave equation which exhibits finite speed of propagation. This allows us to reduce the large data problem to the small data problem by appropriate localizations. This argument is carried out at the beginning of Section 11.

Rough solutions as limits of smooth solutions.

Our sequence of modules discussed so far comes together in Section 11, where we finally obtain our rough solutions \(u\) as a limit of smooth solutions \(u^{h}\) with initial data frequency localized below frequency \(2^{h}\). The bulk of the proof is organized as a bootstrap argument, where the bootstrap quantities are uniform energy type bounds for both \(u^{h}\) and for their increments \(v^{h} = \dfrac{d}{dh} u^{h}\), which solve the corresponding linearized equation. The main steps are as follows:

  • we use the short time Strichartz estimates derived from [38] for \(u^{h}\) and \(v^{h}\) in order to obtain long time Strichartz estimates for \(u^{h}\), which in turn implies energy estimates for both the full equation and the paradifferential equation, and closes one half of the bootstrap.

  • we combine the short time Strichartz estimates and the long time energy estimates for the paradifferential equation in \(\mathcal {H}^{\frac{1}{2}}\) (\(\mathcal {H}^{\frac{5}{8}}\) if \(n=2\)) to obtain long time Strichartz estimates for the same paradifferential equation.

  • we use the energy and Strichartz estimates for the paradifferential equation to obtain similar bounds for the linearized equation. This in turn implies long time energy estimates for \(v^{h}\), closing the second half of the bootstrap loop.

The well-posedness argument.

Once we have a complete collection of energy estimates and Strichartz estimates for both the full equation and the linearized equation, we are able to use frequency envelopes in order to prove the remaining part of the well-posedness results, namely the strong convergence of the smooth solutions, the continuous dependence, and the associated uniqueness property. In this we follow the strategy outlined in the last two authors’ expository paper [21].

2 Notations, paraproducts and some commutator type bounds

We begin with some standard notations and conventions:

  • The greek indices \(\alpha \), \(\beta \), \(\gamma \), \(\delta \) etc. in expressions range from 0 to \(n\), where 0 stands for time. Roman indices \(i\), \(j\) are limited to the range from 1 to \(n\), and are associated only to spatial coordinates.

  • The differentiation operators with respect to all coordinates are \(\partial _{\alpha}\), \(\alpha = 0,\ldots, n\). By \(\partial \) without any index we denote the full space-time gradient. To separate only spatial derivatives we use the notation \(\partial _{x}\).

  • We consistently use the Einstein summation convention, where repeated indices are summed over, unless explicitly stated otherwise.

  • The inequality sign \(x \lesssim y\) means \(x \leq Cy\) with a universal implicit constant \(C\). If instead the implicit constant \(C\) depends on some parameter \(A\) then we write \(x \lesssim _{A} y\).

2.1 Littlewood-Paley decompositions and Sobolev spaces

We denote the Fourier variables by \(\xi _{\alpha}\) with \(\alpha = 0,\ldots,n\). To separate the spatial Fourier variables we use the notation \(\xi '\).

2.1.1 Littlewood-Paley decompositions

For distributions in \({\mathbb{R}}^{n}\) we will use the standard inhomogeneous Littlewood-Paley decomposition

$$ u = \sum _{k=0}^{\infty }P_{k} u, $$

where \(P_{k} = P_{k} (D_{x})\) are multipliers with smooth symbols \(p_{k}(\xi ')\), localized in the dyadic frequency region \(\{|\xi | \approx 2^{k}\}\) (unless \(k=0\), where we capture the entire unit ball). We emphasize that no such decompositions are used in the paper with respect to the time variable. We will also use the notations \(P_{< k}\), \(P_{>k}\) with the standard meaning. Often we will use shorthand for the Littlewood-Paley pieces of \(u\), such as \(u_{k} :=P_{k} u\) or \(u_{< k}:= P_{< k} u\). On occasion we will need multipliers with slightly larger support, e.g. \(\tilde{P}_{k}\) will denote a multiplier with similar dyadic frequency localization as \(P_{k}\), but so that \(\tilde{P}_{k} P_{k} = P_{k}\).

2.1.2 Function spaces

For our main evolution we will use inhomogeneous Sobolev space \(H^{s}\), often combined as product spaces \(\mathcal {H}^{s} = H^{s} \times H^{s-1}\) for the position/velocity components of our evolution. In the next to last section of the paper only we will have an auxiliary use for the corresponding homogeneous spaces \(\dot{H}^{s}\), in connection with scaling analysis.

For our estimates we will use \(L^{\infty}\) based control norms. In addition to the standard \(L^{\infty}\) norms, in many estimates we will use the standard inhomogeneous \(BMO\) norm, as well as its close relatives \(BMO^{s}\), with norm defined as

$$ \| f\|_{BMO^{s}} = \| \langle D_{x} \rangle ^{s} f\|_{BMO}. $$

We will also need several related \(L^{p}\) based Besov norms \(B^{s}_{p,q}\), defined as

$$ \|u\|_{B^{s}_{p,q}}^{q} = \sum _{k} 2^{qks} \|P_{k} u\|_{L^{p}}^{q} $$

with the obvious changes if \(q = \infty \). In particular we will be often using these norms with \(p=\infty \) or \(p=2n\), which correspond to the spaces \(B^{0}_{\infty ,1}\), \(B^{\frac{1}{2}}_{2n,1}\) and \(B^{\frac{1}{2}}_{\infty ,2}\); these will be used for our control norms \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\) and ℬ.

2.1.3 Frequency envelopes

Throughout the paper we will use the notion of frequency envelopes, introduced by Tao (see for example [42]), which is a very useful device that tracks the evolution of the energy of solutions between dyadic energy shells.

Definition 2.1

We say that \(\{c_{k}\}_{k\geq 0} \in \ell ^{2}\) is a frequency envelope for a function \(u\) in \(H^{s}\) if we have the following two properties:

a) Energy bound:

$$ \|P_{k} u\|_{H^{s}} \leq c_{k}, $$
(2.1)

b) Slowly varying

$$ \frac{c_{k}}{c_{j}} \lesssim 2^{\delta |j-k|} , \quad j,k\in \mathbb{N}. $$
(2.2)

Here \(\delta \) is a positive constant, which is taken small enough in order to account for energy leakage between nearby frequencies.

One can also limit from above the size of a frequency envelope, by requiring that

$$ \| u\|_{H^{s}}^{2} \approx \sum c_{k}^{2}. $$

Such frequency envelopes always exist, for instance one can define

$$ c_{k} = \sup _{j} 2^{-\delta |j-k|} \|P_{j} u\|_{H^{s}}. $$

The same notion can be applied to any Besov norms. In particular we will use it jointly for the Besov norms that define our control parameters \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\) and ℬ.

2.2 Paraproducts and paradifferential operators

For multilinear analysis, we will consistently use paradifferential calculus, for which we refer the reader to [6, 32].

We begin with the simplest bilinear expressions, namely products, for which we will use the Littlewood-Paley trichotomy

$$ f\cdot g = T_{f} g + \Pi (f,g) + T_{g} f, $$

where the three terms capture the low×high frequency interactions, the high×high frequency interactions and the low×high frequency interactions. The paraproduct \(T_{f} g\) might be heuristically thought of as the dyadic sum

$$ T_{f} g = \sum _{k} f_{< k-\kappa} g_{k}, $$

where the frequency gap \(\kappa \) can be simply chosen as a universal parameter, say \(\kappa = 4\), or on occasion may be increased and used as a smallness parameter in a large data context. To avoid bulky notations, in this paper we will make a harmless abuse of notation and neglect \(\kappa \) altogether. In other words, our notation \(P_{< k}\) stands in effect for \(P_{< k-\kappa}\) with a fixed universal constant \(\kappa \).

However, in our context a definition such as the above one is too imprecise, and the difference between usually equivalent choices may have nonperturbative effects when considering adjoints in our proof of balanced energy estimates later on.

In particular, the symmetry properties of \(T_{f}\) as an operator in \(L^{2}\) are critical in our energy estimates. For this reason, we choose to work with the Weyl quantization, and we define

$$ \mathcal {F} (T_{f} g)(\zeta ) = \int _{\xi +\eta = \zeta} \hat{f}(\eta ) \chi \left ( \frac{|\eta |}{\langle \xi +\frac{1}{2}\eta \rangle} \right ) \hat{g}(\xi ) \, d\xi . $$

Here \(\chi \) is a smooth function supported in a small ball and that equals 1 near the origin. With this convention, if \(f\) is real then \(T_{f}\) is an \(L^{2}\) self-adjoint operator.

For paraproducts we have a number of standard bounds which we list below, and we will refer to as Coifman-Meyer estimates:

$$ \| T_{f} g\|_{L^{p}} \lesssim \| f\|_{L^{\infty}} \|g\|_{L^{p}}, $$
(2.3)
$$ \| T_{f} g\|_{L^{p}} \lesssim \| f\|_{L^{p}} \|g\|_{BMO}, $$
(2.4)
$$ \|\Pi (f,g)\|_{L^{p}} \lesssim \| f \|_{L^{p}} \| g\|_{BMO} . $$
(2.5)

These hold for \(1 < p < \infty \), but there are also endpoint results available roughly corresponding to \(p = 1\) and \(p = \infty \).

Paraproducts may also be thought of as belonging to the larger class of translation invariant bilinear operators. Such operators

$$ f,g \to B(f,g) $$

may be described by their symbols \(b(\eta ,\xi )\) in the Fourier space, by

$$ \mathcal {F} B(f,g)(\zeta ) = \int _{\xi +\eta = \zeta} b(\eta ,\xi ) \hat{f}(\eta ) \hat{g}(\xi )\, d \xi . $$

A special class of such operators, which we denote by \(L_{lh}\), will play an important role later in the paper:

Definition 2.2

By \(L_{lh}\) we denote translation invariant bilinear forms whose symbol \(\ell _{lh}(\eta ,\xi )\) is supported in \(\{|\eta | \ll |\xi |+1\}\) and satisfies bounds of the form

$$ |\partial ^{i}_{\eta }\partial ^{j}_{\xi }\ell _{lh}(\eta ,\xi ) | \lesssim \langle \xi \rangle ^{-|i|-|j|}. $$

We remark that in particular the bilinear form \(B(f,g) = T_{f} g\) is an operator of type \(L_{lh}\), with symbol

$$ b(\eta ,\xi ) = \chi \left ( \frac{|\eta |}{\langle \xi +\frac{1}{2}\eta \rangle} \right ). $$

Here the factor in the denominator \(\xi +\eta /2\) is the average of the \(g\) input frequency and the output frequency, and corresponds exactly to our use of the Weyl calculus. The \(L^{p}\) bounds and the commutator estimates for such bilinear form mirror exactly the similar bounds for paraproducts.

2.3 Commutator and other paraproduct bounds

Here we collect a number of general paraproduct estimates, which are relatively standard. See for instance Appendix B of [20] and Sections 2 and 3 of [1] for proofs of the following estimates as well as further references.

We begin with the following standard commutator estimate:

Lemma 2.3

\(P_{k}\) commutators

We have

$$ \| [T_{f},P_{k}] \|_{\dot{H}^{s} \to \dot{H}^{s}} \lesssim 2^{-k} \| \partial f\|_{L^{\infty}}. $$
(2.6)

A similar bound holds also in \(L^{p}\) for \(1 \leq p \leq \infty \).

The following commutator-type estimates are exact reproductions of statements from Lemmas 2.4 and 2.6 in Section 2 of [1], respectively:

Lemma 2.4

Para-commutators

Assume that \(\gamma _{1}, \gamma _{2} < 1\). Then we have

$$ \| T_{f} T_{g} - T_{g} T_{f} \|_{\dot{H}^{s} \to \dot{H}^{s+\gamma _{1}+ \gamma _{2}}} \lesssim \||D|^{\gamma _{1}}f \|_{BMO}\||D|^{\gamma _{2}}g \|_{BMO}, $$
(2.7)
$$ \| T_{f} T_{g} - T_{g} T_{f} \|_{\dot{B}^{s}_{\infty ,\infty} \to \dot{H}^{s+\gamma _{1}+\gamma _{2}}} \lesssim \||D|^{\gamma _{1}}f \|_{L^{2}}\||D|^{\gamma _{2}}g\|_{BMO}. $$
(2.8)

A bound similar to (2.7) holds in the Besov scale of spaces, namely from \(\dot{B}^{s}_{p, q}\) to \(\dot{B}^{s+\gamma _{1}+\gamma _{2}}_{p, q}\) for real \(s\) and \(1\leq p,q \leq \infty \).

Lemma 2.5

Para-associativity

For \(s + \gamma _{2} \geq 0\), \(s + \gamma _{1} + \gamma _{2} \geq 0\), and \(\gamma _{1} < 1\) we have

$$ \| T_{f} \Pi (v, u) - \Pi (v, T_{f} u)\|_{\dot{H}^{s + \gamma _{1}+ \gamma _{2}}} \lesssim \||D|^{\gamma _{1}}f \|_{BMO}\||D|^{\gamma _{2}}v \|_{BMO} \|u\|_{\dot{H}^{s}}. $$
(2.9)

We also have a Leibniz-type rule with paraproducts, which closely follows Lemma 3.6 of [1]. Here, our setting is slightly cleaner as we have only \(T_{f}\partial _{\alpha}\) in place of \(\partial _{t} + T_{b} \partial _{\alpha}\), and the dependence on \(f\) is captured by the control norm \(A_{\frac{1}{4}}\) in [1].

Lemma 2.6

Para-Leibniz rule

For the balanced Leibniz rule error

$$ E^{\pi}_{L} (u,v) = T_{f}\partial _{\alpha }\Pi (u, v) - \Pi (T_{f} \partial _{\alpha }u, v) - \Pi (u, T_{f}\partial _{\alpha }v) $$

we have the bound

$$ \| E^{\pi}_{L} (u,v) \|_{H^{s}} \lesssim \|f\|_{BMO^{\frac{1}{2}}} \| u \|_{BMO^{-\frac{1}{2} - \sigma}} \| v\|_{H^{s+\sigma}}, \qquad \sigma \in {\mathbb{R}}, \quad s \geq 0. $$
(2.10)

The next paraproduct estimate, see Lemma 2.5 in [1], directly relates multiplication and paramultiplication:

Lemma 2.7

Para-products

Assume that \(\gamma _{1}, \gamma _{2} < 1\), \(\gamma _{1}+\gamma _{2} \geq 0\). Then

$$ \| T_{f} T_{g} - T_{fg} \|_{\dot{H}^{s} \to \dot{H}^{s+\gamma _{1}+ \gamma _{2}}} \lesssim \||D|^{\gamma _{1}}f \|_{BMO}\||D|^{\gamma _{2}}g \|_{BMO}. $$
(2.11)

A similar bound holds in the Besov scale of spaces, namely from \(\dot{B}^{s}_{p, q}\) to \(\dot{B}^{s+\gamma _{1}+\gamma _{2}}_{p, q}\) for real \(s\) and \(1\leq p,q \leq \infty \).

We will also need the following variant, which applies for a different range of indices:

Lemma 2.8

Low-high para-products

Assume that \(\gamma _{1} > 0\), \(\gamma _{1}+\gamma _{2} \leq 0\). Then

$$ \begin{aligned} &\| T_{f} T_{g} - T_{T_{f}g} \|_{\dot{H}^{s} \to \dot{H}^{s+\gamma _{1}+ \gamma _{2}}} \lesssim \ \||D|^{\gamma _{1}}f \|_{BMO}\||D|^{\gamma _{2}}g\|_{BMO}, \qquad \gamma _{1}+\gamma _{2} \neq 0, \end{aligned} $$
(2.12)
$$ \begin{aligned} & \| T_{f} T_{g} - T_{T_{f}g} \|_{{BMO}^{s} \to \dot{H}^{s+\gamma _{1}+\gamma _{2}}} \lesssim \ \||D|^{\gamma _{1}}f \|_{BMO}\||D|^{\gamma _{2}}g\|_{L^{2}}. \end{aligned} $$
(2.13)

The proof of this Lemma only requires a straightforward Littlewood-Paley decomposition for both \(f\) and \(g\), where the difference \(T_{f} T_{g} - T_{T_{f}g} \) selects the range where the \(f\) frequency is at least comparable to the \(g\) frequency. The details are left for the reader.

These are stated here in the more elegant homogeneous setting, but there are also obvious modifications that apply in the inhomogeneous case. We end with the following Moser-type result:

Lemma 2.9

Let \(F\) be smooth with \(F(0)=0\), and \(w \in H^{s}\). Set

$$ R(w) = F(w) - T_{F'(w)} w. $$

Then we have the estimate

$$ \|R(w)\|_{H^{s+\frac{1}{2}}} \lesssim C(\|w\|_{L^{\infty}})\| D^{ \frac{1}{2}} w\|_{BMO} \| w\|_{H^{s}}. $$
(2.14)

This should be a classical result, though we were not able to find a sufficiently accurate reference. Instead of providing a proof here, we refer to similar Moser estimates in Lemmas 5.2 and 5.7, which are more relevant to the current paper and which we prove in Section 5. For further reference, the reader may also view this as a variation of Lemma 2.3 in [1].

2.4 Paradifferential operators

As a generalization of paraproducts, we will also work with paradifferential operators. Precisely, given a symbol \(a(x,\xi )\) in \({\mathbb{R}}^{n}\), we define its paradifferential Weyl quantization \(T_{a}\) as the operator

$$ \mathcal {F} (T_{a} g)(\zeta ) = \int _{\xi +\eta = \zeta} \chi \left ( \frac{|\eta |}{\langle \xi +\frac{1}{2}\eta \rangle} \right ) \hat{a}( \eta ,\xi ) \hat{g}(\xi ) \,d\xi , $$

where

$$ \hat{a}(\eta ,\xi ) = \mathcal {F}_{x} a(x,\xi ). $$

The simplest class of symbols one can work with is \(L^{\infty }S^{m}\), which contains symbols \(a\) for which

$$ | \partial _{\xi}^{\alpha }a(x,\xi ) |\leq c_{\alpha }\langle \xi \rangle ^{m-|\alpha |} $$
(2.15)

for all multi-indices \(\alpha \). For such symbols, the Calderón-Vaillancourt theorem ensures appropriate boundedness in Sobolev spaces,

$$ T_{a}: H^{s} \to H^{s-m}. $$

We remark that this class of symbols in the paradifferential quantization is contained in the class denoted by \(\mathcal {B}S^{m}_{1,1}\), see for instance Hormander [16] but also the earlier work of Bony [7] for further properties.

More generally, given a translation invariant space of distributions \(X\), we can define an associated symbol class \(X S^{m}\) of symbols with the property that

$$ \| \partial _{\xi}^{\alpha }a(x,\xi ) \|_{X} \leq c_{\alpha }\langle \xi \rangle ^{m-|\alpha |} $$
(2.16)

for each \(\xi \in {\mathbb{R}}^{n}\). Later in the paper, we will use several choices of symbols of this type, using function spaces that we will associate to our problem.

3 A complete set of equations

Here we aim to further describe the minimal surface equation and the underlying geometry, and, in particular, its null structure. We also derive the linearized equation, and introduce the paralinearization of both the main equation and its linearization.

3.1 The Lorentzian geometry of the minimal surface

Starting from the expression of the metric \(g\) in (1.4), the dual metric is easily computed to be

$$ g^{\alpha \beta} := m^{\alpha \beta} - \frac{m^{\alpha \gamma} m^{\beta \delta} \partial _{\gamma} u \partial _{\delta }u}{1 + m^{\mu \nu} \partial _{\mu }u \partial _{\nu }u}. $$
(3.1)

Also associated to the metric \(g\) is its determinant

$$ g = \det (g_{\alpha \beta}) = \det (g^{\alpha \beta})^{-1}, $$

and the associated volume form

$$ dV = \sqrt{g}\, dx. $$

This can be easily computed (e.g. using Sylvester’s determinant theorem) as

$$ g = 1+m^{\mu \nu} \partial _{\mu }u \, \partial _{\nu }u. $$

In the sequel, we will always raise indices with respect to the metric \(g\), never with respect to Minkowski. In particular we will use the standard notation

$$ \partial ^{\alpha }= g^{\alpha \beta} \partial _{\beta}. $$
(3.2)

We remark that, when applied to the function \(u\), this operator has nearly the same effect as the corresponding Minkowski operator,

$$ \partial ^{\alpha }u = \frac{1}{g} m^{\alpha \beta}\partial _{\beta }u. $$
(3.3)

3.2 The minimal surface equation

Here we rewrite the minimal surface equation in covariant form. Using the \(g\) notation above and the Minkowski metric, we rewrite (1.3) as

$$ m^{\alpha \beta} \partial _{\alpha }( g^{-\frac{1}{2}} \partial _{ \beta }u) = 0, $$

or equivalently

$$ m^{\alpha \beta} (\partial _{\alpha }\partial _{\beta }u - \frac{1}{2g} \partial _{\alpha }g \partial _{\beta }u)= 0 . $$

Expanding the \(g\) derivative, we have

$$ \partial _{\alpha }g = 2 m^{\mu \nu} \partial _{\mu }u \, \partial _{ \alpha }\partial _{\nu }u. $$
(3.4)

Then in the previous equation we recognize the expression for the dual metric, and the minimal surface equation becomes

$$ g^{\alpha \beta} \partial _{\alpha }\partial _{\beta }u = 0. $$
(3.5)

Using the notation (3.2), this is written in an even shorter form,

$$ \partial ^{\alpha }\partial _{\alpha }u = 0. $$
(3.6)

Similarly, using also (3.3), the relation (3.4) becomes

$$ \frac{1}{2g} \partial _{\alpha }g = \partial ^{\nu }u \, \partial _{ \alpha }\partial _{\nu }u. $$
(3.7)

3.3 The covariant d’Alembertian

The covariant d’Alembertian associated to the metric \(g\) has the form

$$ \Box _{g} = \frac{1}{\sqrt{g}} \partial _{\alpha }\sqrt{g} g^{\alpha \beta} \partial _{\beta}, $$

which we can rewrite as

$$ \begin{aligned} \Box _{g} & = \partial _{\alpha }g^{\alpha \beta} \partial _{\beta }+ \frac{1}{2 g} (\partial _{\alpha }g) g^{\alpha \beta} \partial _{ \beta } \\ &= g^{\alpha \beta}\partial _{\alpha }\partial _{\beta}+ \left ( \partial _{\alpha }g^{\alpha \beta}\right ) \partial _{\beta }+ \frac{1}{2 g} (\partial _{\alpha }g) g^{\alpha \beta} \partial _{ \beta}. \end{aligned} $$

Next we need to compute the two coefficients in round brackets. The second coefficient is given by (3.7). For the first one, for later use, we perform a slightly more general computation where we differentiate \(g^{\alpha \beta}(\partial _{\gamma}u)\) as a function of its arguments \(p_{\gamma}:=\partial _{\gamma }u\),

$$ \frac{\partial g^{\alpha \beta}}{\partial p_{\gamma}} =-\partial ^{ \alpha}u\, g^{\beta \gamma} -\partial ^{\beta}u\, g^{\alpha \gamma}. $$
(3.8)

This formula follows by directly differentiating (3.1) and from (3.3),

$$ \frac{\partial g^{\alpha \beta}}{\partial p_{\gamma}} =-m^{\alpha \gamma} \partial ^{\beta} u \, - m^{\beta \gamma}\partial ^{\alpha}u \,+2g\partial ^{\alpha}\, u\partial ^{\beta}u\, \partial ^{\gamma} u . $$

We use (3.1) once again to get (3.8)

$$ \begin{aligned} \frac{\partial g^{\alpha \beta}}{\partial p_{\gamma}} &=-[g^{\alpha \gamma }+g\partial ^{\alpha}u \partial ^{\gamma}u] \partial ^{\beta} u -[g^{\beta \gamma }+g\partial ^{\gamma}u \partial ^{\beta}u]\partial ^{ \alpha}u+2\partial ^{\alpha} u\partial ^{\beta}u g\partial ^{\gamma} u \\ &= -g^{\alpha \gamma}\partial ^{\beta}u - g^{\beta \gamma }\partial ^{ \alpha}u. \end{aligned} $$

From (3.8) and chain rule, we arrive at

$$ \partial _{\gamma }g^{\alpha \beta} = - \partial ^{\alpha }u \, g^{ \beta \delta} \partial _{\gamma }\partial _{\delta }u - \partial ^{ \beta }u\, g^{\alpha \sigma} \partial _{\gamma }\partial _{\sigma }u. $$
(3.9)

Setting \(\gamma =\alpha \) and using the minimal surface equation in the (3.5) formulation, we get

$$ \partial _{\alpha }g^{\alpha \beta} = - \partial ^{\alpha }u \, g^{ \beta \delta} \partial _{\alpha }\partial _{\delta }u. $$
(3.10)

Comparing this with (3.7), we see that the last two terms in the \(\Box _{g}\) expression above cancel, and we obtain the following simplified form for the covariant d’Alembertian:

$$ \Box _{g} = g^{\alpha \beta} \partial _{\alpha }\partial _{\beta}. $$
(3.11)

In particular, we get the covariant form of the minimal surface equation for \(u\):

$$ \Box _{g} u = 0. $$
(3.12)

For later use, we introduce the notation

$$ A^{\alpha }= - \partial _{\beta }g^{\alpha \beta} = \frac{1}{2g} \partial ^{\alpha }g = \partial ^{\beta }u \, g^{\alpha \delta} \partial _{\beta }\partial _{\delta }u. $$
(3.13)

An interesting observation is that from here on, the Minkowski metric plays absolutely no role:

Remark 3.1

In order to introduce the minimal surface equations we have started from the Minkowski metric \(m^{\alpha \beta}\). However, the formulation (3.5) of the equations together with the relations (3.8) provide a complete description of the equations without any reference to the Minkowski metric, and which is in effect valid for any other Lorentzian metric. Indeed, the equation (3.5) together with the fact that the metric components \(g^{\alpha \beta}\) are smooth functions of \(\partial u\) satisfying (3.8) are all that is used for the rest of the paper. Thus, our results apply equally for any other Lorentzian metric in \({\mathbb{R}}^{n+2}\).

3.4 The linearized equations

Our objective now is to derive the linearized minimal surface equations. We will denote by \(v\) the linearized variable. Then, by (3.8), the linearization of the dual metric \(g^{\alpha \beta} = g^{\alpha \beta}(u)\) takes the form

$$ \delta g^{\alpha \beta} = - \partial ^{\alpha }u \, g^{\beta \nu} \partial _{\nu }v - \partial ^{\beta }u\, g^{\alpha \sigma} \partial _{ \sigma }v. $$

Then the linearized equation is directly computed, using the symmetry in \(\alpha \) and \(\beta \), as

$$ g^{\alpha \beta} \partial _{\alpha }\partial _{\beta }v - 2 \partial ^{ \alpha }u \, g^{\beta \gamma} \partial _{\alpha }\partial _{\beta }u \, \partial _{\gamma }v = 0. $$

Using the expression of \(A\) in (3.13), the linearized equations take the form

$$ (g^{\alpha \beta} \partial _{\alpha} \partial _{\beta} - 2 A^{\gamma} \partial _{\gamma}) v = 0. $$
(3.14)

Alternatively this may also be written in a divergence form,

$$ (\partial _{\alpha} g^{\alpha \beta} \partial _{\beta} - A^{\gamma } \partial _{\gamma })v = 0, $$
(3.15)

or in covariant form,

$$ \begin{aligned} \Box _{g} v &= 2 A^{\beta} \partial _{\beta }v. \end{aligned} $$
(3.16)

3.5 Null forms and the nonlinear null condition

The primary null form that plays a role in this article is \(Q_{0}\), defined by

$$ Q_{0} (v,w) :=g^{\alpha \beta} \partial _{\alpha}v \partial _{\beta} w = \partial _{\alpha }v \partial ^{\alpha }w. $$
(3.17)

Now, we verify that the nonlinear null condition (1.18) holds; for this we use (3.8) to compute

$$ \frac{\partial g^{\alpha \beta}}{\partial{p_{\gamma}}} \xi _{\alpha} \xi _{\beta}\xi _{\gamma} =\left ( -g^{\alpha \gamma}\partial ^{\beta}u -g^{ \beta \gamma}\partial ^{\alpha}u\right )\xi _{\alpha}\xi _{\beta} \xi _{\gamma} , $$

which vanishes on the null cone \(g^{\alpha \beta} \xi _{\alpha }\xi _{\beta }= 0\).

In addition we would like the contribution of \(A\) to the linearized equation to be a null form. We get

$$ A^{\beta }\partial _{\beta }v ={\partial ^{\alpha}u}\, Q_{0}( \partial _{\alpha}u, v). $$

3.6 Two conformally equivalent metrics

While the metric \(g\) is the primary metric used in this paper, for technical reasons we will also introduce two additional, conformally equivalent metrics, as follows:

(i) The metric \({\tilde{g}}\) is defined by

$$ {\tilde{g}}^{\alpha \beta} := (g^{00})^{-1} g^{\alpha \beta}. $$
(3.18)

Then the minimal surface equation can be written as

$$ {\tilde{g}}^{\alpha \beta} \partial _{\alpha }\partial _{\beta }u = 0, $$
(3.19)

while the linearized equation, written in divergence form is

$$ ( \partial _{\alpha }{\tilde{g}}^{\alpha \beta} \partial _{\beta }- { \tilde{A}}^{\alpha }\partial _{\alpha}) v= 0, $$
(3.20)

where, still raising indices only with respect to \(g\),

$$ {\tilde{A}}^{\alpha }= (g^{00})^{-1} A^{\alpha }- {\tilde{g}}^{\alpha \beta} \partial _{\beta }(\ln g^{00}) = \partial ^{\beta }u\, { \tilde{g}}^{\alpha \delta} \partial _{\beta }\partial _{\delta }u + 2 \partial ^{0} u \,{\tilde{g}}^{0\delta} {\tilde{g}}^{\alpha \beta} \partial _{\beta }\partial _{\delta }u . $$
(3.21)

The main feature of \(\tilde{g}\) is that \(\tilde{g}^{00}=1\). Because of this, it will be useful in the study of the linear paradifferential flow, in order to prevent a nontrivial paracoefficient in front of \(\partial _{0}^{2} v\) in the equations.

(ii) The metric \({\hat{g}}\) is defined by

$$ {\hat{g}}^{\alpha \beta} = g^{-\frac{1}{2}} g^{\alpha \beta}. $$
(3.22)

Then the minimal surface equation can be written as

$$ {\hat{g}}^{\alpha \beta} \partial _{\alpha }\partial _{\beta }u = 0, $$
(3.23)

which is not so useful. Instead, the advantage of using this metric is that, using (3.13), the linearized equation can now be written in divergence form,

$$ \partial _{\alpha }{\hat{g}}^{\alpha \beta} \partial _{\beta }v = 0. $$
(3.24)

This will be very useful when we study the linearized equation in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) in two dimensions).

3.7 Paralinearization and the linear paradifferential flow

A key element in our study of the minimal surface equation is the associated linear paradifferential flow, which is derived from the linearized flow (3.15). In inhomogeneous form, this is

$$ (\partial _{\alpha} T_{g^{\alpha \beta}} \partial _{\beta} - T_{A^{ \gamma}}\partial _{\gamma}) w = f. $$
(3.25)

Similarly we can write the paradifferential equations associated to \({\tilde{g}}\), namely

$$ (\partial _{\alpha} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} - T_{{ \tilde{A}}^{\gamma}}\partial _{\gamma}) w = f $$
(3.26)

as well as \({\hat{g}}\), which can be written in divergence form:

$$ \partial _{\alpha} T_{{\hat{g}}^{\alpha \beta}} \partial _{\beta} w = f. $$
(3.27)

These are all equivalent up to perturbative errors. Accordingly, we introduce the notation

$$ T_{P} = \partial _{\alpha }T_{g^{\alpha \beta}} \partial _{\beta } $$
(3.28)

for the paradifferential wave operator as well as its counterparts \(T_{\tilde{P}}\) and \(T_{\hat{P}}\) with the metric \(g\) replaced by \({\tilde{g}}\), respectively \({\hat{g}}\).

We will first use the paradifferential equation in the study of the minimal surface problem (3.5), which we rewrite in the form

$$ (\partial _{\alpha} T_{g^{\alpha \beta}} \partial _{\beta} - T_{A^{ \gamma}}\partial _{\gamma}) u = N(u). $$
(3.29)

Here we carefully base our formula on the linearized flow (3.25), rather on a direct paradifferential expansion in (3.5). This is in order to insure that all nonlinear interactions in \(N(u)\) are frequency balanced at leading order.

A key contention of our paper is that the nonlinearity \(N\) plays a perturbative role. However, this has to be interpreted in a more subtle way, in the sense that \(N\) becomes perturbative only after a well chosen partial, variable coefficient normal form transformation.

Secondly, we will use it in the study of the linearized minimal surface equation, which we can write in the form

$$ \partial _{\alpha}T_{{\hat{g}}^{\alpha \beta}} \partial _{\beta} v = N_{lin}(u) v. $$
(3.30)

Here the nonlinearity \(N_{lin}\) will also play a perturbative role, in the same fashion as above. We caution the reader that this is not the linearization of \(N\).

4 Energy and Strichartz estimates

Both energy and Strichartz estimates play an essential role in this paper, in various forms and combinations. These are primarily applied first to the linear paradifferential flow, and then to the linearized flow associated to solutions to our main equation (1.7). Our goal here is to provide a brief overview of these estimates.

Importantly, in this section we do not prove any energy or Strichartz estimates. Instead, we simply provide definitions and context for what will be proved later in the paper, and prove a good number of equivalences between various well-posedness statements and estimates. We do this under absolutely minimal assumptions (e.g. boundedness) on the metric \(g\), in order to be able to apply these properties easily later on. In particular there are no commutator bounds needed or used in this section. The structure of the minimal surface equations also plays no role here.

4.1 The equations

For context, here we consider a pseudo-Riemannian metric \(g\) in \(I \times {\mathbb{R}}^{n}\), where \(I=[0,T]\) is a time interval of unspecified length. We will make some minimal universal assumptions on the metric \(g\):

  • both \(g\) and its inverse are uniformly bounded,

  • the time slices are uniformly space-like.

Associated to this metric \(g\), we will consider several equations:

The linear paradifferential flow in divergence form:
$$ \partial _{\alpha }T_{g^{\alpha \beta}} \partial _{\beta }v = f, \qquad v[0] = (v_{0},v_{1}) . $$
(4.1)
The linear paradifferential flow in non-divergence form:
$$ T_{g^{\alpha \beta}} \partial _{\alpha }\partial _{\beta }v = f, \qquad v[0] = (v_{0},v_{1}) . $$
(4.2)
The linear flow in divergence form:
$$ \partial _{\alpha }g^{\alpha \beta} \partial _{\beta }v = f, \qquad v[0] = (v_{0},v_{1}) . $$
(4.3)
The linear flow in non-divergence form:
$$ g^{\alpha \beta} \partial _{\alpha }\partial _{\beta }v = f \qquad v[0] = (v_{0},v_{1}) . $$
(4.4)

Several comments are in order:

  • As written, the above evolutions are inhomogeneous. If \(f = 0\) then we will refer to them as the homogeneous flows.

  • In the context of this paper, we are primarily interested in the metric \({\hat{g}}\), in which case the equation (4.3) represents our main linearized flow, and (4.1) represents our main linear paradifferential flow. The metric \(g\) and the nondivergence form of the equations will be used in order to connect our results with the result of Smith-Tataru, which will be used in our proofs.

  • One may also add a gradient potential in the equations above; with the gradient potential added there is no difference between the divergence and the non-divergence form of the equations. We omit it in this section, as it plays no role.

We will consider these evolutions in the inhomogeneous Sobolev spaces \(\mathcal {H}^{s}\). In order to do this uniformly, we will assume that \(|I| \leq 1\); else using homogeneous spaces would be more appropriate. The exponent \(s\) will be an arbitrary real number in the case of the paradifferential flows, but will have a restricted range otherwise.

4.2 Energy estimates and well-posedness for the homogeneous problem

Here we review some relatively standard definitions and facts about local well-posedness.

Definition 4.1

For any of the above flows in the homogeneous form, we say that they are (forward) well-posed in \(\mathcal {H}^{s}\) in the time interval \(I=[0,T]\) if for each initial data \(u[0] \in \mathcal {H}^{s}\) there exists a unique solution \(u\) with the property that

$$ u[\,\cdot \,] \in C(I;\mathcal {H}^{s}). $$

This corresponds to a linear estimate of the form

$$ \| v[\, \cdot \, ] \|_{L^{\infty}(I;\mathcal {H}^{s})} \lesssim \|v[0] \|_{\mathcal {H}^{s}}. $$
(4.5)

Sometimes one establishes additional bounds for the solution (e.g. Strichartz estimates) and these are then added in to the class of solutions for which uniqueness is established. We will comment on this where needed. If no such assumption is used, we call this unconditional uniqueness.

For completeness and reference, we now state without proof a classical well-posedness result:

Theorem 4.2

Assume that \(\partial g \in L^{1}(I;L^{\infty})\). Then

a) The paradiffererential flows (4.1) and (4.2) are wellposed in \(\mathcal {H}^{s}\) for all real \(s\).

b) The divergence form evolution (4.3) is well-posed in \(\mathcal {H}^{s}\) for \(s \in [0,1]\), and the non-divergence form evolution (4.4) is well-posed in \(\mathcal {H}^{s}\) for \(s \in [1,2]\).

We remark that the metrics \(g\) associated with the solutions of Smith-Tataru satisfy the above hypothesis, but the ones associated to the solutions in our paper do not.

A slightly stronger form of well-posedness is to assert the existence of a suitable (time dependent) energy functional \(E^{s}\) in \(\mathcal {H}^{s}\):

Definition 4.3

An energy functional for either of the above problems in \(\mathcal {H}^{s}\) is a bounded quadratic form in \(\mathcal {H}^{s}\) that has the following two properties:

  1. a)

    Coercivity,

    $$ E^{s}(v[t]) \approx \| v[t] \|_{\mathcal {H}^{s}}^{2} . $$
    (4.6)
  2. b)

    Bounded growth for solutions \(v\) to the homogeneous equation,

    $$ \frac{d}{dt}E^{s}(v[t]) \lesssim B(t) \| v[t] \|_{\mathcal {H}^{s}}^{2}, $$
    (4.7)

    where \(B \in L^{1}\) depends only on \(g\).

Later we will also interpret \(E^{s}\) as a symmetric bilinear form in \(\mathcal {H}^{s}\). Such an interpretation is unique.

We remark that, in the context of Theorem 4.2, where \(\partial g \in L^{1} L^{\infty}\), an energy functional \(E^{1}\) corresponding to \(s = 1\) is classically obtained by multiplying the equation with a suitable smooth time-like vector field and integrating by parts; we refer the reader to Section 7.2.1 where this procedure is described in greater detail. Then for \(s \neq 1\) one simply defines

$$ E^{s}(v[0]) = E^{1}(\langle D_{x} \rangle ^{s-1} v[0]), $$

and the corresponding control parameter \(B\) may be taken as

$$ B(t) = \| \partial g(t)\|_{L^{\infty}}. $$

4.3 The wave equation as a system and the inhomogeneous problem

Switching now to the associated inhomogeneous flows, the classical set-up is to take a source term \(f \in L^{1} H^{s-1}\), and then look for solutions \(v\) in \(C(I;\mathcal {H}^{s})\) as above. This is commonly done using the Duhamel principle, which is most readily applied by rewriting the wave equation as a system. We next describe this process.

A common choice is to write the system for the pair of variables \((v,\partial _{t} v)\). However, for us it will be more convenient to make a slightly different linear transformation, and use instead the pair

$$ \mathbf{v}(t) := \begin{pmatrix}v(t)\\g^{0\alpha }\partial _{\alpha }v(t)\end{pmatrix}:= Q\begin{pmatrix}v \\ \partial _{t} v\end{pmatrix}, \qquad Q = \begin{pmatrix}1 & 0 \\ g^{0j} \partial _{j} & g^{00} \end{pmatrix}$$
(4.8)

for (4.3) and (4.4), with products replaced by paraproducts in the case of the equation (4.1) or (4.2). For later use, we record the inverse of \(Q\); this is either

$$ Q^{-1} = \begin{pmatrix} 1 & 0 \\ -(g^{00})^{-1} g^{0j} \partial _{j} & (g^{00})^{-1}\end{pmatrix}, $$
(4.9)

or its version with products replaced by paraproducts, as needed.

The system for \(\mathbf{v}\) will have the form

$$ \frac{d}{dt} \mathbf{v}(t) = {\mathcal {L}}\mathbf{v}(t), $$
(4.10)

with the appropriate choice for the matrix operator ℒ. For instance in the case of the homogeneous equation (4.3) we have

$$ {\mathcal {L}}= \begin{pmatrix} - (g^{00})^{-1} g^{0j} \partial _{j} & (g^{00})^{-1} \\ - \partial _{i} g^{ij} \partial _{j} + \partial _{i} g^{i0} (g^{00})^{-1} g^{0j} \partial _{j} & -\partial _{j} (g^{00})^{-1} g^{0j}\end{pmatrix}, $$
(4.11)

which has the antisymmetry property

$$ {\mathcal {L}}^{*} = - J {\mathcal {L}}J^{-1}, \qquad J = \begin{pmatrix}0 & 1 \\ -1 & 0\end{pmatrix}. $$
(4.12)

A similar property holds in the non-divergence case, but only for the principal part.

We will always work in settings where \(Q\) is bounded and invertible in \(\mathcal {H}^{s}\). This is nearly automatic in the paradifferential case; there we only need to make sure that the operator \(T_{g^{00}}\) is invertible. In the differential case we will have to ask that multiplication by \(g\) and by \((g^{00})^{-1}\) are bounded in \(H^{s-1}\). In such settings, \(\mathcal {H}^{s}\) well-posedness for our original wave equation and for the associated system are equivalent. If a good energy functional \(E^{s}\) exists for the wave equation, then we may define an associated energy functional for the system by setting

$$ \mathbf{E}^{s}(\mathbf{v}(t)) := E^{s}(Q^{-1} \mathbf{v}(t)). $$
(4.13)

Then the properties (4.6) and (4.7) directly transfer to the homogeneous system (4.10).

If our system is (forward) well-posed in \(\mathcal {H}^{s}\), then solving it generates a (forward) evolution operator \(S(t,s)\) which is bounded in \(\mathcal {H}^{s}\) and maps the data at time \(s\) to the solution at time \(t\),

$$ S(t,s) \mathbf{v}(s) = \mathbf{v}(t). $$

For the system it is easy to consider the inhomogeneous version

$$ \frac{d}{dt} \mathbf{v}(t) = {\mathcal {L}}\mathbf{v}(t) + \mathbf{f}(t). $$
(4.14)

If \(\mathbf{f}\in L^{1} \mathcal {H}^{s}\) then the solution to (4.14) is given by Duhamel’s formula,

$$ \mathbf{v}(t) = S(t,0) \mathbf{v}(0) + \int _{0}^{t} S(t,s) \mathbf{f}(s) \, ds, $$
(4.15)

and satisfies the bound

$$ \| \mathbf{v}\|_{L^{\infty }\mathcal {H}^{s}} \lesssim \| \mathbf{v}(0) \|_{\mathcal {H}^{s}} + \| \mathbf{f}\|_{L^{1} \mathcal {H}^{s}}. $$
(4.16)

If we have a good energy \(\mathbf{E}^{s}\) for the homogeneous system, then Duhamel’s formula easily allows us to obtain the corresponding energy estimate for the inhomogeneous one, namely

$$ \frac{d}{dt}\mathbf{E}^{s}(\mathbf{v}(t)) \lesssim \mathbf{E}^{s}( \mathbf{v}(t), \mathbf{f}(t) )+ B(t) \| \mathbf{v}(t) \|_{\mathcal {H}^{s}}^{2} , $$
(4.17)

where the first term on the right arises due to the fact that the energy is quadratic in \(\mathbf{v}(t)\).

Now we are ready to return to our original set of equations, add the source term \(f\) and reinterpret the above consequences of Duhamel’s formula there. As in the homogeneous case, we define \(\mathbf{v}(t) := Q v[t]\). Then adding the source term \(f\) in the original equation is equivalent to adding a source term \(\mathbf{f}\) in the above system. Indeed, it is readily seen that for all our four equations, \(\mathbf{f}\) is given by

$$ \mathbf{f}(t) = \begin{pmatrix}0 \\ f(t)\end{pmatrix}. $$
(4.18)

To complete the correspondence, we note that for such \(\mathbf{f}\) we have

$$ Q^{-1} \mathbf{f}= ({g^{00}})^{-1} \mathbf{f}(t). $$

Then we immediately arrive at the following result:

Theorem 4.4

a) Assume that either of the homogeneous paradifferential flows (4.1) or (4.2) are well-posed in \(\mathcal {H}^{s}\). Then the associated inhomogeneous flows are well-posed in \(\mathcal {H}^{s}\) for \(f \in L^{1} H^{s-1}\), and the following estimate holds

$$ \| v[\cdot ] \|_{L^{\infty}(I;\mathcal {H}^{s})} \lesssim \|v[0]\|_{ \mathcal {H}^{s}} + \| f \|_{L^{1} H^{s-1}}. $$
(4.19)

In addition, if an energy functional \(E^{s}\) in \(\mathcal {H}^{s}\) exists, then

$$ \frac{d}{dt}E^{s}(v[t]) \lesssim E^{s}(v[t], (T_{g^{00}})^{-1} \mathbf{f}(t) )+ B(t) \| v[t] \|_{\mathcal {H}^{s}}^{2}. $$
(4.20)

b) The same holds for the flows (4.3) or (4.4) under the additional assumption that multiplication by \(g\) and \(({g^{00}})^{-1}\) is bounded in \(H^{s-1}\), with the paraproduct replaced by the corresponding product.

For our purposes in this paper, we will also need to allow for a larger class of source terms of the form

$$ f = \partial _{t} f_{1} +f_{2}. $$
(4.21)

To understand why this is natural, it is instructive to start from the inhomogeneous system (4.14) and argue backward.

Above, we have used the inhomogeneous system in the case where the first component of \(\mathbf{f}\) was zero. Now we will allow for both terms in \(\mathbf{f}\) to be nonzero, and derive the corresponding wave equation. For clarity we do this in the context of the equation (4.3), for which we have computed the corresponding operator ℒ in (4.11); however, a similar computation will apply in all four cases.

Starting with a solution \(\mathbf{v}= (\mathbf{v}_{1},\mathbf{v}_{2})^{\top}\) to the inhomogeneous problem (4.14), we begin by defining

$$ v(t) : = \mathbf{v}_{1}(t) $$
(4.22)

as our candidate for the wave equation solution. Then the first equation of the system reads

$$ \partial _{t} v = - (g^{00})^{-1} g^{0j} \partial _{j} v + (g^{00})^{-1} \mathbf{v}_{2} + \mathbf{f}_{1}, $$
(4.23)

or equivalently

$$ g^{0\alpha} \partial _{\alpha }v = \mathbf{v}_{2} + g^{00} \mathbf{f}_{1}. $$

Differentiating this with respect to time we obtain

$$ \begin{aligned} \partial _{\beta }g^{\beta \alpha} \partial _{\alpha }v = & \ \partial _{j} g^{j\alpha} \partial _{\alpha }v + \partial _{t} \mathbf{v}_{2} + \partial _{t} g^{00} \mathbf{f}_{1} \\ = & \ \partial _{j} g^{jk} \partial _{k} v + \partial _{j} g^{j0} \partial _{t} v + \partial _{t} \mathbf{v}_{2} + \partial _{t} g^{00} \mathbf{f}_{1}. \end{aligned} $$

Finally we substitute \(\partial _{t} v\) from (4.23) and \(\partial _{t} \mathbf{v}_{2}\) from the second equation of the system. We already know the right hand side should vanish if \(\mathbf{f}=0\), so it suffices to track the \(\mathbf{f}\) terms. Then we easily obtain the desired equation for \(v\):

$$ \partial _{\beta }g^{\beta \alpha} \partial _{\alpha }v = \partial _{ \alpha }g^{\alpha 0} \mathbf{f}_{1} + \mathbf{f}_{2}. $$
(4.24)

Comparing this with (4.21), we obtain the correspondence between the source terms for the wave equation and the system:

$$ f_{1} = g^{00} \mathbf{f}_{1}, \qquad f_{2} = \partial _{k} g^{k0} \mathbf{f}_{1} + \mathbf{f}_{2}. $$
(4.25)

We also record here the correspondence between the solutions, in the form

$$ \mathbf{v}_{1} = v, \qquad \mathbf{v}_{2} = g^{0\alpha} \partial _{ \alpha }v - g^{00}\mathbf{f}_{1}, $$
(4.26)

noting that this is no longer homogeneous, as in (4.8).

The last step in our analysis is to reinterpret the bounds (4.16) and (4.17) in terms of \(v\) and \(f\). To do this we make the assumption that multiplication by \(g\) and \((g^{00})^{-1}\) is bounded in both \(H^{s}\) and \(H^{s-1}\). Then from (4.16) we get

$$ \begin{aligned} \|v[\cdot ]\|_{L^{\infty }\mathcal {H}^{s}} \lesssim & \ \| \mathbf{v} \|_{L^{\infty }\mathcal {H}^{s}} + \| \mathbf{f}_{1}\|_{L^{\infty }H^{s-1}} \\ \lesssim & \ \|\mathbf{v}(0)\|_{\mathcal {H}^{s}} + \| \mathbf{f}\|_{L^{1} \mathcal {H}^{s}} + \| \mathbf{f}_{1}\|_{L^{\infty }H^{s-1}} \\ \lesssim & \ \|v[0]\|_{\mathcal {H}^{s}} + \|f_{1}\|_{L^{1} H^{s} \cap L^{ \infty }H^{s-1}} + \| f_{2}\|_{L^{1} H^{s-1}}. \end{aligned} $$

Similarly, from (4.17) and (4.13) we obtain the energy bound

$$ \frac{d}{dt}E^{s}(Q^{-1}\mathbf{v}(t)) \lesssim E^{s}(Q^{-1} \mathbf{v}(t), Q^{-1}\mathbf{f}(t) )+ B(t) \| \mathbf{v}(t) \|_{ \mathcal {H}^{s}}^{2}. $$
(4.27)

Here we use (4.9) and (4.26) to compute

$$ Q^{-1}\mathbf{v}(t) = v[t] - \begin{pmatrix}0 \\ \mathbf{f}_{1} \end{pmatrix} = v[t] - \begin{pmatrix}0 \\ (g^{00})^{-1} f_{1} \end{pmatrix} := v[t] - \tilde{v}[t], $$

respectively, using also (4.25),

$$ \begin{aligned} \tilde{f}[t]: = Q^{-1}\mathbf{f}(t) = & \ Q^{-1} \begin{pmatrix} (g^{00})^{-1} f_{1} \\ f_{2} - \partial _{k} g^{k0} (g^{00})^{-1} f_{1}\end{pmatrix}\\ = & \ \begin{pmatrix} (g^{00})^{-1} f_{1} \\ (g^{00})^{-1}( f_{2} - \partial _{k} g^{k0} (g^{00})^{-1} f_{1} - g^{0k} \partial _{k} (g^{00})^{-1} f_{1}) \end{pmatrix}. \end{aligned} $$
(4.28)

Thus we obtain the following natural extension of Theorem 4.4 above:

Theorem 4.5

a) Assume that the homogeneous evolution (4.4) or (4.3) is well-posed in \(\mathcal {H}^{s}\), and that multiplication by \(g\) and \((g^{00})^{-1}\) is bounded in \(H^{s}\) and in \(H^{s-1}\). Consider either of the two evolutions with a source term \(f\) of the form

$$ f = \partial _{t} f_{1}+ f_{2}, \qquad f_{1} \in L^{1} H^{s} \cap C H^{s-1}, \qquad f_{2} \in L^{1} H^{s-1}. $$

Then a unique solution \(u \in C(I, \mathcal {H}^{s})\) exists. If in addition the homogeneous problem admits an energy functional \(E^{s}\) as in Definition 4.3then we have the energy estimate

$$ \frac{d}{dt}E^{s}(v[t]-\tilde{v}[t]) \lesssim E^{s}(v[t]-\tilde{v}[t], \tilde{f}[t] )+ B(t) \| v[t]-\tilde{v}[t] \|_{\mathcal {H}^{s}}^{2} $$
(4.29)

with \(\tilde{v}\) and \(\tilde{f}\) defined above and \(B\) as in (4.7).

b) The same result applies for the paradifferential equations (4.1), respectively (4.2), where all instances of \(g\) above are replaced by the corresponding paraproduct operators \(T_{g}\).

We emphasize here the somewhat unusual function space for \(f_{1}\), in an intersection of two spaces. This reflects the fact that \(f_{1}\) has a dual role, both as a source term and as a velocity correction.

We remark that in the situations where we apply this result, the mapping properties for \(g\) and \((g^{00})^{-1}\) will be fairly straightforward to verify. In the paradifferential case, for instance, the continuity of \(g\) will suffice.

4.4 A duality argument

Duality plays an important role in many estimates for evolution equations. We will also use duality considerations in this paper for several arguments. We restrict the discussion below to the problems written in divergence form, as this is what we will use later in the paper. However, similar versions may be formulated in the nondivergence case.

At heart, this is based on the following identity, which in the context of the operator \(\partial _{\alpha }g^{\alpha \beta} \partial _{\beta}\) is written as follows:

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{\alpha }g^{ \alpha \beta}\partial _{\beta }v \cdot w - v \cdot \partial _{\alpha }g^{ \alpha \beta} \partial _{\beta }w\, dx dt = \left .\int g^{0\alpha} \partial _{\alpha }v \cdot w - v \cdot g^{0\alpha} \partial _{\alpha }w \, dx \right |_{0}^{T}. $$
(4.30)

This holds for any test functions \(v\) and \(w\). The integral on the right can be viewed as a duality relation between \(u[t]\) and \(v[t]\),

$$ B(v[t],w[t]) = \int g^{0\alpha} \partial _{\alpha }v \cdot w - v \cdot g^{0\alpha} \partial _{\alpha }w\, dx. $$

Precisely, assuming that \(g:H^{s-1} \to H^{s-1}\) as a multiplication operator, and that \(g^{00}\) is invertible, this expression has the following two properties

  1. (1)

    Boundedness,

    $$ B: \mathcal {H}^{s} \times \mathcal {H}^{1-s} \to {\mathbb{R}}. $$
  2. (2)

    Coercivity,

    $$ \sup _{\| v\|_{\mathcal {H}^{1-s}} \leq 1} B(u,v) \approx \| u \|_{ \mathcal {H}^{s}} . $$

A standard consequence of this relation is the following property:

Proposition 4.6

The evolutions (4.3), respectively (4.1) are forward well-posed in \(\mathcal {H}^{s}\) iff they are backward well-posed in \(\mathcal {H}^{1-s}\).

We remark that in the context of this paper forward and backward well-posedness are almost identical, so for us this property says that well-posedness in \(\mathcal {H}^{s}\) and \(\mathcal {H}^{1-s}\) are equivalent.

The above proposition may be equivalently reformulated as the corresponding result for the system (4.10). It will be more convenient to view it in this context. To do this, we reinterpret the above duality, in terms of the associated system (4.14). In view of the symmetry property (4.12), we have the relation

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}(\partial _{t} - { \mathcal {L}}) \mathbf{v}\cdot J \mathbf{w}- J \mathbf{v}\cdot ( \partial _{t} - {\mathcal {L}}) \mathbf{w}\, dx dt = \left .\int \mathbf{v}\cdot J \mathbf{w}\, dx\ \right |_{0}^{T}, $$
(4.31)

where the corresponding duality relation is

$$ {\mathbf {B}}({\mathbf {u}},\mathbf{v}) = \int \mathbf{v}\cdot J \mathbf{w}\, dx, $$
(4.32)

which provides the duality between \(\mathcal {H}^{s}\) and \(\mathcal {H}^{1-s}\). Incidentally, a consequence of (4.31) is the duality relation

$$ S(t,s) = S(s,t)^{\ast }, $$

where the duality between \(\mathcal {H}^{s}\) and \(\mathcal {H}^{1-s}\) is the one given by the bilinear form \({\mathbf {B}}\) above. This can be used to construct the backward evolution in \(\mathcal {H}^{1-s}\) given the forward evolution in \(\mathcal {H}^{s}\), and vice-versa. The full equivalence argument is standard, and is omitted.

4.5 Strichartz estimates

Here we discuss several versions of Strichartz estimates, as well as the connection between them.

4.5.1 Estimates for homogeneous equations

In the context of this paper, these have the form

$$ \| v \|_{S^{r}} + \| \partial _{t} v\|_{S^{r-1}} \lesssim \| v[0]\|_{H^{r}}, $$
(4.33)

where for the Strichartz space \(S\) we will consider two different choices:

  1. i)

    Almost lossless estimates, akin to those established in Smith-Tataru [38]. The corresponding Strichartz norms, denoted by \(S=S_{ST}\) are defined as

    $$ \begin{aligned} &\| v\|_{S_{ST}^{r}} = \| v \|_{L^{\infty }H^{r}} + \|\langle D_{x} \rangle ^{r-\frac{3}{4}-\delta} v \|_{L^{4} L^{\infty}}, \qquad n = 2, \\ &\| v\|_{S_{ST}^{r}} =\| v \|_{L^{\infty }H^{r}} + \|\langle D_{x} \rangle ^{r-\frac{n-1}{2}-\delta} v \|_{L^{2} L^{\infty}}, \quad \, n \geq 3. \end{aligned} $$
    (4.34)

    Here the loss of derivatives is measured by \(\delta > 0\), which is an arbitrarily small parameter.

  2. ii)

    Estimates with derivative losses, precisely the type that will be established in this paper. The corresponding Strichartz norms, denoted by \(S=S_{AIT}\) are defined as

    $$ \begin{aligned} &\| v\|_{S_{AIT}^{r}} = \| v \|_{L^{\infty }H^{r}} + \|\langle D_{x} \rangle ^{r-\frac{3}{4} - \frac{1}{8}-\delta} v \|_{L^{4} L^{\infty}}, \qquad n = 2, \\ &\| v\|_{S_{AIT}^{r}} =\| v \|_{L^{\infty }H^{r}} + \|\langle D_{x} \rangle ^{r-\frac{n-1}{2}-\frac{1}{4}-\delta} v \|_{L^{2} L^{\infty}}, \quad \, n \geq 3. \end{aligned} $$
    (4.35)

    Here \(\delta > 0\) is again an arbitrarily small parameter, but we allow for an additional loss of derivatives in the endpoint (Pecher) estimate, namely \(1/8\) derivatives in two space dimensions and \(1/4\) in higher dimensions.

These estimates can be applied to any of the four equations discussed in this section. There are also appropriate counterparts for the corresponding system (4.10), which have the form

$$ \| \mathbf{v}_{1} \|_{S^{r}} + \| \mathbf{v}_{2}\|_{S^{r-1}} \lesssim \| \mathbf{v}[0]\|_{\mathcal {H}^{r}}, \qquad S \in \{ S_{ST},S_{AIT} \} . $$
(4.36)

Under very mild assumptions on \(g\), these are equivalent to the ones for the corresponding wave equation:

Proposition 4.7

The Strichartz estimates (4.33) for the homogeneous wave equation are equivalent to the Strichartz estimates (4.36) for the associated system.

We also remark on a very mild extension of the estimate (4.33) to the inhomogeneous case. Precisely, if (4.33) holds then we also have the inhomogeneous bound

$$ \| v \|_{S^{r}} + \| \partial _{t} v\|_{S^{r-1}} \lesssim \| v[0]\|_{ \mathcal {H}^{r}} + \|f\|_{L^{1} H^{r-1}}. $$
(4.37)

This follows in a straightforward manner by the Duhamel formula, see the discussion in Section 4.3.

We conclude the discussion of the Strichartz estimates for the homogeneous equation with a simple but important case, which will be useful for us in the sequel, and applies in particular to the solutions in [38].

Proposition 4.8

Assume that \(\partial g \in L^{1}L^{\infty}\) and that the Strichartz estimates for the homogeneous equation (4.4) hold in \(\mathcal {H}^{1}\). Then the Strichartz estimates for the homogeneous equation hold in \(\mathcal {H}^{r}\) for all \(r \in {\mathbb{R}}\) for both paradifferential flows (4.1) and (4.2).

We remark that the implicit constant in these Strichartz estimates depends on the implicit constant in the Strichartz estimate in the hypothesis and on the bound for \(\|\partial g\|_{L^{1} L^{\infty}}\). Later when we apply this result we will have uniform control over both, so we obtain uniform control over the \(\mathcal {H}^{r}\) Strichartz norm.

Proof

It will be easier to work with the inhomogeneous bound (4.37), as it is more stable with respect to perturbations. We divide the proof into several steps, all of which are relatively standard.

Step 1: We start with the case \(r=1\) with the additional assumption \(g^{00}= -1\). Then the second equation in (4.2) can be seen as a perturbation of (4.4) with an \(L^{1} L^{2}\) source term. Hence the bound (4.37) for (4.4) implies the same bound for (4.2).

Step 2: Next, assuming still that \(g^{00}= -1\), we extend the bound (4.37) for (4.2) to all real Sobolev exponents \(r\) by conjugating by \(\langle D_{x} \rangle ^{\sigma}\) with \(\sigma = r-1\), where we can estimate perturbatively the commutator

$$ \| [T_{g^{\alpha \beta}},\langle D_{x} \rangle ^{\sigma}] \partial _{ \alpha }\partial _{\beta }\langle D_{x} \rangle ^{-\sigma} v\|_{L^{1} L^{2}} \lesssim \| \partial g\|_{L^{1} L^{\infty}} \| \partial v\|_{L^{ \infty }L^{2}}. $$
(4.38)

This is valid for all real \(\sigma \) and, since it involves paraproducts, can be thought of as a frequency localized bound, which is but a version of Lemma 2.3.

Step 3: Using a multiplication by \(T_{g^{00}}\), we reduce the problem with nonconstant \(g^{00}\) to the case when \(g^{00}= -1\). Here we apply Lemma 2.7 with \(\gamma _{1}+\gamma _{2} = 1\) for the composition of paraproducts, and then interpolation; this applies equally for all real \(s\). At the conclusion of this step, we have the bound (4.37) for (4.2) for all \(r\).

Step 4: We commute the paracoefficients \(T_{g^{\alpha \beta}}\) inside \(\partial _{\alpha}\) perturbatively, in order to obtain the bound (4.37) for (4.1) for all \(r\). □

4.5.2 Dual Strichartz estimates

Here one considers the corresponding inhomogeneous problems, with source terms in dual Strichartz spaces. The estimates have the form

$$ \| v [\cdot ]\|_{L^{\infty }\mathcal {H}^{r}} \lesssim \| v[0]\|_{ \mathcal {H}^{r}} + \| f \|_{(S^{1-r})'}, \qquad S \in \{ S_{ST},S_{AIT} \} . $$
(4.39)

Classically, these are obtained by duality from the homogeneous estimates, as follows:

Proposition 4.9

If the homogeneous estimates (4.33) hold in \(\mathcal {H}^{r}\) for the forward (backward) evolution then the dual estimates (4.39) hold in \(\mathcal {H}^{1-r}\) for the backward (forward) evolution.

However, one can do better than this by going instead through the system form of the equations (4.14). The dual estimates for (4.14) have the form

$$ \| \mathbf{v}\|_{L^{\infty }\mathcal {H}^{r}} \lesssim \| \mathbf{v}(0) \|_{\mathcal {H}^{r}} + \| \mathbf{f}_{1} \|_{(S^{-r})'} + \|\mathbf{f}_{2} \|_{(S^{-r+1})'} , \qquad S \in \{ S_{ST},S_{AIT}\} . $$
(4.40)

These are directly obtained from the homogeneous estimates for the system (4.10) via the duality (4.31):

Proposition 4.10

If the homogeneous estimates hold in \(\mathcal {H}^{r}\) for the forward (backward) evolution (4.10) then the dual estimates hold in \(\mathcal {H}^{1-r}\) for the backward (forward) evolution (4.14).

One can now further return to the original inhomogeneous equation with a source term as in (4.21), and use the correspondence (4.25) and (4.26), in order to transfer the dual bounds back. These dual estimates, which represent a generalization of (4.39), have the form

$$ \| v[\cdot ] \|_{L^{\infty }\mathcal {H}^{r}} \lesssim \| v[0]\|_{ \mathcal {H}^{r}} + \| f_{1} \|_{L^{\infty }H^{r-1} \cap (S^{-r})'} + \|f_{2}\|_{(S^{1-r})'}, \qquad S \in \{ S_{ST},S_{AIT}\} . $$
(4.41)

We obtain the following strengthening of Proposition 4.9:

Proposition 4.11

If the homogeneous estimates (4.33) hold in \(\mathcal {H}^{r}\) for the forward (backward) evolution then the dual estimates (4.41) hold in \(\mathcal {H}^{1-r}\) for the backward (forward) evolution.

4.5.3 Full (retarded) Strichartz estimates

Here we combine the homogeneous and dual Strichartz estimates in a single bound for the inhomogeneous problem. The classical form is

$$ \| v \|_{S^{r}} + \| \partial _{t} v\|_{S^{r-1}} \lesssim \| v[0]\|_{ \mathcal {H}^{r}} + \|f\|_{(S^{1-r})'}, \qquad S \in \{ S_{ST},S_{AIT} \} . $$
(4.42)

However, here we need to take the extra step where we allow source terms of the form \(f = \partial _{t} f_{1} + f_{2}\), and then the estimates have the form

$$ \| v \|_{S^{r}} + \| \partial _{t} v\|_{S^{r-1}} \lesssim \| v[0]\|_{ \mathcal {H}^{r}} + \|f_{1}\|_{S^{r-1}\cap (S^{-r})'} + \|f_{2}\|_{(S^{1-r})'}, \qquad S \in \{ S_{ST},S_{AIT}\} . $$
(4.43)

As we will see, this is closely related to the corresponding bound for the associated inhomogeneous system (4.14):

$$ \| \mathbf{v}_{1} \|_{S^{r}} + \|\mathbf{v}_{2}\|_{S^{r-1}} \lesssim \| \mathbf{v}(0)\|_{\mathcal {H}^{r}} + \| \mathbf{f}_{1} \|_{(S^{-r})'} + \|\mathbf{f}_{2} \|_{(S^{-r+1})'} , \qquad S \in \{ S_{ST},S_{AIT} \} . $$
(4.44)

Our main result here is as follows:

Theorem 4.12

Consider either of the equations (4.3) or (4.1). If the homogeneous problem is well-posed forward in \(\mathcal {H}^{r}\) and backward in \(\mathcal {H}^{1-r}\) and satisfies the homogeneous Strichartz estimates (4.42) in both cases, then the solutions to the associated forward inhomogeneous problem with source term \(f = \partial _{t} f_{1} + f_{2}\) satisfy the bounds in (4.43).

Proof

The proof consists of four steps:

Step 1: If the homogeneous problem is well-posed forward in \(\mathcal {H}^{r}\) and satisfies the homogeneous Strichartz estimates (4.36), then so does the corresponding system, see Proposition 4.7.

Step 2: If the homogeneous problem is well-posed backward in \(\mathcal {H}^{1-r}\) and satisfies the homogeneous Strichartz estimates, then so does the corresponding system. By duality, the inhomogeneous system is well-posed forward in \(H^{r}\) and satisfies the dual Strichartz bounds (4.40).

Step 3: We represent the forward \(\mathcal {H}^{r}\) solution by the Duhamel formula

$$ \mathbf{v}(t) = S(t,0) \mathbf{v}(0) + \int _{0}^{t} S(t,s) \mathbf{f}(s) \, ds. $$

The first term represents the solution to the homogeneous equation, and is estimated by (4.36). For the second term we have two bounds at our disposal: the dual bound where we fix \(t\) and estimate the output in \(\mathcal {H}^{s}\) in terms of the input in the dual Strichartz space, and the homogeneous bound where we fix \(s\), set \(\mathbf{f}(s) \in \mathcal {H}^{r}\) and estimate the output as a function of \(t\) in the Strichartz space. Concatenating the two, we get the restricted bound

$$ \| \mathbf{v}_{1} \|_{S^{r}(J)} + \|\mathbf{v}_{2}\|_{S^{r-1}(J)} \lesssim \| \mathbf{f}_{1} \|_{(S^{-r})'(I)} + \|\mathbf{f}_{2} \|_{(S^{-r+1})'(I)} , \qquad S \in \{ S_{ST},S_{AIT}\}, $$
(4.45)

where the source \(\mathbf{f}\) is supported in an interval \(I\) and the output \(\mathbf{v}\) is measured in an interval \(J\) so that \(I\) precedes \(J\). In two dimensions we can now apply the Christ-Kiselev lemma [9] (or the \(U^{p}\)-\(V^{p}\) spaces, see [28]) to get the full estimate. In three and higher dimensions we have a slight problem which is that neither method applies for bounds from \(L^{2}_{t}\) to \(L^{2}_{t}\). However in our case this is not an issue, because our estimates allow for at least a loss of \(\delta \) derivatives. Then we can afford to interpolate between the two endpoints and use the Christ-Kiselev lemma for bounds from \(L^{2-}_{t}\) to \(L^{2+}_{t}\) and then return to the endpoint setting by Bernstein’s inequality in space and Holder’s inequality in time, all at the expense of an arbitrarily small increase in the size \(\delta \) of the loss.

Step 4. We transfer the estimate (4.44) back to the original system via the correspondence (4.25), (4.26), in order to obtain (4.43). □

We conclude with a corollary of Theorem 4.12, which will be used later in the paper and follows by combining this result with Proposition 4.8:

Corollary 4.13

Assume that \(\partial g \in L^{1}L^{\infty}\) and that the Strichartz estimates for the homogeneous equation (4.4) hold in \(\mathcal {H}^{1}\). Then the full Strichartz estimates (4.43) hold in \(\mathcal {H}^{r}\) for all \(r \in {\mathbb{R}}\) for both paradifferential flows (4.1) and (4.2).

5 Control parameters and related bounds

5.1 Control parameters

Here we introduce our main control parameters associated to a solution \(u\) to the minimal surface equation, which serve to bound the growth of energy for both solutions to the minimal surface flow and for its linearization. We will use three such primary quantities, \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\) and ℬ, which are defined as Besov norms of the solution \(u\). Our notations here mirror similar notations in our earlier water wave paper [1].

We begin with \({\mathcal {A}}\), which is \(L^{\infty}\) based,

$$ {\mathcal {A}}= \sup _{t \in [0,T]} \sum _{k} \| P_{k} \partial u\|_{L^{ \infty}}. $$
(5.1)

We next define the slightly stronger variant \({\mathcal {A}^{\sharp }}\gtrsim {\mathcal {A}}\), still at the same scaling but \(L^{2n}\) based,

$$ {\mathcal {A}^{\sharp }}= \sup _{t \in [0,T]} \sum _{k} 2^{\frac{k}{2}} \| P_{k} \partial u\|_{L^{2n}}. $$
(5.2)

Here the choice of the exponent \(2n\) is in no way essential, though it does provide some minor simplifications in one or two places.

Finally we define the time dependent ℬ control parameter which is again \(L^{\infty}\) based:

$$ {\mathcal {B}}(t) = \left ( \sum _{k} 2^{k} \| P_{k} \partial u\|_{L^{ \infty}}^{2} \right )^{\frac{1}{2}}. $$
(5.3)

In a nutshell, the energy functionals we construct later in the paper will be shown to satisfy cubic balanced bounds of the form

$$ \frac{dE}{dt} \lesssim _{{\mathcal {A}^{\sharp }}} {\mathcal {B}}^{2} E, $$
(5.4)

which guarantee that energy bounds can be propagated for as long as \({\mathcal {A}^{\sharp }}\) remains finite and ℬ remains in \(L^{2}_{t}\). One should compare these bounds with the classical energy estimates, which have the form

$$ \frac{dE}{dt} \lesssim _{{\mathcal {A}}} \| \partial ^{2} u\|_{L^{ \infty}} E, $$
(5.5)

and which require an extra half derivative in the control parameter.

We continue with a few comments concerning our choice of control parameters:

  • Here \({\mathcal {A}}\) and \({\mathcal {A}^{\sharp }}\) are critical norms for \(u\), which may be described using the Besov notation as capturing the uniform norms in time

    $$ {\mathcal {A}}= \| \partial u\|_{L^{\infty}_{t} B^{0}_{\infty ,1}}, \qquad {\mathcal {A}^{\sharp }}= \| \partial u\|_{L^{\infty}_{t} B^{ \frac{1}{2}}_{2n,1}}. $$

    In a first approximation, the reader should think of \({\mathcal {A}}\) as simply capturing the \(L^{\infty}\) norm of \(\partial u\); the slightly stronger Besov norm above is needed for minor technical reasons, and allows us to work with scale invariant bounds. Often we will simply rely on the simpler \(L^{\infty}\)-bound, since

    $$ \| \partial u\|_{L^{\infty}} \lesssim {\mathcal {A}}\lesssim { \mathcal {A}^{\sharp }}. $$
    (5.6)
  • The control norm ℬ, taken at fixed time, is \(1/2\) derivative above scaling, and may also be described using the Besov notation as

    $$ {\mathcal {B}}(t) = \| \partial u(t)\|_{B^{\frac{1}{2}}_{\infty ,2}}. $$

    Again, in a first approximation one should simply think of it as \(\|\partial u\|_{BMO^{\frac{1}{2}}}\), which in effect suffices for most of the analysis. Indeed, we have

    $$ \| \partial u\|_{BMO^{\frac{1}{2}}} \lesssim {\mathcal {B}}. $$
    (5.7)
  • Given the choice of these control parameters, it is not difficult to see that our energy estimates of the form (5.4) are invariant with respect to scaling. This by itself does not mean much; even the classical energy estimates, of the form (5.5), are scale invariant, but much less useful for low regularity well-posedness. What is important here is that our energy estimates are cubic and balanced.

  • The fact that our control norms are based on uniform, rather than \(L^{2}\)-bounds, particularly at the level of ℬ, is also critical. This is what allows us to use Strichartz estimates to further improve the low regularity well-posedness threshold in our results.

  • Concerning the dependence of constants in our estimates on \({\mathcal {A}}\), \({\mathcal {A}^{\sharp}}\) and ℬ, we adopt a two track system:

    • The dependence on ℬ is either linear or quadratic, and will always be explicitly stated in all estimates.

    • The dependence on \({\mathcal {A}}\) and \({\mathcal {A}^{\sharp}}\) is often nonlinear, in which case we use the notations \(\lesssim _{{\mathcal {A}}}\), \(\lesssim _{{\mathcal {A}^{\sharp}}}\). This dependence is less important, as beginning with Section 7 we will assume that \({\mathcal {A}^{\sharp}}\ll 1\), and drop it altogether except where the smallness is essential. But for clarity and also for later use, we do track this dependence in this and the next section.

  • In terms of using \({\mathcal {A}}\) versus \({\mathcal {A}^{\sharp}}\), we first note that ideally we would like to avoid \({\mathcal {A}^{\sharp}}\) altogether, and just use the weaker control norm \({\mathcal {A}}\). But this appears not to be possible, which is why \({\mathcal {A}^{\sharp}}\) was introduced. To streamline the analysis, in what follows we will simply think of the implicit dependence as being on \({\mathcal {A}^{\sharp}}\), which suffices for our final result. One may even take the more radical step of dropping \({\mathcal {A}}\) altogether; we decided against that, both for historical reasons and for easy reference.

For bookkeeping reasons we will use a joint frequency envelope \(\{c_{k}\}_{k}\) for the dyadic components of each of \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\), and ℬ, so that

(i) \(\{c_{k}\}_{k}\) is normalized in \(\ell ^{2}\) and slowly varying,

$$ \sum c_{k}^{2} = 1; $$

(ii) We have control of dyadic Littlewood-Paley pieces as follows for \(\partial u\):

$$ \| P_{k} \partial u\|_{L^{\infty}} \lesssim {\mathcal {A}}c_{k}^{2}, \qquad 2^{\frac{k}{2}} \| P_{k} \partial u\|_{L^{2n}} \lesssim {\mathcal {A}^{ \sharp }}c_{k}^{2}, \qquad 2^{\frac{k}{2}} \| P_{k} \partial u\|_{L^{ \infty}} \lesssim {\mathcal {B}}c_{k}. $$
(5.8)

A-priori, these frequency envelopes depend on time. However, at the conclusion of the paper, we will see that for our rough solutions they can be taken to be independent of time, essentially equal to appropriate \(L^{2}\)-type frequency envelopes for the initial data.

5.2 Related bounds

We will frequently need to use bounds that are similar to (5.8) in nonlinear expressions, so it is convenient to have a notation for the corresponding space:

Definition 5.1

The space \(\mathfrak{C}_{0}\) is the Banach space of all distributions \(v\) that satisfy the bounds

$$ \|v\|_{L^{\infty}} \leq C, \qquad 2^{\frac{k}{2}} \| P_{k} v\|_{L^{2n}} \leq C {\mathcal {A}^{\sharp }}c_{k}^{2}, \qquad 2^{\frac{k}{2}} \| P_{k} v\|_{L^{\infty}} \leq C {\mathcal {B}}c_{k}, $$
(5.9)

with the norm given by the best constant \(C\) in the above inequalities.

For this space we have the following algebra and Moser-type result:

Lemma 5.2

a) The space \(\mathfrak{C}_{0}\) is closed with respect to multiplication and para-multiplication. In particular \(\mathfrak{C}_{0}\) is an algebra.

b) Let \(F\) be a smooth function with \(F(0)=0\), and \(v \in \mathfrak{C}_{0}\). Then \(F(v) \in \mathfrak{C}_{0}\). In particular if \(\|v\|_{\mathfrak{C}_{0}} \lesssim 1\) then \(F(v)\) satisfies

$$ \| F(v) \|_{\mathfrak{C}_{0}} \lesssim _{{\mathcal {A}^{\sharp }}} \|v \|_{\mathfrak{C}_{0}}. $$
(5.10)

In particular the above result applies to the metrics \(g\), \({\tilde{g}}\) and \({\hat{g}}\), all of which are smooth functions of \(\partial u\), and thus belong to \(\mathfrak{C}_{0}\) modulo constants (which are simply the Minkowski metric).

Proof

a) We first estimate the \(\mathfrak{C}_{0}\) norm for the paraproduct \(T_{f} g\) for \(f,g \in \mathfrak{C}_{0}\). This is straightforward, using the \(L^{\infty}\) bound for \(f\), for all but the uniform bound in (5.9). For the uniform bound, we change the summation order in the Littlewood-Paley expansion to obtain

$$ \| T_{f} g\|_{L^{\infty}} \lesssim \sum _{k} \|P_{k} f \|_{L^{\infty}} \| P_{> k} g \|_{L^{\infty}} \lesssim \| f\|_{\mathfrak{C}_{0}} \|g\|_{L^{ \infty}}. $$

It now remains to estimate \(\Pi (f,g)\) in \(\mathfrak{C}_{0}\). The uniform bound is almost identical to the one above. For the \({\mathcal {A}^{\sharp }}\) norm we use Bernstein’s inequality

$$\begin{aligned} 2^{\frac{k}{2}} \| P_{k} \Pi (f,g) \|_{L^{2n}} \lesssim& \sum _{j \geq k} 2^{k} \| f_{j} g_{j} \|_{L^{n}} \\ \lesssim& \sum _{j \geq k} 2^{k} \| f_{j} \|_{L^{2n}} \|g_{j} \|_{L^{2n}} \\ \lesssim& \sum _{j \geq k} 2^{k-j} c_{j}^{2} {\mathcal {A}^{\sharp }}^{2} \|f\|_{\mathfrak{C}_{0}} \|g\|_{ \mathfrak{C}_{0}} , \end{aligned}$$

and now the \(j\) summation is straightforward.

For the ℬ norm, on the other hand, we estimate

$$\begin{aligned} \| P_{k} \Pi (f,g) \|_{L^{\infty}} \lesssim& \sum _{j \geq k} \| f_{j} g_{j} \|_{L^{\infty}} \lesssim \sum _{j \geq k} \| f_{j} \|_{L^{\infty}} \|g_{j} \|_{L^{\infty}} \\ \lesssim& \sum _{j \geq k} 2^{-\frac{j}{2}} c_{j} { \mathcal {A}^{\sharp }}{\mathcal {B}}\|f\|_{\mathfrak{C}_{0}} \|g\|_{ \mathfrak{C}_{0}} , \end{aligned}$$

and again the \(j\) summation is straightforward.

b) To prove the Moser inequality we use a continuous Littlewood-Paley decomposition, which leads to the expansion

$$ F(v) = F(v_{0}) + \int _{0}^{\infty }F'(v_{< j}) v_{j} \, dj. $$

To estimate \(P_{k} F(v)\) we consider several cases:

i) \(j = k+ O(1)\). Then \(c_{j} \approx c_{k}\), \(F'(v_{< j})\) is directly bounded in \(L^{\infty}\) and our bounds are straightforward.

ii) \(j < k-4\). Then we can insert an additional localization,

$$ P_{k}(F'(v_{< j}) v_{j}) = P_{k}(\tilde{P}_{k} F'(v_{< j}) v_{j}), $$

where we gain from the frequency difference

$$ \| P_{k} F'(v_{< j}) \|_{L^{\infty}} \lesssim 2^{-N(k-j)}, $$
(5.11)

which more than compensates for the difference (ratio) between \(c_{j}\) and \(c_{k}\).

iii) \(j > k+4\). In this case we reexpand \(F'(v_{< j})\) and write

$$ F'(v_{< j}) v_{j} = F'(v_{0}) v_{j} + \int _{0}^{\infty }F''(v_{< l}) v_{l} v_{j} \, dl . $$

We further separate into two cases:

(iii.1) \(l = j + O(1)\). Then we simply bound \(F''(v_{< l})\) in \(L^{\infty}\), and estimate first for the \({\mathcal {A}^{\sharp }}\) bound using Bernstein’s inequality

$$ 2^{\frac{k}{2}} \| P_{k} (F''(v_{< l}) v_{l} v_{j}) \|_{L^{2n}} \lesssim 2^{k} \| v_{l} v_{j} \|_{L^{n}} \lesssim 2^{k} \| v_{l} \|_{L^{2n}} \|v_{j} \|_{L^{2n}} \lesssim 2^{k-j} c_{j}^{2} {\mathcal {A}^{\sharp }}^{2}, $$

where the \(j\) and \(l\) integrations are trivial. Next we estimate for the ℬ-bound

$$ \| P_{k} (F''(v_{< l}) v_{l} v_{j}) \|_{L^{\infty}} \lesssim \|v_{l}\|_{L^{ \infty}} \| v_{j}\|_{L^{\infty}} \lesssim 2^{-\frac{j}{2}} { \mathcal {A}^{\sharp }}{\mathcal {B}}c_{j}, $$

again with easy \(j\) and \(l\) integrations.

(iii.2) \(l < j-4\). Then we can insert another frequency localization,

$$ P_{k} (F''(v_{< l}) v_{l} v_{j}) = P_{k} (\tilde{P}_{j} F''(v_{< l}) v_{l} v_{j}), $$

and repeat the computation in (b.ii) but using (5.11) to account for the difference between \(l\) and \(j\). □

In order to avoid tampering with causality, the Littlewood-Paley projections we use in this paper are purely spatial. This is more of a choice between different evils than a necessity; see for instance the alternate choice made in [38]. A substantial but worthwhile price to pay is that on occasion we will need to separately estimate double time derivatives, in a somewhat imperfect but sufficient fashion.

A good starting point in this direction is to think of bounds for second derivatives of our solution \(u\). If at least one of the derivatives is spatial, then this is straightforward:

$$ \| P_{< k} \partial _{x} \partial u \|_{L^{\infty}} \lesssim 2^{ \frac{k}{2}} {\mathcal {B}}c_{k}. $$
(5.12)

However, matters become more complex if instead we look at the second time derivative of \(u\). The natural idea is to use the main equation (3.5) to estimate \(\partial _{t}^{2} u\), by writing in terms of spatial derivatives,

$$ \partial _{t}^{2} u = - \sum _{(\alpha ,\beta ) \neq (0,0)} {\tilde{g}}^{ \alpha \beta} \partial _{\alpha }\partial _{\beta }u. $$

If one takes this view, the main difficulties we face are with the high-high interactions in this expression. But these high-high interactions have the redeeming feature that they are balanced, so they will often play a perturbative role. This leads us to define a corrected expression as follows:

Definition 5.3

“Good” second order derivatives

We denote by \(\widehat{\partial _{0} \partial _{0}} u\) or shortly \(\hat{\partial}_{t}^{2} u\) the expression

$$ \hat{\partial}_{t}^{2} u = \partial _{t}^{2} u + \sum _{(\alpha ,\beta ) \neq (0,0)} \Pi ({\tilde{g}}^{\alpha \beta}, \partial _{\alpha }\partial _{\beta }u). $$
(5.13)

On the other hand if \((\alpha ,\beta ) \neq (0,0)\) then we define

$$ \widehat{\partial _{\alpha }\partial _{\beta}} u := {\partial _{\alpha }\partial _{\beta}} u. $$

With this notation, we have

Lemma 5.4

Assume that \(u\) solves the equation (3.5). Then for its second time derivative we have the decomposition

$$ \partial _{t}^{2} u = \hat{\partial}_{t}^{2} u + \pi _{2}(u), $$
(5.14)

where the two components satisfy the uniform bounds

$$ \| P_{< k} \hat{\partial}_{t}^{2} u \|_{L^{\infty}} \lesssim 2^{ \frac{k}{2}} {\mathcal {B}}c_{k}, \qquad \| P_{< k} \hat{\partial}_{t}^{2} u \|_{L^{\infty}} \lesssim 2^{k} {\mathcal {A}}c_{k}^{2},$$
(5.15)

respectively

$$ \| \pi _{2}(u) \|_{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}} { \mathcal {B}}^{2}, \qquad \| \pi _{2}(u) \|_{L^{n}} \lesssim _{{ \mathcal {A}^{\sharp }}} {\mathcal {A}^{\sharp }}^{2}. $$
(5.16)

One should compare this with the easier direct bound (5.12) for spatial derivatives; the good part \(\hat {\partial }_{t}^{2} u\) satisfies a similar bound, but the error \(\pi _{2}(u)\) does not. Later, when such expressions are involved, we will systematically peel off perturbatively the error, and always avoid differentiating it further.

Proof

The main ingredient here is the Littlewood-Paley decomposition. For expository simplicity we prove (5.15) at fixed frequency \(k\). Using the notation in (5.13) we can rewrite equation (3.19) as

$$ \hat{\partial}_{t}^{2} u + \sum _{(\alpha ,\beta ) \neq (0,0)} T_{{ \tilde{g}}^{\alpha \beta}}\partial _{\alpha}\partial _{\beta}u+T_{ \partial _{\alpha}\partial _{\beta}u}{\tilde{g}}^{\alpha \beta} =0. $$
(5.17)

To finish the proof we consider the expression above localized at frequency \(2^{k}\), and evaluated in the \(L^{\infty}\)-norm

$$ \begin{aligned} \Vert P_{k}\hat{\partial}_{t}^{2} u \Vert _{L^{\infty}} &\leq \sum _{(\alpha ,\beta ) \neq (0,0)} \Vert P_{< k} ({\tilde{g}}^{ \alpha \beta}) P_{k}( \partial _{\alpha}\partial _{\beta}u )\Vert _{L^{ \infty}}+ \Vert P_{< k}(\partial _{\alpha}\partial _{\beta}u) P_{k}({ \tilde{g}}^{\alpha \beta})\Vert _{L^{\infty}}. \end{aligned} $$
(5.18)

We bound each of the terms separately. For the second we use the fact that \({\tilde{g}}\) is bounded in \(L^{\infty}\), together with the third bound in (5.8), in order to get

$$ \Vert P_{< k}(\partial _{\alpha}\partial _{\beta}u) P_{k}({\tilde{g}}^{ \alpha \beta})\Vert _{L^{\infty}} \leq 2^{\frac{k}{2}}{\mathcal {B}}c_{k}, \qquad (\alpha ,\beta ) \neq (0,0). $$

For the first term we rely on the same procedure, and hence finish the proof of the first bound in (5.15). The second bound in (5.15) has as a starting point the same decomposition in (5.18), only that this time we want to bound the RHS terms using the control norm \({\mathcal {A}}\). Here we use the first bound in (5.8) and the algebra property of \(L^{\infty}\) to obtain

$$\begin{aligned} \Vert P_{< k}(\partial _{\alpha}\partial _{\beta}u) P_{k}({\tilde{g}}^{ \alpha \beta})\Vert _{L^{\infty}} \leq& \Vert P_{< k}(\partial _{\alpha} \partial _{\beta}u) \Vert _{L^{\infty}} \Vert P_{k}({\tilde{g}}^{ \alpha \beta})\Vert _{L^{\infty}} \\ \leq& 2^{k}{\mathcal {A}}c_{k}^{2}, \qquad (\alpha ,\beta ) \neq (0,0). \end{aligned}$$

The last bound to prove is (5.16), where because of the balanced frequencies we can easily even out the derivatives balance and estimate each of the factors using the ℬ norm. Explicitly, \({\tilde{g}}^{\alpha \beta}\) is in \(\mathfrak{C}_{0} \) by Lemma 5.2 and hence, we get that for indices \((\alpha , \beta )\neq (0, 0)\):

$$ \begin{aligned} \Vert \Pi ({\tilde{g}}^{\alpha \beta}, \partial _{\alpha }\partial _{ \beta }u)\Vert _{L^{\infty}} &\leq \sum _{k} \Vert P_{k}({\tilde{g}}^{ \alpha \beta}) P_{k}(\partial _{\alpha }\partial _{\beta }u) \Vert _{L^{ \infty}} \\ &\leq \sum _{k} 2^{-\frac{k}{2}}\Vert 2^{\frac{k}{2}}P_{k}({\tilde{g}}^{ \alpha \beta})\Vert _{L^{\infty}}2^{\frac{k}{2}}\Vert 2^{-\frac{k}{2}}P_{k}( \partial _{\alpha }\partial _{\beta }u)\Vert _{L^{\infty}} \\ &\lesssim _{{\mathcal {A}^{\sharp }}} {\mathcal {B}}^{2}, \end{aligned} $$

which is the first bound in (5.16). The second bound in (5.16) is similar, but replacing the \(L^{\infty}\) norms with \(L^{2n}\) norms. □

The above lemma motivates narrowing the space \(\mathfrak{C}_{0}\), in order to also include information about \(\partial _{t} v\). For later use, we also define two additional closely related spaces.

Definition 5.5

a) The space ℭ is the space of distributions \(v\) that satisfy (5.9) and, in addition, \(\partial _{t} v\) admits a decomposition \(\partial _{t} v = w_{1}+w_{2}\) so that

$$ \| P_{k} w_{1} \|_{L^{\infty}} \leq C 2^{\frac{k}{2}} {\mathcal {B}}c_{k}, \qquad \| w_{2} \|_{L^{\infty}} \leq C{\mathcal {B}}^{2}, $$
(5.19)

endowed with the norm defined as the best possible constant \(C\) in (5.9) and in the above inequality relative to all such possible decompositions.

b) The space \(\mathfrak{DC}\) consists of all functions \(f\) that admit a decomposition \(f = f_{1}+f_{2}\) so that

$$ \| P_{k} f_{1}\|_{L^{\infty}} \leq C 2^{\frac{k}{2}} {\mathcal {B}}c_{k}, \qquad \| f_{2}\|_{L^{\infty}} \leq C {\mathcal {B}}^{2}, $$
(5.20)

endowed with the norm defined as the best possible constant \(C\) in the above inequality relative to all such possible decompositions.

c) The space \(\partial _{x} \mathfrak{DC}\) consists of functions \(f\) that admit a decomposition \(f = f_{1}+f_{2}\) so that

$$ \| P_{k} f_{1}\|_{L^{\infty}} \leq C 2^{\frac{3k}{2}} {\mathcal {B}}c_{k}, \qquad \| P_{k} f_{2}\|_{L^{\infty}} \leq C 2^{k} {\mathcal {B}}^{2}, $$
(5.21)

endowed also with the corresponding norm.

We remark that, by definition, we have the simple inclusions

$$ \mathfrak{C}\subset \mathfrak{C}_{0}, \qquad \partial :\mathfrak{C} \to \mathfrak{DC}, \qquad \partial _{x}: \mathfrak{DC}\to \partial _{x} \mathfrak{DC}. $$
(5.22)

Based on what we have so far, we begin by identifying some elements of these spaces:

Lemma 5.6

We have

$$ \|\partial u\|_{\mathfrak{C}} \lesssim 1, \qquad \| \partial ^{2} u \|_{\mathfrak{DC}} \lesssim 1. $$
(5.23)

Proof

The bounds in (5.23) are trivial unless both derivatives are time derivatives, in which case it follows directly from the previous Lemma 5.4. □

The Moser estimates of Lemma 5.2 may be extended to this setting to include all smooth functions of \(\partial u\):

Lemma 5.7

a) We have the bilinear multiplicative relations

$$ \mathfrak{C}_{0} \cdot \mathfrak{DC}\to \mathfrak{DC}, \qquad T_{ \mathfrak{C}_{0}} \cdot \mathfrak{DC}\to \mathfrak{DC}, \qquad T_{ \mathfrak{DC}} \mathfrak{C}_{0} \to {\mathcal {A}^{\sharp }} \mathfrak{DC}, $$
(5.24)

as well as

$$ T_{\mathfrak{DC}} \mathfrak{C}_{0} \to {\mathcal {B}}^{2} L^{\infty}, \qquad \Pi (\mathfrak{DC},\mathfrak{C}_{0}) \to {\mathcal {B}}^{2} L^{ \infty}. $$
(5.25)

b) The spaceis closed under multiplication and para-multiplication; in particular it is an algebra.

c) Let \(F\) be a smooth function, and \(v \in \mathfrak{C}\). Then \(F(v) \in \mathfrak{C}\). In particular if \(\|v\|_{\mathfrak{C}} \lesssim 1\) and \(F(0) = 0\) then \(F(v)\) satisfies

$$ \| F(v) \|_{\mathfrak{C}} \lesssim _{{\mathcal {A}^{\sharp }}} \|v\|_{ \mathfrak{C}}. $$
(5.26)

d) In addition we also have the paralinearization error bounds

$$ \| P_{k} R(v)\|_{L^{n}} \lesssim _{{\mathcal {A}^{\sharp}}} 2^{-k} c_{k}^{2} {\mathcal {A}^{\sharp}}^{2}, \qquad \| \partial R(v)\|_{L^{\infty}} \lesssim _{{\mathcal {A}^{ \sharp }}} {\mathcal {B}}^{2}, $$
(5.27)

where \(R\) is as in Lemma 2.9, namely \(R(v) = F(v) - T_{F'(v)} v\).

Here part (a) is the main part, after which parts (b) and (c) become immediate improvements of Lemma 5.2. But the new interesting bound is the one in part (d), where, notably, we also bound the time derivative of \(R(v)\).

Proof

a) Let \(z \in \mathfrak{C}_{0}\) and \(w \in \mathfrak{DC}\) with the decomposition \(w=w_{1}+w_{2}\) as in (5.20). We skip the first bound in (5.24), as it is a consequence of the rest of the estimates in (5.24) and (5.25), and first consider the paraproduct \(T_{z} w\). We will bound the contributions of \(w_{1}\) and \(w_{2}\) in the same norms as \(w_{1}\), respectively \(w_{2}\). Precisely, we have

$$ \| P_{k} T_{z} w_{1} \|_{L^{\infty}} \lesssim \| z\|_{L^{\infty}} \| \tilde{P}_{k} w_{1}\|_{L^{\infty}} \lesssim \| z\|_{\mathfrak{C}_{0}} 2^{ \frac{k}{2}} {\mathcal {B}}c_{k} \| w \|_{\mathfrak{DC}} , $$

respectively

$$ \| T_{z} w_{2} \|_{L^{\infty}} \lesssim \sum _{k} \|z_{k}\|_{L^{ \infty}} \| P_{>k} w_{2}\|_{L^{\infty}} \lesssim {\mathcal {A}^{\sharp }} \| z\|_{\mathfrak{C}_{0}} \|w_{2}\|_{L^{\infty}}. $$

Next we consider \(T_{w} z\), where we have two choices. The first choice is to use only the \({\mathcal {A}^{\sharp }}\) component of the \(\mathfrak{C}_{0}\) norm of \(z\), and prove the last bound in (5.24). Precisely, we have

$$ \| P_{k} T_{w_{1}} z \|_{L^{\infty}} \lesssim \| w_{1,< k}\|_{L^{ \infty}} \| \tilde{P}_{k} z\|_{L^{\infty}} \lesssim 2^{\frac{k}{2}} { \mathcal {B}}c_{k} \| w \|_{\mathfrak{DC}} \cdot {\mathcal {A}^{\sharp }} \|z\|_{\mathfrak{C}_{0}}, $$

respectively

$$ \| T_{w_{2}} z \|_{L^{\infty}} \lesssim \sum _{k} \| w_{2,< k}\|_{L^{ \infty}} \| P_{k} z\|_{L^{\infty}} \lesssim \| w_{2}\|_{L^{\infty}} \cdot {\mathcal {A}^{\sharp }}\|z\|_{\mathfrak{C}_{0}}. $$

Alternatively, we can use the ℬ component of the \(\mathfrak{C}_{0}\) norm of \(z\) in the bound for the \(w_{1}\) component,

$$\begin{aligned} \| T_{w_{1}} z \|_{L^{\infty}} \lesssim& \sum _{k} \| w_{1,< k}\|_{L^{ \infty}} \| z_{k}\|_{L^{\infty}}\\ \lesssim& \sum _{k} 2^{\frac{k}{2}} { \mathcal {B}}c_{k} \| w \|_{\mathfrak{DC}} \cdot 2^{-\frac{k}{2}} { \mathcal {B}}c_{k}\|z\|_{\mathfrak{C}_{0}} \\ \lesssim& {\mathcal {B}}^{2} \| w \|_{\mathfrak{DC}}\|z\|_{\mathfrak{C}_{0}}, \end{aligned}$$

and the \({\mathcal {A}^{\sharp}}\) component for the \(w_{2}\) term,

$$ \| T_{w_{2}} z \|_{L^{\infty}} \lesssim \sum _{k} \| w_{2,< k}\|_{L^{\infty}} \| z_{k}\|_{L^{\infty}} \lesssim {\mathcal {B}}^{2} \sum _{k} {\mathcal {A}^{\sharp}}c_{k}^{2} \|z\|_{\mathfrak {C}_{0}} \lesssim _{{\mathcal {A}^{\sharp}}} {\mathcal {B}}^{2} \| w \|_{\mathfrak {DC}}\|z\|_{\mathfrak {C}_{0}}, $$

which leads to the first bound in (5.25).

It remains to consider the second bound in (5.25), where we have

$$\begin{aligned} \| \Pi (w_{1},z) \|_{L^{\infty}} \lesssim& \sum _{k} \| w_{1,k}\|_{L^{ \infty}} \| z_{k} \|_{L^{\infty}} \\ \lesssim& \sum _{k} 2^{\frac{k}{2}} { \mathcal {B}}c_{k} \| w \|_{\mathfrak{DC}} \cdot 2^{-\frac{k}{2}} { \mathcal {B}}c_{k}\|z\|_{\mathfrak{C}_{0}} \\ \lesssim& {\mathcal {B}}^{2} \| w \|_{\mathfrak{DC}}\|z\|_{\mathfrak{C}_{0}}, \end{aligned}$$

respectively

$$\begin{aligned} \| \Pi (w_{2},z) \|_{L^{\infty}} \lesssim& \sum _{k} \| w_{2,k}\|_{L^{ \infty}} \| z_{k}\|_{L^{\infty}} \\ \lesssim& \| w_{2}\|_{L^{\infty}} \cdot {\mathcal {A}^{\sharp }}\|z\|_{\mathfrak{C}_{0}} \\ \lesssim& { \mathcal {B}}^{2}\| w\|_{\mathfrak{DC}} \cdot {\mathcal {A}^{\sharp }}\|z \|_{\mathfrak{C}_{0}}. \end{aligned}$$

b) Compared with part (a) of Lemma 5.2, it remains to estimate the time derivative of products and paraproducts. Using Leibniz’s rule, this reduces directly to the multiplicative bounds in (a).

c) Compared with part (b) of Lemma 5.2 it remains to estimate

$$ \partial _{0} F(v) = F'(v) \partial _{0} v $$

as in (5.19), which is the same as placing it in \(\mathfrak{DC}\). By Lemma 5.2 we have \(F'(v) \in \mathfrak{C}_{0}\), while \(\partial _{0} v \in \mathfrak{DC}\). Then we can bound the product in \(\mathfrak{DC}\) by part (a) of this Lemma.

d) Subtracting the harmless linear part of \(F\), without any loss of generality we can assume that \(F'(0)=0\). We have

$$ \partial R = (F'(v)-T_{F'(v)}) \partial v - T_{\partial F'(v)} v = \Pi (F'(v), \partial v) + T_{\partial v} F'(v) - T_{\partial F'(v)} v. $$

By (5.26) we can place \(F'(v) \in \mathfrak {C}\). Then the \({\mathcal {B}}^{2}\) bound follows directly from (5.25). For the \({\mathcal {A}^{\sharp}}^{2}\) bound we can replace \(\partial \) by \(\partial _{x}\) above, and use only \(\mathfrak {C}_{0}\) bounds. Then we can repeat the argument in Lemma 5.2 (a). □

Applying the above lemma shows that for smooth functions \(F\) with \(F(0)=0\) we have \(F(\partial u) \in \mathfrak{C}\), and in particular all components of the metrics \(g\), \({\tilde{g}}\) and \({\hat{g}}\) are in ℭ modulo constants. We also have \(F(\partial u) \partial ^{2} u \in \mathfrak{DC}\), which in particular shows that the gradient potentials \(A\) and \({\tilde{A}}\) belong to \(\mathfrak{DC}\).

We will use part (d) when \(w=\partial u\) and \(F=g\), in which case (5.27) reads

$$ \|\partial R(\partial u)\|_{L^{\infty}} \lesssim _{{\mathcal {A}^{ \sharp }}}{\mathcal {B}}^{2}, \qquad R = g^{\alpha \beta} + T_{ \partial ^{\alpha} u g^{\beta \gamma}} \partial _{\gamma }u + T_{ \partial ^{\beta} u g^{\alpha \gamma}} \partial _{\gamma }u . $$
(5.28)

We remark that a similar \(H^{s}\) type bound for the same \(R\) is provided by (2.14), namely

$$ \|R(\partial u)\|_{H^{s-\frac{1}{2}}} \lesssim _{{\mathcal {A}}}{ \mathcal {B}}\| \partial u\|_{H^{s-1}}. $$
(5.29)

The next lemma provides us with the primary example of elements of the space \(\partial _{x} \mathfrak{DC}\):

Lemma 5.8

We have

$$ \| \partial _{\alpha }\widehat{\partial _{\beta }\partial _{\gamma}} u \|_{\partial _{x} \mathfrak{DC}} \lesssim _{{\mathcal {A}^{\sharp }}} 1. $$
(5.30)

Proof

The bound in (5.30) is trivial if at least two derivatives are spatial, and follows from (5.23) unless all indices are zero. It remains to consider the case \(\alpha = \beta = \gamma = 0\). Here we rely on the earlier decomposition (5.17) to which we further apply a \(\partial _{t}\):

$$ \partial _{t} \hat{\partial}_{t}^{2}u =- \sum _{(\alpha , \beta ) \neq (0, 0)}\left ( T_{\partial _{t} \tilde{g}^{\alpha \beta}} \partial _{\alpha}\partial _{\beta}u + T_{ \tilde{g}^{\alpha \beta}} \partial _{t}\partial _{\alpha}\partial _{\beta}u+ T_{\partial _{t} \partial _{\alpha }\partial _{\beta }u} {\tilde{g}}^{\alpha \beta}+T_{ \partial _{\alpha}\partial _{ \beta} u} \partial _{t}{\tilde{g}}^{ \alpha \beta}\right ) . $$
(5.31)

We now investigate each term separately, for fixed \((\alpha ,\beta ) \neq (0,0)\). We begin with the first term, which needs to be bounded in the \(\partial _{x} \mathfrak{DC}\) norm given in (5.21). We have

$$ \Vert P_{k} T_{\partial _{t} \tilde{g}^{\alpha \beta}} \partial _{ \alpha}\partial _{\beta}u \Vert _{L^{\infty}} \lesssim \Vert P_{< k} ( \partial _{t} \tilde{g}^{\alpha \beta})\Vert _{L^{\infty}} \Vert P_{k} (\partial _{\alpha}\partial _{\beta}u) \Vert _{L^{\infty}}. $$

The term that contains the time derivative falling onto the metric will be bounded using the Moser estimate Lemma 5.7. Explicitly, we know that \({\tilde{g}}^{\alpha \beta}\) is in ℭ modulo constants, and due to Lemma 5.7, part \(c\)), we get \(\partial _{t}{\tilde{g}}^{\alpha \beta}\in \mathfrak{DC}\) which allows us to decompose it as in (5.20), \(\partial _{t}{\tilde{g}}^{\alpha \beta}={\tilde{g}}_{1}^{\alpha \beta} + {\tilde{g}}_{2}^{\alpha \beta}\) where

$$ \Vert P_{< k}\partial _{t} {\tilde{g}}^{\alpha \beta}\Vert _{L^{\infty}} \lesssim \Vert {\tilde{g}}_{1}^{\alpha \beta}\Vert _{L^{\infty}} + \Vert P_{< k} {\tilde{g}}_{2}^{\alpha \beta}\Vert _{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}}C{\mathcal {B}}^{2} +C2^{\frac{k}{2}}{ \mathcal {B}}c_{k}. $$

We now turn to the last bound which we can estimate in two ways using the last part of (5.8),

$$ \Vert P_{k} (\partial _{\alpha}\partial _{\beta}u) \Vert _{L^{\infty}} \lesssim 2^{\frac{k}{2}} \Vert 2^{-\frac{k}{2}}P_{k}(\partial _{ \alpha }\partial _{\beta }u)\Vert _{L^{\infty}}\lesssim 2^{ \frac{k}{2}}{\mathcal {B}}c_{k}, $$

respectively the first part of (5.8)

$$ \Vert P_{k} (\partial _{\alpha}\partial _{\beta}u) \Vert _{L^{\infty}} \lesssim 2^{k }\Vert 2^{-k}P_{k}(\partial _{\alpha }\partial _{\beta }u) \Vert _{L^{\infty}}\lesssim 2^{k}{\mathcal {A}}c_{k}^{2}. $$

Putting together the bounds we have leads to

$$ \Vert P_{k}T_{\partial _{t} {\tilde{g}}^{\alpha \beta}} \partial _{ \alpha}\partial _{\beta}u\Vert _{L^{\infty}}\lesssim _{{\mathcal {A}^{ \sharp }}}C2^{k}{\mathcal {A}}{\mathcal {B}}^{2} c_{k}^{2} + C2^{k} { \mathcal {B}}^{2} c_{k}^{2}. $$

We now bound the second term in (5.31) as follows:

$$ \Vert P_{k} T_{{\tilde{g}}^{\alpha \beta}}\partial _{t} \partial _{\alpha}\partial _{\beta} u\Vert _{L^{\infty}} \lesssim \Vert {\tilde{g}}^{\alpha \beta}\Vert _{L^{\infty}}\Vert \tilde{P}_{k} \partial _{t}\partial _{\alpha}\partial _{\beta} u \Vert _{L^{\infty}}. $$

Here we know that \((\alpha , \beta ) \neq (0,0)\) hence there are two cases to consider: (i) we have either \(\alpha =0\) or \(\beta =0\), but not both zero, which overalls means we need to bound \(\partial _{t} ^{2} \partial _{x} u\), or (ii) we have both \(\alpha , \beta \neq 0\), in which case we need a pointwise bound for \(\partial _{t}\partial ^{2}_{x} u\). However, both cases can be handled in the same way if we observe that \(\partial _{x} (\partial _{x}\partial _{t} u)\) and \(\partial _{x} (\partial ^{2}_{t} u)\) are elements in \(\partial _{x}\mathfrak{DC}\); this is a direct consequence of \(\partial ^{2} u \in \mathfrak{DC}\) as shown in (5.23), followed by the inclusion in (5.22).

Finally, the third and fourth terms in (5.31) can be treated in the same way the first term in (5.31) was shown to be bounded. □

We continue with another, slightly more subtle balanced bound:

Lemma 5.9

For \(g, h \in \mathfrak{C}\) define

$$ r =( T_{g} T_{h} - T_{gh} ) \partial ^{2} u. $$

Then we have the balanced bound

$$ \| r\|_{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp}}} {\mathcal {B}}^{2} \|g\|_{ \mathfrak{C}} \|h\|_{\mathfrak{C}}. $$
(5.32)

Proof

For \(\partial ^{2} u\) we use the \(\mathfrak{DC}\) decomposition as in (5.20),

$$ \partial ^{2} u = f_{1} + f_{2}. $$

We begin with the contribution \(r_{1}\) of \(f_{1}\), which we expand as

$$ r_{1} = \sum _{k} ( T_{g} T_{h} - T_{gh} ) f_{1,k}. $$

This vanishes unless the frequencies \(k_{1}\), \(k_{2}\) of \(g\) and \(h\) are either

(i) \(k_{1}, k_{2} \leq k\) and \(\max \{k_{1},k_{2}\} = k+O(1)\), or

(ii) \(k_{1} = k_{2} > k + O(1) \).

Then we use the \({\mathcal {A}^{\sharp }}\) component of the \(\mathfrak{C}_{0}\) norm for the lower frequency and the ℬ component for the higher frequency to estimate

$$ \begin{aligned} \|r_{1}\|_{L^{\infty}} \lesssim & \ \sum _{k} \| f_{1,k}\|_{L^{\infty}} \left ( \sum _{k_{1} < k+O(1)} \| g_{k_{1}}\|_{L^{\infty}} \sum _{k_{2}=k + O(1)} \| h_{k_{2}}\|_{L^{\infty}} \right . \\ &\, + \left . \sum _{k_{1} = k+O(1)} \| g_{k_{1}}\|_{L^{\infty}} \sum _{k_{2}< k + O(1)} \| h_{k_{2}}\|_{L^{\infty}}\right . \\ &\, + \left . \sum _{k_{1} = k_{2} \geq k+O(1)} \| g_{k_{1}}\|_{L^{\infty}} \| h_{k_{2}}\|_{L^{\infty}} \right ) \\ \lesssim & \ \|g\|_{\mathfrak{C}} \|h\|_{\mathfrak{C}} \sum _{k} 2^{\frac{k}{2}} {\mathcal {B}}c_{k} ( {\mathcal {A}^{ \sharp }}\cdot 2^{-\frac{k}{2}} {\mathcal {B}}c_{k} + 2^{-\frac{k}{2}} { \mathcal {B}}c_{k} \cdot {\mathcal {A}^{\sharp }}+ {\mathcal {A}^{\sharp }} \cdot 2^{-k} {\mathcal {B}}c_{k}) \\ \lesssim & \ {\mathcal {A}^{\sharp }}{\mathcal {B}}^{2}\|g\|_{ \mathfrak{C}} \|h\|_{\mathfrak{C}}, \end{aligned} $$

as needed.

For the contribution \(r_{2}\) of \(f_{2}\) we use a similar expansion, and the first two lines of the estimate above are largely unchanged, except for the use of Bernstein’s inequality in case (ii). But now we only use the \({\mathcal {A}^{\sharp}}\) component of the \(\mathfrak {C}_{0}\) norm for both \(k_{1}\) and \(k_{2}\) frequency. This leads to

$$\begin{aligned} \|r_{2}\|_{L^{\infty}} \lesssim & \ \sum _{k} \| f_{2,k}\|_{L^{\infty}} \left ( \sum _{k_{1} < k+O(1)} \| g_{k_{1}}\|_{L^{\infty}} \sum _{k_{2}=k + O(1)} \| h_{k_{2}}\|_{L^{\infty}} \right .\\ &\, + \left . \sum _{k_{1} = k+O(1)} \| g_{k_{1}}\|_{L^{\infty}} \sum _{k_{2}< k + O(1)} \| h_{k_{2}}\|_{L^{\infty}} \right . \\ &\, \left . + \sum _{k_{1} = k_{2} \geq k+O(1)} 2^{k} \| g_{k_{1}}\|_{L^{2n}} \| h_{k_{2}}\|_{L^{2n}} \right ) \\ \lesssim & \ {\mathcal {B}}^{2} \|g\|_{\mathfrak {C}} \|h\|_{\mathfrak {C}} \sum _{k} {\mathcal {A}^{\sharp}}^{2} \left (c_{k}^{2} + \sum _{j \geq k} {\mathcal {A}^{\sharp}}^{2} 2^{k-j} c_{j}^{2}\right ) \\ \lesssim & \ {\mathcal {A}^{\sharp}}^{2} {\mathcal {B}}^{2}\|g\|_{\mathfrak {C}} \|h\|_{\mathfrak {C}}, \end{aligned}$$

which again suffices. □

As already discussed in the introduction, the paradifferential wave operator

$$ T_{P} = \partial _{\alpha }T_{g^{\alpha \beta}} \partial _{\beta}, $$
(5.33)

as well as its counterparts \(T_{\tilde{P}}\) and \(T_{\hat{P}}\) with the metric \(g\) replaced by \({\tilde{g}}\), respectively \({\hat{g}}\), play an important role in our context.

Throughout the paper, we will interpret various objects related to \(u\) as approximate solutions for the \(T_{P}\) equation. We provide several results of this type, where we use our control parameters \({\mathcal {A}}\), ℬ in order to estimate the source term in the paradifferential equation for both \(u\) and for its derivatives.

Lemma 5.10

We have

$$ \| T_{P} u \|_{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}}{ \mathcal {B}}^{2}, $$
(5.34)

as well as the similar bounds for \(T_{\tilde{P}}\) and \(T_{\hat{P}}\).

Proof

We first prove the bound (5.34), and for this we begin with the paradifferential equation associated to the minimal surface equation (3.5)

$$ T_{g^{\alpha \beta}}\partial _{\alpha}\partial _{\beta}u +T_{ \partial _{\alpha}\partial _{\beta}u}g^{\alpha \beta}+\Pi (g^{\alpha , \beta}, \partial _{\alpha}\partial _{\beta}u)=0, $$

and further isolate the part we are interested in estimating

$$ \partial _{\beta}T_{g^{\alpha \beta}}\partial _{\alpha}u - T_{ \partial _{\beta} g^{\alpha \beta}}\partial _{\alpha}u +T_{\partial _{ \alpha}\partial _{\beta}u}g^{\alpha \beta}+\Pi (g^{\alpha \beta}, \partial _{\alpha}\partial _{\beta}u)=0. $$

The estimate we want relies on getting bounds for the following terms

$$\begin{aligned} \Vert T_{P}u\Vert _{L^{\infty}} =&\Vert \partial _{\beta}T_{g^{\alpha \beta}}\partial _{\alpha}u \Vert _{L^{\infty}} \\ \lesssim& \Vert T_{ \partial _{\beta} g^{\alpha \beta}}\partial _{\alpha}u \Vert _{L^{ \infty}} + \Vert T_{\partial _{\alpha}\partial _{\beta}u}g^{\alpha \beta} \Vert _{L^{\infty}}+ \Vert \Pi (g^{\alpha \beta}, \partial _{ \alpha}\partial _{\beta}u) \Vert _{L^{\infty}.} \end{aligned}$$

However, the bounds for all of these terms rely on the use of the fact that \(g^{\alpha \beta}\) is in ℭ modulo constants, \(\partial g^{\alpha \beta}, \partial _{\alpha}\partial _{\beta}u \in \mathfrak{DC}\) (consequence of Lemma 5.7), as well as on the bound given by Lemma 5.4. Precisely the estimate (5.25) implies that

$$ \Vert T_{\partial _{\beta} g^{\alpha \beta}}\partial _{\alpha}u \Vert _{L^{\infty}} + \Vert T_{\partial _{\alpha}\partial _{\beta}u}g^{ \alpha \beta} \Vert _{L^{\infty}}+ \Vert \Pi (g^{\alpha \beta}, \partial _{\alpha}\partial _{\beta}u) \Vert _{L^{\infty}}\lesssim _{{ \mathcal {A}^{\sharp }}} {\mathcal {B}}^{2}. $$

Similar bounds will be obtained for \(T_{\hat{P}}\) and \(T_{\tilde{P}}\) using the same results mentioned in the proof of bound (5.34) above. □

We next consider similar bounds for derivatives of \(u\). Here we will differentiate between space and time derivatives. We begin with spatial derivatives:

Lemma 5.11

We have

$$ \| P_{< k} T_{P} \partial _{x} u \|_{L^{\infty}} \lesssim _{{ \mathcal {A}^{\sharp }}} 2^{k} {\mathcal {B}}^{2}, $$
(5.35)

as well as the similar bounds for \(T_{\tilde{P}}\) and \(T_{\hat{P}}\).

Proof

For this proof we rely on the previous Lemma 5.10 and on Lemma 5.4. This becomes obvious after we commute the \(\partial _{x}\) across the \(T_{P}\) operator

$$ P_{k} T_{P} \partial _{x} u = \partial _{x} P_{k} \partial _{\alpha }T_{g^{ \alpha \beta}} \partial _{\beta }u- P_{k} \partial _{\alpha }T_{ \partial _{x}g^{\alpha \beta}} \partial _{\beta }u . $$

The first term on the RHS of the identity above is bounded using (5.34) as follows:

$$ \Vert \partial _{x} P_{k} \partial _{\alpha }T_{g^{\alpha \beta}} \partial _{\beta }u\Vert _{L^{\infty}}\lesssim 2^{k} \Vert \partial _{ \alpha }T_{g^{\alpha \beta}} \partial _{\beta }u\Vert _{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}}2^{k}{\mathcal {B}}^{2}. $$

Here we took advantage of the operator \(\partial _{x}\) accompanied by the frequency projector \(P_{k}\). A similar advantage will not present itself for the last term, where we need to distribute the \(\alpha \) derivative

$$ P_{k} \partial _{\alpha }T_{ \partial _{x}g^{\alpha \beta}} \partial _{ \beta }u =P_{k} T_{ \partial _{\alpha}\partial _{x}g^{\alpha \beta}} \partial _{\beta }u + P_{k} T_{ \partial _{x}g^{\alpha \beta}} \partial _{\alpha} \partial _{\beta }u :=e_{1}+e_{2}. $$

We bound \(e_{1}\) using Lemma 5.7, by placing \(\partial _{x} \partial g^{\alpha \beta} \in \partial _{x} \mathfrak{DC}\) which means it will admit a decomposition as follows

$$ \partial _{x} \partial g^{\alpha \beta} =f_{1}+f_{2}, $$

where

$$ \Vert P_{k}f_{1}\Vert _{L^{\infty}}\lesssim _{{\mathcal {A}^{\sharp }}}2^{ \frac{3k}{2}}{\mathcal {B}}c_{k}, \quad \Vert P_{k}f_{2}\Vert _{L^{ \infty}}\lesssim _{{\mathcal {A}^{\sharp }}}2^{k}{\mathcal {B}}^{2}. $$

Thus, we get

$$\begin{aligned} \Vert e_{1}\Vert _{L^{\infty}} &\lesssim \left ( \Vert P_{< k} f_{1} \Vert _{L^{\infty}}+\Vert P_{< k} f_{2}\Vert _{L^{\infty}} \right ) \Vert \tilde{P}_{k} \partial _{\beta}u\Vert _{L^{\infty}} \\ &\lesssim _{{ \mathcal {A}^{\sharp }}}\left ( 2^{k}{\mathcal {B}}^{2} + 2^{\frac{3k}{2}}{ \mathcal {B}}c_{k}\right ) \Vert \tilde{P}_{k} \partial _{\beta}u\Vert _{L^{ \infty}}, \end{aligned}$$

which leads to the desired bound once we estimate the last term accordingly. The bounds can be one of the following

$$ \Vert \tilde{P}_{k} \partial _{\beta }u \Vert _{L^{\infty}} \lesssim 2^{- \frac{k}{2}}\Vert 2^{\frac{k}{2}}\tilde{P}_{k} \partial _{\beta}u \Vert _{L^{\infty}}\lesssim 2^{-\frac{k}{2}} c_{k} {\mathcal {B}} \qquad \text{ or } \qquad \Vert \partial _{\beta }u \Vert _{L^{\infty}} \lesssim c_{k}^{2} {\mathcal {A}}. $$

For the first term in the bracket we use the control norm \({\mathcal {A}}\), and for the second term we use the ℬ norm bound.

For \(e_{2}\) we use the decomposition in Lemma 5.4 for \(\partial _{\alpha}\partial _{\beta}u\) and for \(g\) we use the fact that \(g\) is in \(\mathfrak{C}_{0}\) modulo constants, where we can use either the \({\mathcal {A}}\) bound or the ℬ bound. The computations are similar to the case of \(e_{1}\).

The bounds for \(T_{\tilde{P}}\) and \(T_{\hat{P}}\) follow from the exact argument as the one used above in the \(T_{P}\) case. □

Lemma 5.12

a) We have

$$ T_{P} \partial u \in \partial ({\mathcal {B}}^{2} L^{\infty}), $$
(5.36)

i.e. there exists a representation

$$ T_{P} \partial u = \partial _{\alpha }f^{\alpha}, \qquad |f| \lesssim _{{\mathcal {A}^{\sharp }}} {\mathcal {B}}^{2}. $$
(5.37)

b) We also have

$$ \| P_{k} \partial _{\alpha }T_{g^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\gamma}} u\|_{L^{\infty}} + \| P_{k} \partial _{\gamma }T_{g^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\alpha}} u\|_{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}} 2^{k} {\mathcal {B}}^{2}. $$
(5.38)

Similar results hold with \(g\) replaced by \({\tilde{g}}\) or \({\hat{g}}\).

Proof

a) We write

$$ \begin{aligned} \partial _{\alpha }T_{g^{\alpha \beta}} \partial _{\beta }\partial u = & \ \partial \partial _{\alpha }T_{g^{\alpha \beta}} \partial _{\beta }u - \partial _{\alpha }T_{\partial g^{\alpha \beta}} \partial _{\beta }u . \end{aligned} $$

Here for the first term we use Lemma 5.10, while for the second, by (5.25), we have

$$ \| T_{\partial g} \partial u \|_{L^{\infty}} \lesssim _{{\mathcal {A}^{ \sharp }}} {\mathcal {B}}^{2}. $$

b) The first step here is to reduce to the case of the metric \({\tilde{g}}\). Each of the other two metrics may be written in the form \(h {\tilde{g}}\), with \(h= h(\partial u)\). Then we can write

$$ \partial _{\alpha }T_{h {\tilde{g}}^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\gamma}} u = T_{h} \partial _{ \alpha }T_{{\tilde{g}}^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\gamma}} u - T_{\partial _{ \alpha }h} T_{{\tilde{g}}^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\gamma}} u + \partial _{ \alpha }(T_{h {\tilde{g}}^{\alpha \beta}} - T_{h} T_{{\tilde{g}}^{ \alpha \beta}}) \widehat{ \partial _{\beta }\partial _{\gamma}} u. $$
(5.39)

The first term corresponds to our reduction, and the remaining terms need to be estimated perturbatively. This is straightforward unless \(\alpha = 0\), so we focus now on this case. Discarding constants, we can assume here that \(h(0)=0\).

For the middle term in (5.39) we can use the bound (5.26) in Lemma 5.7 to place \(h\) in ℭ. Then \(\partial _{0} h\) is in \(\mathfrak {DC}\). Using a \(\mathfrak {DC}\) decomposition for it, \(\partial _{0} h = h_{1}+h_{2}\), we can match the two terms with the two pointwise bounds for \(\widehat{ \partial _{\beta }\partial _{\gamma}} u\), namely

$$ \|P_{k} \widehat{ \partial _{\beta }\partial _{\gamma}} u\|_{L^{ \infty}} \lesssim 2^{k} {\mathcal {A}}c_{k}^{2}, \qquad \| P_{k} \widehat{ \partial _{\beta }\partial _{\gamma}} u\|_{L^{\infty}} \lesssim 2^{\frac{k}{2}} {\mathcal {B}}c_{k}, $$

which follow from (5.8) and (5.15). This yields

$$ \begin{aligned} \| P_{k} T_{\partial _{\alpha }h} T_{{\tilde{g}}^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\gamma}} u\|_{L^{\infty}} & \ \lesssim \ \| h_{1}\|_{L^{\infty}} \cdot \| \tilde{P}_{k} \widehat{ \partial _{\beta }\partial _{\gamma}} u\|_{L^{\infty}} + \| h_{2, < k} \|_{L^{\infty}} \cdot \| \tilde{P}_{k} \widehat{ \partial _{\beta }\partial _{\gamma}} u\|_{L^{\infty}} \\ & \ \lesssim _{{\mathcal {A}^{\sharp }}} {\mathcal {B}}^{2} \cdot 2^{k} { \mathcal {A}}c_{k}^{2} + 2^{\frac{k}{2}} {\mathcal {B}}c_{k} \cdot 2^{ \frac{k}{2}} {\mathcal {B}}c_{k} , \end{aligned} $$

which suffices.

For the last expression in (5.39) we distribute \(\partial _{0}\),

$$ \begin{aligned} \partial _{0} (T_{h {\tilde{g}}^{0\beta}} - T_{h} T_{{\tilde{g}}^{0 \beta}}) \widehat{ \partial _{\beta }\partial _{\gamma}} u = & \ (T_{h {\tilde{g}}^{0\beta}} - T_{h} T_{{\tilde{g}}^{0\beta}}) \partial _{0} \widehat{ \partial _{\beta }\partial _{\gamma}} u + (T_{\partial _{0} h {\tilde{g}}^{0\beta}} - T_{\partial _{0} h} T_{{\tilde{g}}^{0\beta}}) \widehat{ \partial _{\beta }\partial _{\gamma}} u \\ & + (T_{h \partial _{0} {\tilde{g}}^{0\beta}} - T_{h} T_{\partial _{0} { \tilde{g}}^{0\beta}}) \partial _{0} \widehat{ \partial _{\beta }\partial _{\gamma}} u . \end{aligned} $$

For the first term we can combine the bound (5.30) with the para-composition bound in Lemma 2.4 exactly as in the proof of Lemma 5.9. For the second term we use the same \(\mathfrak{DC}\) decomposition as above for \(\partial _{0} h\). For the \(h_{1}\) contribution we have a direct bound without using any cancellations, while for \(h_{2}\) we use again Lemma 2.4. The third term is similar to the second, with the roles of \(h\) and \({\tilde{g}}^{0\beta}\) interchanged. This concludes our reduction to the case of the metric \({\tilde{g}}\).

We continue with the secondFootnote 4 reduction, which is to switch \(\partial _{\alpha}\) and \(\partial _{\gamma}\) in the expression \(\partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\gamma}} u\); this allows us to replace the first term in (5.38) with the second. For fixed \(\alpha \) and \(\gamma \), we write

$$ \partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\gamma}} u = \partial _{ \gamma }T_{{\tilde{g}}^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\alpha}} u + f, $$
(5.40)

where we claim \(f\) satisfies

$$ \| P_{< k} f \|_{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}} 2^{k} { \mathcal {B}}^{2}. $$
(5.41)

This is trivial if \(\alpha = \gamma = 0\). If both are nonzero, or if one of them is zero but \(\beta \neq 0\), then there is no hat correction and this is a straightforward commutator bound. It remains to discuss the case when \(\beta = 0\) and exactly one of \(\alpha \) and \(\gamma \) are zero, say \(\gamma = 0\). Then we need to consider the difference

$$ \begin{aligned} f = & \ \partial _{\alpha }T_{{\tilde{g}}^{\alpha 0 }} \widehat{\partial _{0} \partial _{0}} u - \partial _{0} T_{{\tilde{g}}^{ \alpha 0 }} \partial _{0} \partial _{\alpha }u \\ = & \ \partial _{\alpha }T_{{\tilde{g}}^{\alpha 0 }} \Pi (u, \partial _{x} \partial u) + T_{\partial _{\alpha}{\tilde{g}}^{\alpha 0 }} \partial _{0} \partial _{0} u - T_{\partial _{0} {\tilde{g}}^{ \alpha 0 }} \partial _{0} \partial _{\alpha }u, \end{aligned} $$

which can be estimated as in (5.41) using the fact that \(\alpha \neq 0\) as well as the bound (5.15) for the second time derivative of \(u\), respectively the similar bound (5.26) (third estimate) for \(\partial _{0} {\tilde{g}}\). Hence (5.41) follows.

Finally, it remains to examine the expression

$$ g_{\gamma }= \partial _{\gamma }T_{{\tilde{g}}^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\alpha}} u, $$

where, unlike above, we take advantage of the summation with respect to \(\alpha \) and \(\beta \). Then, using the \(u\) equation, we have

$$ g_{\gamma }= \partial _{\gamma }T_{\partial _{\alpha }\partial _{ \beta }u} {\tilde{g}}^{\alpha \beta}. $$

The term where both \(\alpha \) and \(\beta \) are zero vanishes since \({\tilde{g}}^{00}\) is constant; this was a motivation for the first reduction above. So this can be estimated directly as in (5.41) if \(\gamma \neq 0\), and using either (5.15) or (5.26) (third estimate) for \(\partial _{0} {\tilde{g}}\), otherwise. □

6 Paracontrolled distributions

To motivate this section, we start from the classical energy estimates for the wave equation, which are obtained using the multiplier method. Precisely, one multiplies the equation \(\Box _{g} u = f\) by \(Xu\) and simply integrates by parts. Here \(X\) is any regular time-like vector field. In the next section, we prove energy estimates for the paradifferential equation (3.25), by emulating this strategy at the paradifferential level. The challenge is then to uncover a suitable vector field \(X\). Unlike the classical case, here not every time-like vector field \(X\) will suffice. Instead \(X\) must be carefully chosen, and in particular it will inherently have a limited regularity.

Since the metric \(g\) is a function of \(\partial u\), scaling considerations indicate that the vector field \(X\) should be at the same regularity level. Naively, one might hope to have an explicit expression \(X = X(\partial u)\) for our vector field. Unfortunately, seeking such an \(X\) eventually leads to an overdetermined system. At the other extreme, one might enlarge the class of \(X\) to all distributions that satisfy the same \(H^{s}\) and Besov norms as \(\partial u\), which is essentially the class of functions that satisfy (5.26). While this class will turn out to contain the correct choice for \(X\), it is nevertheless too large to allow for a clean implementation of the multiplier method.

Instead, there is a more subtle alternative, namely to have the vector \(X\) to be paracontrolled by \(\partial u\). This terminology was originally introduced by Gubinelli, Imkeller and Perkowski [13] in connection to Bony’s calculus, in order to study stochastic pde problems, see also [14]. However, similar constructions have been carried out earlier in the renormalization arguments e.g. for wave maps, in work of Tao [42], Tataru [47] and Sterbenz-Tataru [41]; the last reference used the name renormalizable for the corresponding class of distributions.

In the standard usage, this is more of a principle than an exact notion, which needs to be properly adapted to one’s purposes. For our own objective here, we provide a very precise definition of this notion, which is exactly tailored to the problem at hand.

6.1 Definitions and key properties

Definition 6.1

We say that a function \(z\) is paracontrolled by \(\partial u\) in a time interval \(I\) if it admits a representationFootnote 5 of the form

$$ z = T_{a} \partial u + r , $$
(6.1)

where the vector field \(a\) and the error \(r\) have the following properties:

(i) bounded para-coefficients \(a\):

$$ \| a \|_{\mathfrak{C}} \leq C. $$
(6.2)

(ii) balanced error \(r\):

$$ \|P_{k} r\|_{L^{n}} \leq C 2^{-k} c_{k}^{2} {\mathcal {A}^{\sharp}}^{2}, \qquad \|\partial r\|_{L^{\infty}} \leq C {\mathcal {B}}^{2} . $$
(6.3)

It is convenient to think of the space of distributions \(z\) paracontrolled by \(\partial u\) as a Banach space, which we denote by \({\mathfrak {P}}(\partial u)\), or simply \({\mathfrak {P}}\). The norm in this Banach space is defined to be the largest implicit constant in (6.2) and (6.3), minimized over all representations of the form (6.1). If \(\|z\|_{{\mathfrak {P}}} \lesssim _{{\mathcal {A}^{\sharp }}} 1\) then we will simply write

While for the most part this definition can be applied separately at each time \(t\), in our context we will think of both \(u\) and \(z\) as functions of time, and think of these bounds as uniform in \(t\). Precisely, above we think of \({\mathcal {A}^{\sharp }}\) as a global, time independent parameter, whereas ℬ is allowed to be a possibly unbounded function of \(t\).

To better understand the space \({\mathfrak {P}}\) of paracontrolled distributions, it is useful to relate it to the objects we have already discussed in the previous section:

Lemma 6.2

a) We have the inclusion \({\mathfrak {P}}\subset \mathfrak{C}\).

b) If \(F\) is a smooth function with \(F(0) = 0\), then \(F(\partial u) \in {\mathfrak {P}}\).

Proof

a) Clearly \(\partial u \in {\mathfrak {P}}\). Then the first term in (6.1) can be placed in ℭ by part (b) of Lemma 5.7. The error term \(r\) also belongs to \(\mathfrak {C}_{0}\) by Bernstein’s inequality and interpolation. This can be upgraded to ℭ using the \(\partial _{0} r\) bound in the second inequality in (6.3).

b) This is a direct consequence of parts (c), (d) of Lemma 5.7. □

Thus one may think of the class \({\mathfrak {P}}\) of paracontrolled distributions as an intermediate stage between the class of smooth functions of \(\partial u\), which is too narrow for our purposes, and the larger class ℭ, which does not carry sufficient structure.

Next we consider nonlinear properties:

Lemma 6.3

a) [Algebra property] The space \({\mathfrak {P}}(\partial u)\) is an algebra. Further, if \(z_{1}, z_{2} \in {\mathfrak {P}}\) have paracoefficients \(a_{1}\), respectively \(a_{2}\), then the paracoefficients of \(z_{1} z_{2}\) can be taken to be \(z_{1} a_{2}+z_{2} a_{1}\).

b) [Moser inequality] If \(F\) is a smooth function with \(F(0)=0\) and , then and \(F(z)\) satisfies

$$ \| F(z) \|_{{\mathfrak {P}}} \lesssim _{{\mathcal {A}^{\sharp }},\|z\|_{L^{ \infty}}} \|z\|_{{\mathfrak {P}}} . $$
(6.4)

Further, if \(z \in {\mathfrak {P}}\) has paracoefficients \(a\), then the paracoefficients of \(F(z)\) can be taken to be \(F'(z) a\).

Proof

a) We consider the algebra property. Let

$$ z_{1} = T_{a_{1}} \partial u + r_{1}, \qquad z_{2} = T_{a_{2}} \partial u + r_{2} $$

and expand \(z_{1}z_{2}\).

We first observe that we can place \(\Pi (z_{1},z_{2})\) into the error term. For this it suffices to use the ℭ norm for \(z_{1}\), \(z_{2}\) and apply the second bound in (5.25).

We next consider \(T_{z_{1}} z_{2}\) where for \(z_{1}\) we again use only the ℭ norm. We begin with \(T_{z_{1}} r_{2}\), which we also estimate as an error term. Here we estimate again the more difficult time derivative. If it falls on the first term then we can bound the output exactly as the balanced case above, see (5.25). Else, it suffices to use the uniform bound on \(z_{1}\).

Finally, we consider the expression

$$ T_{z_{1}} T_{a_{2}^{\gamma}} \partial _{\gamma }u = T_{z_{1} a_{2}^{ \gamma}} \partial _{\gamma }u + (T_{z_{1}} T_{a_{2}^{\gamma}}- T_{z_{1} a_{2}^{\gamma}}) \partial _{\gamma }u , $$

where the first term has a ℭ coefficient by the ℭ algebra property, and the second may be estimated perturbatively. Here if the time derivative goes on the first factor then we are back to the previous case and no cancellation is needed. Else for \(\partial _{t} \partial _{\gamma }u\) we use the decomposition in Definition 5.1(a) (or simply Lemma 5.4), combined with Lemma 2.7.

b) To prove the Moser inequality, our starting point is Lemma 5.7(d), which allows us to reduce the problem to estimating \(T_{F'(z)} z\), using only the ℭ norm of \(z\). But here we can bound \(F'(z)\) in ℭ using the Moser bound in ℭ, which allows us to conclude as in part (a). □

In addition to the above lemmas, functions in \({\mathfrak {P}}\) essentially solve a paradifferential \(\Box _{\tilde{P}}\) equation. This will be used later to estimate lower order terms in the proof of Theorem 7.1, and for (7.108):

Lemma 6.4

Let \(h \in {\mathfrak {P}}\). Then there exist functions \(f^{\alpha}\) so that we have the representation

$$ \partial _{\alpha }T_{g^{\alpha \beta}}\partial _{\beta }h = \partial _{\alpha }f^{\alpha }, $$

which satisfy the following bounds:

$$ |f^{\alpha}| \lesssim _{{\mathcal {A}^{\sharp }}}{\mathcal {B}}^{2} \|h\|_{{ \mathfrak {P}}} , $$
(6.5)

respectively

$$ \| P_{< k} (T_{g^{00}} \partial _{0} h - f^{0}) \|_{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}}2^{k} \|h\|_{{\mathfrak {P}}}. $$
(6.6)

The same result holds for the metrics \({\tilde{g}}\), \({\hat{g}}\).

Proof

We use the representation (6.1) for \(h\). The property in the lemma holds trivially for the \(r\) component of \(h\), with

$$ f^{\alpha }= T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }r. $$

Precisely, the bound (6.5) holds due to the second part of (6.3), while for the bound (6.6), the \(\partial _{0} r\) component cancels and then we can use the first part of (6.3).

It remains to consider \(h\) of the form \(h = T_{a^{\gamma}} \partial _{\gamma }u\). We write

$$ \partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}}\partial _{\beta }h := \partial _{\alpha }h^{\alpha}, $$

noting that the expression on the left hand side of (6.6) is exactly \(h^{0}-f^{0}\). We begin by refining the expression for \(h^{\alpha}\), noting that corrections of size \({\mathcal {B}}^{2}\) may be directly included into \(f^{\alpha}\) without harming (6.6). For this we write

$$\begin{aligned} h^{\alpha } = & \partial _{\gamma }T_{a^{\gamma}} T_{g^{\alpha \beta}} \partial _{\beta }u + T_{{\tilde{g}}^{\alpha \beta}} T_{\partial _{ \beta }a^{\gamma}} \partial _{\gamma }u + [T_{{\tilde{g}}^{\alpha \beta}}, T_{a^{\gamma}}] \partial _{\gamma }\partial _{\beta }u \\ &{} - T_{a^{ \gamma}} T_{\partial _{\gamma }{\tilde{g}}^{\alpha \beta}} \partial _{ \beta }u - \partial _{\alpha }T_{\partial _{\gamma }a^{\gamma}} T_{ { \tilde{g}}^{\alpha \beta}} \partial _{\beta }u, \end{aligned}$$

where the first term on the right is the leading term, while the remaining terms can be estimated by \({\mathcal {B}}^{2}\) as follows:

  • The second, fourth and fifth terms are estimated directly using (6.2) for \(\partial _{\beta }a^{\gamma}\), \(\partial _{\gamma }g^{\alpha \beta}\) respectively \(\partial _{\gamma }a^{\gamma}\).

  • The third term is estimated using the commutator bound in Lemma 2.4, as well as Lemma 5.4 if both \(\beta \) and \(\gamma \) are zero.

We have reduced the problem to the case when

$$ h^{\alpha }= \partial _{\gamma }T_{a^{\gamma}} T_{g^{\alpha \beta}} \partial _{\beta }u. $$

At this point we rewrite

$$ \partial _{\alpha }h^{\alpha }= \partial _{\gamma }\tilde{h}^{\gamma}, \qquad \tilde{h}^{\gamma }= \partial _{\alpha }T_{a^{\gamma}} T_{g^{ \alpha \beta}} \partial _{\beta }u, $$

noting that

$$ \| P_{< k} (h^{0} - \tilde{h}^{0}) \|_{L^{\infty}} \lesssim 2^{k}, $$

which allows us to switch \(h\) and \(\tilde{h}\) also in (6.6). Then we are allowed to correct \(\tilde{h}^{\gamma}\), by writing

$$ \tilde{h}^{\gamma }= T_{\partial _{\alpha }a^{\gamma}} T_{g^{\alpha \beta}} \partial _{\beta }u + T_{a^{\gamma}} \partial _{\alpha }T_{g^{ \alpha \beta}} \partial _{\beta }u. $$

Now both terms on the right can be estimated by \({\mathcal {B}}^{2}\) as follows:

  • The first term is estimated directly using (6.2) for \(\partial _{\alpha }a^{\gamma}\).

  • The second term is estimated using Lemma 5.10.

Hence the proof of the lemma is concluded. □

In addition to the class of paracontrolled distributions \({\mathfrak {P}}\) we also define a secondary class of distributions, which roughly speaking corresponds to derivatives of \({\mathfrak {P}}\) functions.

Definition 6.5

The space \({\mathfrak{DP}}\) of distributions consists of functions \(y\) that admit a representation

$$ y = \partial _{\alpha }z^{\alpha }+ r , $$
(6.7)

where

$$ \| z^{\alpha }\|_{{\mathfrak {P}}} \leq C, \qquad \| r \|_{L^{\infty}} \leq C{\mathcal {B}}^{2}, $$
(6.8)

with the natural associated norm.

Due to the inclusion \({\mathfrak {P}}\subset \mathfrak{C}\), we can directly relate it to the class \(\mathfrak{DC}\) introduced earlier.

Lemma 6.6

We have \({\mathfrak{DP}}\subset \mathfrak{DC}\).

Next we verify that \({\mathfrak{DP}}\) is stable under multiplication by \({\mathfrak {P}}\) functions.

Lemma 6.7

We have the bilinear bound

$$ {\mathfrak {P}}\times {\mathfrak{DP}}\to {\mathfrak{DP}}. $$
(6.9)

As a corollary of this lemma, it follows that our gradient potentials \(A^{\gamma}\) and \({\tilde{A}}^{\gamma}\) are in \({\mathfrak{DP}}\).

Proof

For \(h,z \in {\mathfrak {P}}\) we consider the expansion

$$ \begin{aligned} q =& \ h \partial _{\alpha }z \\ = & \ \partial _{\alpha }T_{h} z - T_{\partial _{\alpha }h} z + \pi (h, \partial _{\alpha }z) + T_{\partial _{\alpha }z} h . \end{aligned} $$

The first term is in \({\mathfrak{DP}}\) by Lemma 6.3(a). The three remaining terms can be perturbatively estimated by \({\mathcal {B}}^{2}\), using the bounds in (5.25). □

Finally, we consider decompositions for \({\mathfrak{DP}}\) functions which are akin to Lemma 5.4. We will do this in two different ways, one which is shorter but loses some structure, and another which is more involved but retains full structure.

Lemma 6.8

Let \(w \in {\mathfrak{DP}}\). Then \(w\) admits a representation of the form

$$ w = \partial _{x} w_{1} + r, $$

where

$$ \| w_{1}\|_{{\mathfrak {P}}} \lesssim _{{\mathcal {A}^{\sharp }}} \|w\|_{{ \mathfrak{DP}}}, \qquad \|r\|_{L^{\infty}} \lesssim _{{\mathcal {A}^{ \sharp }}} {\mathcal {B}}^{2} \|w\|_{{\mathfrak {P}}}. $$
(6.10)

Proof

It suffices to consider \(w\) of the form \(w = \partial _{0} z\) where \(z \in {\mathfrak {P}}\), with a representation as in (6.1),

$$ z = T_{a^{\gamma}} \partial _{\gamma }u + r , $$

with \(a^{\gamma}\), \(r\) as in (6.2), (6.3). The bound (6.3) allows us to discard the contribution of \(r\) to (6.10). It remains to produce an appropriate modification \(\partial _{x} z_{1}\), with \(z_{1} \in \partial _{x} {\mathfrak {P}}\), for the expression

$$ q = \partial _{0} T_{a^{\gamma}} \partial _{\gamma }u. $$

We successively peel off perturbative \(O({\mathcal {B}}^{2})\) layers from \(q\). First we use (6.2) to write

$$ q = T_{a^{\gamma}} \partial _{0}\partial _{\gamma }u + T_{\partial _{0} a^{\gamma}} \partial _{\gamma }u = T_{a^{\gamma}} \partial _{0} \partial _{\gamma }u + O({\mathcal {B}}^{2}). $$

At this point we have two cases to consider:

(i) \(\gamma \neq 0\). Then we write

$$ q = \partial _{\gamma }T_{ a^{\gamma}}\partial _{0} u + O({\mathcal {B}}^{2}), $$

and the remaining expression is in \(\partial _{x} {\mathfrak {P}}\).

(ii) \(\gamma = 0\). Here we use the equation for \(u\) to write

$$ \partial _{t}^{2} u = - \sum _{(\alpha ,\beta ) \neq (0,0)} T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\alpha }\partial _{\beta }u + \Pi ( {\tilde{g}}^{\alpha \beta}, \partial _{\alpha }\partial _{\beta }u) + T_{\partial _{\alpha }\partial _{\beta }u} {\tilde{g}}^{\alpha \beta}. $$

Here the first term on the right involves at least one spatial derivative and is treated as before, in the case \(\gamma \neq 0\), while the contributions of the last two terms are perturbative, and can be bounded by \({\mathcal {B}}^{2}\). □

Our second representation provides a more explicit recipe to obtain the corrected version not only of \({\mathfrak{DP}}\) functions, but also of \({\mathfrak {P}}\times {\mathfrak{DP}}\) functions:

Lemma 6.9

Let \(w = z_{1}^{\alpha }\partial _{\alpha }z_{2}\), where \(z_{1}, z_{2} \in {\mathfrak {P}}\), and \(z_{2}\) has the \({\mathfrak {P}}\) representation

$$ z_{2} = T_{a^{\gamma}} \partial _{\gamma }u + r. $$

Define

$$ \mathring{w} := T_{z_{1}^{\alpha }a^{\gamma}} \widehat{\partial _{\alpha}\partial _{\gamma}} u. $$

Then we have

$$ \|w-\mathring{w}\|_{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}} { \mathcal {B}}^{2}, $$
(6.11)

while

$$ \| P_{k} \mathring{w}\|_{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}} 2^{\frac{k}{2}} {\mathcal {B}}c_{k}. $$
(6.12)

Proof

The contribution of \(r\) is directly perturbative so we discard it. Furthermore, the bounds in (5.25) allow us to replace perturbatively \(w\) by

$$ w = T_{z_{1}^{\alpha}} \partial _{\alpha }z_{2} + O({\mathcal {B}}^{2}) = T_{z_{1}^{\alpha}} T_{a^{\gamma}} \partial _{\alpha }\partial _{ \gamma }u + O({\mathcal {B}}^{2}). $$

Using also Lemma 5.4 we obtain

$$ w = T_{z_{1}^{\alpha}} T_{a^{\gamma}} \widehat{\partial _{\alpha }\partial _{\gamma}} u + O({\mathcal {B}}^{2}). $$

Finally, we use Lemma 2.7 to combine the two paraproducts, arriving at

$$ w = \mathring{w} + O({\mathcal {B}}^{2}), $$

as needed. Finally, the bound (6.12) is also a consequence of Lemma 5.4. □

The last lemma helps us uncover a more subtle, hidden \(\Box _{g}\) structure which appears if we compute the double divergence of the metric \({\tilde{g}}\).

Lemma 6.10

We have

$$ \| P_{< k} \partial _{\alpha }\partial _{\beta }{\tilde{g}}^{\alpha \beta} \|_{L^{\infty}} \lesssim _{{\mathcal {A}^{\sharp }}}2^{k} { \mathcal {B}}^{2} . $$
(6.13)

Proof

For fixed \(\beta \) we expand \(\partial _{\beta }{\tilde{g}}^{\alpha \beta}\) using the relation (3.9) to obtain

$$ \partial _{\beta }{\tilde{g}}^{\alpha \beta} = - \partial ^{\beta }u \, {\tilde{g}}^{\alpha \delta} \partial _{\beta }\partial _{\delta }u - \partial ^{\alpha }u \, {\tilde{g}}^{\beta \delta} \partial _{\beta } \partial _{\delta }u + 2\partial ^{0} u \, {\tilde{g}}^{0\delta} { \tilde{g}}^{\alpha \beta} \partial _{\beta }\partial _{\delta }u. $$

For this expression we define a corresponding ring correction

$$ \partial _{\beta }\mathring{{\tilde{g}}}^{\alpha \beta} := - T_{ \partial ^{\beta }u} \, T_{{\tilde{g}}^{\alpha \delta}} \widehat{\partial _{\beta }\partial _{\delta}} u - T_{\partial ^{ \alpha }u} \, T_{{\tilde{g}}^{\beta \delta}} \widehat{\partial _{\beta }\partial _{\delta}} u + 2T_{\partial ^{0} u} \, T_{{\tilde{g}}^{0\gamma}} T_{{\tilde{g}}^{\alpha \beta}} \widehat{\partial _{\gamma }\partial _{\beta}} u, $$

which is also chosen to vanish if \((\alpha ,\beta )=(0,0)\). We claim that the difference is perturbative for fixed \(\alpha \) and \(\beta \),

$$ |P_{< k} \partial _{\alpha}( \partial _{\beta }{\tilde{g}}^{\alpha \beta} - \partial _{\beta }\mathring{{\tilde{g}}}^{\alpha \beta}) | \lesssim _{{\mathcal {A}^{\sharp }}}2^{k} {\mathcal {B}}^{2}. $$

Indeed, if \(\alpha \neq 0\) then this follows directly from Lemma 6.9. On the other hand if \(\alpha =0\) then \(\beta \neq 0\) in which case the hat correction can be discarded and we may distribute the time derivative, using the fact that \(\partial u, {\tilde{g}}\in \mathfrak{C}\) modulo constants, see Lemma 5.7.

It remains to estimate the expression \(\partial _{\alpha}(\partial _{\beta }\mathring{{\tilde{g}}}^{\alpha \beta}) \), where we return to the standard summation convention and take the sum with respect to all \((\alpha ,\beta )\). Here we separate the three terms in \(\partial _{\beta }\mathring{{\tilde{g}}}^{\alpha \beta}\), in particular forfeiting the cancellation when \((\alpha ,\beta )=(0,0)\). By Lemma 5.7 all paracoefficients are in ℭ, which allows us to perturbatively commute \(\partial _{\alpha}\) with them as needed. Then it suffices to estimate the expression

$$ - T_{\partial ^{\beta }u} \, \partial _{\alpha }T_{{\tilde{g}}^{ \alpha \delta}} \widehat{\partial _{\beta }\partial _{\delta}} u - T_{ \partial ^{\alpha }u} \, \partial _{\alpha }T_{{\tilde{g}}^{\beta \delta}} \widehat{\partial _{\beta }\partial _{\delta}} u + 2T_{ \partial ^{0} u} \, T_{{\tilde{g}}^{0\gamma}} \partial _{\alpha }T_{{ \tilde{g}}^{\alpha \beta}} \widehat{\partial _{\gamma }\partial _{\beta}} u. $$

For all terms here we may directly use Lemma 5.12(b) directly. Hence the proof of the lemma is concluded. □

6.2 Symbol classes and the \({\mathfrak {P}}\)DO calculus

In a similar fashion to the \(L^{\infty }S^{m}\) classes of symbols, our analysis will involve paradifferential operators with symbols that on the physical side are at either the \({\mathfrak {P}}\) or the \({\mathfrak{DP}}\) level. Precisely, we will work with both the symbol classes \({\mathfrak {P}}S^{m}\) and with the classes \({\mathfrak{DP}}S^{m}\).

For comparison purposes, we recall that for just paraproducts with \({\mathfrak {P}}\) functions \(f\), we have the uniform in time product bounds

$$ \|T_{f} T_{g} - T_{fg}\|_{H^{s} \to H^{s}} \lesssim {\mathcal {A}^{ \sharp }}^{2}, $$
(6.14)

as well as the time dependent bounds

$$ \|T_{f} T_{g} - T_{fg}\|_{H^{s} \to H^{s-1}} \lesssim {\mathcal {B}}^{2}, $$
(6.15)

and the corresponding commutator estimates. We also have, for \(h \in {\mathfrak{DP}}\),

$$ \|T_{f} T_{h} - T_{T_{f} h}\|_{H^{s} \to H^{s}} \lesssim {\mathcal {B}}^{2}. $$
(6.16)

Our objective in what follows is to expand these kinds of bounds to the \(\Psi \)DO setting. We will see that things become more complex there. Fortunately, in the present paper we will only need such results primarily when one of the operators is a paraproduct, so we only prove our results in this case and merely make some comments about the general case.

We begin with the uniform in time bounds, i.e. the counterpart of (6.14), where not much changes:

Lemma 6.11

Let \(f \in {\mathfrak {P}}S^{j}\), \(g \in {\mathfrak {P}}S^{k}\). Then

$$ \|T_{f} T_{g} - T_{fg}\|_{H^{s} \to H^{s-j-k}} \lesssim {\mathcal {A}^{ \sharp }}^{2}. $$
(6.17)

Proof

By definition we have a decomposition \(f = f_{1} + f_{2}\) where \(f_{1}\) is an \(S^{j}\) multiplier and \(f_{2} \in {\mathcal {A}^{\sharp }}L^{\infty }S^{j}\), and similarly for \(g\). Since \(T_{f_{1}} = f_{1}(D)\) and \(T_{g_{1}} = g_{1}(D)\), the leading parts cancel and we are left only with \(O({\mathcal {A}^{\sharp }})\) terms, which can be estimated directly without using any cancellation. □

Our next result is concerned with the counterpart of (6.16), where again the result is similar:

Lemma 6.12

For \(g \in {\mathfrak {P}}\) and \(h \in {\mathfrak{DP}}S^{m}\) we have

$$ \|T_{g} T_{h} - T_{T_{g} h}\|_{H^{s} \to H^{s-m}} \lesssim { \mathcal {B}}^{2}. $$

Here, by a slight abuse of notation, by \(T_{g} h\) we mean the symbol paraproduct, where the Fourier variable is viewed as a parameter.

Proof

All operators in the lemma preserve dyadic frequency localization, so it suffices to fix a dyadic frequency size \(k\) and then show that we have

$$ \| (T_{g} T_{h} - T_{T_{g} h}) P_{k} u \|_{L^{2}} \lesssim 2^{mk} { \mathcal {B}}^{2} \|u\|_{L^{2}}. $$

Here we can include the \(2^{mk}\) factor in \(h\) and reduce the problem to the case when \(m=0\).

In the first term we can also harmlessly replace \(g\) by \(g_{< k}\) and \(T_{g}\) by multiplication by \(g_{< k}\), as

$$ \| (T_{g} - g_{< k}) P_{k} u \|_{L^{2} \to L^{2}} \lesssim 2^{- \frac{k}{2}} {\mathcal {B}}, $$
(6.18)

while \(h_{< k} \in 2^{\frac{k}{2}} {\mathcal {B}}L^{\infty }S^{0}\) therefore

$$ \| T_{h} P_{k} \|_{L^{2} \to L^{2}} \lesssim 2^{\frac{k}{2}}{ \mathcal {B}}. $$

Similarly, in the second term we can replace \(T_{g} h\) by \(g_{< k} h\), as

$$ \| P_{< k} (T_{g} h - g_{< k} h)\|_{L^{\infty }S^{0}} \lesssim { \mathcal {B}}^{2}, $$
(6.19)

akin to Lemma 6.7.

Thus it remains to bound in \(L^{2}\) the simpler operator

$$ R = (g_{< k} T_{h} - T_{g_{< k} h}) P_{k}. $$

Our last simplification here is to separate variables in \(h\), and reduce to the case where \(h\) has a product form at frequency \(2^{k}\), namely

$$ h(x,\xi ) = f(x) a(\xi ), \qquad |\xi | \approx 2^{k}, $$

where \(f \in {\mathfrak{DP}}\) and \(a \in S^{0}\). This can be done for instance by thinking of \(h\) as a function of \(\xi \) in a dyadic frequency cube, smooth on the corresponding scale, and by taking a Fourier series in \(\xi \), with coefficients depending on \(x\). The coefficients will inherit the spatial regularity from \(h\), and will be rapidly decreasing since we are taking the Fourier series of a smooth function of \(\xi \).

After this simplification we may represent the operator \(T_{h}\) in the form

$$ T_{h} P_{k} u = L_{lh}(f, P_{k} u), $$

where the symbol of the bilinear form \(L_{lh}\) depends linearly (and explicitly) on \(a\). In this case we may rewrite the operator \(R\) in the form

$$ R u = g_{< k} L_{lh}(f, P_{k} u) - L_{lh}(g_{< k} f, P_{k} u). $$

At this point we can apply one last time the method of separation of variables to the symbol of \(L_{lh}\) to reduce the problem to the case when the bilinear form \(L_{lh}\) is of product type,

$$ L_{lh}(f, P_{k} u) = b_{< k}(D) f c(D) P_{k} u, $$

where the symbols for both symbols \(b_{< k}\) and \(c P_{k}\) are bounded and smooth on the \(2^{k}\) scale. After this final reduction the operator \(R\) has a commutator structure,

$$ R u = [g_{< k}, b_{< k}(D)] f c(D) P_{k} u. $$

Here \(|P_{< k} f| \lesssim 2^{\frac{k}{2}} {\mathcal {B}}\), while the commutator can be bounded by

$$ \| [g_{< k}, b_{< k}(D)]\|_{L^{2} \to L^{2}} \lesssim 2^{-k} \| \partial _{x} g_{< k} \|_{L^{\infty}} \lesssim 2^{-\frac{k}{2}} { \mathcal {B}}. $$

Hence we obtain

$$ \|R\|_{L^{2} \to L^{2}} \lesssim {\mathcal {B}}^{2}, $$

and the proof of the lemma is concluded. □

In very limited circumstances, we will also need a more precise commutator expansion, which arises in the context where we commute one paradifferential operator with symbol \(h \in {\mathfrak {P}}S^{m}\) with a function \(g \in {\mathfrak {P}}\). This will be applied when \(g = {\tilde{g}}^{\alpha \beta}\), but the result holds more generally. The novelty in the commutator expansion below is that we do not simply expand

$$ \text{commutator} = \text{principal part} + \text{error}, $$

but instead we seek to better understand the structure of the error,

$$ \text{commutator} = \text{principal part} + \text{unbalanced subprincipal part} + \text{balanced error.} $$

The principal part corresponds exactly with the Lie bracket of the two symbols, interpreted paradifferentially. For possible use later, we define this more generally for two symbols:

Definition 6.13

The para-Lie bracket of two symbols \(f \in {\mathfrak {P}}S^{j}\), \(g \in {\mathfrak {P}}S^{k}\) is defined as

$$ \{ f,g\}_{p} = T_{\partial _{\xi }f} \partial _{x} g - T_{\partial _{ \xi }g} \partial _{x} f . $$
(6.20)

This belongs to \({\mathfrak{DP}}S^{j+k-1}\).

We remark that if \(f\) is merely a function, then the first term on the right drops.

While the principal part of the commutator can be described using a paradifferential operator with an appropriate symbol, the unbalanced subprincipal part has a more complex structure which would be described best using a variable coefficient bilinear form. In order to be able to describe this structure, we need a slight expansion of the class \(L_{lh}\) of bilinear operators in Definition 2.2:

Definition 6.14

By \({\mathfrak {P}}S^{m} L_{hl}\) we denote any bilinear operator which is a linear combination of operators of the form

$$ T_{h} L_{lh}, \qquad h \in {\mathfrak {P}}S^{m}, $$

which is either finite, or infinite but rapidly convergent.

With this notation, we have the following commutator result:

Proposition 6.15

For \(g \in {\mathfrak {P}}\) and \(h \in {\mathfrak {P}}S^{m}\) we have the commutator expansion

$$ [T_{g}, T_{h}] = -i T_{\{g, h\}_{p}} + OP{\mathfrak {P}}S^{m-2} L_{lh}( \partial _{x}^{2} g, \cdot ) + R, $$
(6.21)

where

$$ \| R \|_{H^{s} \to H^{s-m+1}} \lesssim {\mathcal {B}}^{2}. $$
(6.22)

Proof

As in the proof of Lemma 6.12, we first localize in frequency to a dyadic scale \(2^{k}\) for the input/output, and reduce to the case \(s = 0\) and \(m = 2\).

We consider first the special case when \(h\) is a multiplier, \(h(x,\xi )=h(\xi )\). Then

$$ \{h, g\}_{p} = h_{\xi }g_{x}. $$

In this case we claim that we have an exact formula,

$$ [T_{g}, T_{h}] u = -i T_{\{g, h\}_{p}} u + C, \qquad Cu = L_{lh}( \partial _{x}^{2} g, u). $$
(6.23)

A-priori the last term on the right, \(C\), is a \(lh\) type translation invariant bilinear form in \(g\), \(u\); all we need to do is to compute its symbol \(R(\eta ,\xi )\), and verify that it has symbol type regularity and vanishes of second order when \(\eta =0\). The symbol for \(T_{g} u\) as a bilinear form in \(g\) and \(u\) is

$$ \ell (\eta ,\xi ) = \chi (\frac{|\eta |}{|\xi +\frac{1}{2} \eta |}) . $$

Then the symbol for the commutator is

$$ \chi (\frac{|\eta |}{|\xi +\frac{1}{2} \eta |}) (h(\xi ) - h(\xi + \eta )) . $$

We expand the last difference as a Taylor series around the middle as

$$ h(\xi ) - h(\xi +\eta ) = - \eta \nabla h(\xi +\frac{1}{2} \eta ) + \eta ^{2} r(\xi ,\eta ) $$

with \(r\) a smooth symbol in both \(\eta \) and \(\xi \) on the \(2^{k}\) scale for \(|\eta | \ll |\xi | \approx 2^{k}\). The middle term gives the symbol of the Weyl quantization for the Lie bracket \(\{h, g\}_{p}\). The last term yields the error term \(C\), which has the \(\eta ^{2}\) factor corresponding to the two derivatives of \(g\).

Next we turn our attention to the general case, which we seek to reduce to the special case above. This is achieved by separating variables in the symbol \(h\), which allows us to assume without any restriction in generality that the symbol \(h\) has the form

$$ h(x,\xi ) = a(x) b(\xi ). $$

Then we have a corresponding decomposition at the operator level,

$$ T_{h} u = T_{a} B(D) u + C_{0} u, \qquad C_{0} u = L_{lh}(a_{x},u). $$
(6.24)

Here we can estimate the commutator with \(T_{g}\) as an error term,

$$ [C_{0}, T_{g}] = R. $$

This is most readily seen using another separation of variables, which allows us to reduce the problem to the case when

$$ C_{0} u = C^{1}_{0}(D) a_{x} C_{0}^{2}(D) u, $$

after which we may apply Lemma 2.4. The same lemma also shows that the commutator \([T_{a},T_{g}]\) yields an error term, so we arrive at

$$ [T_{g}, T_{h}] = T_{a} [B(D), T_{g}] + R. $$

For the commutator on the right we apply the formula (6.23), which yields

$$ [T_{g}, T_{h}] u = -i T_{a} ( T_{b_{\xi }g_{x}} + L_{lh}(g_{xx}, \cdot )). $$

It remains to refine the first product,

$$ T_{a} T_{b_{\xi }g_{x}} = T_{T_{a b_{\xi}} g_{x} } + R $$

for which we use Lemma 6.12. □

Our final result here is a product formula where we also need an expansion akin to (6.21). One should contrast this with Lemma 6.12, where such expansion was not necessary.

Proposition 6.16

For \(g \in {\mathfrak {P}}S^{m}\) and \(h \in {\mathfrak{DP}}\) we have the commutator expansion

$$ T_{g} T_{h} = T_{T_{g} h} + OP{\mathfrak {P}}S^{m-1} L_{lh}(\partial _{x} h, \cdot ) + R, $$
(6.25)

where

$$ \| R \|_{H^{s} \to H^{s-m}} \lesssim {\mathcal {B}}^{2}. $$
(6.26)

Proof

The proof follows the same outline as the proof of the previous proposition, so we only outline the main points.

We localize first in frequency to a dyadic frequency region at scale \(2^{k}\), and then separating variables in the first factor. If \(g\) is simply a multiplier then (6.25) is an exact identity akin to (6.23) above. If instead

$$ g = a(x) b(\xi ), \qquad a \in {\mathfrak {P}}, $$

then we expand \(T_{g}\) as in (6.24), and then replace \(T_{a}\) by multiplication by \(a_{< k}\), using (6.18), (6.19). After these simplifications, we are left with estimating the difference

$$ R_{0} = (g_{< k} T_{bh} - T_{g_{< k} bh})P_{k} u = g_{< k} L_{lh}(b,P_{k} u) - L_{lh}(g_{< k} b,P_{k} u). $$

This difference is easily turned into another commutator and estimated as in (6.26); this is achieved by separating again variables in the symbol of \(L_{lh}\) as in the analysis after (6.24). □

7 Energy estimates for the paradifferential equation

Our objective in this section is to prove that the linear paradifferential flow

$$ ( \partial _{\alpha} T_{g^{\alpha \beta}} \partial _{\beta} - T_{A^{ \gamma}}\partial _{\gamma}) v = f $$
(7.1)

is locally well-posed in a range of Sobolev spaces. Precisely, we will show that

Theorem 7.1

Let \(u\) be a smooth solution for the minimal surface equation (3.5) in a time interval \(I=[0,T]\), with associated control parameters \({\mathcal {A}^{\sharp }}\) andso that

$$ {\mathcal {A}^{\sharp }}\ll 1, \qquad {\mathcal {B}}\in L^{2}_{t}. $$
(7.2)

Let \(s \in {\mathbb{R}}\). Then the linear paradifferential flow (7.1) is locally well-posed in \(\mathcal {H}^{s}\) in the time interval \(I\). Furthermore, there exists an energy functional \(E^{s}(v) = E^{s}(v[t])\), depending on \(u\), which is smooth in \(\mathcal {H}^{s+1}\), with the following two properties:

a) Energy equivalence:

$$ E^{s}(v[t]) \approx \| v[t]\|_{\mathcal {H}^{s}}^{2} . $$
(7.3)

b) Energy estimate:

$$ \frac{d}{dt} E^{s}(v[t]) \lesssim {\mathcal {B}}^{2} E^{s}(v[t]) + \|f \|_{H^{s-1}} E^{s}(v[t])^{\frac{1}{2}} . $$
(7.4)

The same result is also valid for the paradifferential equations (3.26), respectively (3.27) associated to the metrics \({\tilde{g}}\) and \({\hat{g}}\).

We remark on the modular structure of our arguments. Precisely, from this section it is only the conclusion of this theorem which is used later in the paper. We also remark on the smallness condition for \({\mathcal {A}^{\sharp }}\):

Remark 7.2

The condition that \({\mathcal {A}^{\sharp }}\ll 1\) in the theorem is a technical convenience rather than a necessity. It is only used in the reduction in Proposition 7.3 in order to ensure that the operator \(T_{g^{00}}\) is invertible, and then in Lemma 7.4 in order to insure that our vector field \(X\) is forward time-like. Since \(|g^{00}| \gtrsim 1\) this may be alternatively guaranteed by a more careful choice of the quantization, respectively construction of \(X\). Another minor advantage is that with this assumption we no longer need to track the dependence on \({\mathcal {A}^{\sharp }}\) of implicit constants in all the estimates.

It will be easier to prove the result for the paradifferential flow associated to the metric \({\tilde{g}}\). Because of this, our first step will be to reduce the problem to this case. Then we will prove the result for \({\tilde{g}}\) in two steps. First, we show that the desired result holds for \(s = 0\). Then, we use a paraconjugation argument to show that the same result holds for all real \(s\).

7.1 Equivalent metrics

The idea here is that we can replace the metric \(g\) with the conformally equivalent metric \({\tilde{g}}\) given by (3.18) in order to simplify the subsequent analysis. A similar equivalence holds for the metric \({\hat{g}}\); the argument is completely identical.

Then we have the following equivalence:

Proposition 7.3

Assume that \(v\) solves (7.1). Then it also satisfies an equation of the form

$$ (\partial _{\alpha} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} - T_{{ \tilde{A}}^{\gamma}}\partial _{\gamma}) v = E f + {\tilde{R}}v, $$
(7.5)

where \(E\) is invertible and elliptic,

$$ \| Ef \|_{H^{s}} \approx \|f\|_{H^{s}}, $$
(7.6)

and \({\tilde{R}}\) is balanced,

$$ \| {\tilde{R}}v\|_{H^{s}} \lesssim {\mathcal {B}}^{2} \| \partial v\|_{H^{s}}. $$
(7.7)

Proof

We first observe that, since \(g^{00}\) is a small, \(O({\mathcal {A}})\) perturbation of a nonzero constant, it follows that \(T_{(g^{00})^{-1}}\) is an invertible elliptic operator, with elliptic inverse \(E = (T_{(g^{00})^{-1}})^{-1}\), which satisfies (7.6) for all real \(s\).

Then \(v\) solves (7.5) with \({\tilde{R}}\) of the form

$$ {\tilde{R}}= E\partial _{\alpha} ( T_{{\tilde{g}}^{\alpha \beta}} - T_{(g^{00})^{-1}} T_{g^{\alpha \beta}}) \partial _{\beta }- E ( T_{(g^{00})^{-1}} T_{A^{ \gamma}} - T_{{\tilde{A}}^{\gamma}} - T_{\partial _{\alpha }(g^{00})^{-1}} T_{g^{\alpha \gamma}}) \partial _{\gamma} . $$

Here we have the algebraic relations

$$ {\tilde{g}}^{\alpha \beta} = (g^{00})^{-1} g^{\alpha \beta}, \qquad (g^{00})^{-1} A^{\gamma} = {\tilde{A}}^{\gamma }+ \partial _{\alpha }(g^{00})^{-1} g^{ \alpha \gamma}. $$

This allows us to estimate \({\tilde{R}}\) in a balanced fashion using Lemma 5.7 and Lemma 2.7, as desired. □

As a consequence of this result, we see that it suffices now to prove the result in Theorem 7.1 but with the equation (7.1) replaced by

$$ (\partial _{\alpha} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} - 2 T_{{\tilde{A}}^{\gamma}}\partial _{\gamma}) v = f . $$
(7.8)

7.2 The \(H^{1} \times L^{2}\) bound

For expository purposes, we first review the multiplier method for proving energy estimates for the wave equation in a simplified setting. Guided by this, we construct a suitable vector field, to be used as our multiplier. Finally, we reinterpret the energy estimates at the paradifferential level, and prove Theorem 7.1 with \(s = 1\).

7.2.1 Energy estimates via the multiplier method

Suppose that we have a function \(v\) that solves a divergence form wave equation,

$$ P v = f, \qquad P = \partial _{\alpha }g^{\alpha \beta} \partial _{ \beta }- A^{\alpha }\partial _{\alpha }. $$
(7.9)

Given a vector field \(X = X^{\alpha }\partial _{\alpha}\), the standard strategy is to multiply the equation by \(X v\) and integrate by parts. For expository purposes we will follow this path here, noting that another alternative would be to interpret the vector field in the Weyl calculus, and work instead with the skew-adjoint operator

$$ X^{w} = X^{\alpha }\partial _{\alpha }+ \frac{1}{2} \partial _{ \alpha }X^{\alpha }. $$

At this point we only seek to identify the principal part of the energy estimates, which will lead us to the choice of the vector field \(X\), so we do not follow this second path. However, later on, once \(X\) is chosen and we have switched to the paradifferential setting, we will need to also carefully track the lower order terms, and we will add lower order corrections to our vector field.

To further place the following computations into context, we remark that vector field energy identities for the wave equation are often employed in their covariant form, which is derived by contracting the divergence free relation for the energy momentum tensor with the vector field \(Xu\), and integrating with respect to the measure associated with the metric \(g\). Such a strategy would work but would be counterproductive in our setting, where we will reinterpret all these identities in a paradifferential fashion.

Assuming at first that the function \(v\) is compactly supported, integrating by parts several times, in order to essentially commute the second order part of \(P\) with \(X\), one arrives at the identity

$$ 2\iint Pv \cdot X v \, dx dt = \iint c_{X}(v,v) \, dx dt, $$
(7.10)

where \(c_{X}\) is a quadratic expression in \(\partial v\) of the form

$$ c_{X} (v,v) = c_{X}^{\alpha \beta} \partial _{\alpha }v \, \partial _{ \beta }v $$
(7.11)

with coefficients given by the relation

$$ c_{X}(x,\xi ):= c_{X}^{\alpha \beta} \xi _{\alpha }\xi _{\beta }= \{ p,X^{ \gamma }\xi _{\gamma}\}(x,\xi ) - \partial _{\gamma }X^{\gamma }p(x, \xi ) + 2 A^{\gamma }\xi _{\gamma }X^{\delta }\xi _{\delta}, $$
(7.12)

where we recall that \(p(x,\xi ) = g^{\alpha \beta}\xi _{\alpha }\xi _{\beta}\). Removing the compact support assumption on \(v\) and introducing boundaries at times \(t = 0\) and \(t = T\), the identity above with the integral taken over \([0,T]\times {\mathbb{R}}^{n}\) still holds but with added contributions at these times,

$$ 2\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}Pv \cdot X v\, dx dt = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}c_{X}(v,v) \ dx dt + \left . \int _{{\mathbb{R}}^{n}} e_{X}(v,v)\, dx \right |_{0}^{T} , $$
(7.13)

where the contributions at the initial and final time can be thought of as energies. Here the energy density \(e_{X}\) is a bilinear expression of the form

$$ e_{X}(v,v) = e_{X}^{\alpha \beta} \partial _{\alpha }v \cdot \partial _{\beta }v . $$
(7.14)

This can be written in terms of the energy momentum tensor associated to the \(\Box _{g}\) operator,

$$ T_{\alpha \beta}[v] = \partial _{\alpha }v \cdot \partial _{\beta }v - \frac{1}{2}g_{\alpha \beta} g^{\gamma \delta}\partial _{\delta} v \cdot \partial _{\gamma }v . $$
(7.15)

Then we have

$$ e_{X}^{\alpha \beta} \partial _{\alpha }v \cdot \partial _{\beta }v = g^{0 \alpha} T_{\alpha \beta} X^{\beta }= T(\partial _{t},X). $$
(7.16)

Thus we can define the energy functional associated to the vector field \(X\) as

$$ E_{X}[v] = \int _{{\mathbb{R}}^{n}} e_{X}(v,v)\, dx . $$
(7.17)

The key property of the energy density \(e_{X}\) is that it is classically known to be positive definite in a pointwise sense,

$$ e_{X}(v,v) = T(\partial _{t},X) \gtrsim |\partial v|^{2}, $$
(7.18)

provided that the vector fields \(\partial _{t}\) and \(X\) are uniformly forward time-like. Then we obtain the energy coercivity property

$$ E_{X}[v(t)] \approx \| \partial v(t)\|_{L^{2}}^{2}. $$

With these notations, we can rewrite the integral identity (7.13) as a differential identity

$$ \frac{d}{dt} E_{X}(v) = 2\int _{{\mathbb{R}}^{n}} P v \cdot X v \, dx - \int _{{\mathbb{R}}^{n}} c_{X}(v,v)\, dx. $$
(7.19)

In a nutshell, this computation, interpreted paradifferentially, is at the heart of our proof of the energy estimates. In this context, the choice of the vector field \(X\) should naively be governed by the requirement that the energy flux form \(c_{X}\) is balanced. We note that one cannot ask for \(c_{X}\) to be zero, as this would produce an overdetermined system for \(X\), which in particular implies the condition that \(X\) is a conformal Killing field for the metric \(g\). Even the requirement that \(c_{X}\) is balanced turns out to be a bit too much, which is why we will need a second step to the above computation.

Precisely, the second step is based on another interesting observation, namely that the contribution of terms in \(c_{X}^{\alpha \beta}\) of the form

$$ I = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}q g^{\alpha \beta} \partial _{\alpha }v \cdot \partial _{\beta }v \, dxdt $$

has a favourable structure and can be eliminated using a suitable Lagrangian type energy correction.

Indeed, for compactly supported \(v\), this contribution can be rewritten, integrating by parts, as

$$ \begin{aligned} I = & \ - \iint q \partial _{\alpha }g^{\alpha \beta} \partial _{\beta }v \cdot v \, dx dt + \frac{1}{2} \iint \partial _{\alpha }g^{\alpha \beta} \partial _{\beta }q\, v^{2} dx dt\\ = & \ - \iint P v \cdot q v \, dxdt + \frac{1}{2} \iint \left (P q + q \partial _{\gamma }A^{\gamma}\right ) v^{2} dxdt + \iint A^{\gamma }\partial _{\gamma }q \, v^{2} dxdt . \end{aligned} $$

The first term can be interpreted as a correction to \(X\) in (7.10). Introducing the notation

$$ {\mathfrak {M}}= 2X+q, $$
(7.20)

it now takes the form

$$ \iint P v \cdot {\mathfrak {M}}v\, dx dt = \iint c_{X}(v,v)- q g^{ \alpha \beta} \partial _{\alpha }v \cdot \partial _{\beta }v + d v^{2} \, dx dt, $$
(7.21)

where the coefficient \(d\) of the additional zero order term is

$$ d = \frac{1}{2}(P q + q \partial _{\gamma }A^{\gamma}) + A^{\gamma }\partial _{\gamma }q. $$
(7.22)

Finally, adding in boundaries at \(t=0,T\) we obtain the integral relation

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}P v \cdot {\mathfrak {M}}v \, dx dt = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}c_{X,q} (v,v)+ d v^{2} \ dx dt + \left . \int _{{\mathbb{R}}^{n}} e_{X,q}(v,v)\, dx \right |_{0}^{T}, $$
(7.23)

where the leading flux density is now

$$ c_{X,q}(v,v) = c_{X} (v,v) - q g^{\alpha \beta} \partial _{\alpha }v \cdot \partial _{\beta }v , $$

while the new energy density \(e_{X,q}\) has the form

$$ e_{X,q}(v,v) = e_{X}(v,v) + q g^{0\beta} \partial _{\beta }v \cdot v - \frac{1}{2} (g^{0\beta} \partial _{\beta }q - q A^{0}) v^{2} . $$

We can also convert this into a differential relation akin to (7.19), namely

$$ \frac{d}{dt} E_{X,q}(v) = \int _{{\mathbb{R}}^{n}} P v \cdot { \mathfrak {M}}v \, dx - \int _{{\mathbb{R}}^{n}} c_{X,q}(v,v) + d v^{2} \, dx. $$
(7.24)

The identity (7.24) will heuristically provide the intuition for the proof of the desired energy estimate. However, to make this rigorous we will have to re-implement the above computation at the paradifferential level. There, the treatment of the lower order terms will differ slightly, in part in order to avoid a need for direct bounds on higher order time derivatives of \(u\).

Based on the relation (7.24), the vector field \(X\) will have to be chosen so that the symbol for the bilinear form \(c_{X,q}\) is balanced, or equivalently so that \(c_{X}\) is balanced modulo a Lagrangian contribution. In turn, the Lagrangian correction weight \(q\) will have to be chosen carefully, so that it satisfies multiple requirements:

  1. (1)

    Comparing the form of \(c_{X,q}\) with the earlier expression for \(c_{X}\), a natural choice would seem to be

    $$ q = \partial _{\gamma }X^{\gamma}. $$
    (7.25)
  2. (2)

    Examining the lower order coefficient \(d\) above, we will need to have good control over the function \(P q\). Here it is the second order part of \(P\) that matters, as the effect of the magnetic term will turn out to be directly perturbative.

Reconciling these two requirements will play an important role later on in this section.

To complete our discussion here, we need to carry out an additional step, namely to investigate what happens if we replace \(g\), \(A\) by \({\tilde{g}}\), \({\tilde{A}}\). Observing that

$$ P v = g^{00} \tilde{P}v $$

it becomes natural to replace the vector field \(X\), the Lagrangian weight \(q\) and the multiplier by

$$ \tilde{X}= g^{00} X, \qquad {\tilde{q}}= g^{00} q, \qquad \tilde{\mathfrak {M}}= 2 \tilde{X}+ {\tilde{q}}. $$

Then the relation (7.23) remains essentially unchanged,

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\tilde{P}v \cdot \tilde{\mathfrak {M}}v\, dx dt = \int _{0}^{T} \!\!\!\! \int _{{ \mathbb{R}}^{n}}c_{X,q} (v,v)+ d v^{2} \ dx dt + \left . \int _{{ \mathbb{R}}^{n}} e_{X,q}(v,v)\, dx \right |_{0}^{T}, $$
(7.26)

and the same applies to the differential form (7.24) of the same relation. Here the principal flux symbol can be equivalently expressed in the form

$$ c_{X,q}(x,\xi ):= c_{X,q}^{\alpha \beta} \xi _{\alpha }\xi _{\beta }= \{ \tilde{p},\tilde{X}\}(x,\xi ) - (\partial _{\gamma }\tilde{X}^{\gamma}+{ \tilde{q}}) p(x,\xi ) + 2{\tilde{A}}^{\gamma }\xi _{\gamma }\tilde{X}^{ \delta }\xi _{\delta}. $$
(7.27)

Our task is now twofold:

  • To identify a suitable time-like vector field \(X\) so that the energy flux above satisfies a balanced energy estimate, and

  • To recast the above computation in the paradifferential setting without losing the energy balance; this will also require a careful choice for \(q\).

7.2.2 The construction of the vector field \(X\)

Our objective here is to construct a forward time-like vector field \(X\) so that the flux coefficients in \(c_{X,q}\) are balanced for \(q\) as in (7.25). In essence, at this stage we disregard any paradifferential frequency localizations, and work as if \(v\) has infinite frequency. We also do not distinguish between \(g\) and \({\tilde{g}}\), as this does not play a role in the choice of \(X\). Our main result governing the choice of the vector field \(X\) is where our notion of paracontrolled distributions is first needed, and reads as follows:

Lemma 7.4

There exists a forward time-like vector field \(X\) that is paracontrolled by \(\partial u\), and so that we have the balanced bound

$$ \|c_{X}^{\alpha \beta} + \partial _{\gamma }X^{\gamma }g^{\alpha \beta}\|_{L^{\infty}} \lesssim {\mathcal {B}}^{2}. $$
(7.28)

We remark that the fact that such a vector field exists is closely connected to the fact that our equation satisfies the nonlinear null condition in a strong sense. One should think of our vector field \(X\) as the next best thing to a Killing or conformal Killing vector field. Perhaps a good terminology would a para-Killing vector field, i.e. whose deformation tensor is balanced, rather than equal to zero or a multiple of the metric.

Proof

Starting from (7.12), we compute the expression in (7.28) as follows:

$$\begin{aligned} (c_{X}^{\alpha \beta}+ \partial _{\gamma }\tilde{X}^{\gamma }g^{ \alpha \beta}) \xi _{\alpha }\xi _{\beta } = & \ 2 \xi _{\gamma }\xi _{ \alpha }g^{\gamma \beta} \partial _{\beta }X^{\alpha }- X^{\gamma } \xi _{\alpha }\xi _{\beta }\partial _{\gamma} g^{\alpha \beta} + 2A^{ \delta }\xi _{\delta }X^{\gamma }\xi _{\gamma } \\ = & \ 2 \xi _{\gamma }\xi _{\alpha }g^{\gamma \beta} \partial _{ \beta }X^{\alpha }+ 2 X^{\gamma }\partial ^{\beta }u \xi _{\beta }g^{ \alpha \delta} \partial _{\delta }\partial _{\gamma }u \xi _{\alpha} \\ &{}+ 2 \partial ^{\alpha }u \partial _{\alpha }\partial _{\beta }u g^{\beta \delta} \xi _{\delta }X^{\gamma }\xi _{\gamma } \\ = & \ 2 \xi _{\gamma }\xi _{\alpha }g^{\gamma \beta} \partial _{ \beta }X^{\alpha }+ 2 \xi _{\gamma }\xi _{\alpha }X^{\beta }\partial ^{ \gamma }u g^{\alpha \delta} \partial _{\delta }\partial _{\beta }u \\ &{}+ 2 \xi _{\alpha}\xi _{\gamma }\partial ^{\delta }u \partial _{\delta } \partial _{\beta }u g^{\beta \alpha} X^{\gamma } \\ = & \ 2 \xi _{\gamma }\xi _{\alpha }g^{\gamma \beta}( \partial _{ \beta }X^{\alpha }+ X^{\delta }\partial ^{\alpha }u \partial _{ \delta }\partial _{\beta }u + X^{\alpha }\partial ^{\delta }u \partial _{\delta }\partial _{\beta }u ). \end{aligned}$$

Here one could freely symmetrize the coefficients relative to the pair of indices \((\alpha , \gamma )\). We have chosen to neglect the symmetrization, but, instead, we made favourable choices. The above expression would cancel for instance if

$$ \partial _{\beta }X^{\alpha }= - X^{\delta }\partial ^{\alpha }u \partial _{\beta }\partial _{\delta }u - \partial ^{\delta }u X^{ \alpha }\partial _{\delta }\partial _{\beta }u . $$
(7.29)

This is an overdetermined system, so we cannot hope for an exact cancellation. Even if we symmetrize (raising the \(\beta \) index first) and equate the symmetric part of the two sides, it still remains overdetermined.

But we do not need exact cancellation, we only need the difference of the two sides to be balanced. Assume for the moment that \(X\) is at the same regularity level as \(\partial u\). Then, examining the right hand side, the expressions there are unbalanced only in the paraproduct case, where the \(\partial ^{2} u\) term is the high frequency, i.e. for the terms \(T_{h(h,\nabla u)} \partial ^{2} u\). Hence we heuristically arrive at the equivalent requirement

$$ \partial _{\beta }X^{\alpha }\overset{bal}{\approx }- T_{X^{\delta } \partial ^{\alpha }u} \partial _{\beta }\partial _{\delta }u - T_{ \partial ^{\delta }u X^{\alpha}} \partial _{\delta }\partial _{\beta }u , $$

where we introduce the notation “\(\overset{bal}{\approx }\)” to indicate that the difference between the two expressions is balanced, i.e. can be estimated as in (7.28). Then, at leading order we may cancel the \(\beta \) derivative to obtain a single paradifferential relation at one regularity level higher, namely

$$ X^{\alpha }\overset{bal}{\approx }- T_{X^{\delta }\partial ^{\alpha }u} \partial _{\delta }u - T_{\partial ^{\delta }u X^{\alpha}} \partial _{ \delta }u . $$

Modulo balanced terms we may break the paraproducts above in two. This allows us to devise an inductive scheme to construct \(X\) as a dyadic sum of frequency localized pieces, by setting

$$ X = X_{0} + \sum _{k = 1}^{\infty }X_{k} $$
(7.30)

starting with the forward time-like initialization

$$ X_{0} = \partial _{t}, $$

and where the functions \(X_{k}\), localized at frequency \(2^{k}\), are defined inductively by

$$ X^{\alpha}_{k} = -( T_{X^{\delta}} T_{ \partial ^{\alpha }u} + T_{ X^{ \alpha}} T_{\partial ^{\delta }u}) \partial _{\delta }u_{k}. $$
(7.31)

It remains to show that, as defined above, the vector field \(X\) has all the properties in the Lemma. We will achieve this in three stages:

  • We show that \(X\) satisfies the same bounds as \(\partial u\), (see (5.8) and (5.15)),

    $$ \| X - X_{0}\|_{\mathfrak{C}} \lesssim 1 . $$
    (7.32)

    Since \({\mathcal {A}^{\sharp}}\ll 1\), this in particular guarantees that \(X\) is forward time-like.

  • We show that \(X- X_{0}\) is paracontrolled by \(\partial u\).

  • Finally, we establish the balanced bound (7.28).

To simplify the notations, we will write schematically that

$$ X_{k} = T_{X} T_{h} \partial u_{k}, $$

with coefficients \(h\) of the form \(h = F(\partial u)\), which belong to ℭ modulo constants.

I. Dyadic bounds for \(X\). These are proved at each dyadic frequency \(k\) by induction on \(k\). We do this in two steps, where we first estimate the \(\mathfrak{C}_{0}\) norm of \(X-X_{0}\). Precisely, the first set of statements to be proved by induction for \(k > 0\) is as follows:

$$ \| X_{k}\|_{L^{2n}} \leq C 2^{-\frac{k}{2}} {\mathcal {A}^{\sharp}}c_{k}^{2} , $$
(7.33)
$$ \| X_{k} \|_{L^{\infty}} \leq C 2^{-\frac{k}{2}} {\mathcal {B}}c_{k}, $$
(7.34)

with a fixed large universal constant \(C\). This implies that \(\|X-X_{0}\|_{\mathfrak{C}_{0}} \lesssim 1\). The induction hypothesis combined with Bernstein’s inequality yields the bound

$$ \| X_{< k} -X_{0} \|_{L^{\infty}} \lesssim C {\mathcal {A}^{\sharp }}. $$

Then we write

$$ X_{k} = T_{X_{0}} T_{h} u_{k} + T_{X_{< k} - X_{0}} T_{h} u_{k}, $$

which yields

$$ \| X_{k}\|_{L^{2n}} \lesssim (1 + C{\mathcal {A}^{\sharp}}) 2^{-\frac{k}{2}}{\mathcal {A}^{\sharp}}c_{k}^{2} , $$

respectively

$$ \| X_{k} \|_{L^{\infty}} \lesssim (1+ C{\mathcal {A}^{\sharp }}) 2^{- \frac{k}{2}} {\mathcal {B}}c_{k}. $$

Thus the induction argument closes if \(C\) is a large constant and \({\mathcal {A}^{\sharp }}\ll 1\).

The second step is to prove that

$$ \|X-X_{0}\|_{\mathfrak {C}} \lesssim 1. $$

To achieve this we will prove by induction that

$$ \|\partial _{t} X_{\leq k}\|_{\mathfrak{DC}} \leq C, $$
(7.35)

i.e. that \(\partial _{t} X_{\leq k}\) admits a decomposition \(\partial _{t} X_{\leq k} = f_{k1}+f_{k2}\), where

$$ \| f_{1,\leq k}\|_{L^{\infty}} \leq C {\mathcal {B}}^{2} c_{k}^{2}, \qquad \| P_{j} f_{2,\leq k}\|_{L^{\infty}} \leq C 2^{\frac{j}{2}} { \mathcal {B}}c_{j} . $$

Here again \(C\) is a fixed large constant, unrelated to the earlier \(C\).

For this we write

$$ \begin{aligned} \partial _{t} X_{\leq k} = & \ \partial _{t} (T_{X} T_{h} \partial u_{ \leq k} ) = (T_{\partial _{t} X} T_{h} \partial u_{\leq k} + T_{X} T_{ \partial _{t} h} \partial u_{\leq k}) + T_{X} T_{h} \partial _{t} \partial u_{\leq k} . \end{aligned} $$

Here the \(X\) coefficients involve only frequencies below \(2^{k}\), so we may use the induction hypothesis in the first term. For the second and third terms it suffices to use the \(\mathfrak{C}_{0}\) bound for \(X\), which we already have from the first induction. Hence, repeatedly applying the bounds in (5.24) we obtain

$$\begin{aligned} \| \partial _{t} X_{\leq k} \|_{\mathfrak{DC}} \lesssim & \ {\mathcal {A}^{\sharp }}\|\partial _{t} X_{< k}\|_{\mathfrak{DC}} \| h \|_{L^{\infty}} \|\partial u\|_{\mathfrak{C}_{0}} + {\mathcal {A}^{ \sharp }}\| X \|_{\mathfrak{C}_{0}} \|\partial _{t} h\|_{ \mathfrak{DC}} \|\partial u\|_{\mathfrak{C}_{0}} \\ &{} + \|X\|_{ \mathfrak{C}_{0}} \|h\|_{\mathfrak{C}_{0}} \|\partial _{t} \partial u \|_{\mathfrak{DC}} \\ \lesssim & \ C {\mathcal {A}^{\sharp }}+ 1, \end{aligned}$$

which closes the inductive proof of (7.35) if \(C \gg 1\) and \({\mathcal {A}^{\sharp }}\ll 1\).

II. \(X\) is paracontrolled by \(\partial u\). To prove this, we will establish the representation

$$ X^{\alpha }= -( T_{X^{\delta }\partial ^{\alpha }u} + T_{ X^{\alpha } \partial ^{\delta }u}) \partial _{\delta }u + r^{\alpha }. $$
(7.36)

This will play the role of (6.1). The Moser estimates in Lemma 5.7 show that the paracoefficients above satisfy the bounds required of \(a\) in (6.2), so it remains to establish that the errors \(r_{\alpha}\) satisfy the bounds (6.3). For this, we write

$$ \begin{aligned} r^{\alpha }= & \ \sum _{k} X^{\alpha}_{k} + ( T_{X^{\delta }\partial ^{ \alpha }u} +T_{ X^{\alpha }\partial ^{\delta }u}) \partial _{\delta }u_{k} \\ = & \ \sum _{k} - [ (T_{X^{\delta }\partial ^{\alpha }u} - T_{X^{ \delta}} T_{\partial ^{\alpha }u}) +(T_{X^{\alpha }\partial ^{\delta }u }- T_{X^{\alpha}} T_{\partial ^{\delta }u})] \partial _{\delta }u_{k} \\ := & \ \sum _{k} r^{\alpha}_{k} . \end{aligned} $$

Now we apply Lemma 2.7 to estimate

$$ \|r^{\alpha}_{k}\|_{L^{n}} \lesssim 2^{-k} c_{k}^{2} {\mathcal {A}^{\sharp}}^{2}, \qquad \|r^{\alpha}_{k}\|_{L^{\infty}} \lesssim 2^{-k} c_{k}^{2} { \mathcal {B}}^{2} $$
(7.37)

as needed. It remains to bound the time derivative of \(r^{\alpha}\) in \(L^{\infty}\). For this we distribute the time derivative. If it falls on any of the para-coefficients then we can directly use the bound (5.25). Else, we use Lemma 5.9.

III. The bound for \(c_{X}^{\alpha \beta} + \partial _{\gamma }X^{\gamma }g^{\alpha \beta}\). Here we recall that

$$ c_{X}^{\alpha \gamma} + \partial _{\beta }X^{\beta }g^{\alpha \gamma} = 2 g^{\gamma \beta}( \partial _{\beta }X^{\alpha }+ X^{\delta } \partial ^{\alpha }u \partial _{\delta }\partial _{\beta }u + X^{ \alpha }\partial ^{\delta }u \partial _{\delta }\partial _{\beta }u ). $$

To estimate this, our starting point is the relation (7.36), together with the bounds (7.37) for \(r^{\alpha}\). Denoting

$$ h^{\alpha \delta} := X^{\delta }\partial ^{\alpha }u + X^{\alpha } \partial ^{\delta }u $$

we write

$$ \begin{aligned} c_{X}^{\alpha \gamma} + \partial _{\beta }X^{\beta }g^{\alpha \gamma} = & \ g^{\gamma \beta}( \partial _{\beta }X^{\alpha }+ h^{\alpha \delta} \partial _{\delta }\partial _{\beta }u) \\ = & \ T_{g^{\gamma \beta}} \partial _{\beta }r^{\alpha }+ ( T_{g^{ \gamma \beta} h^{\alpha \delta}} - T_{g^{\gamma \beta}} T_{h^{\alpha \delta}}) \partial _{\alpha }\partial _{\beta }u \\ & \ + T_{\partial _{\beta }X^{\alpha}} g^{\gamma \beta} + T_{ \partial _{\delta }\partial _{\beta }u} [g^{\gamma \beta}h^{\alpha \delta}] + \Pi (\partial _{\beta }X^{\alpha}, g^{\gamma \beta}) \\ &\ + \Pi ({\partial _{\delta }\partial _{\beta }u},g^{\gamma \beta}h^{ \alpha \delta}). \end{aligned} $$

For the \(r^{\alpha}\) term we use (7.37), for the next term we use the earlier bound (5.32) and the terms on the last line are estimated directly using the algebra property for \(\mathfrak{C}_{0}\) and the bilinear estimate (5.25). □

7.2.3 Paradifferential energy estimates associated to \(X\)

Now we use our vector field \(X\) to prove the balanced energy estimates for \(v\). To do this, we repeat the computations leading to the key energy relations (7.23) and (7.24) at the paradifferential level.

To fix the notations, we denote by \(T_{\tilde{P}}\) the operator in (7.8),

$$ T_{\tilde{P}} = \partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }- T_{{\tilde{A}}^{\gamma}} \partial _{\gamma}. $$

By a slight abuse of notation, this is not exactly the same as the Weyl quantized operator with the corresponding symbol, though the difference between the two can be seen to be balanced and thus perturbative in our analysis.

For our multiplier, inspired by the energy relation (7.26), we will use the paradifferential operator

$$ T_{\tilde{\mathfrak {M}}} := 2 T_{\tilde{X}^{\alpha}} \partial _{\alpha }+ T_{{\tilde{q}}}. $$
(7.38)

Here ideally we would like to have

$$ {\tilde{q}}= - g^{00} \partial _{\alpha }X^{\alpha}. $$

However, such a choice causes some technical difficulties due to the lack of sufficient time regularity of \({\tilde{q}}\). To avoid this, we will forego the above explicit expression for \({\tilde{q}}\), and instead ask for \({\tilde{q}}\) to satisfy the following two properties:

  • it is close to the ideal setting,

    $$ | {\tilde{q}}- g^{00} \partial _{\alpha }X^{\alpha }| \lesssim { \mathcal {B}}^{2} . $$
    (7.39)
  • it has the form \({\tilde{q}}= \partial _{x} q_{1}\), where \(q_{1} \in {\mathfrak {P}}\).

We remark that the obvious choice \({\tilde{q}}_{0} := - g^{00} \partial _{\alpha }X^{\alpha}\) for the first criteria does not satisfy the second criteria, as it contains expressions involving \(\partial _{t}^{2} u\). However, by definition we have \({\tilde{q}}_{0} \in {\mathfrak{DP}}\), therefore, a good approximation \({\tilde{q}}\) for \({\tilde{q}}_{0}\) as above exists by Lemma 6.8. Note that for this it suffices to use the fact that \(X^{\alpha }\in {\mathfrak {P}}\) separately for each \(\alpha \), rather than the more precise representation in (7.36).

Now we implement the multiplier method to prove energy estimates in the paradifferential setting. We recall our objective, which is to establish an integral energy identity of the form

$$ \iint T_{\tilde{P}} v \cdot T_{\tilde{\mathfrak {M}}} v\, dx dt = \left . E_{X}(v(t)) \right |_{0}^{T} + \int _{0}^{T} O({\mathcal {B}}^{2}) \| v(t) \|_{\mathcal {H}^{1}}^{2} \, dt $$
(7.40)

for a suitable positive definite energy functional \(E_{X}\) in ℋ,

$$ E_{X}(v(t)) \approx \|v[t]\|_{\mathcal {H}^{1}}^{2} . $$
(7.41)

This may also be interpreted as a differential energy identity,

$$ \frac{d}{dt}E_{X}(v(t)) = \int T_{\tilde{P}} v \cdot T_{ \tilde{\mathfrak {M}}} v\, dx + O({\mathcal {B}}^{2}) \| v(t)\|_{ \mathcal {H}^{1}}^{2} . $$
(7.42)

Notation for errors: There are two types of error/correction terms that appear in our computations:

  • Corrections in the energy functional. Here we will denote by \(Err({\mathcal {A}^{\sharp }})\) any fixed time expressions that have size \(O({\mathcal {A}^{\sharp }}) \| v[t]\|_{\mathcal {H}^{1}}^{2}\). A typical example here is a lower order term of the form

    $$ \int _{{\mathbb{R}}^{n}} \partial v \cdot T_{q} v \, dx, \qquad q \in \partial _{x} {\mathfrak {P}}, $$

    where

    $$ \| P_{< k} q \|_{L^{\infty}} \lesssim 2^{k} {\mathcal {A}}. $$
  • Corrections in the energy flux term. These are like the last term on the right in (7.40), respectively (7.42). For brevity we will denote the admissible errors in the two identities by \(Err({\mathcal {B}}^{2})\).

To establish (7.40), we consider the contributions of the two terms in \(T_{\tilde{\mathfrak {M}}}\).

I. The contribution of \(T_{\tilde{X}^{\alpha}} \partial _{\alpha}\). Integrating by parts and commuting, this is given by

$$\begin{aligned} I_{X} = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \tilde{P}_{A}} v \cdot T_{\tilde{X}^{\gamma}} \partial _{\gamma }v\, dxdt \\ = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}( \partial _{ \alpha }T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} v - T_{{ \tilde{A}}^{\alpha}} \partial _{\alpha }v) \cdot T_{\tilde{X}^{\gamma}} \partial _{\gamma }v \, dx dt \\ = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{ \alpha }T_{\tilde{X}^{\gamma}} T_{{\tilde{g}}^{\alpha \beta}} \partial _{ \beta} v \cdot \partial _{\gamma }v - T_{\partial _{\alpha }\tilde{X}^{ \gamma}} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} v \cdot \partial _{\gamma }v - T_{{\tilde{A}}^{\alpha}} \partial _{\alpha }v \cdot T_{\tilde{X}^{\gamma}} \partial _{\gamma }v \, dx dt \\ = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{ \gamma }T_{\tilde{X}^{\gamma}} T_{{\tilde{g}}^{\alpha \beta}} \partial _{ \beta} v \cdot \partial _{\alpha }v - T_{\partial _{\alpha }\tilde{X}^{ \gamma}} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} v \cdot \partial _{\gamma }v - T_{{\tilde{A}}^{\alpha}} \partial _{\alpha }v \cdot T_{\tilde{X}^{\gamma}} \partial _{\gamma }v \, dx dt \\ & + \left . \int T_{\tilde{X}^{\gamma}} T_{{\tilde{g}}^{0 \beta}} \partial _{\beta} v \cdot \partial _{\gamma }v - T_{\tilde{X}^{0}} T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\beta} v \cdot \partial _{ \alpha }v \, dx \right |_{0}^{T} \end{aligned}$$

so we obtain

$$ \begin{aligned} I_{X} = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}} \frac{1}{2} (\partial _{\gamma }T_{\tilde{X}^{\gamma}} T_{{\tilde{g}}^{ \alpha \beta}}\! - \! T_{{\tilde{g}}^{\alpha \beta}} T_{\tilde{X}^{ \gamma}} \partial _{\gamma}) \partial _{\beta} v \cdot \partial _{ \alpha }v \\ & \quad \!-\! T_{\partial _{\alpha }\tilde{X}^{\gamma}} T_{{\tilde{g}}^{ \alpha \beta}} \partial _{\beta} v \cdot \partial _{\gamma }v\!-\! T_{{ \tilde{A}}^{\alpha}} \partial _{\alpha }v \cdot T_{\tilde{X}^{\gamma}} \partial _{\gamma }v \, dx dt \\ & \quad + \left . \int T_{\tilde{X}^{\gamma}} T_{{\tilde{g}}^{0 \beta}} \partial _{\beta} v \cdot \partial _{\gamma }v - T_{\tilde{X}^{0}} T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\beta} v \cdot \partial _{ \alpha }v + \frac{1}{2} T_{\tilde{X}^{0}} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} v \cdot \partial _{\alpha }v \, dx \right |_{0}^{T}. \end{aligned} $$

For the double integral we peel off some perturbative contributions. The first term has a commutator structure, and we distinguish several cases. If \((\alpha , \beta ) = (0,0)\), then we simply write

$$ \partial _{\gamma }T_{\tilde{X}^{\gamma}} T_{{\tilde{g}}^{0 0}} - T_{{ \tilde{g}}^{00}} T_{\tilde{X}^{\gamma}} \partial _{\gamma }= {\tilde{g}}^{00} T_{\partial _{\gamma }\tilde{X}^{\gamma}}. $$

If \((\alpha ,\beta )= (0,j)\) then we commute the derivative first,

$$ \partial _{\gamma }T_{ \tilde{X}^{\gamma}} T_{{\tilde{g}}^{0 j}} - T_{{ \tilde{g}}^{0j}} T_{ \tilde{X}^{\gamma}} \partial _{\gamma }= T_{ \partial _{\gamma }\tilde{X}^{\gamma}} T_{{\tilde{g}}^{0j}} + T_{ \tilde{X}^{\gamma}} T_{\partial _{\gamma}{\tilde{g}}^{0j}} + [T_{ \tilde{X}^{\gamma}},T_{{\tilde{g}}^{0j}}] \partial _{\gamma}, $$

where the contribution of the commutator term is estimated using Lemma 2.4,

$$ \| [T_{\tilde{X}^{\gamma}},T_{{\tilde{g}}^{0j}}] \partial _{j}\|_{L^{2} \to L^{2}} \lesssim {\mathcal {B}}^{2}. $$

If \((\alpha ,\beta )= (j,0)\) then we commute the paraproducts first,

$$ \partial _{\gamma }T_{ \tilde{X}^{\gamma}} T_{{\tilde{g}}^{j0}} - T_{{ \tilde{g}}^{j0}} T_{ \tilde{X}^{\gamma}} \partial _{\gamma }= T_{ \partial _{\gamma}{\tilde{g}}^{0j}} T_{\tilde{X}^{\gamma}} + T_{{ \tilde{g}}^{j0}} T_{\partial _{\gamma }\tilde{X}^{\gamma}} + \partial _{ \gamma }[T_{\tilde{X}^{\gamma}},T_{{\tilde{g}}^{j0}}] , $$

where the contribution of the commutator is again perturbative once we integrate by parts with respect to \(x^{\gamma}\). If \(\gamma =0\) then this integration by parts contributes to the energy with the expression

$$ \int [T_{\tilde{X}^{0}},T_{{\tilde{g}}^{j0}}] \partial _{0} v \cdot \partial _{j} v \, dx, $$

which also plays a perturbative role. For the double paraproducts we use Lemma 2.7 to compound them, as in

$$ \| T_{g} T_{\partial h} - T_{g \partial h}\|_{L^{2} \to L^{2}} \lesssim {\mathcal {B}}^{2}. $$

We arrive at the relation

$$ I_{X} = \iint T_{\tilde{P}_{A}} v \cdot T_{\tilde{X}^{\gamma}} \partial _{\gamma }v\, dxdt = \iint T_{c^{\alpha \beta}_{X}} \partial _{\alpha }v \cdot \partial _{\beta }v \, dx dt+ \left . E_{X}(v) \right |_{0}^{T} + Err({\mathcal {B}}^{2}), $$
(7.43)

where we recall that \(c^{\alpha \beta}_{X}\) is given by the relation (7.27), and the energy functional \(E_{X}\) is given by

$$\begin{aligned} E_{X}(v) =& \int T_{\tilde{X}^{\gamma}} T_{{\tilde{g}}^{0 \beta}} \partial _{\beta} v \cdot \partial _{\gamma }v - T_{\tilde{X}^{0}} T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\beta} v \cdot \partial _{ \alpha }v + \frac{1}{2} T_{\tilde{X}^{0}} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} v \cdot \partial _{\alpha }v \\ &{}+ [T_{\tilde{X}^{0}},T_{{ \tilde{g}}^{j0}}] \partial _{0} v \cdot \partial _{j} v \, dx . \end{aligned}$$

Here we may compound all double paraproducts and discard the commutator term, at the expense of \(Err({\mathcal {A}^{\sharp }})\) errors. We arrive at

$$ E_{X}(v) = \int T_{e^{\alpha \beta}_{X}} \partial _{\alpha }v \partial _{\beta }v \, dx + Err({\mathcal {A}^{\sharp }}), $$
(7.44)

with \(e^{\alpha \beta}_{X}\) given by (7.16), and which therefore belongs to \({\mathfrak {P}}\) modulo constants. Since \(X = \partial _{t} +O({\mathcal {A}^{\sharp }})\) is uniformly time-like, it follows that this matrix is positive definite, which implies the positivity property in (7.41).

II. The contribution of \(T_{q}\). Here we need to consider the integral

$$ I_{q} = \iint (\partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }+ T_{{\tilde{A}}^{\gamma}} \partial _{\gamma}) v \cdot T_{{\tilde{q}}}v\, dx dt, $$

where we recall that \({\tilde{q}}= \partial _{x} q_{1}\) with \(q_{1} \in {\mathfrak {P}}\). The contribution of \({\tilde{A}}\) is directly perturbative, as \({\tilde{A}}\in {\mathfrak{DP}}\subset \mathfrak{DC}\); then one can use the \(\mathfrak {DC}\) decomposition \({\tilde{A}}= {\tilde{A}}_{1} + {\tilde{A}}_{2}\) as in Definition 5.5,(b), pairing each of the two associated bounds in (5.20) with the ℬ, respectively the \({\mathcal {A}^{\sharp}}\) bound in the \(\mathfrak {C}_{0}\) norm of \(q_{1}\):

$$ \|T_{{\tilde{A}}^{\gamma}_{1}} T_{{\tilde{q}}} \|_{H^{1} \rightarrow L^{2}} \lesssim {\mathcal {B}}^{2}, \qquad \|T_{{\tilde{A}}^{\gamma}_{2}} T_{{\tilde{q}}} \|_{H^{1} \rightarrow L^{2}} \lesssim {\mathcal {A}^{\sharp}} {\mathcal {B}}^{2}. $$

Integrating by parts and using Lemmas 2.7, 2.8, we compute

$$ \begin{aligned} I_{q} = & \ - \iint T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }v \cdot ( T_{{\tilde{q}}} \partial _{\alpha }+ T_{\partial _{\alpha }{ \tilde{q}}}) v \, dx dt + \left . \int T_{{\tilde{g}}^{0\beta}} \partial _{\beta }v \cdot T_{{\tilde{q}}}v\, dx \right |_{0}^{T} + Err({ \mathcal {B}}^{2}) \\ = & \ - \iint T_{{\tilde{g}}^{\alpha \beta} {\tilde{q}}} \partial _{ \beta }v \cdot \partial _{\alpha }v \, dx dt - \iint \partial _{ \beta }v \cdot T_{T_{{\tilde{g}}^{\alpha \beta}} \partial _{\alpha }{ \tilde{q}}} v \, dx dt \\ & \ + \left . \int T_{{\tilde{g}}^{0\beta}} \partial _{\beta }v \cdot T_{{\tilde{q}}}v\, dx \right |_{0}^{T} + Err({ \mathcal {B}}^{2}). \end{aligned} $$

Here the first term on the right is the one we want and the last term on the right yields an energy correction which is perturbative, i.e. of size \(Err({\mathcal {A}^{\sharp }})\). It remains to show that the second term, which we shall denote by \(I_{q}^{2}\), also yields only perturbative contributions. Heuristically, this should be relatively simple, in that we can integrate once more by parts, to obtain

$$\begin{aligned} I_{q}^{2} :=& - \iint \partial _{\beta }v \cdot T_{T_{{\tilde{g}}^{ \alpha \beta}} \partial _{\alpha }{\tilde{q}}} v \, dx dt \\ =& \frac{1}{2} \iint v \cdot T_{ \partial _{\beta }T_{{\tilde{g}}^{ \alpha \beta}} \partial _{\alpha }{\tilde{q}}} v \, dx dt - \frac{1}{2} \left . \int v \cdot T_{T_{{\tilde{g}}^{\alpha 0}} \partial _{\alpha }{\tilde{q}}} v \, dx \right |_{0}^{T} . \end{aligned}$$

Here we could estimate both integrals perturbatively and conclude directly if we knew that

$$ \| P_{< k} \partial _{\beta }T_{{\tilde{g}}^{\alpha \beta}} \partial _{ \alpha }{\tilde{q}}\|_{L^{\infty}} \lesssim 2^{2k} {\mathcal {B}}^{2}, \qquad \| P_{< k} \partial _{\alpha }{\tilde{q}}\|_{L^{\infty}} \lesssim 2^{2k} {\mathcal {A}^{\sharp }}. $$

Both of these bounds would be true if \(q\) contained no time derivatives of \(u\) in its expression. However, this is too much to hope for, so a more careful argument is needed. The first step in this argument has already been carried out earlier, where we saw that we may take \({\tilde{q}}\) of the form \({\tilde{q}}= \partial _{x} q_{1}\) with \(q_{1} \in {\mathfrak {P}}\). This removes one of the two potential time derivatives in \(q\), but not the second. We can use this property to write

$$ I_{q}^{2} = \iint \partial _{\beta }v \cdot T_{\partial _{x} T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\alpha }q_{1}} v \, dx dt - \iint \partial _{\beta }v \cdot T_{T_{\partial _{x} {\tilde{g}}^{ \alpha \beta}} \partial _{\alpha }q_{1}} v \, dx dt , $$

where the uniform bound

$$ \| P_{< k} T_{\partial _{x} {\tilde{g}}^{\alpha \beta}} \partial _{ \alpha }q_{1} \|_{L^{\infty}} \lesssim 2^{k} {\mathcal {B}}^{2} $$

shows that we can treat the second term perturbatively, to get

$$ I_{q}^{2} = \iint \partial _{\beta }v \cdot T_{\partial _{x} T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\alpha }q_{1}} v \, dx dt + Err({ \mathcal {B}}^{2}). $$

At this point, we can use the fact that \(q_{1} \in {\mathfrak {P}}\) implies that \(q_{1}\) solves an approximate paradifferential wave equation. The precise statement we use is the one in Lemma 6.4, which yields the representation

$$ \partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }q_{1} = \partial _{\alpha }f^{\alpha } $$

with

$$ \| f^{\alpha}\|_{L^{\infty}} \lesssim {\mathcal {B}}^{2}, \qquad \|P_{< k} (\partial _{0} q_{1} -f^{0})\|_{L^{\infty}} \lesssim {\mathcal {A}^{ \sharp }}. $$
(7.45)

We use this representation to refine the outcome of the naive integration by parts above,

$$\begin{aligned} I_{q}^{2} = & \ -\frac{1}{2} \iint v \cdot T_{ \partial _{x} \partial _{\beta }T_{{\tilde{g}}^{\alpha \beta}} \partial _{\alpha }q_{1}} v \, dx dt + \frac{1}{2} \left . \int v \cdot T_{\partial _{x} T_{{ \tilde{g}}^{\alpha 0}} \partial _{\alpha }q_{1}} v \, dx \right |_{0}^{T} + Err({\mathcal {B}}^{2}) \\ = & \ -\frac{1}{2} \iint v \cdot T_{ \partial _{x} \partial _{ \alpha }f^{\alpha}} v \, dx dt + \frac{1}{2} \left . \int v \cdot T_{ \partial _{x} T_{{\tilde{g}}^{\alpha 0}} \partial _{\alpha }q_{1}} v \, dx \right |_{0}^{T} + Err({\mathcal {B}}^{2}) \\ = & \ \iint \partial _{\alpha }v \cdot T_{ \partial _{x} f^{\alpha}} v \, dx dt + \frac{1}{2} \left . \int v \cdot T_{\partial _{x}( T_{{ \tilde{g}}^{\alpha 0}} \partial _{\alpha }q_{1}- f^{0})} v \, dx \right |_{0}^{T} + Err({\mathcal {B}}^{2}). \end{aligned}$$

By the pointwise bound on \(f^{\alpha}\) in (7.45), the first term is perturbative, i.e. \(Err({\mathcal {B}}^{2})\). By the second bound in (7.45), the second term can be seen as a perturbative \(Err({\mathcal {A}^{\sharp }})\) energy correction.

We conclude that for \(I_{q}\) we have

$$\begin{aligned} I_{q} =& \iint T_{{\tilde{g}}^{\alpha \beta} q} \partial _{\beta }v \cdot \partial _{\alpha }v \, dx dt + \left . \int T_{{\tilde{g}}^{0 \beta}} \partial _{\beta }v \cdot T_{q} v +\frac{1}{2} v \cdot T_{ \partial _{x}( T_{{\tilde{g}}^{\alpha 0}} \partial _{\alpha }q_{1}- f^{0})} v \, dx \right |_{0}^{T} \\ &{}+ Err({\mathcal {B}}^{2}). \end{aligned}$$
(7.46)

III. Conclusion. To finish the proof of (7.40), and thus of Theorem 7.1 for \(s=0\), we combine the relations (7.43) and (7.46) to obtain

$$ 2\iint T_{\tilde{P}} v \cdot T_{\tilde{\mathfrak {M}}} v\, dx dt = \iint T_{c^{\alpha \beta}_{X} + {\tilde{g}}^{\alpha \beta} {\tilde{q}}} \partial _{\alpha }v \cdot \partial _{\beta }v \, dx dt + \left . E_{X}(v(t)) \right |_{0}^{T} + Err({\mathcal {B}}^{2}), $$
(7.47)

where \(E_{X}\) is redefined as the sum of the two contributions in (7.43) and (7.46), which still has the leading order term as in (7.44) plus an \(Err({\mathcal {A}^{\sharp }})\) correction.

It remains to examine the paracoefficient in the integral on the right, and show that it has size \(O({\mathcal {B}}^{2})\). At this point, we simply invoke the choice of our para-Killing vector field \(X\) in Lemma 7.4 for the first term (which we have not used so far), and the choice of \({\tilde{q}}\) in (7.39) for the second term, thereby completing the proof of (7.40).

7.3 The \(H^{s+1} \times H^{s}\) bound for the linear paradifferential flow

Here we prove Theorem 7.1 in the general case, where \(s \neq 1\). The argument will be a more complex variation of the argument in the case \(s=1\), where paraproduct based multipliers have to be replaced by paradifferential multipliers.

7.3.1 The conjugated equation

For simplicity in notations we will consider the linear paradifferential equation in \(\mathcal {H}^{s+1}\) with \(s \neq 0\). We begin by setting \(w = \langle D_{x} \rangle ^{s} v\), which solves a perturbed linear paradifferential equation of the form

$$ ( \partial _{\alpha} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} - T_{{\tilde{A}}^{\gamma}}\partial _{\gamma})w = \langle D_{x} \rangle ^{s} f + {\tilde{B}}w , $$
(7.48)

where the conjugation error \({\tilde{B}}\) in the new source term is given by

$$ {\tilde{B}}= \langle D_{x} \rangle ^{s} [ \partial _{\alpha} T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\beta} - T_{{\tilde{A}}^{\gamma}} \partial _{\gamma}, \langle D_{x} \rangle ^{-s}] . $$
(7.49)

Then we need to construct an \(H^{1} \times L^{2}\) balanced energy for the solution \(w\) to (7.48).

We note that \(\tilde{B}\) is a paradifferential operator, whose principal symbol \({\tilde{b}}_{0}\) is homogeneous of order one and a first degree polynomial in the time variable \(\xi _{0}\), and is given by

$$ {\tilde{b}}_{0}(x,\xi )= - i |\xi '|^{s} \{ {\tilde{g}}^{\alpha \beta} \xi _{\alpha} \xi _{\beta},|\xi '|^{-s}\}. $$

Using the expression (3.9) for the derivatives of the metric \(g\), this can be further written in the form

$$ \begin{aligned} {\tilde{b}}_{0}(x,\xi ) = & \ 2is (\partial ^{\beta }u {\tilde{g}}^{ \alpha \nu} \partial _{j} \partial _{\nu }u\ - \partial ^{0} u { \tilde{g}}^{0\nu} \partial _{j} \partial _{\nu }u {\tilde{g}}^{\alpha \beta}) \xi _{\alpha }\xi _{\beta }\xi _{j} |\xi '|^{-2} \\ := & \ 2 i s {\tilde{b}}_{0}^{\alpha }\xi _{\alpha }, \end{aligned} $$
(7.50)

where

$$ {\tilde{b}}_{0}^{\alpha}= (\partial ^{\beta }u \, {\tilde{g}}^{\alpha \nu} - \partial ^{0} u {\tilde{g}}^{0\nu} {\tilde{g}}^{\alpha \beta}) \partial _{j} \partial _{\nu }u\ \xi _{\beta }\xi _{j} |\xi '|^{-2} . $$
(7.51)

Here the unbalanced part of the coefficients corresponds to the case when the factor \(\partial ^{2} u\) is higher frequency compared to the \(\partial ^{\beta }u\) and \({\tilde{g}}^{\alpha \nu}\) factors. The important feature is that, at the operator level, \(T_{{\tilde{b}}_{0}^{\gamma}} \partial _{\gamma }w\) presents a null form structure of the type \(Q_{0}(\partial u, w)\), with added more regular paradifferential coefficients in \({\mathfrak {P}}\).

We switch the leading term \(2sT_{{\tilde{b}}_{0}^{\gamma}} \partial _{\gamma}\) to the left hand side of the equation; there it will play a role similar to the gradient term \({\tilde{A}}^{\gamma }\partial _{\gamma}\). The remainder \({\tilde{B}}- 2sT_{{\tilde{b}}_{0}^{\gamma}} \partial _{\gamma}\) will play a secondary role; one should think of it as renormalizable, though we will achieve this at the level of the energy, via an energy correction, rather than through an actual normal form transformation. Our equation (7.48) becomes

$$ (\partial _{\alpha} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} - T_{{ \tilde{A}}^{\gamma}}\partial _{\gamma }- 2s T_{{\tilde{b}}_{0}^{\gamma}} \partial _{\gamma})w = \langle D_{x} \rangle ^{s} f + ({\tilde{B}}-2s T_{{ \tilde{b}}_{0}^{\gamma}}\partial _{\gamma}) w , $$
(7.52)

where the leading operator is denoted by

$$ T_{\tilde{P}_{B}} = \partial _{\alpha} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta} - T_{{\tilde{A}}^{\gamma}}\partial _{\gamma }- 2s T_{{ \tilde{b}}_{0}^{\gamma}}\partial _{\gamma}. $$
(7.53)

As in the previous case of the \(H^{1} \times L^{2}\) bounds, our strategy will be to construct a suitable vector field, or multiplier, denoted \(\tilde{X}_{s}\), which depends only on the principal symbol \(\tilde{b}_{0}\) above, and which formally generates a balanced energy estimate at the leading order. Then, reinterpreting all the analysis at the paradifferential level, we will rigorously prove that the generated energy satisfies favourable, balanced bounds.

7.3.2 The multiplier \(\tilde{X}_{s}\)

In the previous section, the multiplier \(\tilde{X}\in {\mathfrak {P}}\) was a well-chosen vector field which belongs to our space \({\mathfrak {P}}\) of paracontrolled distributions. Here, this can no longer work due to the presence of the operator \(T_{{\tilde{b}}_{0}}\), which is a pseudodifferential rather than a differential operator. For this reason we will instead use a pseudodifferential “vector field” \(i \tilde{X}_{s}\), where \(\tilde{X}_{s}\) has a real, odd symbol of the form

$$ \tilde{X}_{s}(x,\xi ) = \tilde{X}_{s1}(x,\xi ') + \tilde{X}_{s0}(x,\xi ') \xi _{0}, \qquad X_{sj} \in {\mathfrak {P}}S^{j}, $$

which will be homogeneous away from frequency zero. We carefully note that we want the symbol \(\tilde{X}_{s}\) to be a first order polynomial in \(\xi _{0}\); this is important so that we can still do integration by parts in time and have a well defined fixed time energy. The symbol \(\tilde{X}_{s}\) may be interpreted as a pseudodifferential operator using the Weyl paradifferential quantization,

$$ T_{i\tilde{X}_{s}} = i T_{\tilde{X}_{s1}} + T_{\tilde{X}_{s0}} \partial _{0} +\frac{1}{2} T_{\partial _{0} \tilde{X}_{s0}}. $$
(7.54)

However, as in the \(s=0\) case, we will allow a more general choice for the zero order component, and work instead with the modified multiplier

$$ T_{i\tilde{\mathfrak {M}}_{s}} := i T_{\tilde{X}_{s1}} + T_{\tilde{X}_{s0}} \partial _{0} +\frac{1}{2} T_{\tilde{Y}_{s0}}, $$
(7.55)

where the real, even zero order symbol \(\tilde{Y}_{s0} \in \partial _{x}{\mathfrak {P}}S^{0}\) will be carefully chosen later on in order to provide an appropriate Lagrangian correction in our energy estimates.

Repeating the heuristic computation in the previous subsection, in the absence of time boundaries we have an identity of the form

$$ 2\iint T_{\tilde{P}_{A,B}} v \cdot (i T_{\tilde{X}_{s1}} + T_{\tilde{X}_{s0}} \partial _{0}) v \, dx dt = \iint c_{\tilde{X}_{s}, B} (v,v) \, dx dt, $$
(7.56)

where \(c_{\tilde{X}_{s},B}(v,v)\) is a bilinear form whose principal symbol \(c_{\tilde{X}_{s},B}\) is of order two,

$$ c_{\tilde{X}_{s},B}(x,\xi ) = \{ \tilde{p},\tilde{X}_{s}\}(x,\xi ) + 2 \tilde{X}_{s}(x,\xi )( {\tilde{A}}^{\gamma }+ 2s {\tilde{b}}^{\gamma}_{0}(x, \xi )) \xi _{\gamma }- \partial _{0} \tilde{X}_{s0} \tilde{p}(x,\xi ). $$
(7.57)

The objective would now be to choose the symbols \(\tilde{X}_{sj} \in {\mathfrak {P}}S^{j}\) so that we cancel the unbalanced part of the symbol \(c_{\tilde{X}_{s},B}\). However, it is immediately clear that this may be a bit too much to ask, as it conflicts with the requirement that \(\tilde{X}_{s}\) is a first degree polynomial in \(\xi _{0}\). Hence, as a substitute, we will seek to achieve this cancellation on the characteristic set \(p(x,\xi ) = 0\). Then, instead of asking for

$$ c_{\tilde{X}_{s},B} \overset{bal}{\approx }0, $$

we will settle for the slightly weaker property

$$ c_{\tilde{X}_{s},B}(x,\xi ) \overset{bal}{\approx }\tilde{Y}_{s0}(x, \xi ') \cdot \tilde{p}(x,\xi ), $$

where \(\tilde{Y}_{s0} \in {\mathfrak{DP}}S^{0}\) is a purely spatial zero homogeneous symbol, with the spatial dependence at the level of \(\partial ^{2} u\). This term will be harmless, as we will also be able to remove it in our energy energy estimates with a Lagrangian correction, by making a good choice for \(\tilde{Y}_{s0}\).

An additional requirement on our paradifferential “vector field” \(\tilde{X}_{s}\) will be that, in the energy estimate generated by \(\tilde{X}_{s}\), the associated energy functional \(E_{\tilde{X}_{s}}\) should be positive definite at the level of it principal part. Earlier, in the case when \(X\) was a vector field, this requirement was identified, via the energy momentum tensor, with the property that \(X\) is forward time-like. Here we will generalize this notion to symbols:

Definition 7.5

We say that the (real) symbol \(X= \xi _{0} X_{0}+ X_{1} \in C^{0} S^{1}\) is forward time-like if the following two properties hold:

a) \(X_{0}(x,\xi ') > 0\).

b) \(X(x,\xi _{0}^{1},\xi ') X(x,\xi _{0}^{2}, \xi ') < 0\), where \(\xi _{0}^{1}(x,\xi ') < \xi _{0}^{2}(x,\xi ')\) are the two real zeros of \(p(x,\xi )\) as a polynomial of \(\xi _{0}\).

We remark that, using \(X\) as a multiplier, relative to the metric \(g\), we will obtain an energy functional which at leading order can be described via the symbol

$$ \begin{aligned} e_{X}(x,\xi ) = & \ g^{\alpha \beta} \xi _{\alpha }\xi _{\beta }X_{0} - 2 g^{0\alpha} \xi _{\alpha }(X_{1} + X_{0} \xi _{0}) \\ := & \ e_{X}^{0}(x,\xi ') \xi _{0}^{2} + e_{X}^{1}(x,\xi ') \xi _{0} + e_{X}^{2}(x,\xi '), \end{aligned} $$
(7.58)

which should be compared with the expression (7.16) defined earlier in terms of the energy momentum tensor in the case when \(X\) is a vector field. Correspondingly, we define the energy functional

$$ E_{X}[w] = \int - T_{e_{X}^{0}} \partial _{t} w \cdot \partial _{t} w + T_{ie_{X}^{1}}w \cdot \partial _{t} w + T_{e_{X}^{2}} w \cdot w \, dx. $$
(7.59)

The main property of forward time-like symbols is as follows:

Lemma 7.6

The symbol \(e_{X}\) is positive definite iff \(X\) is forward time-like.

Proof

Assuming \(X_{0}\) is nonzero, we represent \(X\) in the form

$$ X = X_{0} ( a_{1} (\xi _{0} -\xi _{0}^{1}) + a_{2}(\xi _{0}-\xi _{0}^{2})) , $$

where \(a_{1}+a_{2} = 1\). Then \(e_{X}\) has the form

$$ \begin{aligned} e_{X} = & \ g^{00}(\xi _{0} -\xi _{0}^{1})(\xi _{0}-\xi _{0}^{2}) X_{0} -g^{00}(2 \xi _{0} - \xi _{0}^{1}-\xi _{0}^{2}) X_{0} ( a_{1} (\xi _{0} -\xi _{0}^{1}) + a_{2}(\xi _{0}-\xi _{0}^{2})) \\ = & - g^{00} X_{0} [a_{1}(\xi _{0} - \xi _{0}^{1})^{2} + a_{2}(\xi _{0} - \xi _{0}^{2})^{2}]. \end{aligned} $$

Here \(g^{00} = -1\) and at least one of \(a_{1}\) and \(a_{2}\) are positive. Then \(e_{X}\) is positive definite iff \(X_{0} > 0\) and \(a_{1}, a_{2} > 0\). This is easily seen to be equivalent with the forward time-like condition in the above definition. □

7.3.3 The construction of \(\tilde{X}^{s}\)

Here we return to the matter of choosing \(\tilde{X}_{s}\), whose properties almost exactly mirror those of the vector field \(\tilde{X}\) in the previous subsection:

Proposition 7.7

There exists a real, odd homogeneous symbol of order one \(\tilde{X}_{s} \in \xi _{0}+ {\mathfrak {P}}S^{1}\), which is a first degree polynomial in \(\xi _{0}\), so that:

i) \(\tilde{X}_{s}\) is forward time-like.

ii) The principal symbol \(c_{\tilde{X}_{s},B}\) of the \(\tilde{X}_{s}\) energy flux admits a representation of the form

$$ c_{\tilde{X}_{s},B}(x,\xi ) = {\tilde{q}}_{2}(x,\xi ) + {\tilde{q}}_{0}(x, \xi ')\tilde{p}(x,\xi ), $$
(7.60)

where \({\tilde{q}}_{2}\) is balanced,

$$ \|{\tilde{q}}_{2}\|_{L^{\infty }S^{2}} \lesssim {\mathcal {B}}^{2} , $$
(7.61)

and \({\tilde{q}}_{0}\) has \({\mathfrak{DP}}\) type regularity,

$$ \| {\tilde{q}}_{0}\|_{ {\mathfrak{DP}}S^{0}} \lesssim {\mathcal {A}^{ \sharp }}. $$
(7.62)

iii) The symbol \(\tilde{X}_{s}\) admits the \({\mathfrak {P}}S^{1}\) representation

$$ \tilde{X}_{s} = \xi _{0} + T_{a^{\gamma}} \partial _{\gamma }u + r_{s}, $$
(7.63)

where the para-coefficients \(a^{\gamma}(x,\xi ) = a^{\gamma}_{1}(x,\xi ') + a^{\gamma}_{0}(x,\xi ')\) with \(a^{\gamma}_{j} \in {\mathfrak {P}}S^{j}\) have the form

$$ a^{\gamma }= - \xi _{\delta }\partial ^{\delta }u \partial _{\xi _{ \gamma}} \tilde{X}_{s} - \partial ^{\gamma }u \tilde{X}_{s} -2 \tilde{X}^{s} \partial ^{0} u {\tilde{g}}^{0\gamma} - s \xi _{\delta }\partial _{ \xi _{\gamma}} \log |\xi '|^{2} \partial ^{\delta }u \tilde{X}_{s} + p q_{0}^{ \gamma }$$
(7.64)

with \(q_{0}^{\gamma }\in {\mathfrak {P}}S^{0}\), independent of \(\xi _{0}\).

From the perspective of energy estimates, it might seem that parts (i) and (ii) are the important ones. However, part (ii) will be seen as an immediate consequence of the representation in part (iii), which thus can be thought of as the more fundamental property. Also, in the proof of the energy estimates it will on occasion be more convenient to directly use (7.64). In the sequel we will refer to \(a^{\gamma}\) as the para-coefficients of \(\tilde{X}_{s}\). We note that the choice of \(q_{0}^{\gamma}\) is uniquely determined by the requirement that \(a^{\gamma}\) are first degree polynomials in \(\xi _{0}\).

Proof

It will be somewhat easier to construct the corresponding symbol \(X_{s}\) as associated to \(p\), rather than to \(\tilde{p}\); this avoids the slight symmetry breaking in the transition from \(p\) to \(\tilde{p}\). Precisely, we will choose \(\tilde{X}_{s}\) of the form

$$ \tilde{X}_{s} = g^{00} X_{s}, $$

and then express \(c_{\tilde{X}_{s},B}\) in terms of \(X_{s}\) as follows:

$$ c_{\tilde{X}_{s},B} = \{ p, X_{s}\} + 2 X_{s} (A^{\gamma }+ 2sb^{ \gamma}_{0})\xi _{\gamma }+ q_{00} p, $$

where \(q_{00}\) is given by

$$ q_{00} = \{ X_{s1}, \log g^{00}\} + \xi _{0} \{X_{s0},\log g^{00}\} - \partial _{0} X_{s0} - 2 s X_{s} \partial ^{0} u \, {\tilde{g}}^{0\nu} \partial _{j} \partial _{\nu }u \xi _{j} |\xi '|^{-2} , $$

and \(b_{0}^{\gamma}\) has the form

$$ b^{\gamma}_{0} = \partial ^{\beta }u \, g^{\gamma \nu} \partial _{j} \partial _{\nu }u\ \xi _{\beta }\xi _{j} |\xi '|^{-2}. $$
(7.65)

Here we have separated the two terms in \({\tilde{b}}_{0}^{\gamma}\); the first has contributed to \(b_{0}^{\gamma}\), while the second has contributed the last term in the Lagrangian coefficient \(q_{00}\).

For clarity, we note that the exact expression of \(q_{00}\) is not important, we will only use the fact that \(q_{00} \in {\mathfrak{DP}}S^{0}\). On the other hand, for \(b_{0}^{\gamma}\) we will need the fact that it has a null structure.

Now we restate the proposition in terms of the new symbol \(X_{s}\). Our goal will be to find \(X_{s}\) in the same class as \(\tilde{X}_{s}\), so that the reduced symbol

$$ c^{red}_{X_{s},B} = \{ p, X_{s}\} + 2X_{s} (A^{\gamma }+ 2s b^{\gamma}_{0}) \xi _{\gamma }$$
(7.66)

can be represented in the form

$$ c^{red}_{X_{s},B} =q_{2}(x,\xi ) + q_{0}(x,\xi )p(x,\xi ). $$
(7.67)

Here there is a small twist in the argument. While \(c_{X_{s},B}\) is a second degree polynomial in \(\xi _{0}\), this is no longer the case for \(c^{red}_{X_{s},B}\), which contains the term \(\xi _{0}^{3} \{ g^{00},X_{s0}\}\). For this reason, in (7.67) we apriori have to allow for symbols \(q_{2}\), respectively \(q_{0}\) which are third, respectively first degree polynomials in \(\xi _{0}\). However, we can eliminate the \(\xi _{0}^{3}\) term in \(q_{2}\) with a \(\xi _{0}\) correction in \(q_{0}\). Then, returning to \(c_{X_{s},B}\), we obtain the representation (7.60) with \({\tilde{q}}_{2}\) of second degree and \({\tilde{q}}_{0}\) of first degree. But \(c_{X_{s},B}\) is a second degree polynomial in \(\xi _{0}\), so we finally conclude that \({\tilde{q}}_{0}\) must be independent of \(\xi _{0}\).

We now proceed to construct the symbol \(X_{s}\). As a first step in the proof, we seek to obtain a variant \(X^{0}_{s}\) of the symbol \(X_{s}\) where we drop the requirement that \(X^{0}_{s}\) is a first order polynomial in \(\xi _{0}\) but we ask for the stronger property that the associated symbol \(c^{red}_{X^{0}_{s},B}\) is fully balanced, which corresponds to \(q_{0}=0\). Then, at the end, we choose \(X_{s}\) to be the first degree polynomial in \(\xi _{0}\) that matches \(X^{0}_{s}\) at the two roots of \(p(x,\xi ) = 0\) viewed as a polynomial in \(\xi _{0}\).

The relation we seek for \(X^{0}_{s}\) to satisfy on the characteristic set of \(p\) is

$$ c^{red}_{X^{0}_{s},B} \overset{bal}{\approx }0, $$
(7.68)

where, using the expressions (3.13) and (7.65) for \(A^{\gamma}\) and \(b_{0}^{\gamma}\),

$$ c^{red}_{X^{0}_{s},B} = \{ g^{\alpha \beta} \xi _{\alpha }\xi _{\beta}, X^{0}_{s}\} + 2\partial ^{\alpha }u \xi _{\delta }g^{\beta \delta} \partial _{\alpha }\partial _{\beta }u \cdot X^{0}_{s} +4s X^{0}_{s} \partial ^{\beta }u g^{\alpha \nu} \partial _{j} \partial _{\nu }u\ \xi _{\alpha }\xi _{\beta }\xi _{j} |\xi '|^{-2}. $$

Here we recall the expression for the derivatives of \(g\), see (3.9),

$$ \partial _{\gamma }g^{\alpha \beta} \xi _{\alpha }\xi _{\beta }= -2 \partial ^{\beta }u g^{\alpha \delta} \partial _{\delta }\partial _{ \gamma }u\, \xi _{\beta }\xi _{\alpha}. $$
(7.69)

Substituting this in the previous expression for \(c^{red}_{X^{0}_{s},B}\), we need the following relation to hold modulo balanced terms:

$$ \begin{aligned} 2 \xi _{\alpha }g^{\alpha \beta} \partial _{\beta }X^{0}_{s} \overset{bal}{\approx }& \ -2 \partial ^{\delta }u \xi _{\delta } \partial _{\xi _{\gamma}} X^{0}_{s} \cdot \xi _{\alpha }g^{\alpha \beta} \partial _{\beta }\partial _{\gamma }u - 2X^{0}_{s} \partial ^{ \gamma }u \cdot \xi _{\alpha }g^{\beta \alpha} \partial _{\gamma } \partial _{\beta }u \\ & - 4s X^{0}_{s} \partial ^{\delta }u \xi _{\delta }\xi _{j} |\xi '|^{-2} \cdot \xi _{\alpha }g^{\alpha \beta} \partial _{j} \partial _{\beta }u .\ \end{aligned} $$

We can rewrite this using the following operator

$$ L = \xi _{\alpha }g^{\alpha \beta} \partial _{\beta } $$

in the form

$$ L X^{0}_{s} \overset{bal}{\approx }- \xi _{\delta }\partial ^{\delta }u \partial _{\xi _{\gamma}} X^{0}_{s} \cdot L \partial _{\gamma }u - \partial ^{\gamma }u X^{0}_{s} \cdot L \partial _{\gamma }u - s \partial ^{\delta }u X^{0}_{s} \xi _{\delta }\partial _{\xi _{j}} \log |\xi '|^{2} \cdot L \partial _{j} u . $$
(7.70)

By Lemma 7.4, we already have a solution \(X\) for \(s=0\). Thinking of this multiplicatively, it is then natural to look for \(X^{0}_{s}\) of the form

$$ X^{0}_{s} = Z_{s} X, $$

where should be zero homogeneous in \(\xi \) and must satisfy

$$ L Z_{s} \overset{bal}{\approx }- \xi _{\delta }\partial ^{\delta }u( \partial _{\xi _{\gamma}}Z_{s} L \partial _{\gamma }u + s Z_{s} \partial _{\xi _{j}} \log |\xi '|^{2} L \partial _{j} u ). $$
(7.71)

We will also assume that \(Z_{s}\) is a positive symbol; this will help later with the time-like condition. Then we can rewrite the above relation as a condition for \(\log Z_{s}\), namely

$$ L \log Z_{s} \overset{bal}{\approx }- \xi _{\delta }\partial ^{ \delta }u( \partial _{\xi _{\gamma}}( \log Z_{s} ) L \partial _{ \gamma }u + s \partial _{\xi _{j}} \log |\xi '|^{2} L \partial _{j} u ). $$
(7.72)

Here the inhomogeneous term is linear in \(s\), so we will also look for a solution \(\log Z_{s}\) which is linear in \(s\).

There is one last algebraic simplification, which is to replace \(Z_{s}\) by \(\tilde{Z}_{s} = Z_{s} |\xi '|^{2s}\), which is \(2s\)-homogeneous, even, and inherits the property that \(\log \tilde{Z}_{s}\) is linear in \(s\). Then \(\log \tilde{Z}_{s}\) must solve

$$ L \log \tilde{Z}_{s} \overset{bal}{\approx }- \xi _{\delta }\partial ^{ \delta }u \partial _{\xi _{\gamma}} \log \tilde{Z}_{s} L \partial _{ \gamma }u . $$
(7.73)

Dispensing with the log, we replace this by

$$ L \tilde{Z}_{s} \overset{bal}{\approx }- \xi _{\delta }\partial ^{ \delta }u \partial _{\xi _{\gamma}}\tilde{Z}_{s} L \partial _{\gamma }u . $$
(7.74)

Now we interpret the last relation paradifferentially, formally cancelling the \(L\)’s. This suggests the following scheme to construct the dyadic parts of \(\tilde{Z}_{s}\) inductively by setting

$$ \begin{aligned} &\tilde{Z}_{s0} = \ |\xi |^{2s} \\ &\tilde{Z}_{sk} = \ \xi _{\delta }T_{\partial ^{\delta }u} T_{ \partial _{\xi _{\gamma}} \tilde{Z}_{s}} \partial _{\gamma }u_{k}, \qquad k \geq 1. \end{aligned} $$
(7.75)

A-priori these dyadic parts have a nontrivial dependence on \(\xi \), which would have to be tracked when considering the convergence in the \(k\) summation. However, since \(\log \tilde{Z}_{s}\) is linear in \(s\), it suffices to solve this for some nonzero \(s\). The advantage here is that, if \(s\) is a positive integer (say \(s=1\)) then all our iterates are polynomials of degree \(2s\) in \(\xi \). Hence the convergence issue disappears, due to our smallness condition for \(u\), \({\mathcal {A}^{\sharp }}\ll 1\); this is exactly as in the construction of \(X\) in Section 7.2.2. This defines \(\tilde{Z}_{1}\) as a positive definite polynomial in \(\xi \) of degree 2, so that \(\tilde{Z}_{1} = \xi ^{2} (1+O({\mathcal {A}^{\sharp }}))\). Further, by the same argument as in the proof of Lemma 7.4, it follows that the coefficients of \({\tilde{Z}}_{1} -Z_{10} \) are paracontrolled by \(\partial u\); in other words, \({\tilde{Z}}_{1} - \xi ^{2} \in {\mathfrak {P}}S^{2}\). In addition, by (7.75), it also follows that (a choice for) the para-coefficients of \(\tilde{Z}_{1}\), as in Definition 6.1 is given by

$$ \tilde{Z}_{1} = \xi ^{2}+ T_{ \xi _{\delta }\partial ^{\delta }u \partial _{\xi _{\gamma}} \tilde{Z}_{1}} \partial _{\gamma }u + r. $$
(7.76)

Remark 7.8

We remark on the symbol \(\tilde{Z}_{1}\), which is quadratic in \(\xi \) and para-commutes with \(p\), in the sense that their Lie bracket is balanced and thus bounded by \({\mathcal {B}}^{2}\). This symbol plays a role that is similar to that of the first order symbol \(X\) constructed earlier.

Now that we have \(\tilde{Z}_{1}\), for all real \(s\) we may define

$$ \tilde{Z}_{s} = (\tilde{Z}_{1})^{s},\qquad X^{0}_{s} = X (\tilde{Z}_{1} /| \xi '|^{2})^{s}. $$

By Lemma 7.4 in the previous subsection we have . Combining this with the similar property of \(\tilde{Z}_{1}\), by the algebra and Moser properties of the space \({\mathfrak {P}}\) of paracontrolled distributions it follows that . Finally, combining the representations of \(X\) and of \(\tilde{Z}_{1}\) as paracontrolled distributions, as in (7.36) and (7.76), we obtain the corresponding \({\mathfrak {P}}\) representation for \(X^{0}_{s}\) as in Definition 6.1 (see the relation (7.70))

$$ X^{0}_{s} =\xi _{0} - \xi _{\delta }T_{\partial ^{\delta }u \partial _{ \xi _{\gamma}} X^{0}_{s}} \partial _{\gamma }u - T_{\partial ^{ \gamma }u X^{0}_{s}} \partial _{\gamma }u - s \xi _{\delta }\partial _{ \xi _{j}} \log |\xi '|^{2} T_{\partial ^{\delta }u X^{0}_{s}} \partial _{j} u + r_{s}. $$
(7.77)

This in turn yields the desired conclusion that \(c^{red}_{X^{0}_{s},B}\) is balanced,

$$ \|c^{red}_{X^{0}_{s},B}\|_{L^{\infty }S^{2}} \lesssim {\mathcal {B}}^{2}. $$
(7.78)

Indeed, the equivalent form (7.70) can be obtained by directly applying the operator \(L\) in the relation (7.77); this is because the terms where the paracoefficients get differentiated are balanced, so we are left with the terms where \(L\) is applied to the main factors \(\partial u\).

Now we carry out the last step of the proof, and define the symbol \(X_{s}\) as the unique first degree polynomial in \(\xi _{0}\) with the property that

$$ X_{s}(x,\xi ) = X^{0}_{s}(x,\xi ) \qquad \text{on} \quad \{g^{\alpha \beta}\xi _{\alpha }\xi _{\beta }= 0\}. $$

We now show that this choice for \(X_{s}\) has the desired properties.

Recall that \(\xi _{0}^{1}(x,\xi ') < \xi _{0}^{2}(x,\xi ')\) are the two real zeros of \(p(x,\xi )\) as a polynomial of \(\xi _{0}\), which are 1-homogeneous and smooth in \(\xi '\) and are also smooth functions of \(\partial u\). Thus,

$$ \xi _{0}^{1}, \xi _{0}^{2} \in {\mathfrak {P}}S^{1}. $$

The coefficients \(X_{s0}\) and \(X_{s1}\) in \(X_{s}\) are obtained by solving a linear system,

$$ X_{s0} = \frac{ X^{0}_{s}(x,\xi _{0}^{2},\xi ') - X^{0}_{s}(x,\xi _{0}^{1},\xi ')}{ \xi _{0}^{2} - \xi _{0}^{1}}, \qquad X_{s1} = \frac{ X^{0}_{s}(x,\xi _{0}^{2},\xi ')\xi _{0}^{1} - X^{0}_{s}(x,\xi _{0}^{1},\xi ')\xi _{0}^{2}}{ \xi _{0}^{2} - \xi _{0}^{1}}. $$

By the algebra and Moser properties of the space \({\mathfrak {P}}\) of paracontrolled distributions, it immediately follows that we have the symbol regularity properties \(X_{s0} \in {\mathfrak {P}}S^{0}\) and \(X_{s1} -1 \in {\mathfrak {P}}S^{1}\). By construction we also have a smooth division,

$$ X_{s} = X^{0}_{s} + d p, $$
(7.79)

where we easily see that the quotient \(d\) has regularity \(d \in {\mathfrak {P}}S^{-1}\) by computing directly

$$ \begin{aligned} d(x,\xi ) = & \ \frac{X_{s}(x,\xi ) - X^{0}_{s}(x,\xi )}{ p(x,\xi )} \\ = & \ \frac{1}{g^{00}(\xi _{0}^{1}-\xi _{0}^{2})}\left ( \frac{ X^{0}_{s}(x,\xi ) - X^{0}_{s}(x,\xi _{0}^{1},\xi ')}{\xi _{0}-\xi _{0}^{1}} - \frac{X^{0}_{s}(x,\xi ) - X^{0}_{s}(x,\xi _{0}^{2},\xi ')}{\xi _{0}-\xi _{0}^{2}} \right ). \end{aligned} $$

One may also interpret this as a form of the Malgrange preparation theorem in an easier case where the roots are separated.

We can now use (7.79) to relate \(c^{red}_{X_{s},B}\) with \(c^{red}_{X^{0}_{s},B}\):

$$ c^{red}_{X_{s},B}= c^{red}_{X^{0}_{s},B} + p(\{ p,d\} + d(A^{\gamma }+ s b^{\gamma}_{0}) \xi _{\gamma}), $$

which is exactly the desired representation (7.67).

We can also use the relation (7.77) for part (iii) of the proposition. For this we first transition from \(X^{0}_{s}\) to \(X_{s}\). Using (7.79) and peeling off balanced terms, this gives the \({\mathfrak {P}}\) representation

$$ \begin{aligned} X_{s} = & \ \xi _{0} - \xi _{\delta }T_{\partial ^{\delta }u \partial _{\xi _{\gamma}} X_{s}} \partial _{\gamma }u - T_{\partial ^{ \gamma }u X_{s}} \partial _{\gamma }u - s \xi _{\delta }\partial _{ \xi _{j}} \log |\xi '|^{2} T_{\partial ^{\delta }u X_{s}} \partial _{j} u \\ & \ \, - T_{p} \left ( \xi _{\delta }T_{\partial ^{\delta }u \partial _{\xi _{\gamma}} d} \partial _{\gamma }u + T_{\partial ^{ \gamma }u d} \partial _{\gamma }u + s \xi _{\delta }\partial _{\xi _{j}} \log |\xi '|^{2} T_{\partial ^{\delta }u d} \partial _{j} u + d \right ) \\ & \ \, + \left (T_{d} p + \xi _{\delta }T_{\partial ^{\delta }u d \partial _{\xi _{\gamma}} p } \partial _{\gamma }u\right )+ r_{s}. \end{aligned} $$

In view of the paradifferential expansion (5.28) for \(g^{\alpha \beta}\), in the last bracket there is a leading order cancellation,

$$ T_{d} p + \xi _{\delta }T_{\partial ^{\delta }u d \partial _{\xi _{ \gamma}} p } \partial _{\gamma }u = r_{s}. $$

This implies that \(X_{s}\) admits a \({\mathfrak {P}}S^{1}\) representation of the form

$$ X_{s} = \xi _{0} - \xi _{\delta }T_{\partial ^{\delta }u \partial _{\xi _{\gamma}} X_{s}} \partial _{\gamma }u - T_{\partial ^{ \gamma }u X_{s}} \partial _{\gamma }u - \frac{s}{2} \xi _{\delta } \partial _{\xi _{j}} \log |\xi '|^{2} T_{\partial ^{\delta }u X_{s}} \partial _{j} u + T_{p} z+ r_{s}, $$
(7.80)

where \(z \in {\mathfrak {P}}S^{-1}\). At this stage we only know that \(z\) and \(r_{s}\) are smooth as functions of \(\xi _{0}\). On the other hand, the remaining terms are at most second degree polynomials in \(\xi _{0}\). We claim that, without any restriction in generality, we may take \(z\) independent of \(\xi _{0}\) and then \(r_{s}\) has to be at most second degree polynomial in \(\xi _{0}\).

Subtracting a multiple of \(p\) from all the paracoefficients above and discarding balanced contributions, we may reduce to the case of a first degree polynomial, i.e. to a relation of the form

$$ T_{p} z(x,\xi ) + r_{s}(x,\xi ) = Z_{1}(x,\xi ') + Z_{0}(x,\xi ') \xi _{0}, $$

where \(z \in {\mathfrak {P}}S^{-1}\) and \(Z_{j} \in {\mathfrak {P}}S^{j}\), while \(\partial r_{s} = O({\mathcal {B}}^{2})\), with full symbol regularity in \(\xi \). We will show that in this case we must have \(\partial Z_{j}= O({\mathcal {B}}^{2})\), again with full symbol regularity. This would imply that we may include \(T_{p} z\) into \(r_{s}\), and thus take \(z=0\) in the last relation.

We begin by differentiating this relation in \(x\) and \(t\), noting that \(T_{\partial p} z\) may be placed in \(\partial r_{s}\):

$$ T_{p} \partial z(x,\xi ) + \partial r_{s}(x,\xi ) = \partial Z_{1}(x, \xi ') + \partial Z_{0}(x,\xi ') \xi _{0}. $$

We may also perturbatively replace \(T_{p}\) with \(p\), arriving at

$$ p \partial z(x,\xi ) + r^{1}_{s}(x,\xi ) = \partial Z_{1}(x,\xi ') + \partial Z_{0}(x,\xi ') \xi _{0}, $$
(7.81)

where \(r^{1}\) has size \({\mathcal {B}}^{2}\) and symbol regularity,

$$ | \partial _{\xi}^{\alpha }r_{1}| \lesssim {\mathcal {B}}^{2} |\xi |^{1-| \alpha |}. $$

For fixed \(x\), we examine this relation on the characteristic cone \(C = \{p(x,\xi )= 0\}\). There we have

$$ | \partial Z_{1}(x,\xi ') + \partial Z_{0}(x,\xi ') \xi _{0}| \lesssim {\mathcal {B}}^{2} |\xi | , $$

so we may directly conclude that \(\partial Z_{1}(x,\xi '), \partial Z_{0}(x,\xi ') = O({\mathcal {B}}^{2})\). Next we need a similar bound for their derivatives \(\partial _{\xi '}^{\alpha }\partial Z_{1}(x,\xi ')\), \(\partial _{\xi '}^{ \alpha }\partial Z_{0}(x,\xi ')\) with respect to \(\xi '\). We fix \(x\) and argue by induction in \(|\alpha |\). Then it suffices to use derivatives which are tangent to the cone at that \(x\), which on one hand kill \(p\) but on the other hand give a full range of \(\xi '\) derivatives for \(Z_{j}\). Hence, we may indeed assume that \(z\) is independent of \(\xi _{0}\) and \(r_{s}\) is a second degree polynomial in \(\xi _{0}\).

Lastly we switch from \(X_{s}\) to \(\tilde{X}_{s}\). Again peeling off \(r_{s}\) type contributions, we have

$$\begin{aligned} \tilde{X}_{s} = & \ g^{00} X_{s} \\ = & \ \xi _{0} + T_{g^{00}} X_{s} + T_{X_{s}} g^{00} + r_{s} \\ = & \ \xi _{0}- \xi _{\delta }T_{\partial ^{\delta }u \partial _{\xi _{\gamma}} \tilde{X}_{s}} \partial _{\gamma }u - T_{ \partial ^{\gamma }u \tilde{X}_{s}} \partial _{\gamma }u - s \xi _{ \delta }\partial _{\xi _{j}} \log |\xi '|^{2} T_{\partial ^{\delta }u \tilde{X}_{s}} \partial _{j} u \\ &\ + T_{\tilde{X}_{s}} \log g^{00} + T_{p} T_{g^{00}} z + r_{s}. \end{aligned}$$

It remains to expand the fourth term, using Lemma 8.2:

$$ T_{\tilde{X}_{s}} \log g^{00} = -2 T_{\tilde{X}^{s} \partial ^{0} u { \tilde{g}}^{0\alpha}} \partial _{\alpha }u + r_{s} . $$

This finally yields the representation (7.63) with the paracoefficients in (7.64), thereby concluding the proof of part (iii) of Proposition 7.7.

The final property of \(\tilde{X}_{s}\) to be verified is that \(\tilde{X}_{s}\) is time-like. This property is easily seen to depend only on the sign of the symbol \(\tilde{X}_{s}\) on the characteristic set \(\{p=0\}\). But by construction, \(\tilde{X}_{s}\) has the same sign as \(X_{s}\) there, which in turn has the same sign as the vector field \(X\) in Section 7.2.2. Then the time-like property for \(\tilde{X}_{s}\) follows from the similar property of \(X\).  □

While it is more streamlined to state Proposition 7.7 and its proof directly in terms of the symbol \(c_{\tilde{X}_{s},B}\), in order to prove energy estimates it is more efficient to peel off balanced components of \(c_{\tilde{X}_{s},B}\), so that we are left with less debris to contend with.

To start with, let us assume that \(\tilde{X}_{s} \in {\mathfrak {P}}S^{1}\) admits the representation (7.63) with \(a^{\gamma }\in {\mathfrak {P}}\) but without requiring that \(a^{\gamma}\) satisfy the relation (7.64). For such \(X_{s}\), we peel off balanced components of \(c_{\tilde{X}_{s},B}\) following the two steps in Lemma 6.9. These steps are briefly reviewed in the sequel.

In a first stage, we note that all expressions in \(c_{\tilde{X}_{s},B}\) can be seen as linear combinations on the form \({\mathfrak {P}}\cdot \partial {\mathfrak {P}}\), where the output is balanced unless the second factor has higher frequency, see Lemma 5.7(a). This allows us to replace such products by paraproducts of the form \(T_{{\mathfrak {P}}} \partial {\mathfrak {P}}\). Further, using the definition of paracontrolled distributions for the second factor, we can discard the error term as balanced and arrive at more precise paraproducts of the form \(T_{{\mathfrak {P}}} \partial ^{2} u\), namely

$$ c_{\tilde{X}_{s},B} \overset{bal}{\approx }T_{a^{\alpha \beta}} \partial _{\alpha }\partial _{\beta }u, $$
(7.82)

where the coefficients \(a^{\alpha \beta}\) and \(\tilde{a}^{\alpha \beta}\) are explicitly computable as algebraic expressions in terms of \(u\), \(\tilde{X}_{s}\) and the paracoefficients \(a^{\gamma}\) of \(\tilde{X}_{s}\). Precisely,

$$ \begin{aligned} c_{\tilde{X}_{s},B}(x,\xi ) \overset{bal}{\approx }& \ T_{\partial _{ \xi _{\gamma}} \tilde{p}(x,\xi )} \partial _{\gamma }\tilde{X}_{s} - T_{ \partial _{\xi _{\gamma}} \tilde{X}_{s}} \partial _{\gamma }\tilde{p}(x, \xi ) + 2T_{\tilde{X}_{s}}( {\tilde{A}}^{\gamma }+ 2s{\tilde{b}}^{\gamma}_{0}(x, \xi )) \xi _{\gamma } \\ & \ + T_{p} \partial _{0} \tilde{X}_{s0} \\ \overset{bal}{\approx }& \ 2 T_{\xi _{\alpha }{\tilde{g}}^{\alpha \beta} a^{\gamma}} \partial _{\beta }\partial _{\gamma }u + 2 T_{\xi _{ \alpha }\xi _{\beta }\partial _{\xi _{\gamma}} \tilde{X}_{s} \partial ^{ \beta }u {\tilde{g}}^{\alpha \delta}} \partial _{\delta }\partial _{ \gamma }u \\ & \ + 2 T_{\tilde{X}_{s} (\partial ^{\beta }u\xi _{\gamma }{ \tilde{g}}^{\gamma \alpha} + \partial ^{0} u {\tilde{g}}^{0\beta} \xi _{ \gamma }{\tilde{g}}^{\gamma \alpha})} \partial _{\alpha} \partial _{ \beta} u \\ & \ + 2 s T_{ \tilde{X}_{s}\partial ^{\beta }u \xi _{\gamma }g^{ \gamma \alpha} \xi _{\beta }\partial _{\xi _{\delta}} (\log |\xi '|^{2})} \partial _{\alpha }\partial _{\delta }u \\ & \ - 2s T_{ \tilde{X}_{s} \partial ^{0} u {\tilde{g}}^{0\nu} \xi _{\alpha }{\tilde{g}}^{\alpha \beta} \xi _{\beta }\partial _{\xi _{\delta}} (\log |\xi '|^{2})} \partial _{\nu }\partial _{\delta }u \\ & \ + T_{\tilde{p}a^{\gamma}_{0}} \partial _{0} \partial _{\gamma }u \end{aligned} $$

so we obtain, in unsymmetrized form, the relation (7.82) with

$$ \begin{aligned} a^{\alpha \gamma} = & \ 2\xi _{\beta }{\tilde{g}}^{\alpha \beta} ( a^{ \gamma }+ \xi _{\delta }\partial ^{\delta }u \partial _{\xi _{\gamma}} \tilde{X}_{s} + \tilde{X}_{s} \partial ^{\gamma }u + \tilde{X}_{s} \partial ^{0} u {\tilde{g}}^{0\beta} + \frac{s}{2} \tilde{X}_{s} \xi _{ \delta }\partial ^{\delta }u \partial _{\xi _{\gamma}}(\log |\xi '|^{2}) ) \\ & \ - \tilde{p}(2s \tilde{X}_{s}\partial ^{0} u {\tilde{g}}^{0\alpha} \partial _{\xi _{\gamma}} (\log |\xi '|^{2})- \delta ^{\alpha}_{0} a^{ \gamma}_{0}). \end{aligned} $$
(7.83)

Finally, the last difficulty we face is that we do not have good enough estimates for \(\partial _{t}^{2} u\). This is rectified by using instead the corrected expression \(\hat{\partial}_{t}^{2} u \) introduced in (5.13). This yields a corresponding correction of \(c\), namely

$$ {\mathring{c}}_{\tilde{X}_{s},B} = T_{a^{\alpha \beta}} \widehat{\partial _{\alpha }\partial _{\beta}} u. $$
(7.84)

With these notations, we can now state a more refined version of Proposition 7.7:

Proposition 7.9

Let \(\tilde{X}_{s}\) be the symbol constructed in Proposition 7.7. Then the conclusion of Proposition 7.7holds equally for \({\mathring{c}}_{\tilde{X}_{s},B}\), with the corresponding expressions \({\mathring{q}}_{2}\) and \({\mathring{q}}_{0}\) satisfying a stronger version of (7.61),

$$ \|{\mathring{q}}_{2}\|_{L^{\infty }S^{2}} \lesssim {\mathcal {B}}^{2}, \qquad \|P_{< k} \partial _{0} {\mathring{q}}_{2}\|_{ L^{\infty }S^{2}} \lesssim 2^{k} {\mathcal {B}}^{2}, $$
(7.85)

and with \({\mathring{q}}_{0}\) having \(\partial _{x} {\mathfrak {P}}\) type regularity,

$$ \| {\mathring{q}}_{0}\|_{\partial _{x} {\mathfrak {P}}S^{0}} \lesssim { \mathcal {A}}. $$
(7.86)

Proof

A direct computation using (7.83) and (7.64) shows that the coefficients \(a^{\alpha \beta}\) have the form

$$ a^{\alpha \beta} = \tilde{p}q^{\alpha \beta}, \qquad q^{\alpha \beta} \in {\mathfrak {P}}S^{0}, $$

and thus

$$ {\mathring{c}}_{\tilde{X}_{s},B} = T_{ \tilde{p}q^{\alpha \beta}} \widehat{\partial _{\alpha }\partial _{\beta}} u, $$

We need to express this in the form \(T_{\tilde{p}} \partial _{x} {\mathcal {B}}\) plus a balanced component. For this we consider two cases:

a) If \((\alpha ,\beta ) \neq (0,0)\) then the above component of \({\mathring{c}}\) has the form

$$ T_{\tilde{p}q} \partial _{x} \partial u = T_{\tilde{p}} \partial _{x} (T_{q} \partial u) - T_{\tilde{p}} T_{\partial _{x} q} \partial u + (T_{ \tilde{p}q} - T_{\tilde{p}} T_{q}) \partial _{x} \partial u. $$

Here the first term on the right is as needed, so we set

$$ {\mathring{q}}_{2} = - T_{\tilde{p}} T_{\partial _{x} q} \partial u + (T_{ \tilde{p}q} - T_{\tilde{p}} T_{q}) \partial _{x} \partial u. $$

The first term is balanced by Lemma 5.7(a) and the second is balanced by Lemma 5.9. We still need to estimate \(\partial _{0} {\mathring{q}}_{2}\) as in (7.85), which is immediate using Lemma 5.4, Lemma 5.7(a) and Lemma 5.9.

b) If \((\alpha ,\beta ) = (0,0)\) then the above component has the form

$$ T_{\tilde{p}q} \hat{\partial}_{t}^{2} u = \sum _{(\alpha ,\beta ) \neq (0,0)} T_{\tilde{p}q} (T_{{\tilde{g}}^{\alpha \beta}} \partial _{\alpha } \partial _{\beta }u + T_{\partial _{\alpha }\partial _{\beta }u} { \tilde{g}}^{\alpha \beta}). $$

The first term on the right is treated exactly as in case (a), by pulling out one spatial derivative, while the second is directly placed in \({\mathring{q}}_{2}\) using again Lemma 5.4 and (a minor variation of) Lemma 5.7(a). □

7.3.4 Paradifferential energy estimates associated to \(\tilde{X}_{s}\)

We now use the symbol \(\tilde{X}_{s}\) given by Proposition 7.7 in order to construct an \(H^{1} \times L^{2}\) balanced energy functional for the conjugated problem (7.48). This in turn gives an \(H^{s+1} \times H^{s}\) balanced energy functional for the original linear paradifferential flow (3.25), thus completing the proof of Theorem 7.1.

Broadly speaking, we will be following the analysis in the \(s=0\) case, but with more care since we are replacing the vector field \(X\) with the pseudodifferential multiplier \(\tilde{X}_{s}\). In particular, here, instead of paraproducts we will have to commute paraproducts with paradifferential operators. The difficulty is that we will no longer be able to estimate the commutator contributions in a direct, perturbative fashion; instead, we will need to take into account unbalanced subprincipal commutator terms, and devise an additional zero order correction to \(\tilde{X}_{s}\) in order to deal with them.

We begin by considering the conjugation operator \(B\), for which we provide a favourable decomposition:

Lemma 7.10

The operator \(B\) given by (7.49) admits a decomposition

$$ B = B_{0} + B_{1} + B_{2}, $$
(7.87)

where the three components are as follows:

(i) \(B_{0} = T_{b_{0}^{\gamma}} \partial _{\gamma}\) is the leading part, with symbol

$$ b_{0}(x,\xi )= i |\xi |^{s} \{ {\tilde{g}}^{\alpha \beta} \xi _{\alpha} \xi _{\beta},|\xi |^{-s}\}. $$
(7.88)

(ii) \(B_{1}\) is unbalanced but with a favourable null structure,

$$ B_{1} w = T_{h(\partial u)} T_{{\tilde{g}}^{\alpha \beta}} L_{lh}( \partial _{\alpha }\partial u, \partial _{\beta}|D_{x}|^{-1}w) $$
(7.89)

with \(h\) depending smoothly on \(\partial u\).

(iii) \(B_{2}\) is balanced,

$$ \| B_{2} w \|_{L^{2}} \lesssim {\mathcal {B}}^{2} \|\partial w\|_{L^{2}} . $$
(7.90)

This result is a direct consequence of Proposition 6.15; we have stated it here separately only for quick reference in this section.

At this point, we can repeat the multiplier computation in the previous section, using as multiplier the operator \(T_{\tilde{\mathfrak {M}}_{s}}\) defined in (7.55). Here \(\tilde{X}_{s}\) will be the symbol constructed in the previous subsection, so it remains to consider the choice of \(\tilde{Y}_{s0}\), which will be chosen as

$$ \tilde{Y}_{s0} = -{\mathring{q}}_{0} $$
(7.91)

with \({\mathring{q}}_{0}\) as in Proposition 7.9.

Using the \(T_{\tilde{\mathfrak {M}}_{s}}\) operator as a multiplier, we seek to derive an associated energy identity. Here, at leading order, we would like the energy functional \(E_{\tilde{X}_{s}}\) to be as in (7.59), described by the symbol \(e_{\tilde{X}_{s}}\) defined as in (7.58). On the other hand the energy flux is to be described at leading order by the symbol \({\mathring{c}}_{\tilde{X}_{s},B}\) in (7.84) where we add the contribution of \(\tilde{Y}_{s0}\).

To have a modular argument, at first we simply assume that

  • \(\tilde{X}_{s} \in {\mathfrak {P}}S^{1}\), with the representation (7.63) with \(a^{\gamma }\in {\mathfrak {P}}S^{1}\), but without assuming that \(a^{\gamma}\) are given by (7.64).

  • \(\tilde{Y}_{s0} \in \partial _{x} {\mathfrak {P}}\), but without assuming that \(\tilde{Y}_{s0}\) is as in Proposition 7.9.

Given such \(\tilde{X}_{s}\) and \(\tilde{Y}_{s0}\), we will describe the leading part of the energy flux using the symbol

$$ {\tilde{c}}_{s} = {\mathring{c}}_{\tilde{X}_{s},B} + T_{p} \tilde{Y}_{s0}. $$
(7.92)

This is a second degree polynomial in \(\xi _{0}\), which we expand as

$$ {\tilde{c}}_{s}(x,\xi ) = {\tilde{c}}_{s}^{0}(x,\xi ') \xi _{0}^{2} + { \tilde{c}}_{s}^{1}(x,\xi ') \xi _{0} + {\tilde{c}}_{s}^{2}(x,\xi '). $$
(7.93)

To this expansion we associate the bilinear form

$$ \tilde{C}_{s}(w,w) = \int - T_{{\tilde{c}}_{s}^{0}} \partial _{t} w \cdot \partial _{t} w + \frac{1}{2} T_{\partial _{t} {\tilde{c}}_{s}^{0}} w \cdot \partial _{t} w + T_{i{\tilde{c}}_{s}^{1}}w \cdot \partial _{t} w + T_{{\tilde{c}}_{s}^{2}} w \cdot w \, dx, $$
(7.94)

which, integrated also over time, would yield exactly the quadratic form generated by the symbol \({\tilde{c}}_{s}\) in Weyl calculus.

Now we can state our main multiplier energy identity, which is as follows:

Proposition 7.11

Let \(\tilde{X}_{s} \in {\mathfrak {P}}S^{1}\) and \(\tilde{Y}_{s0} \in \partial _{x} {\mathfrak {P}}\) be as above, and the multiplier \(T_{\tilde{\mathfrak {M}}_{s}}\) be as in (7.55). Then there exists an energy function \(E_{\tilde{X}_{s},B}\) with the following properties:

i) Leading order expression:

$$ E_{\tilde{X}_{s},B}[w] = E_{\tilde{X}_{s}}[w]+ Err({\mathcal {A}^{\sharp }}) . $$
(7.95)

ii) Energy identity:

$$ \frac{d}{dt} E_{\tilde{X}_{s},B}[w] = \int T_{\tilde{P}_{B}} w \cdot T_{ \tilde{\mathfrak {M}}_{s}} w \, dx + \tilde{C}_{s}(w,w) + Err({ \mathcal {B}}^{2}). $$
(7.96)

We recall again that here we assume neither that \(\tilde{X}_{s}\) is the “vector field” constructed in the previous subsection nor that \(\tilde{X}_{s}\) is forward time-like. Instead we will add these two assumptions later on when we apply the Proposition, in order to guarantee that \(\tilde{C}_{s}(w,w)\) is controlled by \(Err({\mathcal {B}}^{2})\), respectively that \(E_{\tilde{X}_{s}}\) is positive definite.

Proof

As stated, the result in the Proposition is linear with respect to both \(\tilde{X}_{s}\) and \(\tilde{Y}_{s0}\), and also separately in \({\tilde{A}}\) and \({\tilde{b}}_{0}\). This allows us to divide the proof into several cases, which turn out to be easier to manage separately.

I. The contribution of \(\tilde{X}_{s1}\) with \({\tilde{A}}=0\) and \({\tilde{b}}_{0}=0\). Our starting point here is the integral

$$ I_{X}^{1} = 2\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \tilde{P}} w \cdot T_{i \tilde{X}_{s1}} w \, dx dt . $$

The operator \(T_{i \tilde{X}_{s1}}\) is purely spatial and antisymmetric, so we can integrate by parts three times in \([0,T] \times {\mathbb{R}}^{n}\) to rewrite \(I_{X}^{1}\) in the form

$$ \begin{aligned} I_{X}^{1} = & \ 2\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \tilde{g}}^{\alpha \beta}} T_{i\partial _{\beta }\tilde{X}_{s1}} w \cdot \partial _{\alpha }w \, dx dt + \int _{0}^{T} \!\!\!\! \int _{{ \mathbb{R}}^{n}}[T_{{\tilde{g}}^{\alpha \beta}}, T_{i\tilde{X}_{s1}}] \partial _{\alpha }w \cdot \partial _{\beta }w \, dx dt \\ & \ + \left . 2 \int T_{{\tilde{g}}^{\alpha 0}} T_{i \tilde{X}_{s1}} w \cdot \partial _{\alpha }w \, dx \right |_{0}^{T}. \end{aligned} $$

Here the expression on the second line should be thought of as the energy and the expression on the first line represents the energy flux. We remark that if there were no boundaries at times \(t=0,T\) then this would be akin to computing the commutator of \(T_{\tilde{P}}\) and \(T_{i\tilde{X}_{s1}}\).

The above expression needs some further processing to put it in the desired form. We begin with the energy component, where we need to compound the paraproducts and separate the cases \(\alpha =0\) and \(\alpha \neq 0\). This is done using Lemma 6.11,

$$\begin{aligned} \int T_{{\tilde{g}}^{\alpha 0}} T_{i \tilde{X}_{s1}} w \cdot \partial _{ \alpha }w \, dx =& \int T_{{\tilde{g}}^{00} \tilde{X}_{s1}} w \cdot \partial _{0} w \, dx + \int T_{{\tilde{g}}^{j 0} \tilde{X}_{s1} } w \cdot \partial _{j} w \, dx \\ &{}+ O({\mathcal {A}^{\sharp }}) \| w[t]\|_{ \mathcal {H}^{1}}^{2}, \end{aligned}$$

as needed.

We now successively consider the space-time integrals on the first line in \(I_{X}^{1}\). In the first integral, the components where the \({\tilde{g}}^{\alpha \beta}\) frequency is at least comparable to the \(\tilde{X}_{s1}\) frequency are balanced, and we can use Lemma 6.12 to compose the paraproducts as

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{\tilde{g}}^{\alpha \beta}} T_{i\partial _{\beta }\tilde{X}_{s1}} w \cdot \partial _{ \alpha }w \, dx dt = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{i T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }\tilde{X}_{s1}} w \cdot \partial _{\alpha }w \, dx dt + Err({\mathcal {B}}^{2}) , $$

where the integral on the right can be freely switched to the Weyl calculus if \(\alpha \neq 0\), and represents one of the desired components of our energy flux.

For the second space-time integral in \(I_{X}^{1}\) we use the commutator expansion in Proposition 6.15 to get a principal part, an unbalanced subprincipal part and a balanced term,

$$\begin{aligned} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}[T_{{\tilde{g}}^{\alpha \beta}}, T_{i\tilde{X}_{s1}}] \partial _{\alpha }w \cdot \partial _{ \beta }w \, dx dt =& \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \{ {\tilde{g}}^{\alpha \beta},\tilde{X}_{s1}\}_{p}} \partial _{\alpha }w \cdot \partial _{\beta }w \, dx dt + I_{X,sub}^{1} \\ &{}+ Err({\mathcal {B}}^{2}) , \end{aligned}$$

where the unbalanced subprincipal part \(I_{X,sub}^{1}\) has the form

$$ I_{X,sub}^{1} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \gamma}} L_{lh}(\partial _{ \gamma }\partial _{x}^{2} u,\ \partial _{\alpha }w) \cdot \partial _{ \beta }w \, dx dt . $$
(7.97)

We postpone the analysis of \(I_{X,sub}^{1}\) for later, and focus now on the principal part, which has symbol

$$ \partial _{\xi _{j}} \tilde{X}_{s1} \partial _{j} {\tilde{g}}^{\alpha \beta}. $$

As in Lemma 6.7, we may perturbatively (with \(O({\mathcal {B}}^{2} L^{\infty }S^{0})\) errors) replace this by

$$ h^{\alpha \beta} = T_{\partial _{\xi _{j}} \tilde{X}_{s1}} \partial _{j} {\tilde{g}}^{\alpha \beta}. $$

This is almost in the desired form, except that we need to switch it to Weyl calculus. We observe that we have no contribution if both \(\alpha \) and \(\beta \) are zero. We separate the remaining cases, where switching to the Weyl calculus yields errors as follows,

$$ \begin{aligned} Err = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\partial _{j} h^{j0}} w \cdot \partial _{0} w \, dxdt + \frac{1}{2} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\partial _{j} h^{jm}} w \cdot \partial _{m} w \, dx dt \\ = & \ \frac{1}{4} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{\alpha }\partial _{\beta }h^{\alpha \beta}} w \cdot w \, dx dt + \left . \int T_{ \partial _{j} h^{j0}} w \cdot w \,dx \right |_{0}^{T}. \end{aligned} $$

The last integral is an acceptable energy correction. For the first integral to be an acceptable energy flux error, it suffices to show that

$$ \| P_{< k} \partial _{\alpha }\partial _{\beta }h^{\alpha \beta} \|_{L^{ \infty }S^{0}} \lesssim 2^{2k} {\mathcal {B}}^{2}. $$

It is easily seen that this is indeed the case if any of the derivatives apply to \(\tilde{X}_{s1}\), by using the time derivative component of the ℭ bound for either \(\tilde{X}_{s1}\) or \({\tilde{g}}^{\alpha \beta}\) to bound time derivatives (of which we can have at most one). So we are left with showing that

$$ \| P_{< k} \partial _{\alpha }\partial _{\beta }{\tilde{g}}^{\alpha \beta} \|_{L^{\infty }S^{0}} \lesssim 2^{k} {\mathcal {B}}^{2}. $$

But for this we use Lemma 6.10.

II. The contribution of \(\tilde{X}_{s0}\) with \(A=0\). Here we will follow the same road map as in the case of \(\tilde{X}_{s1}\), but additional care will be needed in order to handle the additional time derivatives. The integral we need to consider is

$$ I_{X}^{0} = 2\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \tilde{P}} w \cdot T_{\tilde{X}_{s0}} \partial _{0} w \, dx dt . $$

We can integrate by parts once in \([0,T] \times {\mathbb{R}}^{n}\) to rewrite \(I_{X}^{0}\) in the form

$$ \begin{aligned} I_{X}^{0} = & \ - 2\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \tilde{g}}^{\alpha \beta}} T_{\partial _{\beta }\tilde{X}_{s0}} \partial _{0} w \cdot \partial _{\alpha }w \, dx dt - 2 \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{\tilde{g}}^{\alpha \beta}} T_{ \tilde{X}_{s0}} \partial _{0} \partial _{\beta }w \cdot \partial _{ \alpha }w \, dx dt \\ & \ \ + \left . 2 \int T_{{\tilde{g}}^{\alpha 0}} T_{\tilde{X}_{s0}} \partial _{0} w \cdot \partial _{\alpha }w \, dx \right |_{0}^{T}. \end{aligned} $$

In the middle term we switch the operator \(T_{{\tilde{g}}^{\alpha \beta}} T_{\tilde{X}_{s0}} \partial _{0}\) to the right, while integrating by parts once in time, in order to put it in the more symmetric form

$$\begin{aligned} I^{0}_{X} = & \ - 2\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \tilde{g}}^{\alpha \beta}} T_{\partial _{\beta }\tilde{X}_{s0}} \partial _{0} w \cdot \partial _{\alpha }w \, dx dt \\ &{} + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}( \partial _{0} T_{\tilde{X}_{s0}} T_{{ \tilde{g}}^{\alpha \beta}} - T_{{\tilde{g}}^{\alpha \beta}} T_{\tilde{X}_{s0}} \partial _{0}) \partial _{\beta }w \cdot \partial _{\alpha }w \, dx dt \\ &{} + \left . \int 2 T_{{\tilde{g}}^{\alpha 0}} T_{\tilde{X}_{s0}} \partial _{0} w \cdot \partial _{\alpha }w - T_{\tilde{X}_{s0}} T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\alpha }w \cdot \partial _{ \beta }w \, dx \right |_{0}^{T}. \end{aligned}$$

For the energy term there is nothing new, we use as before paraproduct rules to rewrite it as the desired leading part plus an acceptable error. We now consider the second space-time integral, where more care is needed. The operator

$$ C = \partial _{0} T_{\tilde{X}_{s0}} T_{{\tilde{g}}^{\alpha \beta}} - T_{{ \tilde{g}}^{\alpha \beta}} T_{\tilde{X}_{s0}} \partial _{0} $$

has a commutator structure, which is good. However we have to carefully decide on the order in which we commute, because, depending on whether \(\alpha =0\) or \(\beta = 0\), we might carelessly end up with a double time derivative. The positive feature, arising from the fact that we work with the metric \({\tilde{g}}\) rather than \(g\), is that if \((\alpha ,\beta )= (0,0)\) then there is a single commutator which does not involve time derivatives. For clarity we consider the four cases separately:

  1. i)

    The case \(\alpha \neq 0\), \(\beta \neq 0\). This is the simplest case, where, commuting and peeling off operators of size \(O_{L^{2}}({\mathcal {B}}^{2})\), we write

    $$ \begin{aligned} C = & \ T_{\partial _{0} \tilde{X}_{s0}} T_{{\tilde{g}}^{\alpha \beta}}+ T_{\tilde{X}_{s0}} T_{\partial _{0} {\tilde{g}}^{\alpha \beta}} - [T_{{ \tilde{g}}^{\alpha \beta}}, T_{\tilde{X}_{s0}}] \partial _{0}. \end{aligned} $$
  2. ii)

    The case \(\alpha = 0\), \(\beta \neq 0\). Here we use the same order as before.

  3. iii)

    The case \(\alpha \neq 0\), \(\beta = 0\). Here we reverse the order, to write

    $$ \begin{aligned} C = & \ T_{{\tilde{g}}^{\alpha \beta}} T_{\partial _{0} \tilde{X}_{s0}} + T_{\partial _{0} {\tilde{g}}^{\alpha \beta}} T_{\tilde{X}_{s0}} - \partial _{0} [T_{{\tilde{g}}^{\alpha \beta}}, T_{\tilde{X}_{s0}}] , \end{aligned} $$

    where the middle term is integrated by parts once more to move \(\partial _{0}\) together with \(\partial _{\alpha}\), at the expense of another negligible energy correction

    $$ - \left . \int [T_{{\tilde{g}}^{\alpha 0}}, T_{\tilde{X}_{s0}}] \partial _{0} w \cdot \partial _{\alpha }w \, dx \right |_{0}^{T}. $$
  4. iv)

    The case \(\alpha = 0\), \(\beta = 0\). Here we simply have

    $$ C = T_{\partial _{0} \tilde{X}_{s0}}. $$

Now we put together the terms in the four cases.

a) In the \(\partial _{0} \tilde{X}_{s0}\) term the multiplication order does not matter, and we can further replace it by \(T_{ T_{{\tilde{g}}^{\alpha \beta}} \partial _{0} \tilde{X}_{s0}}\) modulo \(O_{L^{2}}({\mathcal {B}}^{2})\) errors. Thus we retain the integral

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ T_{{\tilde{g}}^{ \alpha \beta}} \partial _{0} \tilde{X}_{s0}} \partial _{\beta }w \cdot \partial _{\alpha }w \, dx dt. $$

b) In the \(\partial _{0} {\tilde{g}}^{\alpha \beta}\) term, however, the commutator is not negligible, so in addition to \(T_{\tilde{X}_{s0}} T_{\partial _{0} {\tilde{g}}^{\alpha \beta}}\) we also need the commutator \([T_{\partial _{0} {\tilde{g}}^{\alpha 0}}, T_{\tilde{X}_{s0}}]\). Hence we get two contributions,

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\tilde{X}_{s0}} T_{ \partial _{0} {\tilde{g}}^{\alpha \beta}} \partial _{\beta }u \cdot \partial _{\alpha }u \, dx dt + \int _{0}^{T} \!\!\!\! \int _{{ \mathbb{R}}^{n}}[ T_{\partial _{0} {\tilde{g}}^{\alpha 0}},T_{\tilde{X}_{s0}}] \partial _{0} u \cdot \partial _{\alpha }u \, dx dt. $$

c) In the \([T_{{\tilde{g}}^{\alpha \beta}}, T_{\tilde{X}_{s0}}]\) term where, distinguishing between \(\beta = j \neq 0\) and \(\beta = 0\), we get

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}- [T_{{\tilde{g}}^{ \alpha j}}, T_{\tilde{X}_{s0}}] \partial _{0} \partial _{j} w \cdot \partial _{\alpha }w \, dx dt - \int _{0}^{T} \!\!\!\! \int _{{ \mathbb{R}}^{n}}\partial _{0} w \cdot [T_{{\tilde{g}}^{\alpha 0}},T_{ \tilde{X}_{s0}}] \partial _{0} \partial _{\alpha }w \, dx dt. $$

In the first integral we move \(\partial _{j}\) and the commutator term to the right, also commuting them, so the above expression is rewritten as

$$\begin{aligned} &\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{0} w \cdot [T_{ \tilde{X}_{s0}},T_{{\tilde{g}}^{\alpha \beta}}] \partial _{\beta } \partial _{\alpha }w \, dx dt + \int _{0}^{T} \!\!\!\! \int _{{ \mathbb{R}}^{n}}[T_{\partial _{j} {\tilde{g}}^{\alpha j}}, T_{\tilde{X}_{s0}}] \partial _{0} w \cdot \partial _{\alpha }w \\ &\quad {} + [T_{{\tilde{g}}^{\alpha j}}, \partial _{j} T_{\tilde{X}_{s0}}] \partial _{0} w \cdot \partial _{ \alpha }w \, dx dt. \end{aligned}$$

We retain the first term as it is, combine the second one with the second term in part (b) and discard the last one as perturbative, \(Err({\mathcal {B}}^{2})\).

Putting all terms together, we have rewritten \(I^{0}_{X}\), modulo perturbative terms, as

$$ \begin{aligned} I_{X}^{0} = & \ - 2\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\beta }\tilde{X}_{s0}} \partial _{0} w \cdot \partial _{\alpha }w \, dx dt + \int _{0}^{T} \!\!\!\! \int _{{ \mathbb{R}}^{n}}T_{ T_{{\tilde{g}}^{\alpha \beta}} \partial _{0} \tilde{X}_{s0}} \partial _{\beta }w \cdot \partial _{\alpha }w \, dx dt \\ & \ + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\tilde{X}_{s0}} T_{ \partial _{0} {\tilde{g}}^{\alpha \beta}} \partial _{\beta }u \cdot \partial _{\alpha }u \, dx dt + \int _{0}^{T} \!\!\!\! \int _{{ \mathbb{R}}^{n}}\partial _{0} w \cdot \partial _{\beta }[T_{\tilde{X}_{s0}},T_{{ \tilde{g}}^{\alpha \beta}}] \partial _{\alpha }w \, dx dt \\ & \ - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{0} w \cdot [\partial _{\beta }T_{\tilde{X}_{s0}}, T_{ {\tilde{g}}^{\alpha \beta}}] \partial _{\alpha }w \, dx dt + Err({\mathcal {B}}^{2}) \\ := & \ I_{X}^{01} + I_{X}^{02} + I_{X}^{03} + I_{X}^{04} + I_{X}^{05} + Err({\mathcal {B}}^{2}) . \end{aligned} $$

This can be simplified further by observing that, in view of Lemma 2.3, the term \(I_{X}^{05}\) is also perturbative. Thus we arrive at

$$ I_{X}^{0} = I_{X}^{01} + I_{X}^{02} + I_{X}^{03} + I_{X}^{04} + Err({ \mathcal {B}}^{2}). $$

We successively consider these terms:

IIa. The contribution of \(I_{X}^{01}\). This corresponds to the symbol

$$ 2 T_{T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }\tilde{X}_{s0}} \xi _{0} \xi _{\alpha}, $$

which is akin to one of the components of \(c_{X_{s},B}\) in (7.57). We can turn this into the corresponding component of \({\mathring{c}}_{\tilde{X}_{s},B}\). Precisely, given \(\tilde{X}_{s0}\) as in the representation (7.63), that component is

$$ 2 T_{T_{{\tilde{g}}^{\alpha \beta} a^{\gamma}_{0}} \widehat {\partial _{\beta }\partial _{\gamma}} u} \xi _{0} \xi _{ \alpha}, $$

where we recall that the hat above is understood as nonexistent unless \(\beta = \gamma = 0\), in which case it is interpreted as the corrected expression (5.13). The difference between the two coefficients is easily seen to have size \({\mathcal {B}}^{2}\), so it is perturbative, as in Lemma 6.9. It remains to switch this modification of \(I_{X}^{01}\) to the Weyl calculus, which requires estimating the integral

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\partial _{\alpha }T_{{ \tilde{g}}^{\alpha \beta} a^{\gamma}} \widehat{\partial _{\beta }\partial _{\gamma}} u} \partial _{0} w \cdot w \, dx dt. $$

This follows from the bound

$$ \| P_{< k} \partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta} a^{\gamma}} \widehat{\partial _{\beta }\partial _{\gamma}} u\|_{L^{\infty}} \lesssim 2^{k} {\mathcal {B}}^{2}, $$

which in turn follows from Lemma 5.12(b) after commuting \(a^{\gamma}_{0}\) out.

IIb. The contribution of \(I_{X}^{02}\). Exactly as above, this integral also corresponds to a term in \(c_{X_{s},B}\). Again, after a perturbative \(Err({\mathcal {B}}^{2})\) correction we can turn this into the corresponding term in \({\mathring{c}}_{\tilde{X}_{s},B}\), which has the para-coefficient

$$ T_{{\tilde{g}}^{\alpha \beta} a^{\gamma}} \widehat{\partial _{0} \partial _{\gamma}} u. $$

The associated integral is

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ T_{{\tilde{g}}^{ \alpha \beta} a^{\gamma}} \widehat{\partial _{0} \partial _{\gamma}} u} \partial _{\beta }w \cdot \partial _{\alpha }w \, dx dt. $$

We would like to switch this to Weyl calculus, but we need to be careful here because the convention for the Weyl form differs depending on whether \(\beta \) is zero or not.

If \(\beta \neq 0\) then the error corresponds to switching the operator on the left to Weyl calculus, and has the form

$$ \frac{1}{2} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{\beta }T_{{\tilde{g}}^{\alpha \beta} a^{\gamma}} \widehat{\partial _{0} \partial _{\gamma}} u} w \cdot \partial _{ \alpha }w \, dx dt. $$
(7.98)

The same applies if \(\alpha = \beta = 0\). But if \(\alpha \neq 0\) and \(\beta = 0\) then we have to switch the paraproduct to the right, and then the Weyl correction is

$$ \frac{1}{2} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{ \beta }w \cdot T_{ \partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta} a^{ \gamma}} \widehat{\partial _{0} \partial _{\gamma}} u} w \, dx dt. $$

We can rectify this discrepancy and switch this correction to the form in (7.98) by integrating twice by parts, first in \(x_{\beta}\) and then in \(x_{\alpha}\). Since we are in the case \(\beta = 0\), the first step yields an energy correction, namely

$$ \frac{1}{4} \left . \int w \cdot T_{ \partial _{\alpha }T_{{\tilde{g}}^{ \alpha 0} a^{\gamma}} \widehat{\partial _{0} \partial _{\gamma}} u} w \, dx \right |_{0}^{T} . $$

As \(\alpha \neq 0\), for this to be an acceptable \(Err({\mathcal {A}^{\sharp }})\) error we need the bound

$$ \| P_{< k} \widehat{\partial _{0} \partial _{\gamma}} u \|_{L^{\infty}} \lesssim 2^{k} {\mathcal {A}^{\sharp }}. $$

This is obvious if \(\gamma \neq 0\), and follows from (5.15) otherwise.

Thus we are left with considering the correction in (7.98) summed over all \(\alpha \) and \(\beta \), and which we would like to estimate perturbatively.

Here there is no structure in the \(\gamma \) summation, so we can fix \(\gamma \). The easier case is when \(\gamma \neq 0\). Then we can commute \(\partial _{\gamma}\) out, as well as \(a^{\gamma}\), and \(\partial _{\beta}\) in, writing

$$ \partial _{\beta }T_{{\tilde{g}}^{\alpha \beta} a^{\gamma}} {\partial _{0} \partial _{\gamma}} u = T_{a^{\gamma}} \partial _{\gamma }T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\beta }\partial _{0} u + f , $$

where the error term \(f\) satisfies

$$ \| P_{< k} f\|_{L^{\infty}} \lesssim 2^{k} {\mathcal {B}}^{2}. $$
(7.99)

We may also correct the second order time derivative, arriving at

$$ \partial _{\beta }T_{{\tilde{g}}^{\alpha \beta} a^{\gamma}} {\partial _{0} \partial _{\gamma}} u = T_{a^{\gamma}} \partial _{\gamma }T_{{ \tilde{g}}^{\alpha \beta}} \widehat{\partial _{\beta }\partial _{0}} u + f. $$

The remaining term is no longer directly perturbative, but its contribution may be instead estimated integrating by parts,

$$ \begin{aligned} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ T_{a^{\gamma}} \partial _{\gamma }T_{{\tilde{g}}^{\alpha \beta}} \widehat{\partial _{\beta }\partial _{0}} u} w \cdot \partial _{ \alpha }w \, dx dt = & \ - \frac{1}{2} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{\alpha }T_{a^{\gamma}} \partial _{\gamma }T_{{\tilde{g}}^{ \alpha \beta}} \widehat{\partial _{\beta }\partial _{0}} u} w \cdot w \, dx dt \\ & + \left . \int T_{ T_{a^{\gamma}} \partial _{\gamma }T_{{\tilde{g}}^{0 \beta}} \widehat{\partial _{\beta }\partial _{0}} u} w \cdot w \, dx \right |_{0}^{T}. \end{aligned} $$

The last term is a bounded energy correction, as

$$ \| P_{k} T_{a^{\gamma}} \partial _{\gamma }T_{{\tilde{g}}^{0\beta}} \widehat{\partial _{\beta }\partial _{0}} u\|_{L^{\infty}} \lesssim 2^{2k} {\mathcal {A}^{\sharp }}. $$

It remains to show that the first term is also perturbative,

$$ \| P_{< k} T_{a^{\gamma}} \partial _{\gamma }T_{{\tilde{g}}^{\alpha \beta}} \widehat{\partial _{\beta }\partial _{0}} u\|_{L^{\infty}} \lesssim 2^{2k} {\mathcal {B}}^{2}. $$

Commuting \(\partial _{\alpha} \) inside and discarding \(a^{\gamma }\partial _{\gamma}\), this reduces to

$$ \| P_{< k} \partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} \widehat{\partial _{\beta }\partial _{0}} u\|_{L^{\infty}} \lesssim 2^{k} {\mathcal {B}}^{2}, $$

which is again a consequence of Lemma 5.12(b).

It remains to consider the case \(\gamma =0\), where we take advantage of the hat correction. Precisely, using the \(u\) equation, we write

$$ \widehat{\partial _{0} \partial _{0}} u = - \sum _{(\mu ,\nu ) \neq (0,0)} T_{{\tilde{g}}^{\mu \nu}} \partial _{\mu }\partial _{\nu }u + T_{ \partial _{\mu }\partial _{\nu }u} {\tilde{g}}^{\mu \nu}. $$

We substitute this into the paracoefficient in (7.98), peeling off perturbative contributions. Fixing \(\mu \) and \(\nu \) we may assume \(\mu \neq 0\) and arrive at

$$ \partial _{\beta }T_{{\tilde{g}}^{\alpha \beta} a^{\gamma}} \widehat{\partial _{0} \partial _{0}} u = - \sum _{\mu \neq 0} T_{a^{ \gamma }{\tilde{g}}^{\mu \nu}} \partial _{\mu }T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }\partial _{\nu }u + f , $$

with \(f\) as in (7.99). At this point we can repeat the argument in the case \(\gamma \neq 0\).

IIc. The contribution of \(I_{X}^{03}\). We recall that this is

$$ I_{X}^{03} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \tilde{X}_{s0}} T_{\partial _{0} {\tilde{g}}^{\alpha \beta}} \partial _{ \beta }u \cdot \partial _{\alpha }u \, dx dt. $$

This term is easily seen to be perturbative unless the spatial frequency of \(\tilde{X}_{s0}\) is smaller than that of \(\partial _{0} {\tilde{g}}^{\alpha \beta}\), see Lemma 6.12. Thus we can think of the principal symbol of the product \(T_{\tilde{X}_{s0}} T_{\partial _{0} {\tilde{g}}^{\alpha \beta}}\) as being \(T_{\tilde{X}_{s0}} \partial _{0} {\tilde{g}}^{\alpha \beta}\). However some care is needed here with the error, which is lower order but not necessarily balanced. Precisely, using Proposition 6.16, we can expand this product into a leading part, an unbalanced subprincipal part and a perturbative term,

$$ T_{\tilde{X}_{s0}} T_{\partial _{0} {\tilde{g}}^{\alpha \beta}} = T_{T_{ \tilde{X}_{s0}} \partial _{0} {\tilde{g}}^{\alpha \beta}} + OP{ \mathfrak {P}}S^{-2} L_{lh}(\partial _{x}^{2} \partial _{0} {\tilde{g}}^{ \alpha \beta}, \cdot ) + O_{L^{2}}({\mathcal {B}}^{2}). $$
(7.100)

This yields a corresponding decomposition of \(I_{X}^{03}\) into

$$ I_{X}^{03} = I_{X,main}^{03}+ I_{X,sub}^{03} + Err({\mathcal {B}}^{2}). $$

To better describe the first two terms we take a closer look at the coefficient \(\partial _{0} {\tilde{g}}^{\alpha \beta}\), for which we compute

$$ \partial _{0} {\tilde{g}}^{\alpha \beta} = - \left (\partial ^{\beta} u {\tilde{g}}^{\alpha \delta } \partial _{\delta }\partial _{0} u + \partial ^{\alpha} u {\tilde{g}}^{\beta \delta } \partial _{\delta } \partial _{0} u \right ) + 2 {\tilde{g}}^{\alpha \beta} \partial ^{0} u {\tilde{g}}^{0 \delta } \partial _{\delta }\partial _{0} u. $$
(7.101)

Here we have a double time derivative \(\partial _{t}^{2} u\) when \(\delta = 0\), which we replace as before by \(\hat{\partial}_{t}^{2} u\) with perturbative errors. Once this is done, we may also replace all products by paraproducts, arriving at the modified expression

$$ \begin{aligned} \mathring{\partial}_{0} {\tilde{g}}^{\alpha \beta} := & \ - \left ( T_{\partial ^{\beta} u {\tilde{g}}^{\alpha \delta }} \widehat{\partial _{\delta }\partial _{0}} u + T_{\partial ^{\alpha} u {\tilde{g}}^{\beta \delta }} \widehat{\partial _{\delta }\partial _{0}} u \right ) + 2 T_{ \partial ^{0} u {\tilde{g}}^{0 \delta }{\tilde{g}}^{\alpha \beta}} \widehat{\partial _{\delta }\partial _{0}} u \end{aligned} $$

so that the difference is perturbative in the sense that

$$ \partial _{0} {\tilde{g}}^{\alpha \beta} = \mathring{\partial}_{0} { \tilde{g}}^{\alpha \beta} + O({\mathcal {B}}^{2}). $$

Finally we return to the operator setting, where we make the above substitution. In the principal part can compound the outer paracoefficients at the expense of more negligible errors, writing it in a paradifferential form

$$\begin{aligned} T_{T_{\tilde{X}_{s0}} \partial _{0} {\tilde{g}}^{\alpha \beta}} =& T_{q^{ \alpha \beta}}+ O_{L^{2}}({\mathcal {B}}^{2}), \\ I_{X,main}^{03} =& \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{q^{\alpha \beta}} \partial _{\alpha }w \cdot \partial _{\beta }w \, dx dt+ Err({ \mathcal {B}}^{2}), \end{aligned}$$

where the order zero symbols \(q^{\alpha \beta}\) are given by

$$ q^{\alpha \beta} = - \left ( T_{\tilde{X}_{s0} \partial ^{\beta} u { \tilde{g}}^{\alpha \delta }} \widehat{\partial _{\delta }\partial _{0}} u + T_{\tilde{X}_{s0} \partial ^{\alpha} u {\tilde{g}}^{\beta \delta }} \widehat{\partial _{\delta }\partial _{0}} u \right ) + 2 T_{\tilde{X}_{s0} \partial ^{0} u {\tilde{g}}^{0 \delta }{\tilde{g}}^{\alpha \beta}} \widehat{\partial _{\delta }\partial _{0}} u. $$

Here the symbol \(q^{\alpha \beta} \xi _{\alpha }\xi _{\beta}\) is a component of \({\mathring{c}}_{\tilde{X}_{s},B}\), as desired. All we need now is to convert the last expression for \(I_{X,main}^{03}\) to Weyl form. This conversion yields the additional error

$$ \frac{1}{2} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{\alpha }q^{\alpha \beta}} w \cdot \partial _{\beta }w \, dx dt, $$

which we need to estimate. Here we separate the three terms in \(q^{\alpha \beta}\). For the first term, after one commutation it remains to show that

$$ \| P_{< k} \partial _{\alpha }T_{{\tilde{g}}^{\alpha \delta}} \widehat{\partial _{\delta }\partial _{0}} u \|_{L^{\infty}} \lesssim 2^{k} {\mathcal {B}}^{2} , $$

which we get from Lemma 5.12. The second term is similar if we integrate by parts to switch \(\alpha \) and \(\beta \), at the expense of a bounded energy correction. Finally, the third term is exactly as in the case of \(I_{X}^{02}\).

Similarly, in the subprincipal term in (7.100) we may peel off perturbative errors to write it as a linear combination of expressions of the form

$$ T_{{\mathfrak {P}}S^{-1}} \tilde{T}_{T_{{\tilde{g}}^{\alpha \delta }} \partial _{x} \widehat{\partial _{\delta }\partial _{0}} u} +T_{{ \mathfrak {P}}S^{-1}} \tilde{T}_{T_{{\tilde{g}}^{\beta \delta }} \partial _{x} \widehat{\partial _{\delta }\partial _{0}} u} + T_{{ \mathfrak {P}}S^{-1}}\tilde{T}_{T_{{\tilde{g}}^{\alpha \beta }}\partial _{x} \widehat{\partial _{\delta }\partial _{0}} u}. $$

We postpone their analysis for later, for now we simply list the two types of contributions:

$$ I_{X,sub}^{031} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \delta }} L_{lh}(\partial _{x} \widehat{\partial _{\delta }\partial _{0}} u, \partial _{\alpha }w) \cdot \partial w \, dx dt , $$
(7.102)
$$ I_{X,sub}^{032} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta }} L_{lh}(\partial _{x} \widehat{\partial _{\delta }\partial _{0}} u, \partial _{\alpha }w) \cdot \partial _{\beta }w \, dx dt . $$
(7.103)

.

IId. The contribution of \(I_{X}^{04}\). We recall that this is

$$ I_{X}^{04} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{0} w \cdot \partial _{\alpha }[T_{\tilde{X}_{s0}}, T_{{\tilde{g}}^{\alpha \beta}}] \partial _{\beta }w \, dx dt. $$

This has a similar treatment to \(I_{X}^{03}\). For the commutator above we must have again a smaller frequency on \(\tilde{X}_{s0}\), else this yields a perturbative contribution. Using Proposition 6.15 we expand the commutator into a leading part, an unbalanced subprincipal part and a perturbative term,

$$ [T_{\tilde{X}_{s0}},T_{{\tilde{g}}^{\alpha \beta}}] = T_{T_{\partial _{ \xi _{j}}\tilde{X}_{s0}} \partial _{j} {\tilde{g}}^{\alpha \beta}} + OP{ \mathfrak {P}}S^{-2} L_{lh}(\partial _{x}^{2} {\tilde{g}}^{\alpha \beta}, \cdot ) + R^{\alpha \beta}, $$
(7.104)

where the remainder \(R\) satisfies perturbative bounds of the form

$$ \| R\|_{L^{2} \to H^{1}} \lesssim {\mathcal {B}}^{2}, \qquad \| R\|_{H^{-1} \to L^{2}} \lesssim {\mathcal {B}}^{2}, \qquad \| \partial _{0} R\|_{L^{2} \to L^{2}} \lesssim {\mathcal {B}}^{2}. $$

We first consider the contribution of the leading part \(I_{X,main}^{04}\). For \(\partial _{j}{\tilde{g}}^{\alpha \beta}\) we use the expansion in (7.101) with the subscript 0 replaced by \(j \neq 0\), arriving at

$$ T_{T_{\partial _{\xi _{j}}\tilde{X}_{s0}} \partial _{j} {\tilde{g}}^{ \alpha \beta}} = T_{q^{\alpha \beta}}+ R^{\alpha \beta}, $$

where the order zero symbols \(q^{\alpha \beta}\) are given by

$$ q^{\alpha \beta} = - \left ( T_{\partial _{\xi _{j}} \tilde{X}_{s0} \partial ^{\beta} u {\tilde{g}}^{\alpha \delta }} {\partial _{\delta } \partial _{j}} u + T_{\partial _{\xi _{j}}\tilde{X}_{s0} \partial ^{ \alpha} u {\tilde{g}}^{\beta \delta }} {\partial _{\delta }\partial _{j}} u \right ) + 2 T_{\partial _{\xi _{j}} \tilde{X}_{s0} \partial ^{0} u { \tilde{g}}^{0 \delta }{\tilde{g}}^{\alpha \beta}} {\partial _{\delta } \partial _{j}} u, $$

and the remainder \(R\) is as above. Then the leading part can be written as

$$ I_{X,main}^{04} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{q^{ \alpha \beta}} \partial _{\alpha }\partial _{\beta }w \cdot \partial _{0} w \, dx dt+ Err({\mathcal {B}}^{2}). $$

Now the symbol \(q^{\alpha \beta} \xi _{0}\xi _{\alpha }\xi _{\beta}\) is a component of \({\mathring{c}}_{\tilde{X}_{s},B}\), as desired. It remains to convert the last expression for \(I_{X,main}^{04}\) to Weyl form. The error in doing that is

$$ \frac{1}{4} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{\alpha }\partial _{\beta }q^{\alpha \beta}} w \cdot \partial _{0} w \, dx dt. $$

Estimating this expression requires the bound

$$ \| P_{< k} \partial _{\alpha }\partial _{\beta }q^{\alpha \beta} \|_{L^{ \infty}} \lesssim 2^{k} {\mathcal {B}}^{2}. $$

Here \(q^{00}=0\) so we avoid the case of two time derivatives. This allows us to commute \(\partial _{\alpha }\partial _{\beta}\) inside and take \(\partial _{j}\) outside modulo perturbative terms. Then \(\partial _{j}\) yields the \(2^{k}\) factor, and we have reduced the problem to proving that

$$ \| P_{k} \left ( T_{\partial _{\xi _{j}} \tilde{X}_{s0} \partial ^{ \beta} u {\tilde{g}}^{\alpha \delta }} + T_{\partial _{\xi _{j}} \tilde{X}_{s0} \partial ^{\alpha} u {\tilde{g}}^{\beta \delta }} - 2 T_{ \partial _{\xi _{j}} \tilde{X}_{s0} \partial ^{0} u {\tilde{g}}^{0 \delta }{\tilde{g}}^{\alpha \beta}}\right ) {\partial _{\delta } \partial _{\alpha }\partial _{\beta}} u\|_{L^{\infty}} \lesssim { \mathcal {B}}^{2}. $$

Re-labeling this becomes

$$ \| P_{k} T_{\partial _{\xi _{j}} \tilde{X}_{s0}( \partial ^{\delta} u - \partial ^{0} u {\tilde{g}}^{0 \delta }){\tilde{g}}^{\alpha \beta}} { \partial _{\delta }\partial _{\alpha }\partial _{\beta}} u\|_{L^{ \infty}} \lesssim {\mathcal {B}}^{2}. $$

The expression on the left vanishes if \(\delta =0\). This allows us to break the para-coefficient in two using Lemma 2.7 and replace this by

$$ \| P_{k} T_{\partial _{\xi _{j}} \tilde{X}_{s0}( \partial ^{\delta} u - \partial ^{0} u {\tilde{g}}^{0 \delta })} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\delta }\partial _{\alpha }\partial _{\beta }u\|_{L^{ \infty}} \lesssim {\mathcal {B}}^{2}, $$

which is finally a consequence of Lemma 5.12.

Next we consider the subprincipal term. Here we use again the expansion in (7.101) and recombine paracoefficients to rewrite it as a linear combination of terms of the form

$$ I_{X,sub}^{041} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}} \partial _{0} w \cdot T_{h} \partial _{\alpha }T_{{\tilde{g}}^{\alpha \delta}} L_{lh} (\partial _{x}^{2} \partial _{\delta }u, \partial _{ \beta }w) \,dx dt , $$
(7.105)
$$ I_{X,sub}^{042} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}} \partial _{0} w \cdot T_{h} \partial _{\alpha }T_{{\tilde{g}}^{\beta \delta}} L_{lh} (\partial _{x}^{2} \partial _{\delta }u, \partial _{ \beta }w) \,dx dt , $$
(7.106)

respectively

$$ I_{X,sub}^{043} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}} \partial _{0} w \cdot T_{h} \partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} L_{lh} (\partial _{x}^{2} \partial _{\delta }u, \partial _{ \beta }w) \,dx dt, $$
(7.107)

where \(h \in {\mathfrak {P}}S^{-2}\) roughly corresponds to \(\partial _{\xi}^{2} \tilde{X}_{s0}\). Here we can freely separate variables and reduce to the case when \(h\) is a function, including the multiplier part in \(L_{lh}\).

We remark that until now we were able to exclude the case when \(\alpha = \beta = 0\). However, at this point we need to separate the three types of contributions in order to take advantage of their structures. Because of this, from here on we have to also allow for the case \(\alpha =\beta = 0\), forfeiting the cancellation that would otherwise occur in this case between the different terms. We postpone the estimate for the subprincipal terms for the end of the proof.

III. The contribution of \(\tilde{Y}_{s0}\) with \({\tilde{A}}=0\) and \({\tilde{b}}=0\). Here we consider the integral

$$ I_{Y} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\tilde{P}} w \cdot T_{\tilde{Y}_{s0}} w \, dx dt, $$

where we recall that \(\tilde{Y}_{s0} \in \partial _{x} {\mathfrak {P}}S^{0}\). We integrate once by parts to write

$$\begin{aligned} I_{Y} = &\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{\tilde{g}}^{ \alpha \beta}} \partial _{\alpha }w \cdot T_{\tilde{Y}_{s0}}\partial _{ \beta }w \, dx dt - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\alpha }w \cdot T_{\partial _{ \beta }\tilde{Y}_{s0}} w \, dx dt \\ &{} + \left . \int T_{{\tilde{g}}^{ \alpha }0} \partial _{\alpha }w \cdot T_{\tilde{Y}_{s0}} w \, dx \right |_{0}^{T}. \end{aligned}$$

The last integral is an admissible energy correction. In both space-time integrals we move \(T_{\tilde{Y}_{s0}}\) to the left, and combine the two paraproducts as in Lemma 2.7, peeling off perturbative contributions, to get

$$\begin{aligned} I_{Y} =& \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{T_{{\tilde{g}}^{ \alpha \beta}}\tilde{Y}_{s0}} \partial _{\alpha }w \cdot \partial _{ \beta }w \, dx dt - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{\beta }T_{{\tilde{g}}^{\alpha \beta}} \tilde{Y}_{s0}} \partial _{\alpha }w \cdot w \, dx dt \\ &{} + Err({\mathcal {B}}^{2})+ Err({ \mathcal {A}^{\sharp }})|_{0}^{T}. \end{aligned}$$

The symbol of the bilinear form in the first integral is the desired component of \({\mathring{c}}_{s}\), but we need to convert it to Weyl calculus. This yields an error which is half of the second integral, which in turn needs to be estimated perturbatively. Commuting \(\partial _{\beta}\) inside, we are left with

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{T_{{\tilde{g}}^{ \alpha \beta}}\partial _{\beta }\tilde{Y}_{s0} } w \cdot \partial _{ \alpha }w \, dx dt . $$

Here \(\tilde{Y}_{s0}\) is of the form \(\tilde{Y}_{s0}= \partial _{x} h\), with \(h \in {\mathfrak {P}}S^{0}\). We can harmlessly commute \(\partial _{x}\) out, to arrive at

$$ J = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\partial _{x} T_{{ \tilde{g}}^{\alpha \beta}}\partial _{\beta }h } w \cdot \partial _{ \alpha }w \, dx dt. $$
(7.108)

In the absence of boundaries at \(t = 0,T\) here we could integrate by parts once more to rewrite this as

$$ -\frac{1}{2} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{x} \partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }h } w \cdot \partial _{\alpha }w \, dx dt, $$

and then use Lemma 6.4. The same argument applies if we add in the boundaries, by carefully tracking the boundary contributions. Precisely, we use the lemma to rewrite the expression \(J\) in (7.108) as follows:

$$ \begin{aligned} J = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\partial _{x} (T_{{\tilde{g}}^{\alpha \beta}}\partial _{\beta }h - \delta _{0}^{ \alpha }f^{0}) } w \cdot \partial _{\alpha }w \, dx dt + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\partial _{x} f^{0}} w \cdot \partial _{0} w \, dx dt \\ = & \ -\frac{1}{2} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{x} (\partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} \partial _{\beta }h - \partial _{0} f^{0}) } w \cdot w \, dx dt + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\partial _{x} f^{0}} w \cdot \partial _{0} w \, dx dt \\ & \ + \frac{1}{2} \left . \int T_{\partial _{x} (T_{{\tilde{g}}^{0 \beta}}\partial _{\beta }h - f^{0}) }w \cdot w \, dx \right |_{0}^{T} \\ = & \ -\frac{1}{2} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{x} \partial _{j} f^{j}} w \cdot w \, dx dt + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\partial _{x} f^{0}} w \cdot \partial _{0} w \, dx dt \\ & \ + \frac{1}{2} \left . \int T_{\partial _{x} (T_{{ \tilde{g}}^{0\beta}}\partial _{\beta }h - f^{0}) }w \cdot w \, dx \right |_{0}^{T}. \end{aligned} $$

Now, in view of Lemma 6.4, both the energy and the flux terms are perturbative.

IV. The contribution of the gradient potential \({\tilde{A}}\) and of \({\tilde{b}}_{0}\). We discuss the two together, as their contributions are similar. This has the form

$$ I_{X}^{2} = \iint T_{\tilde{\mathfrak {M}}_{s}} w \cdot (T_{{\tilde{A}}^{ \gamma}} + T_{{\tilde{b}}_{0}^{\gamma}})\partial _{\gamma }w\, dx dt, $$

which we need to shift to Weyl calculus after peeling off a perturbative contribution. For instance the contribution of \(\tilde{Y}_{s0}\) is directly perturbative. On the other hand, \({\tilde{A}}^{\gamma}\) contains \(\partial _{0}^{2} u\) terms which need to be corrected, while \({\tilde{b}}_{0}^{\gamma}\) does not. In any case, the correction can be freely added as its contribution has size \(Err({\mathcal {B}}^{2})\). Below we denote by \({\mathring{\tilde{A}}}\) the corrected version of \({\tilde{A}}\).

Next we consider the contribution of \(\tilde{X}_{s1}\), where we need to shift the operator product \(T_{\tilde{X}_{s1}} T_{{\tilde{A}}^{\gamma}}\) to the Weyl calculus via Lemma 6.12:

$$ T_{\tilde{X}_{s1}} T_{{\tilde{A}}^{\gamma}} = T_{T_{\tilde{X}_{s1}} { \mathring{\tilde{A}}}^{\gamma}} + T_{{\mathfrak {P}}S^{0}} T_{{\tilde{g}}^{ \alpha \gamma}} L_{lh}(\partial _{x} \widehat{\partial _{\alpha }\partial} u, \cdot ) + O_{H^{1} \to L^{2}}({ \mathcal {B}}^{2}), $$

and similarly for \(b_{0}\), i.e. the desired term plus a null unbalanced lower order term plus a perturbative contribution. We note here that the contribution of the null unbalanced lower order term has the form

$$ I_{X,sub}^{31} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{0}} T_{{\tilde{g}}^{\alpha \gamma}} L_{lh}(\partial _{x} \widehat{\partial _{\alpha }\partial} u, \partial _{\gamma} w) \cdot w \, dx dt. $$

Finally we consider the contribution of \(\tilde{X}_{s0}\),

$$\begin{aligned} I_{X}^{30} =&\int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\tilde{X}_{s0}} \partial _{0} w \cdot (T_{{\mathring{\tilde{A}}}^{\gamma}} + T_{b_{0}^{ \gamma}})\partial _{\gamma }w\, dx dt \\ =& \int _{0}^{T} \!\!\!\! \int _{{ \mathbb{R}}^{n}}\partial _{0} w \cdot T_{\tilde{X}_{s0}} (T_{{{ \mathring{\tilde{A}}}}^{\gamma}} + T_{b_{0}^{\gamma}})\partial _{ \gamma }w\, dx dt. \end{aligned}$$

We use again the product formula for paraproducts to write

$$ T_{\tilde{X}_{s0}} (T_{{{\mathring{\tilde{A}}}}^{\gamma}} + T_{b_{0}^{ \gamma}}) = T_{T_{\tilde{X}_{s0}}({\mathring{\tilde{A}}}^{\gamma }+ b_{0}^{ \gamma})} + T_{{\mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \gamma}} L_{lh}( \partial _{x} \widehat{\partial \partial _{\alpha}} u,\cdot ) +O_{L^{2}}({ \mathcal {B}}^{2}), $$

which generates a leading term and a subprincipal term.

The leading term is

$$ I_{X,main}^{30}= \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}} \partial _{0} w \cdot T_{T_{\tilde{X}_{s0}} ({{\mathring{\tilde{A}}}}^{ \gamma }+ T_{b_{0}^{\gamma}})}\partial _{\gamma }w\, dx dt. $$

Its symbol is as needed, but we still have to switch it to Weyl calculus. This switch introduces an error

$$ \frac{1}{2} \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \partial _{\gamma }[T_{\tilde{X}_{s0}}({{\mathring{\tilde{A}}}}^{ \gamma }+ b_{0}^{\gamma})]} w \cdot \partial _{0} w \, dx dt. $$

To bound its contribution, we would like to have the symbol bound

$$ |P_{< k} \partial _{\gamma }T_{\tilde{X}_{s0}}({{\mathring{\tilde{A}}}}^{ \gamma }+ b_{0}^{\gamma})| \lesssim 2^{k} {\mathcal {B}}^{2} . $$
(7.109)

Here we use the expressions for \({\tilde{A}}^{\gamma}\) and \(b_{0}^{\gamma}\), take out bounded paracoefficients, and we are left with

$$ \|P_{< k} \partial _{\gamma }T_{{\tilde{g}}^{\gamma \alpha}} \widehat{\partial _{\alpha }\partial _{\beta}} u\|_{L^{\infty}} \lesssim 2^{k} {\mathcal {B}}^{2} . $$

But this is in turn a consequence of Lemma 5.12.

To conclude, we record the form of the subprincipal term,

$$ I_{X,sub}^{30} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \gamma}} L_{lh}(\partial _{x} \widehat{\partial _{\alpha }\partial} u, \partial _{\gamma} w) \cdot \partial _{0} w \, dx dt. $$
(7.110)

V. The unbalanced lower order terms. These are the expressions identified earlier, which we recall here:

$$ I_{X,sub}^{1} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \gamma}} L_{lh}(\partial _{ \gamma }\partial _{x}^{2} u,\ \partial _{\alpha }w) \cdot \partial _{ \beta }w \, dx dt , $$
(7.111)
$$ I_{X,sub}^{031} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \delta }} L_{lh}(\partial _{x} \widehat{\partial _{\delta }\partial _{0}} u, \partial _{\alpha }w) \cdot \partial w \, dx dt , $$
(7.112)
$$ I_{X,sub}^{032} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta }} L_{lh}(\partial _{x} \widehat{\partial _{\delta }\partial _{0}} u, \partial _{\alpha }w) \cdot \partial _{\beta }w \, dx dt , $$
(7.113)
$$ I_{X,sub}^{041} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}} \partial _{0} w \cdot T_{{\mathfrak {P}}S^{-2}} \partial _{\alpha }T_{{ \tilde{g}}^{\alpha \delta}} L_{lh} (\partial _{x}^{2} \partial _{ \delta }u, \partial _{\beta }w) \,dx dt , $$
(7.114)
$$ I_{X,sub}^{042} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}} \partial _{0} w \cdot T_{{\mathfrak {P}}S^{-2}} \partial _{\alpha }T_{{ \tilde{g}}^{\beta \delta}} L_{lh} (\partial _{x}^{2} \partial _{ \delta }u, \partial _{\beta }w) \,dx dt , $$
(7.115)
$$ I_{X,sub}^{043} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}} \partial _{0} w \cdot T_{{\mathfrak {P}}S^{-2}} \partial _{\alpha }T_{{ \tilde{g}}^{\alpha \beta}} L_{lh} (\partial _{x}^{2} \partial _{ \delta }u, \partial _{\beta }w) \,dx dt, $$
(7.116)
$$ I_{X,sub}^{30} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \gamma}} L_{lh}(\partial _{x} \widehat{\partial _{\alpha }\partial} u, \partial _{\gamma} w) \cdot \partial _{0} w \, dx dt. $$
(7.117)

All of these exhibit a null structure.

We directly compress four of these into the expression

$$ I_{sub,\gamma \delta} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{h} T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{x} \widehat{\partial _{\alpha }\partial _{\gamma}} u,\ \partial _{ \beta }w) \cdot \partial _{\delta }w \, dx dt, \qquad h \in { \mathfrak {P}}S^{1}, $$
(7.118)

where the analysis will be slightly different depending on whether \(\gamma \) and \(\delta \) are zero or not.

In \(I_{X,sub}^{042}\) the case \(\alpha \neq 0\) is included above. If instead \(\alpha = 0\) then we integrate by parts \(\partial _{\alpha}\) to the left, so that, after a perturbative energy correction, we arrive at

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{0}^{2} w \cdot T_{{\mathfrak {P}}S^{-2}} T_{{\tilde{g}}^{\beta \delta}} L_{lh} ( \partial _{x}^{2} \partial _{\delta }u, \partial _{\beta }w) \,dx dt . $$

Now we use the paradifferential equation \(T_{\tilde{P}}\) equation for \(w\), which after more perturbative errors allows us to replace the leading \(\partial _{0}^{2}\) operator by \(\partial \partial _{x}\), with a \({\mathfrak {P}}\) paracoefficient. Then \(\partial _{x}\) combines with \(T_{{\mathfrak {P}}S^{-2}}\) to give \(T_{{\mathfrak {P}}S^{-2}}\), thereby reducing the problem to the case of (7.118).

Finally in \(I_{X,sub}^{041}\) we commute inside and distribute the \(\partial _{\alpha}\) derivative, peeling off perturbative errors. We arrive at

$$ \begin{aligned} I_{X,sub}^{041} = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}} \partial _{0} w \cdot T_{{\mathfrak {P}}S^{-2}} T_{{\tilde{g}}^{\alpha \delta}} L_{lh} (\partial _{x}^{2} \partial _{\alpha }\partial _{ \delta }u, \partial _{\beta }w) \,dx dt \\ & + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{0} w \cdot T_{{\mathfrak {P}}S^{-2}} T_{{\tilde{g}}^{\alpha \delta}} L_{lh} ( \partial _{x}^{2} \partial _{\delta }u, \partial _{\alpha }\partial _{ \beta }w) \,dx dt . \end{aligned} $$

The first term is estimated by commuting \(T_{{\tilde{g}}^{\alpha \delta}}\) inside \(L_{lh}\) and onto the first argument, after which we use Lemma 5.12. In the second term we pull \(\partial _{\beta}\) out, reducing the problem either to \(I_{X,sub}^{042}\), which was discussed earlier, or to

$$ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}\partial _{0} w \cdot T_{{ \mathfrak {P}}S^{-2}} T_{{\tilde{g}}^{\alpha \delta}} L_{lh} (\partial _{x}^{2} \partial _{\delta }\partial _{\beta }u, \partial _{\alpha }w) \,dx dt . $$

But here we can pull out one of the \(\partial _{x}\) operators to reduce to the case of (7.118).

After this discussion we have reduced the problem to the estimate for \(I_{sub,\gamma \delta}\). Here, from easiest to hardest, we need to consider the case when neither of \(\gamma \) or \(\delta \) is zero, then when one of them is zero, and finally when none of them is zero. We will first illustrate the principle in the easiest case, and then describe the additional complications for the most difficult case. We leave the intermediate case for the reader.

A. The case \(\gamma ,\delta \neq 0\). The argument in this case consists of three integrations by parts in a circular manner. Here we have \(h \in {\mathfrak {P}}S^{-1}\). We may include \(\partial _{\delta}\) in \(h\) in which case \(h \in {\mathfrak {P}}S^{0}\). Separating variables, the problem can be further reduced to \(h \in {\mathfrak {P}}\). In the computations below we omit \(h\) altogether, as it does not play any role. Then it remains to bound the integral

$$ I_{sub} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{\tilde{g}}^{ \alpha \beta}} L_{lh}(\partial _{x}^{2} {\partial _{\alpha}} u,\ \partial _{\beta }w) \cdot w \, dx dt . $$
(7.119)

Similarly, derivatives applied to \(g\) yield perturbative contributions, of size \(O({\mathcal {B}}^{2})\), and will not be explicitly written in order to avoid cluttering the formulas. In the absence of boundary terms, we compute as follows, integrating by parts in order to convert the null form into three \(T_{\tilde{P}}\) operators modulo admissible errors:

$$ \begin{aligned} I_{sub} = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}w \cdot T_{{ \tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{\alpha }\partial _{x}^{2} u, \partial _{\beta }w) \, dx dt \\ = & \ - \iint \partial _{\beta }w \cdot T_{{\tilde{g}}^{\alpha \beta}} L_{lh}( \partial _{\alpha }\partial _{x}^{2} u, w) \, dx dt \\ & - \iint w \cdot T_{{ \tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{\beta }\partial _{\alpha } \partial _{x}^{2} u, \partial _{0} w) \, dx dt + Err({\mathcal {B}}^{2}) \\ = & \ \iint \partial _{\alpha }\partial _{\beta }w \cdot T_{{\tilde{g}}^{ \alpha \beta}} L_{lh}( \partial _{x}^{2} u, w) \, dx dt \ +\iint \partial _{\beta }w \cdot T_{{\tilde{g}}^{\alpha \beta}} L_{lh}( \partial _{x}^{2} u, \partial _{\alpha }w) \, dx dt \\ & - \iint w \cdot T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{ \beta }\partial _{\alpha }\partial _{x}^{2} u, w) \, dx dt + Err({ \mathcal {B}}^{2}) \\ = & \ \iint \partial _{\alpha }\partial _{\beta }w \cdot T_{{\tilde{g}}^{ \alpha \beta}} L_{lh}( \partial _{x}^{2} u, w) \, dx dt \ - I_{sub} \\ & - \iint \ w \cdot T_{{\tilde{g}}^{\alpha \beta}} L_{lh}( \partial _{x}^{2} u, \partial _{\alpha }\partial _{\beta }w) \, dx dt \\ & - \iint w \cdot T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{ \beta }\partial _{\alpha }\partial _{x}^{2} u, w) \, dx dt + Err({ \mathcal {B}}^{2}). \end{aligned} $$

We now distribute \(T_{{\tilde{g}}^{\alpha \beta}}\), noting that any commutator errors involve derivatives of \({\tilde{g}}\) and thus are perturbative. We arrive at

$$ \begin{aligned} 2I_{sub} = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\alpha }\partial _{\beta }w \cdot L_{lh}( \partial _{x}^{2} u, w) \, dx dt \\ & - \int _{0}^{T} \!\!\! \! \int _{{\mathbb{R}}^{n}}\ w \cdot L_{lh}( \partial _{x}^{2} u, T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\alpha }\partial _{\beta }w) \, dx dt \\ & - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}w \cdot L_{lh}( T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\beta }\partial _{\alpha } \partial _{x}^{2} u, w) \, dx dt + Err({\mathcal {B}}^{2}). \end{aligned} $$
(7.120)

It remains to add the boundary terms at times \(t=0,T\) into the above computation. Such boundary terms arise from the integration by parts with respect to \(x_{0}\). We obtain the following enhanced version of (7.120):

$$ \begin{aligned} 2I_{sub} = & \ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\alpha }\partial _{\beta }w \cdot L_{lh}( \partial _{x}^{2} u, w) \, dx dt \\ & - \int _{0}^{T} \!\!\! \! \int _{{\mathbb{R}}^{n}}\ w \cdot L_{lh}( \partial _{x}^{2} u, T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\alpha }\partial _{\beta }w) \, dx dt \\ & - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}w \cdot L_{lh}( T_{{ \tilde{g}}^{\alpha \beta}} \partial _{\beta }\partial _{\alpha } \partial _{x}^{2} u, w) \, dx dt + Err({\mathcal {B}}^{2}) \\ & + \left . \!\!\! \int \! w \cdot T_{{\tilde{g}}^{\alpha 0}} L_{lh}( \partial _{\alpha }\partial _{x}^{2} u, w) -\partial _{\beta }w \cdot T_{{\tilde{g}}^{0\beta}} L_{lh}( \partial _{x}^{2} u, w)\right. \\ & \left. + w \cdot T_{{\tilde{g}}^{\alpha 0}} L_{lh}( \partial _{x}^{2} u, \partial _{\alpha }w) \, dx \right |_{0}^{T}. \!\!\!\!\!\!\!\! \end{aligned} $$
(7.121)

The boundary terms are easily seen as lower order energy corrections, so it remains to estimate the interior contributions. For the first one we can use the \(w\) equation to get the fixed time bounds

$$ \| T_{{\tilde{g}}^{\alpha \beta}} \partial _{\alpha }\partial _{\beta }w \|_{L^{2}} \lesssim \| \tilde{P}_{B} w\|_{L^{2}} + ({\mathcal {B}}^{2} + 2^{ \frac{k}{2}} {\mathcal {B}}c_{k}) \| \partial w\|_{L^{2}} , $$
(7.122)

which sufficesFootnote 6 by combining the two components of the last term with either the \({\mathcal {A}}\) or the ℬ bound for \(u\) in \(L_{lh}\). The other two interior contributions reduce to the bound

$$ \|P_{< k} T_{{\tilde{g}}^{\alpha \beta}} \partial _{\alpha }\partial _{ \beta }\partial _{x}^{2} u\|_{L^{\infty}} \lesssim 2^{2k} { \mathcal {B}}^{2} , $$
(7.123)

which is a consequence of Lemma 5.11 and which suffices to estimate the expressions in (7.121).

B. The case \(\gamma =\delta = 0\). In this case we seek to estimate the integral

$$ I_{sub,00} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{x} \widehat{\partial _{\alpha }\partial _{0}} u,\ \partial _{\beta }w) \cdot \partial _{0} w \, dx dt . $$
(7.124)

Here the hat correction plays a perturbative role and could be omitted. However, in the computations below we need to keep it in order to be able to estimate energy corrections. Our computations emulate the simpler case considered above, but with some care in order to avoid iterated time derivatives. Integrating by parts the \(\beta \) derivative we get

$$ \begin{aligned} I_{sub,00} = & \ - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{x} \partial _{\beta }\widehat{\partial _{\alpha }\partial _{0}} u,\ w) \cdot \partial _{0} w \, dx dt \\ & - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{\mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{x} \widehat{\partial _{\alpha }\partial _{0}} u,\ w) \cdot \partial _{0} \partial _{\beta }w \, dx \\ & + Err({\mathcal {B}}^{2}) + Err({\mathcal {A}^{ \sharp }})|_{0}^{T}, \end{aligned} $$

where the last term accounts for the boundary contributions obtained when \(\beta =0\). In the first integral we perturbatively move \(T_{{\tilde{g}}^{\alpha \beta}}\) on the first \(L_{lh}\) argument and then use Lemma 5.11; this allows us to move the entire first integral into the error, leading us to

$$ \begin{aligned} I_{sub,00} ={}& - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{x} \widehat{\partial _{\alpha }\partial _{0}} u,\ \partial _{\beta }w) \cdot \partial _{0} \partial _{\beta }w \, dx \\ &{}+ Err({\mathcal {B}}^{2}) + Err({\mathcal {A}^{\sharp }})|_{0}^{T}. \end{aligned}$$
(7.125)

On the other hand we can perturbatively drop the hat and integrate by parts the \(\alpha \) derivative. This gives

$$ \begin{aligned} I_{sub,00} = & \ - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{x} \partial _{0} u,\ \partial _{\alpha }\partial _{\beta }w) \cdot \partial _{0} w \, dx dt \\ & - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{\mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{x} \partial _{0} u, \ w) \cdot \partial _{0} \partial _{\alpha }w \, dx + Err({ \mathcal {B}}^{2}) + Err({\mathcal {A}^{\sharp }})|_{0}^{T}. \end{aligned} $$

In the first integral we perturbatively move \(T_{{\tilde{g}}^{\alpha \beta}}\) on the second \(L_{lh}\) argument; using the paradifferential \(w\) equation where the \({\tilde{A}}\) and \({\tilde{b}}\) terms are perturbative, this yields a lower order correction to our multiplier. Switching the \(\alpha \) and \(\beta \) indices we obtain

$$ \begin{aligned} I_{sub,00} = & \ - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta}} L_{lh}(\partial _{x} \partial _{0} u,\ \partial _{\alpha }w) \cdot \partial _{0} \partial _{\beta }w \, dx \\ &+ \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \tilde{P}} w \cdot Q \partial _{0} w \, dx dt \\ & + Err({\mathcal {B}}^{2}) + Err({\mathcal {A}^{\sharp }})|_{0}^{T}, \end{aligned} $$
(7.126)

where \(\|Q\|_{L^{2} \to L^{2}} \lesssim {\mathcal {A}^{\sharp }}\).

The next step is to add the relations (7.125) and (7.126). Here we separate the cases \(\alpha = j \neq 0\) and \(\alpha =0\), where in the first case we can drop the hat and pull out the \(\partial _{\alpha}\) in \(L_{lh}\),

$$ \begin{aligned} 2 I_{sub,00} = & \ - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{j\beta}} \partial _{j} L_{lh}( \partial _{x} \partial _{0} u,\ w) \cdot \partial _{0} \partial _{ \beta }w \, dx \\ & \ - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{\mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{0\beta}} [ L_{lh}(\partial _{x} \widehat{\partial _{0} \partial _{0}} u,\ w) + L_{hl}(\partial _{x} \partial _{0} u, \partial _{0} w)] \cdot \partial _{0} \partial _{ \beta }w \, dx \\ & + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\tilde{P}} w \cdot Q \partial _{0} w \, dx dt + Err({\mathcal {B}}^{2}) + Err({ \mathcal {A}^{\sharp }})|_{0}^{T}. \end{aligned} $$

In the first integral we integrate by parts to switch \(\partial _{0}\) to the left and then \(\partial _{j}\) to the right. Then we distribute the \(\partial _{0}\) on the left. This yields

$$\begin{aligned} 2 I_{sub,00} = & - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{j\beta}} \partial _{j} [ L_{lh}( \partial _{x} \partial _{0} \partial _{0} u, w)+ L_{lh}(\partial _{x} \partial _{0} u, \partial _{0} w)] \cdot \partial _{j} \partial _{ \beta }w \, dxdt \\ & \ - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{\mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{0\beta}} [ L_{lh}(\partial _{x} \widehat{\partial _{0} \partial _{0}} u, w) + L_{hl}(\partial _{x} \partial _{0} u, \partial _{0} w)] \cdot \partial _{0} \partial _{ \beta }w \, dxdt \\ & + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\tilde{P}} w \cdot Q \partial _{0} w \, dx dt + Err({\mathcal {B}}^{2}) + Err({ \mathcal {A}^{\sharp }})|_{0}^{T}. \end{aligned}$$

In the first integral we may perturbatively correct \(\partial _{0}^{2} u\); this allows us to put back together the cases when \(\alpha \) is zero and nonzero,

$$ \begin{aligned} 2 I_{sub,00} = & \ - \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{{ \mathfrak {P}}S^{-1}} T_{{\tilde{g}}^{\alpha \beta}} \partial _{j} [ L_{lh}( \partial _{x} \widehat{\partial _{0} \partial _{0}} u,\ w) \\ &+ L_{lh}( \partial _{x} \partial _{0} u,\ \partial _{0} w)] \cdot \partial _{ \alpha }\partial _{\beta }w \, dxdt \\ & + \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{\tilde{P}} w \cdot Q \partial _{0} w \, dx dt + Err({\mathcal {B}}^{2}) + Err({ \mathcal {A}^{\sharp }})|_{0}^{T}. \end{aligned} $$

Finally, we commute \(T_{{\tilde{g}}^{\alpha \beta}} \) to the right factor, and use the \(w\) equation to add another perturbative factor to our multiplier. This gives

$$ 2 I_{sub,00} = \int _{0}^{T} \!\!\!\! \int _{{\mathbb{R}}^{n}}T_{ \tilde{P}} w \cdot Q \partial _{0} w \, dx dt + Err({\mathcal {B}}^{2}) + Err({\mathcal {A}^{\sharp }})|_{0}^{T}, $$

with a modified \(Q\), as desired.

The proof of Proposition 7.11 is now concluded. □

We now conclude the proof of Theorem 7.1 using Proposition 7.11, with the vector field \(\tilde{X}_{s}\) chosen as in Proposition 7.7 and \(\tilde{Y}_{s0}\) defined as in (7.91). For these we have at our disposal not only the conclusion of Proposition 7.7, but also the refined version in Proposition 7.9. This guarantees that the symbol \({\tilde{c}}_{s}\) in (7.92) has size \({\mathcal {B}}^{2}\), in the sense that its coefficients in (7.93) satisfy

$$ {\tilde{c}}_{s}^{j} \in {\mathcal {B}}^{2} L^{\infty }S^{j}, \qquad 2^{-k} P_{< k} \partial _{0} {\tilde{c}}_{s}^{j} \in {\mathcal {B}}^{2} L^{ \infty }S^{j}. $$

These conditions, in turn, guarantee that the flux term \(C_{s}\) in our energy estimate (7.96) satisfies

$$ |C_{s}(w,w)| \lesssim {\mathcal {B}}^{2} \| \partial w\|_{L^{2}}^{2}, $$

and thus the conclusion of Theorem 7.1 follows.

8 Energy estimates for the full equation

Our objective here is to prove energy estimates for the solution \(u\) to the minimal surface equation (1.7) in \(\mathcal {H}^{s} = H^{s} \times H^{s-1}\), in terms of our control parameters \({\mathcal {A}^{\sharp }}\) and ℬ.

Theorem 8.1

For each \(s \geq 1\) there exists an energy functional \(E^{s}_{NL}\) for the minimal surface equation (1.7) in \(H^{s} \times H^{s-1}\) with the property that for all \(\mathcal {H}^{s}\) solutions \(u \) to (1.7) with \({\mathcal {A}^{\sharp }}\ll 1\) and \({\mathcal {B}}\in L^{2}\) we have:

a) coercivity,

$$ E^{s}_{NL} (u[t]) \approx \| u[t]\|_{\mathcal {H}^{s}}^{2} . $$
(8.1)

b) energy bound,

$$ \frac{d}{dt} E^{s}_{NL}(u[t]) \lesssim {\mathcal {B}}^{2} E^{s}_{NL}(u[t]). $$
(8.2)

Because of the assumption \({\mathcal {A}^{\sharp}}\ll 1\), in this section we no longer need to track the dependence of implicit constants on \({\mathcal {A}^{\sharp}}\). The exception to this is in the proof of Lemma 8.4, where the smallness of \({\mathcal {A}^{\sharp}}\) is used in order to guarantee the invertibility of our partial normal form transformation; even there, we only need to use linear and quadratic \({\mathcal {A}^{\sharp}}\) factors.

The rest of this section is devoted to the proof of the theorem. This has two main ingredients:

  1. (1)

    Reduction to the paradifferential equation, using normal form analysis.

  2. (2)

    Energy estimates for the paradifferential equation, which have already been proved in Theorem 7.1.

Hence, our primary objective here will be to carry out the above reduction. We recall the minimal surface equation,

$$ g^{\alpha \beta} \partial _{\alpha} \partial _{\beta} u = 0. $$

In order to use the energy estimates obtained in the previous section, we write this in paradifferential form:

$$ (T_{g^{\alpha \beta}} \partial _{\alpha} \partial _{\beta} - 2 T_{A^{ \gamma}}\partial _{\gamma}) u = N(u), $$
(8.3)

where the source term \(N(u)\) is given by

$$ N(u) = -\Pi (\partial _{\alpha} \partial _{\beta}u, g^{\alpha \beta})-T_{ \partial _{\alpha} \partial _{\beta}u}g^{\alpha \beta}-2 T_{A^{\gamma}} \partial _{\gamma }u. $$
(8.4)

Here we cannot treat \(N\) perturbatively; precisely, we do not have an estimate of the form

$$ \| N(u)(t) \|_{H^{s-1}} \lesssim {\mathcal {B}}^{2} \|u[t]\|_{ \mathcal {H}^{s}}, $$

even though \(N(u)\) is cubic in \(u\), and the above inequality is dimensionally correct. This is because \(N\) contains some unbalanced contributions.

To address this issue, our strategy will be to correct \(u\) via a well chosen normal form transformation, in order to eliminate the unbalanced part of \(N(u)\). But in order to do this, we have to first identify the unbalanced part of \(N(u)\), and reveal its null structure. A first step in this direction is to better describe the contributions of the metric coefficients \(g^{\alpha \beta}\) in \(N\); explicitly we want to extract the renormalizable terms (i.e. the terms to which we can apply a normal form correction). For this we express \(g^{\alpha \beta}\) paradifferentially as follows:

Lemma 8.2

The metric \(g^{\alpha \beta}\) can be expressed paradifferentially as follows

$$ g^{\alpha \beta}(\partial u) = g^{\alpha \beta}(0) -T_{g^{ \alpha \gamma}\partial ^{\beta }u } \partial _{\gamma}u-T_{g^{ \gamma \beta}\partial ^{\alpha }u } \partial _{\gamma}u +R(u), $$
(8.5)

where \(R (u)\) satisfies the following balanced bounds for \(s \geq 1\):

$$ \Vert R(u)\Vert _{H^{s-\frac{1}{2}}}\lesssim {\mathcal {B}}\| \partial u \|_{H^{s-1}}, $$
(8.6)

as well as

$$ \Vert R(u)\Vert _{H^{s-1}}\lesssim \| \partial u\|_{H^{s-1}}. $$
(8.7)

Proof

The representation in (8.5) and the bound (8.6) for \(R\) follow from (3.8) and Lemma 2.9. To get (8.7) one estimates each term in \(R\) separately, using no cancellations. □

This suggests that the nonlinear contribution \(N(u)\) should be seen as the sum of two terms

$$ N(u) =N_{1}(u)+N_{2}(u), $$

where \(N_{1}\) has null structure and \(N_{2}\) is balanced,

$$ \begin{aligned} N_{1} (u) :={}& -2 \Pi (\partial _{\alpha} \partial _{\beta}u, T_{g^{ \alpha \gamma}\partial ^{\beta }u } \partial _{\gamma}u), \\ N_{2} (u):={}& - 2\left ( T_{\partial _{\alpha}\partial _{\beta} u}T_{g^{ \alpha \gamma}\partial ^{\beta}u}\partial _{\gamma}u -T_{\partial _{ \alpha}\partial _{\beta} ug^{\alpha \gamma}\partial ^{\beta}u} \partial _{\gamma}u \right ) \\ &{}+T_{\partial _{\alpha}\partial _{\beta} u}R(u)+ \Pi (\partial _{\alpha} \partial _{\beta}u, R(u)). \end{aligned} $$

We will first prove that \(N_{2}(u)\) is a perturbative term:

Lemma 8.3

The expression given by \(N_{2}(u)\) satisfies the bound

$$ \Vert N_{2}(u)\Vert _{H^{s-1}}\lesssim {\mathcal {B}}^{2} \Vert \partial u\Vert _{H^{s-1}}, \qquad s\geq 1. $$
(8.8)

Proof

We begin with the first difference in \(N_{2}\), and look separately at each \(\alpha \), \(\beta \) and \(\gamma \). If \((\alpha , \beta )\neq (0,0)\) then we apply Lemma 2.7 to obtain

$$ \begin{aligned} &\Vert \! \left ( T_{\partial _{\alpha}\partial _{\beta} u}T_{g^{ \alpha \gamma}\partial ^{\beta}u} -T_{\partial _{\alpha}\partial _{ \beta} ug^{\alpha \gamma}\partial ^{\beta}u} \right )\! \partial _{ \gamma}u \Vert _{H^{s-1}} \\ &\quad \lesssim \Vert \vert D\vert ^{-\frac{1}{2}} \partial _{\alpha }\partial _{\beta} u\Vert _{BMO} \Vert \vert D \vert ^{\frac{1}{2}}\!\left ( g^{\alpha \gamma}\partial ^{\beta}u \right )\!\Vert _{BMO}\Vert \partial u\Vert _{H^{s-1}} \\ &\quad \lesssim {\mathcal {B}}^{2} \Vert \partial u\Vert _{H^{s-1}}. \end{aligned} $$

If \((\alpha , \beta )= (0,0)\) then we use the wave equation for \(u\) with the \(\tilde{g}\) metric (3.19) to write

$$ \partial _{t}^{2} u=\left ( T_{\tilde{g}}\partial _{x}\partial u+ T_{ \partial _{x}\partial u}\tilde{g}\right ) +\Pi (\tilde{g},\partial _{x} \partial u)=:\hat{\partial}_{0}^{2} u +\pi _{2}(u), $$
(8.9)

exactly as in Lemma 5.4. Then for the first term we have the estimate

$$ \Vert \hat{\partial}_{0}^{2} u \Vert _{BMO^{-\frac{1}{2}}} \lesssim { \mathcal {B}}$$
(8.10)

which suffices in order to apply Lemma 2.7 as above in order to estimate its contribution.

On the other hand, the bound for the contribution of \(\pi _{2}(u)\) is easier because by Lemma 5.4 we have the direct uniform bound

$$ \Vert \pi _{2}(u)\Vert _{L^{\infty}}\lesssim {\mathcal {B}}^{2}. $$
(8.11)

Now, we turn our attention to the second term in \(N_{2}(u)\), where we again discuss separately the \((\alpha , \beta )\neq (0,0)\) and \((\alpha , \beta ) = (0,0)\) cases.

For the \((\alpha , \beta )\neq (0,0)\) case we use the bound in (8.6) to obtain

$$ \Vert T_{\partial _{\alpha}\partial _{\beta}u}R(u)\Vert _{H^{s-1}} \lesssim \Vert \partial _{\alpha}\partial _{\beta}u \Vert _{BMO^{- \frac{1}{2}}} \Vert R(u)\Vert _{H^{s-\frac{1}{2}}} \lesssim { \mathcal {B}}^{2} \Vert \partial u\Vert _{H^{s-1}}. $$

Next we consider the case \((\alpha , \beta )= (0,0)\), and observe that we again need to use the decomposition (8.9). The contribution of \(\hat{\partial}_{0}^{2} u\) is estimated using (8.10) and the bound (8.6) for \(R\), exactly as above:

$$ \Vert T_{\hat{\partial}_{0}^{2} u}R(u)\Vert _{H^{s-1}}\lesssim \Vert \vert D\vert ^{-\frac{1}{2}}\hat{\partial}_{0}^{2} u \Vert _{BMO} \Vert R(u)\Vert _{H^{s-\frac{1}{2}}}\lesssim {\mathcal {B}}^{2}\Vert \partial u\Vert _{H^{s-1}}. $$

For the \(\pi _{2}(u)\) contribution we use the pointwise bound in (8.11) and the \(H^{s-1}\) bound in (8.7) for \(R\),

$$ \Vert T_{\pi _{2}(u)}R(u)\Vert _{H^{s-1}}\lesssim \Vert \vert \pi _{2}(u) \Vert _{L^{\infty}} \Vert R(u)\Vert _{H^{s-1}}\lesssim {\mathcal {B}}^{2} \Vert \partial u\Vert _{H^{s-1}}. $$

Finally, a similar analysis leads to the bound for the balanced term \(\Pi (\partial _{\alpha} \partial _{\beta}u, R(u))\). □

To account for the unbalanced part \(N_{1} (u)\) of \(N\) we introduce a normal form correction

$$ \tilde{u}=u- \Pi (\partial _{\beta} u, T_{\partial ^{\beta}u } u):=u-u_{2}. $$
(8.12)

Our goal will be to show that the normal form variable solves a linear inhomogeneous paradifferential equation with a balanced source term.

Lemma 8.4

The normal form correction above has the following properties:

  1. a)

    It is bounded from above and below,Footnote 7

    $$ \Vert \tilde{u}[t] \Vert _{\mathcal {H}^{s}}\approx \Vert u[t] \Vert _{ \mathcal {H}^{s}}. $$
  2. b)

    It solves the an equation of the form

    $$ ( \partial _{\alpha} T_{g^{\alpha \beta}}\partial _{\beta} - T_{A^{ \gamma}}\partial _{\gamma}) \tilde{u}=N_{2}(u) + \partial _{t} R_{1}(u) +R_{2}(u), $$
    (8.13)

    where

    $$ \Vert R_{1}(u)\Vert _{H^{s}}\lesssim {\mathcal {B}}^{2} \Vert u[t] \Vert _{\mathcal {H}^{s}}, \qquad \Vert R_{2}(u)\Vert _{H^{s-1}} \lesssim {\mathcal {B}}^{2} \Vert u[t]\Vert _{\mathcal {H}^{s}}, $$
    (8.14)

    and

    $$ \Vert R_{1} (u)\Vert _{H^{s-1}}\lesssim {\mathcal {A}^{\sharp }}^{2} \Vert u[t]\Vert _{\mathcal {H}^{s}}. $$
    (8.15)

We remark that here we expand the meaning of “balanced source terms” to include expressions of the form \(\partial _{t} R_{1}\) with \(R_{1}\) as above. This is required due to the fact that time derivatives are often more difficult to estimate in our context, and are allowed in view of the result in Theorem 4.5.

Proof

a) In view of the smallness of \({\mathcal {A}^{\sharp }}\), for the boundedness of the normal form it suffices to show that we have the fixed time bound

$$ \begin{aligned} \qquad \Vert u_{2}\Vert _{H^{s}} \lesssim {\mathcal {A}^{\sharp }}^{2} \Vert u[t]\Vert _{{\mathcal {H}^{s}}}, \end{aligned} $$
(8.16)

as well as

$$ \begin{aligned} \qquad \Vert \partial _{t} u_{2}\Vert _{H^{s-1}} \lesssim {\mathcal {A}^{ \sharp }}^{2} \Vert u[t]\Vert _{{\mathcal {H}^{s}}} . \end{aligned} $$
(8.17)

For the first bound we directly have

$$ \Vert \Pi (\partial _{\beta}u, T_{\partial ^{\beta}u}u)\Vert _{H^{s}} \lesssim \Vert \partial _{\beta}u\Vert _{H^{s-1}} \Vert T_{\partial ^{ \beta}u}u\Vert _{BMO^{1}}\lesssim {\mathcal {A}^{\sharp }}^{2} \Vert u[t] \Vert _{\mathcal {H}^{s}} . $$
(8.18)

We prove the second bound in a similar manner, but we first apply the time derivative and analyze each term separately:

$$ \begin{aligned}\partial _{t}\Pi (\partial _{\beta}u, T_{\partial ^{\beta}u}u)&= \Pi ( \partial _{t}\partial _{\beta}u, T_{\partial ^{\beta}u}u) + \Pi ( \partial _{\beta}u, T_{\partial ^{\beta}u}\partial _{t} u)+ \Pi ( \partial _{\beta}u, T_{\partial _{t}\partial ^{\beta}u}u) \\ &:=r_{1}+r_{2}+r_{3}. \end{aligned}$$

There are multiple cases arising from the strategy we will implement for terms involving two times derivatives, as well as from the particular structure of each of the terms.

We begin with \(r_{1}\), where we need to separate the \(\beta \neq 0\) and \(\beta = 0\) cases. The easiest case is when \(\beta \neq 0\), where we have

$$ \Vert \Pi (\partial _{t}\partial _{\beta}u, T_{\partial ^{\beta}u}u) \Vert _{H^{s-1}} \lesssim \Vert T_{\partial ^{\beta}u}u\Vert _{H^{s}} \sup _{k} 2^{-k} \Vert P_{< k} \partial _{t}\partial _{\beta}u\Vert _{L^{ \infty}} \lesssim {\mathcal {A}^{\sharp }}^{2} \Vert u[t]\Vert _{{ \mathcal {H}^{s}}} . $$
(8.19)

Here we have used the energy control we have for \(\partial _{t} u\), which in turn gives control of all spatial derivatives of \(\partial _{t} u\). For the case \(\beta =0\) we use the decomposition for \(\partial _{t}^{2} u\) as in Lemma 5.4. For the first component we use the second bound in (5.15),

$$ \Vert \Pi (\hat{\partial}_{t}^{2} u, T_{\partial ^{\beta}u}u)\Vert _{H^{s-1}} \lesssim \Vert T_{\partial ^{\beta}u}u\Vert _{H^{s}} \sup _{k} 2^{-k} \Vert P_{< k} \hat{\partial}_{t}^{2} u\Vert _{L^{\infty}} \lesssim { \mathcal {A}^{\sharp }}^{2} \Vert u[t]\Vert _{{\mathcal {H}^{s}}}. $$

For the second component we argue similarly but using the second bound in (5.16) together with Bernstein’s inequality,

$$ \Vert \Pi (\pi _{2}(u), T_{\partial ^{\beta}u}u)\Vert _{H^{s-1}} \lesssim \Vert T_{\partial ^{\beta}u}u\Vert _{H^{s}} \sup _{k} 2^{-k} \Vert P_{< k} \pi _{2}(u) \Vert _{L^{\infty}} \lesssim {\mathcal {A}^{ \sharp }}^{2} \Vert u[t]\Vert _{{\mathcal {H}^{s}}}. $$

We continue with the bound for \(r_{2}\), where we do not need to distinguish between the time and space derivatives,

$$ \begin{aligned}\Vert \Pi (\partial _{\beta}u, T_{\partial ^{\beta}u}\partial _{t} u) \Vert _{H^{s-1}}&\lesssim \Vert T_{\partial ^{\beta}u}\partial _{t} u \Vert _{H^{s-1}} \Vert \partial _{\beta} u\Vert _{L^{\infty}} \\ &\lesssim {\mathcal {A}}\Vert \partial ^{\beta} u\Vert _{L^{\infty}} \Vert \partial _{t} u\Vert _{H^{s-1}} \\ &\lesssim {\mathcal {A}^{\sharp }}^{2} \Vert u[t]\Vert _{\mathcal {H}^{s}}. \end{aligned}$$

Lastly, we need to bound \(r_{3}\). Here we argue as in (8.19),

$$ \Vert r_{3}\Vert _{H^{s-1}} \lesssim \Vert \partial _{\beta }u \Vert _{L^{ \infty}} \Vert T_{\partial _{t} \partial ^{\beta}u} u\Vert _{H^{s-1}} \lesssim {\mathcal {A}}\Vert u\Vert _{H^{s}} \sup _{k} 2^{-k} \Vert P_{< k} \partial _{t}\partial ^{\beta}u\Vert _{L^{\infty}}, $$

so it remains to prove the following bound for \(\partial _{t} \partial ^{\beta }u\),

$$ \Vert P_{< k} \partial _{t}\partial ^{\beta}u\Vert _{L^{\infty}} \lesssim 2^{k} {\mathcal {A}^{\sharp }}. $$
(8.20)

Here we use the chain rule for the paracoefficient to write \(\partial _{t} \partial ^{\beta }u\) as a linear combination of \(\partial ^{2}_{t} u\) and \(\partial _{t}\partial _{x} u\),

$$ \partial _{t}\partial ^{\beta} u := \partial _{t} h( \partial u ) = h( \partial u ) \partial _{t}^{2} u+ h( \partial u ) \partial _{t} \partial _{x} u, $$

where \(h:= h(\partial u)\). For the \(\partial ^{2}_{t}\) term we use the minimal surface equation (3.5), arriving at the representation

$$ \partial _{t}\partial ^{\beta} u = \tilde{h} (\partial u) \partial _{x} \partial u , $$

where \(\tilde{h}\) incorporates the corresponding metric coefficients. As before, we need to use a Littlewood-Paley decomposition

$$ \tilde{h} (\partial u) \partial _{x}\partial u =T_{\tilde{h} ( \partial u) } \partial _{x}\partial u +T_{ \partial _{x}\partial u } \tilde{h} (\partial u) + \Pi (\tilde{h} (\partial u) , \partial _{x} \partial u ). $$

The first two terms are easy to estimate using only \(L^{\infty}\) bounds for \(\partial u\) and \(\tilde{h}(\partial u)\),

$$ \Vert P_{< k} T_{\tilde{h} (\partial u) } \partial _{x}\partial u \Vert _{L^{\infty}}+ \Vert P_{< k}T_{ \partial _{x}\partial u } \tilde{h} (\partial u) \Vert _{L^{\infty}} \lesssim 2^{k} {\mathcal {A}}. $$

Finally for the third we use instead the \({\mathcal {A}^{\sharp }}\) component of the \(\mathfrak{C}_{0}\) norm for both \(\partial u\) and \(\tilde{h}(\partial u)\),

$$ \Vert P_{< k}\Pi (\tilde{h} (\partial u), \partial _{x}\partial u) \Vert _{L^{\infty}} \lesssim 2^{k} \Vert \Pi (\tilde{h} (\partial u), \partial _{x}\partial u)\Vert _{L^{n}} \lesssim 2^{k} {\mathcal {A}^{ \sharp }}^{2}. $$

Hence (8.20) follows. This finishes the proof of (8.17), and thus of the boundedness from above and below of the normal form transformation in our desired Sobolev space \(\mathcal {H}^{s}\).

b) We begin with the supposedly easier contribution, meaning with the term \(T_{A^{\gamma}}\partial _{\gamma }u_{2}\). To bound this term we would like to commute the \(\partial _{\gamma}\) and place it in front of the product,

$$ T_{A^{\gamma}}\partial _{\gamma }u_{2}=\partial _{\gamma }T_{A^{ \gamma}} u_{2} - T_{\partial _{\gamma}A^{\gamma}}u_{2}. $$

This would look good for the first term on the RHS. However the last term would be problematic, as it may contain three derivatives with respect to time. To avoid this issue we first substitute \(A^{\gamma}\), which by (3.13) is given by

$$ {A}^{\gamma}:= \partial ^{\alpha}u g^{\alpha \gamma} \partial _{ \alpha}\partial _{\beta}u, $$

with the more manageable leading part \(\mathring{A}^{\gamma}\) given by

$$ \mathring{A}^{\gamma}:= T_{\partial ^{\alpha}u}T_{g^{\alpha \gamma}} \widehat{\partial _{\alpha}\partial _{\beta}u}. $$

Here the hat correction is from the Definition 5.3. Then

$$ \begin{aligned} T_{A^{\gamma}}\partial _{\gamma }u_{2}&=\left ( T_{A^{\gamma}}-T_{ \mathring{A}^{\gamma}} \right ) \partial _{\gamma }u_{2}+T_{ \mathring{A}^{\gamma}} \partial _{\gamma }u_{2} \\ &=\left ( T_{A^{\gamma}}-T_{\mathring{A}^{\gamma}} \right ) \partial _{ \gamma }u_{2}+ \partial _{\gamma }T_{\mathring{A}^{\gamma}} u_{2} -T_{ \partial _{\gamma}\mathring{A}^{\gamma}}u_{2}. \end{aligned} $$

We will successively place each of these terms in \(\partial _{t} R_{1}\) or \(R_{2}\). We place the first term in \(R_{2}\). To prove (8.14) for this term, we use the bounds (8.16) and (8.17) for \(u_{2}\), and Lemma 6.9 to bound the coefficient

$$ \Vert A^{\gamma}-\mathring{A}^{\gamma }\Vert _{L^{\infty}}\lesssim { \mathcal {B}}^{2}. $$
(8.21)

We will place the second term in \(\partial _{t} R_{1}\) if \(\gamma =0\) and in \(R_{2}\) if \(\gamma \neq 0\).

In order to prove (8.14) we measure \(u_{2}\) in \(H^{s+\frac{1}{2}}\),

$$ \Vert u_{2}\Vert _{H^{s+\frac{1}{2}}} \lesssim \| \partial u\|_{BMO^{\frac{1}{2}}} \| T_{\partial u} u\|_{H^{s}} \lesssim \| \partial u\|_{BMO^{\frac{1}{2}}} \| \partial u\|_{L^{\infty}} \|u\|_{H^{s}} \lesssim {\mathcal {A}}{\mathcal {B}}\Vert u[t]\Vert _{\mathcal {H}^{s}}. $$

Then (8.14) follows if we can bound the coefficient \(\mathring {A}\) by

$$ \Vert P_{< k} \mathring {A}^{\gamma} \Vert _{L^{\infty}} \lesssim 2^{\frac{k}{2}} {\mathcal {B}}. $$

But this is also a consequence of Lemma 6.9, see (6.12). On the other hand for the bound (8.15) we estimate \(u_{2}\) in \(H^{s}\) as in (8.16), and then it suffices to show that

$$ \Vert P_{< k} \mathring {A}^{\gamma} \Vert _{L^{\infty}}\lesssim 2^{k}{\mathcal {A}^{\sharp}}, $$

which is similar.

The last term is placed in \(R_{2}\) using again the bounds (8.16) and (8.17) for \(u_{2}\) on one hand and Lemma 5.12 on the other hand, to obtain

$$ \Vert P_{< k} \partial _{\gamma} \mathring{A}^{\gamma} \Vert _{L^{ \infty}}\lesssim 2^{k}{\mathcal {B}}^{2}. $$

Now we consider the main term \(\partial _{\alpha} T_{g^{\alpha \beta}}\partial _{\beta}u_{2}\), which can be expanded as

$$ \begin{aligned} \partial _{\alpha} T_{g^{\alpha \beta}}\partial _{\beta}u_{2} &= \partial _{\alpha} T_{g^{\alpha \beta}} \partial _{\beta} \left [ \Pi (\partial _{\gamma} u, T_{\partial ^{\gamma}u } u)\right ] \\ &= \partial _{\alpha} T_{g^{\alpha \beta}} \left [ \Pi (\partial _{ \beta}\partial _{\gamma} u, T_{\partial ^{\gamma}u } u) + \Pi ( \partial _{\gamma} u, T_{\partial _{\beta}\partial ^{\gamma}u } u) + \Pi (\partial _{\gamma} u, T_{\partial ^{\gamma}u } \partial _{\beta} u) \right ]. \end{aligned} $$

Depending on whether \(\alpha =0\) or \(\alpha \neq 0\), we place the middle term into \(\partial _{t}R_{1}\) or \(R_{2}\), respectively:

$$ \Vert \Pi (\partial _{\gamma} u, T_{\partial _{\beta}\partial ^{ \gamma}u } u)\Vert _{H^{s}}\lesssim {\mathcal {B}}^{2} \Vert u[t]\Vert _{ \mathcal {H}^{s}}, $$
$$ \Vert \Pi (\partial _{\gamma} u, T_{\partial _{\beta}\partial ^{ \gamma}u } u)\Vert _{H^{s-1}}\lesssim {\mathcal {A}^{\sharp }}^{2} \Vert u[t]\Vert _{\mathcal {H}^{s}}. $$

Here we use the property \(\partial _{\beta}\partial ^{\gamma}u \in \mathfrak{DC}\) to handle the case when \(\beta =0\) for the first bound, and (8.20) for the second.

The first term can be rewritten in the form

$$ \partial _{\alpha} T_{g^{\alpha \beta}} \Pi (\partial _{\beta} \partial _{\gamma} u, T_{\partial ^{\gamma}u } u) =\partial _{\alpha} \Pi ( T_{g^{\alpha \beta}} \widehat{\partial _{\beta}\partial _{\gamma}} u, T_{\partial ^{\gamma}u } u) +\partial _{t}R_{1}+R_{2}, $$
(8.22)

by using Lemma 2.5, as well as Lemma 5.4 for the case \((\beta , \gamma )=(0,0)\). Similarly the last term can be rewritten in the analogous form

$$ \partial _{\alpha} T_{g^{\alpha \beta}} \Pi (\partial _{\gamma} u, T_{ \partial ^{\gamma}u }\partial _{\beta} u) =\partial _{\alpha} \Pi ( \partial _{\gamma} u , T_{\partial ^{\gamma}u} T_{g^{\alpha \beta}} \partial _{\beta} u) +\partial _{t}R_{1}+R_{2}. $$
(8.23)

Finally we distribute the \(\alpha \) derivative in both (8.22) and (8.23). For the first term on the right in (8.22) we get

$$ \begin{aligned} \partial _{\alpha} \Pi ( T_{g^{\alpha \beta}} \widehat{\partial _{\beta}\partial _{\gamma}} u, T_{\partial ^{\gamma}u } u) &= \Pi (\partial _{\alpha }T_{g^{\alpha \beta}} \widehat{\partial _{\beta}\partial _{\gamma}} u, T_{\partial ^{\gamma}u } u) + \Pi ( T_{g^{\alpha \beta}} \widehat{\partial _{\beta}\partial _{\gamma}} u, T_{\partial _{\alpha} \partial ^{\gamma}u } u) \\ & \quad + \Pi ( T_{g^{\alpha \beta}} \widehat{\partial _{\beta}\partial _{\gamma}} u, T_{\partial ^{\gamma}u } \partial _{\alpha}u) \\ &:=s_{1}+s_{2}+s_{3}. \end{aligned} $$

We place \(s_{1}\) in \(R_{2}\) using Lemma 5.12,

$$ \Vert s_{1}\Vert _{H^{s-1}} \lesssim \sup _{k} 2^{-k} \Vert P_{< k} \partial _{\alpha }T_{g^{\alpha \beta}} \widehat{\partial _{\beta}\partial _{\gamma}} u\Vert _{L^{\infty}} \Vert u[t]\Vert _{\mathcal {H}^{s}}\lesssim {\mathcal {B}}^{2} \Vert u[t] \Vert _{\mathcal {H}^{s}}. $$

The term \(s_{2}\) is also estimated perturbatively using the fact that \(\partial _{\alpha}\partial ^{\gamma}u \in \mathfrak{DC}\), which allows us to decompose it as a sum \(f_{1}+f_{2}\) as in (5.20). Then we estimate

$$ \begin{aligned} \Vert s_{2}\Vert _{H^{s-1}} \lesssim & \ \sup _{k} 2^{-\frac{k}{2}} \Vert P_{< k} T_{g^{\alpha \beta}} \widehat{\partial _{\beta}\partial _{\gamma}} u\Vert _{L^{\infty}} \sup _{j} 2^{-\frac{j}{2}}\Vert P_{< j} f_{1}\Vert _{L^{\infty}} \Vert u[t] \Vert _{\mathcal {H}^{s}} \\ & \ + \sup _{k} 2^{-k} \Vert P_{< k} T_{g^{\alpha \beta}} \widehat{\partial _{\beta}\partial _{\gamma}} u\Vert _{L^{\infty}} \Vert f_{2}\Vert _{L^{\infty}} \Vert u[t]\Vert _{\mathcal {H}^{s}} \\ \lesssim & \ {\mathcal {B}}^{2} \Vert u[t]\Vert _{\mathcal {H}^{s}}, \end{aligned} $$

using Lemma 5.4 for \(\widehat{\partial _{\beta}\partial _{\gamma}} u\). In \(s_{3}\) we can switch \(T_{g}\) onto the other argument of \(\Pi \) using Lemma 2.5 and remove the hat correction, so that it becomes half of \(N_{1}\).

The last remaining term to bound is the one on the RHS of (8.23). Here we distribute again the \(\alpha \)-derivative

$$ \begin{aligned} \partial _{\alpha} \Pi ( \partial _{\gamma} u , T_{\partial ^{\gamma}u} T_{g^{\alpha \beta}} \partial _{\beta} u)&= \Pi ( \partial _{\alpha} \partial _{\gamma} u , T_{\partial ^{\gamma}u} T_{g^{\alpha \beta}} \partial _{\beta} u) + \Pi ( \partial _{\gamma} u , T_{\partial _{ \alpha}\partial ^{\gamma}u} T_{g^{\alpha \beta}} \partial _{\beta} u) \\ &\quad + \Pi ( \partial _{\gamma} u , T_{\partial ^{\gamma}u} T_{ \partial _{\alpha}g^{\alpha \beta}} \partial _{\beta} u) + \Pi ( \partial _{\gamma} u , T_{\partial ^{\gamma}u} T_{g^{\alpha \beta}} \partial _{\beta} \partial _{\alpha}u). \end{aligned} $$

By inspection we observe that the first term in the equality above is the second half of \(N_{1}\). The remaining three terms can be estimated perturbatively using exactly the same approach as in the case of (8.22). □

In view of Lemma 8.3 we can include \(N_{2}(u)\) into \(R_{2}(u)\) in (8.13), obtaining the shorter representation of the source term

$$ ( \partial _{\alpha} T_{g^{\alpha \beta}}\partial _{\beta} - T_{A^{ \gamma}}\partial _{\gamma}) \tilde{u}= \partial _{t} R_{1}(u) +R_{2}(u), $$
(8.24)

where \(R_{1}\) and \(R_{2}\) satisfy the bounds (8.14) and (8.15).

For the homogeneous paradifferential problem we have the \(\mathcal {H}^{s}\) energy \(E^{s}\) given by Theorem 7.1. We will use this to construct our desired nonlinear energy \(E^{s}_{NL}\) in Theorem 8.1. Because we have the source term \(\partial _{t}R_{1}\), the associated nonlinear energy will not be simply given by \(E^{s}(\tilde{u}[t])\). Instead, the correct energy is the one provided by Theorem 4.5, namely

$$ E^{s}_{NL}(u[t]):=E^{s}(\tilde{u}[t]-r[t]), $$
(8.25)

where the correction \(r[t]\) is given by

$$ r[t]= \begin{pmatrix}0\\ (T_{g^{00}})^{-1}R_{1}(u)\end{pmatrix}. $$
(8.26)

Then by the estimate in (4.29) we obtain

$$ \dfrac{d}{dt}E^{s}_{NL} (u[t])\lesssim E^{s}(\tilde{u}[t]-r[t],r_{1} )+ {\mathcal {B}}^{2}E^{s}_{NL}(u[t]), $$
(8.27)

where \(r_{1}[t]\) is as in (4.28),

$$ r_{1}[t]:= \ \begin{pmatrix} (T_{g^{00}})^{-1} R_{1} \\ (T_{g^{00}})^{-1}( R_{2} - \partial _{k} T_{g^{k0}} (T_{g^{00}})^{-1} R_{1} - T_{g^{0k}} \partial _{k} (T_{g^{00}})^{-1} R_{1}) \end{pmatrix}. $$

Our nonlinear energy \(E^{s}_{NL}\) is coercive because \(r[t]\) is small,

$$ \Vert r[t]\Vert _{\mathcal {H}^{s}} \lesssim {\mathcal {A}^{\sharp }}^{2} \Vert u[t]\Vert _{\mathcal {H}^{s}}, $$

due to the bound (8.15). Finally, we control the time derivative of the energy, because

$$ \Vert r_{1}[t]\Vert _{\mathcal {H}^{s}}\lesssim {\mathcal {B}}^{2} \Vert u[t] \Vert _{\mathcal {H}^{s}}. $$

This is due to the bound in (8.14).

9 Energy and Strichartz estimates for the linearized equation

Our objective here is to prove that the homogeneous linearized equation is well-posed in the space \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) in two dimensions) and satisfies Strichartz estimates with an appropriate loss of derivatives, namely (4.33) with \(S=S_{AIT}\), under the assumption that the associated linear paradifferential equation has similar properties. The main result of this section is as follows:

Theorem 9.1

Let \(s\) be as in (1.20) (respectively (1.19) in dimension \(n = 2\)). Let \(u\) be a smooth solution for the minimal surface equation in a unit time interval, and which satisfies the energy and Strichartz bounds

$$ \| u \|_{S^{s}_{AIT}} + \| \partial _{t} u\|_{S^{s-1}_{AIT}} \ll 1. $$
(9.1)

Assume also that the associated linear paradifferential equation

$$ \begin{aligned} \partial _{\alpha} T_{{\hat{g}}^{\alpha \beta}} \partial _{\beta} v = f \end{aligned} $$
(9.2)

is well-posed in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) in dimension \(n = 2\)) in a unit time interval, and satisfies the full Strichartz estimates (4.43), with \(S= S_{AIT}\), in the same interval.

Then the homogeneous linearized equation

$$ \begin{aligned} \partial _{\alpha} {\hat{g}}^{\alpha \beta} \partial _{\beta} v = 0 \end{aligned} $$
(9.3)

is also well-posed in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) in dimension \(n = 2\)), and satisfies the homogeneous form of the Strichartz estimates in (4.33) with \(S = S_{AIT}\).

We continue with several comments on the result in the theorem, in order to better place it into context.

  • Up to this point we only know that both the full equation and the associated linear paradifferential equation satisfy good energy estimates, but we do not yet know that they also satisfy the corresponding Strichartz estimates. This is however not a problem, as the main result of this section, namely Theorem 9.1 above, will only be used as a module within our main bootstrap argument in the last section of the paper, by which time we will have already established the energy and Strichartz estimates for both the full equation and for the linear paradifferential equation.

  • The exponent \(s\) in the above result needs not be the same as the one in our main result in Theorem 1.3; it can be taken to be smaller, as long as it still satisfies the constraints in (1.19), (1.20).

  • While we can no longer control the linearized evolution purely in terms of the control parameters \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\), and ℬ, these still play role in the analysis. The hypothesis of the theorem guarantees that

    $$ {\mathcal {A}^{\sharp }}\ll 1, \qquad \| {\mathcal {B}}\|_{L^{2}} \ll 1. $$
  • The exponent \(\delta \) in the bound (4.43) with \(S= S_{AIT}\) should be thought of as being sufficiently small, compared with the distance between \(s\) and the threshold in (1.19), (1.20).

  • The smoothness assumption on \(u\) is not used in any quantitative way. Its role is only to ensure that we already have solutions for the linearized problem, so we can skip an existence argument. Thus, by a density argument, the result of the theorem may be seen as an a-priori estimate for smooth solutions \(v\) to the linearized equation. As our rough solutions will be obtained in the last section as limits of smooth solutions, this assumption may be discarded a posteriori.

  • The reason we only consider the homogeneous case in the linearized equation (9.3) is to shorten the proof, as this is all that is used later in the last section. However, the result also extends to the inhomogeneous case. In particular in dimension \(n \geq 3\) this is an immediate consequence of Theorem 4.12, but in dimension \(n=2\) an extra argument would be needed.

One major simplification in this section, compared with the previous two sections, is that we no longer have the earlier difficulties in estimating the second order time derivatives and even some third order time derivatives of \(u\). In particular, we have the following relatively straightforward lemma:

Lemma 9.2

For solutions \(u\) to the minimal surface equation as in (9.1) we have the bounds

$$ \| \partial ^{2} u\|_{S^{s-2}_{AIT}}+ \| \partial {\hat{g}}\|_{S^{s-2}_{AIT}} \ll 1, $$
(9.4)

as well as

$$ \|\langle D_{x} \rangle ^{\sigma +\delta _{0}} \partial _{\alpha }T_{{ \hat{g}}^{\alpha \beta}} \partial _{\beta }\partial _{\gamma }u\|_{L^{p} L^{q}} \ll 1 , $$
(9.5)

where \(\delta _{0} >0\) depends only on \(s\) and

$$ \begin{aligned}\frac{1}{p} + \frac{1}{q} &= 1, \qquad \frac{2}{n-\frac{1}{2}} \leq \frac{1}{q} \leq \frac{1}{2} + \frac{2}{n-\frac{1}{2}}, \\ \sigma &= \left (n-\frac{1}{2}\right )\left (\frac{1}{q} - \frac{2}{n-\frac{1}{2}}\right ) \geq 0, \qquad n \geq 3. \end{aligned}$$

respectively

$$ \frac{2}{p} + \frac{1}{q} = 1, \qquad \frac{4}{7} \leq \frac{1}{q} \leq \frac{11}{14}, \qquad \sigma = \frac{7}{4} \left (\frac{1}{q} - \frac{4}{7}\right ) \geq 0, \qquad n =2. $$

Proof

For (9.4) we only need to consider the second order time derivatives, which we can write using the minimal surface equation as

$$ \partial _{t}^{2} u= h(\partial u) \partial _{x} \partial u. $$

By Moser inequalities we have \(\|h(\partial u)\|_{H^{s-1}} \ll 1\). Since \(s -1 > n/2\), it is easily verified that the space \(S^{s-2}_{AIT}\) is left unchanged by multiplication by \(h(\partial u)\). The same argument applies to derivatives of the metric \(\partial {\hat{g}}\).

For (9.5) we can use again the minimal surface equation to obtain the representation

$$ \partial _{\alpha }T_{{\hat{g}}^{\alpha \beta}} \partial _{\beta } \partial _{\gamma }u = T_{\partial _{x} \partial {\hat{g}}} \partial u + T_{\partial {\hat{g}}} \partial ^{2} u + \Pi (\partial {\hat{g}}, \partial ^{2} u) + \Pi ({\hat{g}},\partial ^{3} u), $$

where we can further write

$$ \partial ^{3} u = \partial _{x} ({\hat{g}}\partial ^{2} u) + \partial { \hat{g}}\partial ^{2} u. $$

Hence we need to multiply two functions in \(S^{s-2}_{AIT}\), which contains a range of mixed \(L^{p}\) norms at varying spatial Sobolev regularities. We can do this optimally if both mixed norms can be chosen to have non-negative Sobolev index. In order to avoid using Sobolev embeddings we further limit the range of exponents to the case when one of the Sobolev indices may be taken to be zero. This gives the range of exponents in the lemma. □

Next we discuss the strategy of the proof. The first potential strategy here would be to try to view the equation (9.3) as a perturbation of (9.2). Unfortunately such a strategy does not seem to work in our context, because this would require a balanced estimate for the difference between the two operators, whereas this difference contains some terms which are clearly unbalanced.

To address the above difficulty, the key observation is that the aforementioned difference exhibits a null structure, at least in its unbalanced part. This opens the door to a partial normal form analysis, in order to develop a better reduction of (9.3) to (9.2). Because of this, the proof of the theorem will be done in two steps:

  1. (1)

    The normal form analysis, where a suitable normal form transformation is constructed.

  2. (2)

    Reduction to the paradifferential equation, using the above normal form.

9.1 Preliminary bounds for the linearized variable

The starting point for the proof of the theorem is to rewrite the divergence form of the linearized equation (9.3) as an inhomogeneous paradifferential evolution (9.2) with a perturbative source \(f\), as follows:

$$ \begin{aligned} T_{{\hat{P}}}v = -\partial _{\alpha} T_{\partial _{\beta} v} {\hat{g}}^{ \alpha \beta} - \partial _{\alpha} \Pi (\partial _{\beta} v, {\hat{g}}^{ \alpha \beta}) =: f. \end{aligned} $$
(9.6)

While we cannot directly prove a balanced cubic estimate for \(f\), a useful initial step is to establish a quadratic estimate for it. The expression for \(f\) involves \(v\) and \(\partial _{t} v\), which we already control, but also \(\partial _{t}^{2} v\), which we do not. So we estimate it first:

Lemma 9.3

For solutions \(v\) to (9.3) we have

$$ \begin{aligned} &\| \partial _{t}^{2} v(t) \|_{H^{-\frac{3}{2}}} \lesssim \|v[t]\|_{ \mathcal {H}^{\frac{1}{2}}} \qquad \, \, n \geq 3, \\ &\| \partial _{t}^{2} v(t) \|_{H^{-\frac{11}{8}}} \lesssim \|v[t]\|_{ \mathcal {H}^{\frac{5}{8}}} \qquad n = 2. \end{aligned} $$
(9.7)

Proof

We consider the case \(n \geq 3\), and comment on the case \(n=2\) at the end. Using the equation (9.3) for \(v\), we may write

$$ \partial _{t}^{2} v = h(\partial u) \partial _{x} \partial v + h( \partial u) \partial ^{2} u \partial v . $$

Here by Moser inequalities we have

$$ \|h(\partial u)\|_{H^{s-1}} \ll 1. $$

Then, using also (9.4) the conclusion of the Lemma follows from the straightforward multiplicative estimates

$$ H^{s-1} \cdot H^{-\frac{3}{2}} \to H^{-\frac{3}{2}}, \qquad H^{s-1} \cdot H^{s-2} \cdot H^{-\frac{1}{2}} \to H^{-\frac{3}{2}}, $$

where it is important that \(s > \frac{n}{2}+1\) and \(s > \frac{5}{2}\). This last condition is not valid in dimension \(n=2\), where we only ask that \(s > 2+\frac{3}{8}\). This is why the Sobolev exponents in this case need to be increased by \(\frac{1}{8}\). □

We now return to the quadratic estimate for the source term in (9.6):

Lemma 9.4

Let \(v \in S^{\frac{1}{2}}_{AIT}\) satisfy (9.3). Then \(v\) also solves the inhomogeneous paradifferential equation

$$ \begin{aligned} T_{{\hat{P}}} v &= f, \end{aligned} $$
(9.8)

with source term \(f\) satisfies the following bounds:

a) For \(n \geq 3\) we have the uniform bound

$$ \| f\|_{H^{-\frac{5}{4}}} \ll \|v[\cdot ]\|_{L^{\infty }\mathcal {H}^{ \frac{1}{2}}}, $$
(9.9)

and the space-time bound

$$ \|f\|_{L^{p} L^{q}} \ll \|v[\cdot ]\|_{L^{\infty }\mathcal {H}^{\frac{1}{2}}} $$
(9.10)

with

$$ \frac{1}{q} = \frac{1}{n - \frac{1}{2}} + \frac{1}{2}, \qquad \frac{1}{p} + \frac{1}{q} = 1. $$

b) For \(n =2 \) we have the uniform bound

$$ \| f\|_{H^{-1}} \ll \|v[\cdot ]\|_{L^{\infty }\mathcal {H}^{\frac{5}{8}}}, $$
(9.11)

and the space-time bound

$$ \begin{aligned} \|f\|_{L^{4} L^{2}} \ll \|v[\cdot ]\|_{L^{\infty }\mathcal {H}^{\frac{5}{8}}}. \end{aligned} $$
(9.12)

Proof

To avoid cluttering the notations, we prove the result in the case \(n \geq 3\). The two dimensional case is identical up to appropriate adjustments of \(L^{p}\) exponents.

We write

$$ -f = T_{ \partial _{\alpha}\partial _{\beta} v} {\hat{g}}^{\alpha \beta} + T_{\partial _{\beta} v} \partial _{\alpha} {\hat{g}}^{\alpha \beta} + \Pi (\partial _{\alpha}\partial _{\beta} v, {\hat{g}}^{ \alpha \beta}) + \Pi (\partial _{\beta} v, \partial _{\alpha}{\hat{g}}^{ \alpha \beta}). $$

For the two terms where \(\partial _{\alpha}\) has fallen on \(g\), we have

$$ \begin{aligned}\|T_{\partial _{\beta} v} \partial _{\alpha} {\hat{g}}^{\alpha \beta} + \Pi (\partial _{\beta} v, \partial _{\alpha}{\hat{g}}^{\alpha \beta}) \|_{L^{q}} &\lesssim \|v[t]\|_{\mathcal {H}^{\frac{1}{2}}} \| \langle D_{x} \rangle ^{ \frac{1}{2}}\partial _{\alpha}{\hat{g}}^{\alpha \beta}\|_{L^{n - \frac{1}{2}}} \\ &\lesssim _{{\mathcal {A}}}\|v[t]\|_{\mathcal {H}^{ \frac{1}{2}}}\|\partial u\|_{W^{\frac{3}{2}, n - \frac{1}{2}}}. \end{aligned}$$

Finally, in the cases where the \(\partial _{\alpha}\) has fallen on \(v\), we easily obtain the same estimate due to a good balance of derivatives. Here we use Lemma 9.3 to bound second derivatives of \(v\). □

9.2 The normal form analysis

The estimate in Lemma 9.4 is suboptimal as it does not recognize the cubic structure of the source. This is due to components of the source in which the linearized variable \(v\) is the second highest frequency, and which are not efficiently balanced with respect to derivatives. In fact, these cubic terms may heuristically be viewed as quadratic with a low frequency coefficient.

To better understand the source terms, we begin with a better description of the metric coefficients. By applying Lemma 2.9 to \(g^{- \frac{1}{2}}g^{\alpha \beta}\) (see also (3.8)) and rearranging, we may write the paradifferential expansion

$$ {\hat{g}}^{\alpha \beta}(\partial u) = {\hat{g}}^{\alpha \beta}(0) - T_{{ \hat{g}}^{\alpha \beta} \partial ^{\gamma} u + {\hat{g}}^{\alpha \gamma} \partial ^{\beta} u + {\hat{g}}^{\beta \gamma} \partial ^{\alpha} u} \partial _{\gamma}u + R(\partial u) , $$
(9.13)

where \(R\) satisfies favourable balanced bounds, as in Lemma 5.7,

$$ \| \partial R \|_{L^{n}} \lesssim {\mathcal {A}^{\sharp }}^{2}, \qquad \| \partial R \|_{L^{\infty}} \lesssim {\mathcal {B}}^{2}. $$
(9.14)

To obtain a cubic estimate for (9.6), we substitute (9.13) in (9.6) and write

$$ \begin{aligned} T_{{\hat{P}}} v &= N_{1}(u) + N_{2}(u), \end{aligned} $$
(9.15)

where

$$ \begin{aligned} N_{1}(u) ={}& \partial _{\alpha }T_{\partial _{\beta }v} T_{{\hat{g}}^{ \alpha \beta} \partial ^{\gamma} u + {\hat{g}}^{\alpha \gamma} \partial ^{\beta} u + {\hat{g}}^{\beta \gamma} \partial ^{\alpha} u} \partial _{\gamma }u \\ &{}+ \partial _{\alpha }\Pi (\partial _{\beta }v, T_{{ \hat{g}}^{\alpha \beta} \partial ^{\gamma} u + {\hat{g}}^{\alpha \gamma} \partial ^{\beta} u + {\hat{g}}^{\beta \gamma} \partial ^{\alpha} u} \partial _{\gamma }u) \end{aligned} $$

consists of the essentially quadratic, nonperturbative components, while

$$ \begin{aligned} N_{2}(u) &= - \partial _{\alpha }T_{\partial _{\beta}v}R(\partial u) - \partial _{\alpha }\Pi (\partial _{\beta}v, R(\partial u)) \end{aligned} $$

consists of the balanced, directly perturbative components. We address the essentially quadratic components in \(N_{1}(u)\) by passing to a renormalization \(\tilde{v}\) of \(v\),

$$ \begin{aligned} \tilde{v} &= v - T_{T_{\partial ^{\gamma }u} \partial _{\gamma }v} u - T_{T_{ \partial ^{\gamma }u} v}\partial _{\gamma }u - \Pi (T_{\partial ^{ \gamma }u} \partial _{\gamma }v, u) - \Pi (T_{\partial ^{\gamma }u} v, \partial _{\gamma }u) \\ &:= v+ v_{2}. \end{aligned} $$
(9.16)

This renormalization eliminates the components of the source where the linearized variable \(v\) is the second highest frequency. We thus replace \(N_{1}(u)\) with a source consisting of terms with \(v\) only at the third highest frequency, and hence may be viewed as authentically cubic.

Proposition 9.5

Let \(v \in S^{\frac{1}{2}}_{AIT}\) be a solution for (9.3). Then the following two properties hold in dimension \(n \geq 3\):

(i) Equivalent norms:

$$ \| \tilde{v}[t] \|_{\mathcal {H}^{\frac{1}{2}}} \approx \| v[t] \|_{ \mathcal {H}^{\frac{1}{2}}} . $$
(9.17)

(ii) \(\tilde{v}\) solves a good linear paradifferential equation of the form

$$ \begin{aligned} T_{{\hat{P}}}\tilde{v} &= \partial _{t} f_{1} + f_{2}, \end{aligned} $$
(9.18)

where the source terms are perturbative:

$$ \|f_{2}\|_{(S^{\frac{1}{2}}_{AIT})'} \ll \|v\|_{S^{\frac{1}{2}}_{AIT}}, \qquad \|f_{1}\|_{(S^{-\frac{1}{2}}_{AIT})'} \ll \|v[\cdot ]\|_{L^{ \infty }\mathcal {H}^{\frac{1}{2}}}, $$
(9.19)

as well as

$$ \|f_{1} (t)\|_{S^{-\frac{1}{2}}_{AIT}} \lesssim {\mathcal {A}^{\sharp }} \|v[t]\|_{\mathcal {H}^{\frac{1}{2}}} . $$
(9.20)

The same result holds in two space dimensions at the level of \(v \in \mathcal {H}^{\frac{5}{8}}\).

Proof

(i) For the bound (9.17), it suffices to estimate \(v_{2}\) as follows:

$$ \| v_{2}(t) \|_{H^{\frac{1}{2}}} \lesssim {\mathcal {A}^{\sharp }}\| v[t] \|_{\mathcal {H}^{\frac{1}{2}}}, $$
(9.21)
$$ \| \partial _{t} v_{2}(t) \|_{H^{-\frac{1}{2}}} \lesssim {\mathcal {A}^{ \sharp }}\| v[t]\|_{\mathcal {H}^{\frac{1}{2}}}. $$
(9.22)

The first term of \(v_{2}\) can be directly estimated using scale invariant \({\mathcal {A}}\) bounds,

$$ \begin{aligned} \| T_{T_{\partial ^{\gamma }u} \partial _{\gamma }v} u\|_{H^{ \frac{1}{2}}} &\lesssim \| T_{\partial ^{\gamma }u} \partial _{\gamma }v \|_{H^{-\frac{1}{2}}} \| \partial u \|_{L^{\infty}} \\ &\lesssim \| \partial u \|_{L^{\infty}} \| \partial _{\gamma }v \|_{H^{- \frac{1}{2}}} \| \partial u \|_{L^{\infty}} \\ &\lesssim {\mathcal {A}}^{2} \| \partial _{\gamma }v \|_{H^{-\frac{1}{2}}}. \end{aligned}$$

The third and the fourth terms are similar. However, for the second term we need to use the \({\mathcal {A}^{\sharp }}\) control norm combined with Bernstein’s inequality:

$$\begin{aligned} \| T_{T_{\partial ^{\gamma }u} v} \partial _{\gamma }u\|_{H^{ \frac{1}{2}}} \lesssim& \| T_{\partial ^{\gamma }u} v \|_{H^{ \frac{1}{2}}} \| \partial u \|_{W^{\frac{1}{2},2n}} \\ \lesssim& \| \partial u \|_{L^{\infty}} \| v \|_{H^{\frac{1}{2}}} \| \partial u \|_{W^{ \frac{1}{2},2n}} \lesssim {\mathcal {A}}{\mathcal {A}^{\sharp }}\| v \|_{H^{ \frac{1}{2}}}. \end{aligned}$$

We next consider (9.22), where we distribute the time derivative, obtaining several types of terms:

a) Terms with distributed derivatives, namely \(T_{T_{\partial u} \partial v} \partial u\) and \(\Pi (T_{\partial u} \partial v , \partial u)\). We estimate the first by

$$ \| T_{T_{\partial u} \partial v} \partial u\|_{H^{-\frac{1}{2}}} \lesssim \|T_{\partial u} \partial v\|_{H^{-\frac{1}{2}}} \| \partial u\|_{L^{\infty}} \lesssim {\mathcal {A}}^{2} \| \partial v\|_{H^{- \frac{1}{2}}}, $$

and the second, using Sobolev embeddings, by

$$ \begin{aligned}\| \Pi (T_{\partial u} \partial v , \partial u)\|_{H^{-\frac{1}{2}}} &\lesssim \| \Pi (T_{\partial u} \partial v , \partial u)\|_{L^{ \frac{2n}{n+1}}} \\ &\lesssim \| T_{\partial u} \partial v \|_{H^{- \frac{1}{2}}} \| \partial u\|_{W^{\frac{1}{2},2n}} \lesssim { \mathcal {A}}{\mathcal {A}^{\sharp }}\| \partial v \|_{H^{-\frac{1}{2}}}. \end{aligned}$$

b) Terms with two derivatives on the high frequency \(u\), namely \(T_{T_{\partial u} v} \partial ^{2} u\) and \(\Pi (T_{\partial u} v , \partial ^{2} u)\). In view of the bound (9.4), the corresponding estimate is nearly identical to case (a) above.

c) Terms with \(\partial _{t} \partial ^{\gamma }u\). Here we know that \(\partial _{t} \partial ^{\gamma }u \in H^{s-2}\), so we arrive at estimates which are also similar to case (a).

d) Terms with two derivatives on \(v\). If one of them is spatial (i.e. \(\gamma \neq 0\)) then this is similar to or better than case (a). So we are left with the expressions \(T_{T_{\partial u} \partial _{t}^{2} v} u\) and \(\Pi (T_{\partial u} \partial _{t}^{2} v , u)\). But there we can use the bound (9.7) and complete the analysis again as in case (a).

(ii) The proof of (9.18) along with the estimates (9.19), (9.20) will be accomplished in four steps.

1) We first estimate the balanced source term component \(N_{2}(u)\) from (9.15). We consider below the paraproduct \(T\) term, but the \(\Pi \) term is similar. We first consider the cases where the outer derivative \(\partial _{\alpha }= \partial _{i}\) is a spatial derivative, which we will place in \(f_{2}\). We have by Lemma 5.7 (see (9.14) above)

$$ \| \partial _{i} T_{\partial _{\beta}v} R(\partial u)\|_{H^{-1/2}} \lesssim \|v[t]\|_{\mathcal {H}^{1/2}}\|\partial R(\partial u)\|_{L^{ \infty}} \lesssim _{{\mathcal {A}}} {\mathcal {B}}^{2} \|v[t]\|_{ \mathcal {H}^{1/2}}, $$

where the \({\mathcal {B}}^{2}\) factor is integrable in time. We place the case where \(\partial _{\alpha }= \partial _{0}\) in \(\partial _{t} f_{1}\), estimating

$$ \| T_{\partial _{\beta}v} R(\partial u)\|_{H^{1/2}} \lesssim \|v[t]\|_{ \mathcal {H}^{1/2}}\|\partial R(\partial u)\|_{L^{\infty}} \lesssim _{{ \mathcal {A}}} {\mathcal {B}}^{2} \|v[t]\|_{\mathcal {H}^{1/2}}, $$

and

$$ \begin{aligned} \|T_{\partial _{\beta}v} R(\partial u)\|_{H^{-1/2}} &\lesssim \|T_{ \partial _{\beta}v} R(\partial u)\|_{L^{\frac{2n}{1 + n}}} \\ &\lesssim \|v[t] \|_{\mathcal {H}^{1/2}}\|\partial R(\partial u)\|_{L^{n}} \lesssim { \mathcal {A}^{\sharp }}^{2} \|v[t]\|_{\mathcal {H}^{1/2}}, \end{aligned} $$

as well as

$$ \begin{aligned} \|\langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{4}} T_{\partial _{ \beta}v} R(\partial u)\|_{L^{\infty}} &\lesssim \|T_{\partial _{\beta}v} R(\partial u)\|_{L^{2}} \\ &\lesssim \|\partial v[t]\|_{\mathcal {H}^{-1/2}} \|\partial R(\partial u)\|_{L^{2n}} \lesssim {\mathcal {A}^{\sharp }}{ \mathcal {B}}\|v[t]\|_{\mathcal {H}^{1/2}}. \end{aligned} $$

Here we have a single ℬ factor, which is \(L^{2}\) in time, as needed for the \(L^{2} L^{\infty}\) Strichartz norm in (9.20).

2) Next, we apply product and commutator lemmas to exchange \(N_{1}(u)\) for an equivalent expression up to perturbative errors, in preparation for comparison with the contribution from the normal form corrections. Here, we discuss the first term of \(N_{1}(u)\),

$$ \partial _{\alpha }T_{\partial _{\beta }v} T_{{\hat{g}}^{\alpha \beta} \partial ^{\gamma} u} \partial _{\gamma }u, $$
(9.23)

but the remaining terms, including the balanced \(\Pi \) terms, are similar, using the analogous product and commutator lemmas. We first consider the cases where the outer derivative \(\partial _{\alpha }= \partial _{i}\) is a spatial derivative, and place all perturbative errors in \(f_{2}\). By an application of product and commutator Lemmas 2.7 and 2.4, we may replace (9.23) with

$$ \partial _{\alpha }T_{\partial ^{\gamma} u} T_{\partial _{\beta }v} T_{{ \hat{g}}^{\alpha \beta}} \partial _{\gamma }u. $$

Then applying Lemma 2.4 and the estimate (2.13) in Lemma 2.8, it suffices to consider

$$ \partial _{\alpha }T_{{\hat{g}}^{\alpha \beta}} T_{T_{\partial ^{ \gamma }u}\partial _{\beta }v} \partial _{\gamma }u. $$

In the case where \(\partial _{\alpha }= \partial _{0}\), we place all perturbative errors in \(\partial _{t} f_{1}\). The bound for \(f_{1}\) in (9.19) is similar to the one for \(f_{2}\), but there is a price to pay, namely that we also need to prove (9.20). Fortunately for (9.20) we may disregard all commutator structure and discard all the para-coefficients, as they are bounded and gain an \({\mathcal {A}^{\sharp }}\) factor, so we are left with proving a bound of the form

$$ \| T_{\partial v} \partial u\|_{S_{AIT}^{-\frac{1}{2}}} \lesssim \| \partial v\|_{L^{\infty }H^{-\frac{1}{2}}}. $$

Here for the uniform bound we simply write at fixed time

$$ \| T_{\partial v} \partial u\|_{H^{-\frac{1}{2}}} \lesssim \| \partial v\|_{ H^{-\frac{1}{2}} } \| \partial u\|_{L^{\infty}} \lesssim {\mathcal {A}}\|\partial v\|_{ H^{-\frac{1}{2}} }, $$

and for the \(L^{2} L^{\infty}\) bound we have

$$ \| T_{\partial v} \partial u\|_{L^{2}} \lesssim {\mathcal {B}}\| \partial v\|_{ H^{-\frac{1}{2}} } $$

using ℬ for the square integrability in time and then applying Bernstein’s inequality in space to convert the \(L^{2}\) bound into \(L^{\infty}\).

Applying the same analysis to the other terms of \(N_{1}(u)\), we have reduced the problem to

$$ \begin{aligned} N_{1}'(u) &= \partial _{\alpha }(T_{{\hat{g}}^{\alpha \beta}} T_{T_{ \partial ^{\gamma }u}\partial _{\beta }v}\partial _{\gamma }u + T_{{ \hat{g}}^{\alpha \gamma}} T_{T_{\partial ^{\beta }u}\partial _{\beta }v} \partial _{\gamma }u + T_{{\hat{g}}^{\beta \gamma}} T_{T_{\partial ^{ \alpha }u}\partial _{\beta }v}\partial _{\gamma }u) \\ &\quad + \partial _{\alpha }(\Pi (T_{\partial ^{\gamma }u} \partial _{ \beta }v, T_{{\hat{g}}^{\alpha \beta}}\partial _{\gamma }u) + \Pi ( T_{ \partial ^{\beta }u}\partial _{\beta }v, T_{{\hat{g}}^{\alpha \gamma}} \partial _{\gamma }u) \\ &\quad + \Pi (T_{\partial ^{\alpha }u}\partial _{ \beta }v, T_{{\hat{g}}^{\beta \gamma}}\partial _{\gamma }u)). \end{aligned} $$

3) We next establish the cancellation between the normal form correction and \(N_{1}'(u)\). In this step, we discuss only the low-high \(T\) paraproduct contributions, and return to the \(\Pi \) contributions in Step 4. Applying \(T_{{\hat{P}}}\) to the \(T\) term of \(v_{2}\) in (9.16), we have the contribution

$$ \begin{aligned} - \partial _{\alpha}T_{{\hat{g}}^{\alpha \beta}} \partial _{\beta} (T_{T_{ \partial ^{\gamma }u}\partial _{\gamma }v} u + T_{T_{\partial ^{ \gamma }u} v}\partial _{\gamma }u ). \end{aligned} $$
(9.24)

a) We first would like to observe that the cases where the derivatives \(\partial _{\beta}\) and \(\partial _{\gamma}\) are split, between \(v\) and the high frequency \(u\), cancel with the first two terms of \(N_{1}'(u)\). The main task to verify before doing so is that the cases where the \(\partial _{\beta}\) falls on the lowest frequency para-coefficient \(\partial ^{\gamma }u\) are perturbative due to an efficient balance of derivatives, and may be absorbed into \(f_{2}\) or \(f_{1}\). To see this, we analyze separately cases involving spatial versus time derivatives. In the case of spatial derivatives \(\partial _{\alpha }= \partial _{i}\) and \(\partial _{\beta }= \partial _{j}\), we directly estimate

$$ \|\partial _{i}T_{{\hat{g}}^{ij}} T_{T_{\partial _{j} \partial ^{ \gamma }u}\partial _{\gamma }v} u\|_{H^{-1/2}} \lesssim {\mathcal {B}}^{2} \|v[t]\|_{\mathcal {H}^{1/2}}. $$

In the case where \(\partial _{\beta }= \partial _{0}\), we obtain the same estimate in the same manner, except when \(\partial _{\gamma }= \partial _{0}\). In this case, we may use Lemma 5.4 to estimate the lowest frequency \(\partial _{0}^{2} u\).

It remains to consider the case \(\partial _{\alpha }= \partial _{0}\), which we place in \(\partial _{t} f_{1}\). We have

$$ \|T_{{\hat{g}}^{0\beta}} T_{T_{\partial _{\beta} \partial ^{\gamma }u} \partial _{\gamma }v} u\|_{H^{1/2}} \lesssim {\mathcal {B}}^{2} \|v[t] \|_{\mathcal {H}^{1/2}} $$

as before. For \(f_{1}\) however, we also require an estimate for the full Strichartz norm in (9.20). We separate \(\partial _{\beta}\) again into spatial and time derivatives. For the spatial case, we have by Sobolev embeddings,

$$ \begin{aligned} \|T_{{\hat{g}}^{0j}} T_{T_{\partial _{j} \partial ^{\gamma }u} \partial _{\gamma }v} u\|_{H^{-1/2}} \lesssim & \ \| T_{T_{\partial _{j} \partial ^{\gamma }u}\partial _{\gamma }v} u\|_{L^{\frac{2n}{1 + n}}} \\ \lesssim&\ \|\langle D_{x} \rangle ^{1/2}\partial ^{\gamma }u\|_{L^{2n}} \| v[t]\|_{\mathcal {H}^{1/2}}\|\partial u\|_{L^{\infty}} \\ \lesssim & \ {\mathcal {A}}{\mathcal {A}^{\sharp }}\| v[t]\|_{\mathcal {H}^{1/2}} \end{aligned} $$

for the uniform bound, as well as

$$ \begin{aligned} \|T_{{\hat{g}}^{0j}} T_{T_{\partial _{j} \partial ^{\gamma }u} \partial _{\gamma }v} u\|_{L^{2}} &\lesssim \|\langle D_{x} \rangle ^{1/2} \partial ^{\gamma }u\|_{L^{2n}} \| v[t]\|_{\mathcal {H}^{1/2}}\| \partial u\|_{BMO^{\frac{1}{2}}} \lesssim {\mathcal {B}}{\mathcal {A}^{ \sharp }}\| v[t]\|_{\mathcal {H}^{1/2}} \end{aligned} $$

for the \(L^{2} L^{\infty}\) bound.

For the case \(\partial _{\beta }= \partial _{0}\), the lowest frequency includes an instance of \(\partial _{0}^{2} u\), where we apply Lemma 5.4. This contributes a spatial component \(\hat{\partial}^{2}_{t} u\) which is estimated as before, as well as a balanced \(\Pi \) interaction, namely \(\pi _{2}(u)\). This case is estimated by

$$ \begin{aligned} \| T_{{\hat{g}}^{00}} T_{T_{\pi _{2}(u)}\partial _{0} v} u\|_{H^{-1/2}} &\lesssim \| T_{T_{\pi _{2}(u)}\partial _{0} v} u\|_{L^{ \frac{2n}{1 + n}}} \\ &\lesssim \|\pi _{2}(u)\|_{L^{n}} \| v[t]\|_{ \mathcal {H}^{1/2}}\|\partial u\|_{L^{\infty}} \lesssim {\mathcal {A}}{ \mathcal {A}^{\sharp }}^{2} \| v[t]\|_{\mathcal {H}^{1/2}} \end{aligned} $$

for the energy norm respectively

$$ \begin{aligned} \| T_{{\hat{g}}^{00}} T_{T_{\pi _{2}(u)}\partial _{0} v} u\|_{L^{2}} & \lesssim \| T_{T_{\pi _{2}(u)}\partial _{0} v} u\|_{L^{2}} \\ &\lesssim \|\pi _{2}(u)\|_{L^{2n}} \| v[t]\|_{\mathcal {H}^{1/2}}\|\partial u\|_{L^{ \infty}} \lesssim {\mathcal {A}}{\mathcal {A}^{\sharp }}{\mathcal {B}}\| v[t] \|_{\mathcal {H}^{1/2}} \end{aligned} $$

for the \(L^{2} L^{\infty}\) bound.

Having dismissed the perturbative cases via this analysis, we observe an exact cancellation with the first two terms of \(N_{1}'(u)\). Collecting the remaining paraproduct terms from \(N_{1}'(u)\) and (9.24), we are left with the expression

$$ \begin{aligned} \partial _{\alpha }T_{{\hat{g}}^{\beta \gamma}} T_{T_{\partial ^{ \alpha }u}\partial _{\beta }v}\partial _{\gamma }u - \partial _{ \alpha}T_{{\hat{g}}^{\alpha \beta}} T_{T_{\partial ^{\gamma }u} \partial _{\beta }\partial _{\gamma }v} u - \partial _{\alpha}T_{{ \hat{g}}^{\alpha \beta}} T_{T_{\partial ^{\gamma }u}v} \partial _{ \beta }\partial _{\gamma }u. \end{aligned} $$
(9.25)

b) Before proceeding, we further process the first term in (9.25), with the key step being an integration by parts which reveals an instance of \(T_{{\hat{P}}}v\). Reindexing, we rewrite this term as

$$ \partial _{\gamma }T_{{\hat{g}}^{\beta \alpha}} T_{T_{\partial ^{ \gamma }u}\partial _{\beta }v}\partial _{\alpha }u. $$

Then applying Lemma 2.4 and the estimate (2.13) in Lemma 2.8 to commute \(T_{{\hat{g}}^{\alpha \beta}}\), similar to step 2), we replace this by

$$ \partial _{\gamma }T_{T_{\partial ^{\gamma }u} T_{{\hat{g}}^{\beta \alpha}} \partial _{\beta }v}\partial _{\alpha }u. $$

Simulating an integration by parts with respect to \(\partial _{\alpha}\), we write this as

$$ \partial _{\alpha}\partial _{\gamma }T_{T_{\partial ^{\gamma }u} T_{{ \hat{g}}^{\beta \alpha}} \partial _{\beta }v} u - \partial _{\gamma }T_{ \partial _{\alpha }T_{\partial ^{\gamma }u} T_{{\hat{g}}^{\beta \alpha}} \partial _{\beta }v} u. $$

We will carry the first of these terms forward to 3c), while the latter term is perturbative. To see this, we observe that \(\partial _{\alpha}\) may commute through \(T_{\partial ^{\gamma }u}\), similar to the analysis in 3a). Thus we arrive at the expression

$$ \partial _{\gamma }T_{ T_{\partial ^{\gamma }u} \partial _{\alpha }T_{{ \hat{g}}^{\beta \alpha}} \partial _{\beta }v} u = \partial _{\gamma }T_{ T_{\partial ^{\gamma }u} T_{{\hat{P}}}v} u. $$

We consider separately via \(f_{2}\) and \(\partial _{t} f_{1}\) the contributions corresponding to \(\partial _{\gamma }= \partial _{i}\) and \(\partial _{\gamma }= \partial _{0}\) respectively. For the bound (9.19), using Lemma 9.4 and the Strichartz exponents \((p_{1},q_{1})\) given by

$$ \frac{1}{p_{1}} = \frac{1}{n - \frac{1}{2}}, \qquad \frac{1}{p_{1}} + \frac{1}{q_{1}} = \frac{1}{2}, $$
(9.26)

we estimate, in both cases,

$$ \begin{aligned} \|T_{ T_{\partial ^{\gamma }u} T_{{\hat{P}}}v} u\|_{(S_{AIT}^{- \frac{1}{2}})'} & \lesssim \|\langle D_{x} \rangle ^{\frac{3}{2}+ \delta} T_{ T_{\partial ^{\gamma }u} T_{{\hat{P}}}v} u\|_{L^{p'_{1}} L^{q'_{1}}} \\ &\lesssim {\mathcal {A}}\|T_{{\hat{P}}}v\|_{L^{p}L^{q}} \|u\|_{L^{2} W^{ \frac{3}{2}+\delta , \infty}}. \end{aligned} $$

It remains to prove the bound (9.20), but this is again a simpler bound where we have a considerable gain. Indeed, using only \(H^{s}\) Sobolev bounds but including (9.4) and (9.7) we obtain at fixed time

$$ \| T_{ T_{\partial ^{\gamma }u} T_{{\hat{P}}}v} u\|_{L^{2}} \lesssim { \mathcal {A}}\| T_{{\hat{P}}}v\|_{H^{-\frac{5}{4}}} \| u \|_{H^{s}} , $$

which suffices for all the Strichartz bounds.

c) Returning to (9.25) and replacing the first term via the analysis in 3b), we are now left with

$$ \begin{aligned} \partial _{\alpha}(\partial _{\gamma }T_{T_{\partial ^{\gamma }u} T_{{ \hat{g}}^{\beta \alpha}} \partial _{\beta }v} u - T_{{\hat{g}}^{\alpha \beta}} T_{T_{\partial ^{\gamma }u} \partial _{\beta }\partial _{ \gamma }v} u - T_{{\hat{g}}^{\alpha \beta}} T_{T_{\partial ^{\gamma }u}v} \partial _{\beta }\partial _{\gamma }u). \end{aligned} $$

We observe a cancellation between the first two terms. To see this, we apply the Leibniz rule for the \(\partial _{\gamma}\) derivative on the first term. Similar to 3a), cases where the derivative falls on the lowest frequency \(\partial ^{\gamma }u\) or \({\hat{g}}^{\beta \alpha}\) are perturbative. We also have a term which cancels the second term, leaving us with

$$ \begin{aligned} \partial _{\alpha }( T_{T_{\partial ^{\gamma }u} T_{{\hat{g}}^{\beta \alpha}} \partial _{\beta }v}\partial _{\gamma }u - T_{{\hat{g}}^{ \alpha \beta}} T_{T_{\partial ^{\gamma }u}v} \partial _{\beta } \partial _{\gamma }u). \end{aligned} $$

Applying also the commutator Lemma 2.4 and the bound (2.13) in Lemma 2.8 as in 2), we rewrite this as

$$ \begin{aligned} \partial _{\alpha }T_{\partial ^{\gamma }u} T_{T_{{\hat{g}}^{\beta \alpha}} \partial _{\beta }v}\partial _{\gamma }u - \partial _{\alpha }T_{T_{\partial ^{\gamma }u}v} T_{{\hat{g}}^{\alpha \beta}} {\partial _{\beta }\partial _{\gamma}} u. \end{aligned} $$

d) We apply the Leibniz rule with respect to \(\partial _{\alpha}\). Here we observe that cases where \(\partial _{\alpha}\) falls on lower frequency instances of \(u\) or \(g\) are perturbative. Note that in contrast to the previous substeps, we no longer have the \(\partial _{\alpha}\) divergence and so we must put all terms in \(f_{2}\).

We consider for instance the term

$$ T_{T_{\partial _{\alpha} \partial ^{\gamma }u} v} T_{{\hat{g}}^{\alpha \beta}} {\partial _{\beta }\partial _{\gamma}} u. $$

Excluding the case of two time derivatives in \(\partial _{\alpha}\partial ^{\gamma }u\), this is easily estimated due to a favorable balance of derivatives. In the case of two time derivatives, we have \(\partial _{\alpha}\partial ^{\gamma }u \in \mathfrak{DC}\) so we can use the decomposition in Definition 5.1, say \(\partial _{\alpha}\partial ^{\gamma }u = h_{1}+h_{2}\). The first component can be thought of as a spatial derivative and is again easily estimated. It remains to consider the contribution of the second term \(h_{2} \in {\mathcal {B}}^{2} L^{\infty}\):

$$ \begin{aligned} \| T_{T_{h_{2}} v} T_{{\hat{g}}^{\alpha \beta}} {\partial _{\beta } \partial _{\gamma}} u \|_{H^{-1/2}} &\lesssim {\mathcal {A}}\| h_{2}\|_{L^{\infty}} \| v\|_{H^{\frac{1}{2}}} \| \partial ^{2} u \|_{H^{s-2}} \\ &\lesssim {\mathcal {B}}^{2} \|v[t]\|_{\mathcal {H}^{1/2}}. \end{aligned} $$

A similar analysis applies in the cases where \(\partial ^{\alpha}\) falls on a low frequency metric coefficient \(g\).

e) We record the remaining terms after applying the Leibniz rule, and will observe instances of \(T_{{\hat{P}}}\) for which we use the equation (9.6), as well as a cancellation. We arrive at

$$ \begin{aligned} T_{\partial ^{\gamma }u} (T_{\partial _{\alpha }T_{{\hat{g}}^{\beta \alpha}} \partial _{\beta }v}\partial _{\gamma }u + T_{T_{{\hat{g}}^{\beta \alpha}} \partial _{\beta }v} {\partial _{\alpha} \partial _{\gamma}} u) - T_{ T_{\partial ^{\gamma }u} \partial _{\alpha }v}T_{{\hat{g}}^{\alpha \beta}} {\partial _{\beta }\partial _{\gamma}} u - T_{T_{\partial ^{\gamma }u} v}\partial _{\alpha }T_{{\hat{g}}^{\alpha \beta}} {\partial _{\beta }\partial _{\gamma}} u. \end{aligned} $$

Reindexing the second term, and applying the bound (2.13) in Lemma 2.8 in the second and the third term, the above expression may be reduced to the form

$$ \begin{aligned} T_{\partial ^{\gamma }u} T_{T_{{\hat{P}}}v} \partial _{\gamma }u + T_{\partial ^{\gamma }u}T_{{\hat{g}}^{\alpha \beta}} T_{\partial _{\alpha }v} {\partial _{\beta} \partial _{\gamma}} u - T_{\partial ^{\gamma }u} T_{ \partial _{\alpha }v}T_{{\hat{g}}^{\alpha \beta}} {\partial _{\beta }\partial _{\gamma}} u - T_{T_{\partial ^{\gamma }u} v}\partial _{\alpha }T_{{\hat{g}}^{\alpha \beta}} {\partial _{\beta }\partial _{\gamma}} u. \end{aligned} $$
(9.27)

Now in the two middle terms we have a commutator structure, which can be estimated directly by Lemma 2.4. It remains to consider the first and the last term. We apply (9.10) to the first term, and estimate in a dual Strichartz norm with exponents \((p_{1},q_{1})\) as in (9.26),

$$ \begin{aligned} \| T_{T_{{\hat{P}}}v} \partial _{\gamma }u \|_{(S_{AIT}^{\frac{1}{2}})'} & \lesssim \|\langle D_{x} \rangle ^{\frac{1}{2}+\delta} T_{T_{{ \hat{P}}}v} \partial _{\gamma }u\|_{L^{p'_{1}} L^{q'_{1}}} \lesssim { \mathcal {A}}\|T_{{\hat{P}}}v\|_{L^{p}L^{q}} \|\partial u\|_{L^{2} W^{ \frac{1}{2}+\delta , \infty}} . \end{aligned} $$

For the last term, on the other hand, we use a Strichartz bound for \(v\) and match it with the bound (9.5) in Lemma 9.2,

$$ \begin{aligned}\| \langle D_{x} \rangle ^{\delta} T_{T_{\partial ^{\gamma }u} v} \partial _{\alpha }T_{{\hat{g}}^{\alpha \beta}} {\partial _{\beta } \partial _{\gamma}} u\|_{L^{p'_{3}}L^{q'_{3}}} &\lesssim \| \langle D_{x} \rangle ^{-\delta} v \|_{L^{p'_{3}}L^{q'_{3}}} \| \langle D_{x} \rangle ^{\delta _{0}} \partial _{\alpha }T_{{\hat{g}}^{\alpha \beta}} { \partial _{\beta }\partial _{\gamma}} u\|_{L^{p_{4}} L^{q_{4}}} \\ &\ll \| v\|_{S^{\frac{1}{2}}_{AIT}}, \end{aligned}$$

where

$$ \frac{1}{p_{3}} = \frac{1}{n-\frac{1}{2}}, \qquad \frac{1}{p_{4}} = 1 - \frac{2}{p_{3}}, \qquad \frac{1}{p_{3}} + \frac{1}{q_{3}} = \frac{1}{2}, \qquad \frac{1}{p_{4}}+ \frac{1}{q_{4}}=1. $$

Here the Strichartz exponents \(p_{3}\) and \(q_{3}\) are chosen so that the first factor on the right is controlled by \(\| v\|_{S^{\frac{1}{2}}_{AIT}}\) and \(\delta \) is arbitrarily small. On the other hand \(\delta _{0}\) is a fixed positive parameter which depends only on the distance between \(s\) and its lower bound.

4) It remains to consider the cancellation between the balanced \(\Pi \) terms in the normal form correction and in \(N_{1}'(u)\). Here the analysis is identical to the analysis for the low-high \(T\) contributions in Step 3, due to the analogous structure for the \(T\) and \(\Pi \) terms in both \(v_{2}\) and \(N_{1}'(u)\). The main care that is needed is to observe that all negative Sobolev exponent norms have been addressed in Step 3 by either using a divergence structure, or by Sobolev embeddings, which apply equally well to the balanced \(\Pi \) case. □

9.3 Reduction to the paradifferential equation

Here we first use the well-posedness result for the linear paradifferential equation in order to obtain a good bound for \(\tilde{v}\). The source terms are perturbative by (9.19) and Theorem 4.12, so the solution \(\tilde{v}\) must satisfy the bound

$$ \| \tilde{v}\|_{S_{AIT}^{\frac{1}{2}}} + \| \partial _{t} \tilde{v}\|_{S_{AIT}^{- \frac{1}{2}}} \lesssim \| \tilde{v}[0]\|_{\mathcal {H}^{\frac{1}{2}}} + c \| v \|_{S_{AIT}^{\frac{1}{2}}} + \| \partial _{t} v \|_{S_{AIT}^{- \frac{1}{2}}}, \qquad c \ll 1. $$
(9.28)

It remains to show that the Strichartz estimates carry over to \(v\). For this, it suffices to show that

$$ \| v_{2} \|_{S_{AIT}^{\frac{1}{2}}} + \| \partial _{t} v_{2} \|_{S_{AIT}^{- \frac{1}{2}}} \ll \| v \|_{L^{\infty }\mathcal {H}^{\frac{1}{2}}}. $$
(9.29)

If this is true, then combining the last two bounds with the norm equivalence (9.17) we obtain the desired bound for the linearized evolution (9.3), namely

$$ \| v \|_{S_{AIT}^{\frac{1}{2}}} + \| \partial _{t} v \|_{S_{AIT}^{- \frac{1}{2}}} \ll \| v[0] \|_{\mathcal {H}^{\frac{1}{2}}} $$
(9.30)

with a universal implicit constant. This concludes the proof of Theorem 9.1 in dimension \(n \geq 3\). The case \(n=2\) is virtually identical.

It remains to prove (9.29). The energy norm for \(v_{2}\) has already been estimated in part (i) of Proposition 9.5, so it remains to consider the \(L^{2} L^{\infty}\) norm in three and higher dimensions. This is a soft bound, where we only need to use the energy bound for \(v\) on the right, and not the full Strichartz norm, as we would also have been allowed. There are eight norms to estimate; most of them are similar, so we consider a representative sample, leaving the rest for the reader.

For a streamlined unbalanced bound we consider the term

$$ \begin{aligned}\| \langle D_{x} \rangle ^{-\frac{n-2}{2}-\frac{1}{4} - \delta} T_{T_{ \partial ^{\gamma }u} v} \partial _{\gamma }u \|_{L^{2} L^{\infty}} &\lesssim \| T_{\partial ^{\gamma }u} v\|_{L^{\infty }L^{ \frac{2n}{n-1}}} \| \langle D_{x} \rangle ^{\frac{1}{4} -\delta} \partial u \|_{L^{\infty}} \\ &\lesssim {\mathcal {A}}\| v\|_{L^{\infty } \mathcal {H}^{\frac{1}{2}}}, \end{aligned}$$

where we have used Bernstein’s inequality twice and the Strichartz bound for \(u\). This pattern is followed for all unbalanced terms.

For the worst balanced case, we apply the time derivative to \(v\) in the next to last term in \(v_{2}\). Then we have to estimate

$$ \begin{aligned} \| \langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{4} - \delta} \Pi (T_{ \partial ^{\gamma }u} \partial ^{2} v, u) \|_{L^{2} L^{\infty}}& \lesssim \| \Pi (T_{\partial ^{\gamma }u} \partial ^{2} v, u) \|_{L^{2} L^{\frac{4n}{2n-1}} } \\ &\lesssim {\mathcal {A}}\| \partial ^{2} v\|_{L^{\infty }H^{- \frac{3}{2}}} \| \langle D_{x} \rangle ^{\frac{3}{2}+\frac{1}{4}} u\|_{L^{2} L^{\infty}} \\ &\lesssim {\mathcal {A}}\| v\|_{L^{\infty }\mathcal {H}^{ \frac{1}{2}}}, \end{aligned} $$

where we have used Bernstein’s inequality twice, Lemma 9.3 and the Strichartz bound for \(u\).

10 Short time Strichartz estimates

The aim of this section is to provide a more detailed overview of the local well-posedness result in [38], and at the same time to provide a formulation of this result which applies in a large data setting, but for a short time. Instead of working with the equation (1.13), here it is easier to work with the problem

$$ g^{\alpha \beta}({\mathbf {u}}) \partial _{\alpha }\partial _{\beta }{ \mathbf {u}}= h^{\alpha \beta}({\mathbf {u}}) \partial _{\alpha }{ \mathbf {u}}\, \partial _{\beta }{\mathbf {u}}$$
(10.1)

for a possibly vector-valued function \({\mathbf {u}}\). This is exactly the set-up of [38], and has the advantage that it is scale invariant. We recall that the scaling exponent for this problem is \(s_{c} = \frac{n}{2}\). In our problem, we will apply the results in this section to the function \({\mathbf {u}}= \partial u\).

We begin with a review of the local well-posedness result in [38], but where we describe also the structure of the Strichartz estimates:

Theorem 10.1

Smith-Tataru [38]

Consider the problem (10.1) with initial data satisfying

$$ \| {\mathbf {u}}[0]\|_{\mathcal {H}^{s_{1}}} \ll 1. $$
(10.2)

where

$$ s_{1}> s_{c}+\frac{3}{4}, \qquad n = 2, $$
(10.3)

respectively

$$ s_{1}> s_{c}+\frac{1}{2}, \qquad n = 3, 4, 5. $$
(10.4)

Then the solution exists on the time interval \([0,1]\), and satisfies the following Strichartz estimates

$$ \| \langle D_{x} \rangle ^{\delta _{0}}\partial {\mathbf {u}}\|_{L^{4} L^{ \infty}} \lesssim 1 , \qquad n=2 , $$
(10.5)

respectively

$$ \| \langle D_{x} \rangle ^{\delta _{0}}\partial {\mathbf {u}}\|_{L^{2} L^{ \infty}} \lesssim 1 , \qquad n = 3,4,5, $$
(10.6)

with a small \(\delta _{0} > 0\).

In addition, another conclusion of the work in [38], which is used as an intermediate step in the proof of the theorem above, is that the linearized problem around the solutions in Theorem 10.1 is well-posed in a range of Sobolev spaces, and almost lossless Strichartz estimates hold for them. Precisely, we have the following:

Theorem 10.2

[38]

Let \({\mathbf {u}}\) be a solution for (10.1) in the time interval \([0,1]\) as in Theorem 10.1. Then the linear equation

$$ \left \{ \begin{aligned} & g^{\alpha \beta}({\mathbf {u}}) \partial _{\alpha }\partial _{\beta }v = 0 \\ & v[0] = (v_{0},v_{1}) \end{aligned} \right . $$
(10.7)

is well-posed in \(\mathcal {H}^{r}\) in the same time interval for \(1 \leq r \leq s_{1}+1\), and the solutions satisfy the uniform and Strichartz estimates (4.33) for the same range of \(r\).

We note that in [38] it is also assumed that \(g^{00}= -1\), akin to our metric \({\tilde{g}}\); but it is clear that such an assumption is not needed in the above theorems, as one can simply divide the equation by \(g^{00}\).

We also remark that the equation (10.7) is not the same as the linearized equation. The reason (10.7) is preferred in [38] is the extended upper bound for \(r\). It is also noted in [38] that for a range of \(r\) with a lower upper bound, the conclusion of the last theorem is also valid for the full linearized equation; this is a straightforward perturbative argument. From below, the Sobolev exponent \(r = 1\) suffices in dimension \(n\geq 3\) in [38], though it is also clear that this is not optimal. Indeed, for dimension \(n = 2\) the above result is extended in [38] to the range \(\frac{3}{4} \leq r \leq s_{1}+1\), and the linearized equation is shown to be well-posed in \(\mathcal {H}^{\frac{3}{4}}\); see [38, Lemma A4]; the same method also works in higher dimension.

We also remark that if the linearized equation is in divergence form, (which can be arranged in the present paper, see (3.24)), then, by duality, (forward/backward) well-posedness in \(\mathcal {H}^{r}\) implies (backward/forward) well-posedness in \(\mathcal {H}^{1-r}\), with the center point at \(r = \frac{1}{2}\). This motivates why, in the context of the present paper, it is easiest to study the linearized equation exactly in \(\mathcal {H}^{\frac{1}{2}}\). Unfortunately our argument runs into a technical obstruction in dimension \(n=2\), which is why we make a slight adjustment there and work instead in \(\mathcal {H}^{\frac{5}{8}}\).

To summarize, in the present paper we will not need directly the conclusion of Theorem 10.2, but rather a minor variation of it where we also consider the divergence form equation and its associated paradifferential flow, and we lower the range for \(r\) in order to include the space \(\mathcal {H}^{\frac{1}{2}}\) (\(\mathcal {H}^{\frac{5}{8}}\) in dimension two).

In the proof of the main result of this paper, we will need to use this result for solutions that are not small in \(\mathcal {H}^{s}\), so we cannot apply it directly. Instead, we will seek to rephrase it and use it in a large data setting via a scaling argument.

The difficulty we face is that rescaling keeps homogeneous Sobolev norms unchanged, rather than the inhomogeneous ones. A first step in this direction is to consider smooth solutions, but which may be large at low frequency:

Theorem 10.3

Consider the problem (10.1) with initial data satisfying

$$ \| {\mathbf {u}}[0]\|_{\dot {\mathcal {H}}^{N} \cap \dot {\mathcal {H}}^{s_{c}}} + \| {\mathbf {u}}(0)\|_{L^{\infty}} \ll 1. $$
(10.8)

Then the solution exists up to time 1, and satisfies the uniform bound

$$ \| {\mathbf {u}}\|_{L^{\infty}} \ll 1, $$
(10.9)

and the Sobolev bound

$$ \| {\mathbf {u}}\|_{L^{\infty}([0,1];\dot {\mathcal {H}}^{N} \cap \dot {\mathcal {H}}^{s_{c}})} \lesssim \| {\mathbf {u}}[0]\|_{ \dot {\mathcal {H}}^{N} + \dot {\mathcal {H}}^{s_{c}}}. $$
(10.10)

In addition,

$$ \| {\mathbf {u}}[\cdot ] \|_{L^{\infty}([0,1]; \mathcal {H}^{1})} \lesssim \|{\mathbf {u}}[0]\|_{\mathcal {H}^{1}} $$
(10.11)

whenever the right hand side is finite.

Proof

Locally, after subtracting a constant, the data is small in \(\mathcal {H}^{N}\) so the existence of regular solutions is classical. It remains to establish energy estimates in homogeneous Sobolev norms. The problem reduces to the case of the paradifferential flow, and, by conjugation with a power of \(\langle D_{x} \rangle \), to bounds in \(\mathcal {H}^{1}\) that are straightforward. □

A second step is the following variation of Theorem 10.1, where we consider a small \(\mathcal {H}^{s_{1}}\) perturbation of a small and smooth data:

Theorem 10.4

Consider the problem (10.1) with initial data \({\mathbf {u}}[0]\) of the form

$$ {\mathbf {u}}[0] = {\mathbf {u}}^{lo}[0] + {\mathbf {u}}^{hi}[0] , $$
(10.12)

where the two components satisfy

$$ \| {\mathbf {u}}^{lo}[0]\|_{\mathcal {H}^{N}} \ll 1, \qquad \| {\mathbf {u}}^{hi}[0] \|_{\mathcal {H}^{s_{1}}} \leq \epsilon \ll 1. $$
(10.13)

Then the solution \(u\) exists on the time interval \([0,1]\), and satisfies the following Strichartz estimates

$$ \| \langle D_{x} \rangle ^{\delta _{0}}\partial ({\mathbf {u}}-{ \mathbf {u}}^{lo})\|_{L^{4} L^{\infty}} \lesssim \epsilon , \qquad n=2, $$
(10.14)

respectively

$$ \| \langle D_{x} \rangle ^{\delta _{0}}\partial ({\mathbf {u}}-{ \mathbf {u}}^{lo})\|_{L^{2} L^{\infty}} \lesssim \epsilon , \qquad n \geq 3, $$
(10.15)

with a small \(\delta _{0} > 0\).

We remark that the solutions in this second theorem are still covered by Theorem 10.1. The only difference is that the constant in the Strichartz bound depends only on the \({\mathbf {u}}^{hi}[0]\) bound.

Proof

This follows by a direct application of the results in Theorem 10.1 and Theorem 10.2. We write an equation for \({\mathbf {u}}^{hi}={\mathbf {u}}-{\mathbf {u}}^{lo}\),

$$ g^{\alpha \beta}({\mathbf {u}}) \partial _{\alpha }\partial _{\beta }{ \mathbf {u}}^{hi} = - (g^{\alpha \beta}({\mathbf {u}}) - g^{\alpha \beta}({ \mathbf {u}}^{lo})) \partial _{\alpha }\partial _{\beta }{\mathbf {u}}_{lo} := \mathbf{f}^{hi}, $$

where the source term \(\mathbf{f}^{hi}\) can be estimated at fixed time by

$$ \| \mathbf{f}^{hi}\|_{H^{s_{1}-1}} \lesssim \|{\mathbf {u}}^{hi} \|_{H^{s_{1}}}, $$

and thus it is perturbative. Then we apply the Strichartz estimates in Theorem 10.2 to \({\mathbf {u}}^{hi}\), and the desired conclusion follows. □

Now we consider the large data problem, where we show local well-posedness by a scaling argument. The price to pay will be that the time interval for which we have the solutions will be shorter. Precisely, we will show that

Theorem 10.5

For any \(s_{1}\) as in (10.3), (10.4) there exists \(\delta _{0} > 0\) so that the following holds: For any \(M > 0\) and any solution \({\mathbf {u}}\) to the problem (1.13) with initial data satisfying

$$ \| {\mathbf {u}}[0]\|_{\dot {\mathcal {H}}^{s_{1}}} \ll M, \qquad \|{ \mathbf {u}}[0]\|_{\dot {\mathcal {H}}^{s_{c}}} \ll 1. $$
(10.16)

We have:

a) The solution exists up to time \(T_{M}\) given by

$$ T_{M}^{\sigma }= M^{-1}, \qquad \sigma = s_{1}-s_{c}, $$
(10.17)

with uniform bounds

$$ \| {\mathbf {u}}[\cdot ]\|_{C([0,T_{M}];\mathcal {H}^{s_{1}})} \lesssim M, \qquad \|{\mathbf {u}}[\cdot ]\|_{C([0,T_{M}];\dot {\mathcal {H}}^{s_{c}})} \lesssim 1, $$
(10.18)

as well as

$$ \| {\mathbf {u}}[\cdot ]\|_{C([0,T_{M}];\mathcal {H}^{1})} \lesssim \|{ \mathbf {u}}[0]\|_{\mathcal {H}^{1}}, $$
(10.19)

whenever the right hand side is finite.

b) The solution \({\mathbf {u}}\) satisfies the following Strichartz estimates in \([0,T_{M}]\):

$$ \|\langle T_{M} D'\rangle ^{\delta _{0}}\partial {\mathbf {u}}\|_{L^{4} L^{ \infty}} \lesssim T_{M}^{-\frac{3}{4}}, \qquad n=2 , $$
(10.20)

respectively

$$ \| \langle T_{M} D'\rangle ^{\delta _{0}}\partial {\mathbf {u}}\|_{L^{2} L^{\infty}} \lesssim T_{M}^{-\frac{1}{2}}, \qquad n \geq 3. $$
(10.21)

c) Furthermore, the homogeneous Strichartz estimates (4.33) also hold in \(\mathcal {H}^{r}\) for the associated linear equations (10.7), on the same time intervals for \(r \in [1,s_{1}]\).

Proof

As stated, the result is invariant with respect to scaling. Precisely \(M\) plays the role of a scaling parameter, and by scaling we can set it to 1.

It remains to prove the result for \(M=1\) in which case \(T_{M}=1\). In a nutshell, the idea of the proof is to use the finite proof of propagation to localize the problem and, by scaling, to reduce it to the case when Theorems 10.310.4 can be applied. To fix the notations, we will consider the case \(n \geq 3\) in what follows; the two dimensional case is identical after obvious changes in notations.

On the Fourier side we split the initial data into two components,

$$ {\mathbf {u}}[0] = {\mathbf {u}}^{lo}[0] + {\mathbf {u}}^{hi}[0], \qquad { \mathbf {u}}^{lo}[0] = P_{< 0} {\mathbf {u}}[0], $$

and we denote by \({\mathbf {u}}\) and \({\mathbf {u}}^{lo}\) the corresponding solutions.

On the other hand on the physical side we partition the initial time slice \(t=0\) into cubes \(Q\) of size 1, and consider a partition of unity associated to the covering by \(8Q\),

$$ 1 = \sum \chi _{Q}, $$

and define the localized initial data

u Q [0]=( χ Q ( u 0 u ¯ 0 , Q l o ), χ Q u 1 ), u ¯ 0 , Q l o = Q u 0 l o dx,

which agrees with \({\mathbf {u}}[0]\) in \(6Q\) up to a constant. The speed of propagation for solutions \({\mathbf {u}}\) with \(|{\mathbf {u}}| \ll 1\) is close to 1, therefore the corresponding solutions \(w_{Q}\) agree with \({\mathbf {u}}\) in \(4Q\) (again, up to a constant) in \([0,1]\), assuming both exist up to this time.

Next we consider the existence and properties of the solutions \({\mathbf {u}}_{Q}\) in the time interval \([0,T_{M}]\). For \({\mathbf {u}}_{Q}[0]\) we have a low-high decomposition,

$$ {\mathbf {u}}_{Q}[0] = (\chi _{T_{M}^{-1} Q}({\mathbf {u}}^{lo}_{0} - \bar {{\mathbf {u}}}^{lo}_{0,Q}), \chi _{T_{M}^{-1}} Q {\mathbf {u}}^{lo}_{1}) + \chi _{T_{M}^{-1} Q}({\mathbf {u}}^{hi}_{0}, Q {\mathbf {u}}^{hi}_{1}) :={ \mathbf {u}}^{lo}_{Q}[0] + {\mathbf {u}}^{hi}_{Q}[0]. $$

Now we consider energy bounds for the initial data. For \({\mathbf {u}}^{lo}\) we have

$$ \| {\mathbf {u}}^{lo} [0]\|_{\dot{H}^{s_{c}} \cap \dot {\mathcal {H}}^{N}} \ll 1. $$
(10.22)

Since \(s_{1}-1 < n/2\), after localization this also implies that the low frequency components satisfy

$$ \| {\mathbf {u}}^{lo}_{Q} [0]\|_{\mathcal {H}^{N}} \ll 1, $$
(10.23)

which is exactly as in Theorem 10.3, respectively Theorem 10.4.

On the other hand, for the high frequency bounds we have the almost orthogonality relation

$$ \sum _{Q} \|{\mathbf {u}}_{Q}[0]\|_{\mathcal {H}^{s_{1}}}^{2} \ll 1. $$
(10.24)

By Theorem 10.1, it follows that the solutions \({\mathbf {u}}_{Q}\) exist up to time 1, and satisfy the Strichartz bounds

$$ \| \langle D_{x} \rangle ^{\delta _{0}}\partial {\mathbf {u}}_{Q}\|_{L^{2} L^{\infty}} \lesssim 1. $$
(10.25)

Theorem 10.4 allows us to improve this to

$$ \| {\mathbf {u}}_{Q} - {\mathbf {u}}_{Q}^{lo} \|_{L^{\infty }\mathcal {H}^{s_{1}}}+ \| \langle D_{x} \rangle ^{\delta _{0}}\partial ({\mathbf {u}}_{Q}-{ \mathbf {u}}_{Q}^{lo})\|_{L^{2} L^{\infty}} \lesssim \|{\mathbf {u}}^{hi}_{Q}[0] \|_{\mathcal {H}^{s_{1}}}. $$
(10.26)

The solutions \({\mathbf {u}}_{Q}\), respectively \({\mathbf {u}}^{lo}_{Q}\) agree with \({\mathbf {u}}\), \({\mathbf {u}}^{lo}\) in \([0,1]\times 4\tilde{Q}\). Then we can recombine the \({\mathbf {u}}_{Q}\) bounds using a partition of unity on the unit spatial scale. We obtain a \({\mathbf {u}}\) bound, namely

$$ \| {\mathbf {u}}- {\mathbf {u}}^{lo} \|_{L^{\infty }\mathcal {H}^{s_{1}}}+ \| \langle D_{x} \rangle ^{\delta _{0}}\partial ({\mathbf {u}}-{ \mathbf {u}}^{lo})\|_{L^{2} L^{\infty}} \lesssim \| {\mathbf {u}}^{hi}_{Q}[0] \|_{\mathcal {H}^{s_{1}}} \ll 1. $$
(10.27)

On the other hand for \({\mathbf {u}}_{lo}\) we have the bounds given by Theorem 10.3.

The energy bounds for \({\mathbf {u}}-{\mathbf {u}}^{lo}\) and \({\mathbf {u}}^{lo}\) combined yield the desired energy bound (10.18) in the theorem. In terms of the Strichartz bounds (10.20), we already have them for \({\mathbf {u}}- {\mathbf {u}}^{lo}\) so it remains to prove them for \({\mathbf {u}}^{lo}\). But there we trivially use Sobolev embeddings and Holder’s inequality in time.

It remains to consider the Strichartz estimates for \(\mathcal {H}^{1}\) solutions to the linearized equation. By the same finite speed of propagation argument as above, it suffices to prove them for the linearization around the localized solutions \({\mathbf {u}}_{Q}\). But this follows by Theorem 10.2. □

To conclude this section we reinterpret the above result in the context of the minimal surface equation, exactly in the for it will be used in the last section. We keep the same notations, with the only change that now \(s_{c} = \frac{n}{2}+1\):

Theorem 10.6

For any \(s_{1}\) as in (10.3), (10.4) there exists \(\delta _{0} > 0\) so that the following holds: For any \(M > 0\) and any solution \(u\) to the problem (1.7) with initial data satisfying

$$ \| u[0]\|_{\dot {\mathcal {H}}^{s_{1}}} \ll M, \qquad \|u[0]\|_{ \dot {\mathcal {H}}^{s_{c}}} \ll 1. $$
(10.28)

We have:

a) The solution exists up to time \(T_{M}\) given by

$$ T_{M}^{\sigma }= M^{-1}, \qquad \sigma = s_{1}-s_{c}, $$
(10.29)

with uniform bounds

$$ \| u[\cdot ]\|_{C([0,T_{M}];\mathcal {H}^{s_{1}})} \lesssim M, \qquad \|u[\cdot ]\|_{C([0,T_{M}];\dot {\mathcal {H}}^{s_{c}})} \lesssim 1, $$
(10.30)

as well as

$$ \| u[\cdot ]\|_{C([0,T_{M}];\mathcal {H}^{1})} \lesssim \|u[0]\|_{ \mathcal {H}^{1}}, $$
(10.31)

whenever the right hand side is finite.

b) The solution \(u\) satisfies the following Strichartz estimates in \([0,T_{M}]\):

$$ \|\langle T_{M} D'\rangle ^{\delta _{0}}\partial ^{2} u\|_{L^{4} L^{ \infty}} \lesssim T_{M}^{-\frac{3}{4}}, \qquad n=2 , $$
(10.32)

respectively

$$ \| \langle T_{M} D'\rangle ^{\delta _{0}}\partial ^{2}u\|_{L^{2} L^{ \infty}} \lesssim T_{M}^{-\frac{1}{2}}, \qquad n \geq 3. $$
(10.33)

c) Furthermore, the homogeneous Strichartz estimates (4.33) also hold in \(\mathcal {H}^{r}\) for the associated linear equations (10.7), on the same time intervals for \(r \in [1,s_{1}]\). Also the full Strichartz estimates (4.42) with \(S= S_{ST}\) hold for the linear paradifferential equation hold in \(\mathcal {H}^{r}\) on the same time intervals for all real \(r\).

The theorem is obtained by applying the previous theorem to \({\mathbf {u}}= \partial u\). For the Strichartz estimates for the linear paradifferential equation we observe in addition that we have the bound

$$ \| \partial ^{2} g\|_{L^{1}(0,T_{M}; L^{\infty})} \lesssim 1. $$

Then the \(r=1\) case of the Strichartz estimates for the linear equations (10.7) together with Proposition 4.8 imply the desired conclusion.

11 Conclusion: proof of the main result

After using the finite speed of propagation to reduce to the small data problem, here we combine our balanced energy estimates with the short time Strichartz bounds in order to complete the proof of our main result in Theorem 1.3. Our rough solutions are constructed as limits of smooth solutions obtained by regularizing the initial data, so the emphasis is on obtaining favourable estimates for these smooth solutions.

11.1 Reduction to small data

By Sobolev embeddings, the initial data satisfies

$$ \| u_{0}\|_{C^{1,\sigma}} + \| u_{1}\|_{C^{\sigma}} \lesssim 1, \qquad \sigma = s -\frac{n}{2}-1. $$

Then given \(x_{0} \in {\mathbb{R}}^{n}\), within a small ball \(B(x_{0},4r)\) we have

$$ |u_{0}(x) - (u_{0}(x_{0})+(x-x_{0})\partial u(x_{0}))| + | u_{1} - u_{1}(x_{0})| \lesssim r^{\sigma}. $$

This allows us to truncate the above differences near \(x_{0}\) to obtain the localized data

$$ \begin{aligned} u_{0}^{r,x_{0}}(x) = & \ (u_{0}(x_{0})+(x-x_{0})\partial u(x_{0})) + \chi (r^{-1}(x-x_{0})) u_{0}(x) - (u_{0}(x_{0}) \\ &+(x-x_{0})\partial u(x_{0})), \\ u_{1}^{r,x_{0}}(x) = & \ u_{1}(x_{0}) + \chi (r^{-1}(x-x_{0})) (u_{1} - u_{1}(x_{0})) , \end{aligned} $$

where \(\chi \in \mathcal {D}({\mathbb{R}}^{n})\) is equal to 1 in \(B(0,2)\) and 0 outside \(B(0,4)\).

Let \(\epsilon > 0\). Then for small enough \(r\), depending on \(\epsilon \), these initial data are close to the initial data for the linear solution to the minimal surface equation given by

$$ \tilde{u}^{x_{0},r}(t,x) = (u_{0}(x_{0})+(x-x_{0})\partial u(x_{0})) + t u_{1}(x_{0}), $$

in the sense that

$$ \| u^{x_{0},r}[0] - \tilde{u}^{x_{0},r}[0] \|_{\mathcal {H}^{s}} \leq \epsilon \ll 1. $$
(11.1)

This will be our smallness condition for the initial data, with \(\partial {\mathbf {u}}^{x_{0},r}\) in a compact subset of the set described in (1.12).

To reduce the problem to the case when the initial data satisfies instead the simpler smallness condition

$$ \| u^{x_{0},r}[0] \|_{\mathcal {H}^{s}} \leq \epsilon \ll 1 $$
(11.2)

it suffices to apply a linear transformation in the Minkowski space \({\mathbb{R}}^{n+2}\) that preserves the time slices but maps our linear solution \(\tilde{u}^{x_{0},r}\) to the zero solution. The price we pay for this is that the background Minkowski metric is then changed to another Lorentzian metric. But the new metric belongs to a compact set in the space of flat Lorenzian metrics for which the time slices are uniformly space-like and the graph of the zero function is uniformly time-like. Hence our small data result applies uniformly to these localized solutions, see Remark 3.1. Then, due to the finite speed of propagation, we also obtain solutions up to time \(O(r)\) for the original problem.

11.2 Uniform bounds for regularized solutions

Let \(s\) be as in Theorem 1.3. Given an initial data \(u[0] \in \mathcal {H}^{s}\) that is small,

$$ \| u[0] \|_{\mathcal {H}^{s}} \leq \epsilon \ll 1, $$
(11.3)

we consider a continuous family of frequency localizations

$$ u^{h}[0] = P_{< h}u[0] $$

to frequencies \(\leq 2^{h}\). For fixed \(h\) and a short time which may depend on \(h\), these solutions exist by Theorem 10.5. Further, they are smooth and also depend smoothly on \(h\). Finally, we consider the functions

$$ v^{h} = \frac{d}{dh} u^{h}. $$

These functions solve the linearized equation around \(u^{h}\), with initial data

$$ v^{h}[0]= P_{h} u[0], $$
(11.4)

which is localized at frequency \(2^{h}\). The functions \(v^{h}\) will be measured in \(\mathcal {H}^{\frac{1}{2}}\) in dimension \(n\geq 3\) and in \(\mathcal {H}^{\frac{5}{8}}\) in dimension \(n=2\). Thus the initial data for \(v^{h}\) satisfies the bound

$$ \begin{aligned} \|v^{h}[0]\|_{\mathcal {H}^{\frac{1}{2}}} \lesssim 2^{-(s-\frac{1}{2})h} \epsilon \qquad n \geq 3, \\ \|v^{h}[0]\|_{\mathcal {H}^{\frac{5}{8}}} \lesssim 2^{-(s-\frac{5}{8})h} \epsilon \qquad n = 2. \end{aligned} $$
(11.5)

Our first objective will be to show that these solutions exist on a time interval that does not depend on \(h\), and satisfy uniform bounds:

Theorem 11.1

The above solutions \(u^{h}\) have the following properties:

  1. a) Uniform lifespan and uniform bounds

    The solutions \(u^{h}\) exist up to time 1, with uniform bounds

    $$ \| u^{h}[\cdot ] \|_{C([0,1];\mathcal {H}^{s})} \lesssim \epsilon , $$
    (11.6)

    and higher regularity bounds

    $$ \| u^{h}[\cdot ] \|_{C([0,1];\mathcal {H}^{s+1})} \lesssim 2^{h} \epsilon . $$
    (11.7)
  2. b) Bounds for the linearized flow

    The linearized equation around \(u^{h}\) is well-posed in \(\mathcal {H}^{\frac{1}{2}}\), with uniform estimates in \([0,1]\), uniformly in \(h\),

    $$ \begin{aligned} \| v\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{1}{2}})} \lesssim \|v[0]\|_{ \mathcal {H}^{\frac{1}{2}}}, \qquad n \geq 3, \\ \| v\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{5}{8}})} \lesssim \|v[0]\|_{ \mathcal {H}^{\frac{5}{8}}}, \qquad n =2, \end{aligned} $$
    (11.8)

    and uniform Strichartz estimates with loss of derivatives,

    $$ \begin{aligned} \| \langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{4}-\delta} \partial v \|_{L^{2} L^{\infty}} \lesssim \|v[0]\|_{\mathcal {H}^{\frac{1}{2}}}, \qquad n \geq 3, \\ \| \langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{4}-\delta} \partial v \|_{L^{4} L^{\infty}} \lesssim \|v[0]\|_{\mathcal {H}^{\frac{5}{8}}}, \qquad n = 2, \end{aligned} $$
    (11.9)

    for any \(\delta > 0\).

The exponent \(s+1\) in (11.7) is chosen so that it falls into the range of existing theory, where we already have well-posedness and continuous dependence. We remark that, as a corollary of part (b), we also obtain uniform bounds for the functions \(v_{h}\), namely

$$ \begin{aligned} \| v^{h}\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{1}{2}})} \lesssim \epsilon 2^{(s-\frac{1}{2})h}, \qquad n \geq 3, \\ \| v^{h}\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{5}{8}})} \lesssim \epsilon 2^{(s-\frac{5}{8})h}, \qquad n = 2. \end{aligned} $$
(11.10)

11.3 The bootstrap assumptions

Our proof of the main result in Theorem 11.1 will be formulated as a bootstrap argument. Then the question is what is a good bootstrap assumption. Having the bounds for the linearized equation as part of the bootstrap assumption would be technically complicated. On the other hand, not having any assumptions at all related to the linearized equation would introduce too many difficulties in getting the argument started. As it turns out, there is a good middle ground, which is to have the uniform energy bounds on both \(u^{h}\) and \(v^{h}\) as part of the bootstrap assumptions, which are then set as follows:

  1. i)

    Uniform \(\mathcal {H}^{s}\) bounds:

    $$ \| u^{h}[\cdot ] \|_{C([0,1];\mathcal {H}^{s})} \leq 1, $$
    (11.11)
  2. ii)

    Higher regularity bounds:

    $$ \| u^{h}[\cdot ] \|_{C([0,1];\mathcal {H}^{s+1})} \leq 2^{h}, $$
    (11.12)
  3. iii)

    Difference bounds,

    $$ \begin{aligned} \| v_{h}\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{1}{2}})} \leq 2^{-(s- \frac{1}{2})h}, \qquad n\geq 3, \\ \| v_{h}\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{5}{8}})} \leq 2^{-(s- \frac{5}{8})h}, \qquad n=2. \end{aligned} $$
    (11.13)

The \(v_{h}\) bootstrap bound will be useful in particular in order to obtain good low frequency bounds for differences of the \(u^{h}\) functions,

$$ \| u^{h} - u^{k}\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{1}{2}})} \lesssim 2^{-(s-\frac{1}{2})h}, \qquad h \leq k, $$
(11.14)

with the obvious change in two dimensions.

To avoid having a bootstrap assumption on a noncompact set of functions, we may freely restrict the range of \(h\). Precisely, given an arbitrary threshold \(h_{0}\), we assume the bootstrap assumption to hold for all \(h \leq h_{0}\) and show that the desired bounds hold in the same range. Since \(h_{0}\) plays no role in the analysis, we will simply drop it in the proofs.

11.4 Short time Strichartz estimates for \(u^{h}\) and \(v^{h}\)

Our goal here is to use the results in Theorem 10.6 together with our bootstrap assumption in order to obtain short time Strichartz estimates for both \(u^{h}\) and \(v^{h}\).

By the bootstrap assumptions (11.11) and (11.12), we may bound the local well-posedness norm \(\mathcal {H}^{s_{1}}\) of the solution \(u^{h}\) by

$$ \| u^{h}[\cdot ]\|_{L^{\infty }\mathcal {H}^{s_{1}}} \lesssim M_{h}:= 2^{h(s_{1}-s)}. $$
(11.15)

Then the result of Theorem 10.5 is valid on time intervals \(I_{h}\) of length

$$ |I_{h}| = T_{h} := M_{h}^{-\frac{1}{\sigma}} = 2^{- \frac{s_{1}- s}{s_{1}-s_{c}}h}. $$

In practice, \(s_{1}\) will be chosen as close as possible to the threshold in (10.3), (10.4). This will insure that in all dimensions we have

$$ \frac{s_{1}-s}{s_{1}-s_{c}} < \frac{1}{2}. $$

In particular, by Theorem 10.5 it follows that the solution \(u^{h}\) satisfies full Strichartz estimates on such intervals,

$$ \| \langle D_{x} \rangle ^{1+\delta _{0}} \partial u^{h}\|_{L^{2}(I_{h}; L^{\infty})} \lesssim T_{h}^{-\frac{1}{2}}, \qquad n \geq 3, $$
(11.16)

respectively

$$ \| \langle D_{x} \rangle ^{1+\delta _{0}} \partial u^{h}\|_{L^{4}(I_{h}; L^{\infty})} \lesssim T_{h}^{-\frac{3}{4}}, \qquad n =2. $$
(11.17)

Also the linearized problem and the linear paradifferential flow will be well-posed in \(\mathcal {H}^{\frac{1}{2}}\) and will satisfy Strichartz estimates on similar time intervals,

$$ \| \langle D_{x} \rangle ^{-\frac{n}{2}-\delta} \partial v \|_{L^{2}(I_{h}; L^{\infty})} \lesssim \| v\|_{L^{\infty}(I_{h};\mathcal {H}^{ \frac{1}{2}})}, \qquad n \geq 3, $$
(11.18)

respectively

$$ \| \langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{8}-\delta} \partial v \|_{L^{4}(I_{h}; L^{\infty})} \lesssim \| v\|_{L^{\infty}(I_{h}; \mathcal {H}^{\frac{5}{8}})}, \qquad n =2, $$
(11.19)

where the \(L^{\infty}\) norm on the right may be replaced by the same \(\mathcal {H}^{\frac{1}{2}}\) norm evaluated at some fixed time within \(I_{h}\). The last set of bounds may be in particular applied to \(v^{h}\), which, in view of our bootstrap assumption, yields

$$ \| \langle D_{x} \rangle ^{-\frac{n}{2}-\delta} \partial v^{h} \|_{L^{2}(I_{h}; L^{\infty})} \lesssim 2^{-(s-\frac{1}{2}) h}, \qquad n \geq 3, $$
(11.20)

respectively

$$ \| \langle D_{x} \rangle ^{-\frac{n}{2} -\frac{1}{8}-\delta} \partial v^{h} \|_{L^{4}(I_{h}; L^{\infty})} \lesssim 2^{-(s- \frac{5}{8}) h} , \qquad n =2. $$
(11.21)

11.5 Long time Strichartz estimates for \(u^{h}\) and \(v^{h}\)

Our objective now is to obtain long time Strichartz bounds by simply adding up the short time bounds. Some care is needed when using (11.21) and (11.20) because, as \(h\) increases, we gain on one hand in the bound on the right, but we loose in the size of the interval \(I_{h}\). However, the gain overrides the loss, so integrating in \(h\) we arrive at the difference bound

$$ \| \langle D_{x} \rangle ^{-\frac{n}{2}-\delta} \partial (u^{h}-u^{k}) \|_{L^{2}(I_{h}; L^{\infty})} \lesssim 2^{-(s-\frac{1}{2}) h}, \qquad n \geq 3, \quad h < k, $$
(11.22)

respectively

$$ \| \langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{8}-\delta} \partial (u^{h}-u^{k}) \|_{L^{4}(I_{h}; L^{\infty})} \lesssim 2^{-(s-\frac{5}{8}) h} , \qquad n =2, \quad h < k. $$
(11.23)

Now we are able to obtain Strichartz bounds for \(u^{h}\) on the full time interval \([0,1]\), simply by adding the short time bounds. Precisely, we claim that for some small universal \(\delta _{1} > 0\) we have

$$ \| \langle D_{x} \rangle ^{\frac{1}{2}+\delta _{1}} P_{k} \partial u^{h} \|_{L^{2}([0,1]; L^{\infty})} \lesssim 1, \qquad n \geq 3, $$
(11.24)

respectively

$$ \| \langle D_{x} \rangle ^{\frac{1}{2}+\delta _{1}} P_{k} \partial u^{h} \|_{L^{4}(0,1; L^{\infty})} \lesssim 1 \qquad n =2. $$
(11.25)

To see this, we differentiate cases depending on how \(k\) and \(h\) compare. We fix the dimension to \(n \geq 3\) for clarity.

a) If \(k \geq h\), then we simply apply (11.16) or (11.17), taking the loss from the number of intervals. For instance in three and higher dimensions we get for \(\delta _{1} \leq \delta _{0}\)

$$ \begin{aligned} \| \langle D_{x} \rangle ^{\frac{1}{2}+\delta _{1}} P_{k} \partial u^{h} \|_{L^{2}([0,1]; L^{\infty})} \lesssim & \ T_{h}^{-\frac{1}{2}} \sup _{I_{h}} \| \langle D_{x} \rangle ^{\frac{1}{2}+\delta _{1}} P_{k} \partial u^{h}\|_{L^{2}(I_{h}; L^{\infty})} \\ \lesssim & \ T_{h}^{-\frac{1}{2}} 2^{-\frac{k}{2}} \sup _{I_{h}} \| \langle D_{x} \rangle ^{1+\delta _{0}} P_{k} \partial u^{h}\|_{L^{2}(I_{h}; L^{\infty})} \\ \lesssim & \ T_{h}^{-\frac{1}{2}} 2^{-\frac{h}{2}} T_{h}^{- \frac{1}{2}} = 2^{(\frac{s_{1}-s}{s_{1}-s_{c}} -\frac{1}{2} )h} \leq 1, \end{aligned} $$

for a favourable choice of \(s_{1}\); for instance \(s_{1} = s+\frac{1}{4}\) suffices, as then \(\frac{s_{1}-s}{s_{1}-s_{c}} < \frac{1}{2}\). The two dimensional argument is similar.

b) if \(k< h\) instead, then we first write

$$ P_{k} u^{h} = P_{k} u^{k} + P_{k}(u^{h}-u^{k}). $$

Here the first term was already estimated before, while for the second we use (11.22) or (11.23), where the loss from the interval size is only in terms of \(k\) and not \(h\). In dimension three and higher this yields

$$ \begin{aligned} \|\langle D_{x} \rangle ^{\frac{1}{2}+\delta _{1}} P_{k}\partial (u^{h}-u^{k}) \|_{L^{2}([0,1]; L^{\infty})} \lesssim & \ T_{k}^{-\frac{1}{2}} \sup _{I_{h}} \| P_{k}\partial (u^{h}-u^{k})\|_{L^{2}(I_{h}; L^{\infty})} \\ \lesssim & \ T_{k}^{-\frac{1}{2}} 2^{-(s-\frac{1}{2}) k} 2^{( \frac{n+1}{2}+\delta )k} \\ =&\ 2^{( \frac{s_{1}-s_{c}}{2\sigma} -(s- \frac{1}{2})+\frac{n+1}{2}+\delta )k} \\ \lesssim & \ 2^{ (\frac{s_{1}-s_{c}}{2\sigma} - (s-s_{c})+ \delta )k} \leq 1, \end{aligned} $$

again for a good choice of \(s_{1}\) (same as above) and a small enough \(\delta \).

In particular, the estimates (11.24), respectively (11.25) allow us to estimate our control parameter ℬ as follows:

$$ \| {\mathcal {B}}\|_{L^{2}[0,1]} \lesssim 1, \qquad n \geq 3, $$
(11.26)

respectively

$$ \| {\mathcal {B}}\|_{L^{4}[0,1]} \lesssim 1, \qquad n =2 . $$
(11.27)

This in turn allows us to use Theorem 8.1 to control the energy growth for the full equation, and in particular to prove the bounds (11.6) and (11.7), thus closing part of the bootstrap loop, namely for the bounds (11.11) and for (11.12).

11.6 Strichartz estimates for the paradifferential flow

Our objective here is to establish Strichartz estimates with loss of derivatives for the linear paradifferential flow around \(u^{h}\). Thus, we consider an \(\mathcal {H}^{\frac{1}{2}}\) solution \(v\) for the paradifferential flow around \(u^{h}\), and we seek to estimate it dyadic pieces in the Strichartz norm, with frequency losses:

Proposition 11.2

Under the bootstrap assumptions (11.11), (11.12) and (11.13), \(\mathcal {H}^{r}\) solutions \(v\) for the linear paradifferential equation

$$ \partial _{\alpha }T_{g^{\alpha \beta}(\partial u^{h})} \partial _{ \beta }v = f $$
(11.28)

satisfy the Strichartz estimates (4.43) with \(S = S_{AIT}\) for all \(r \in {\mathbb{R}}\).

Compared with the full Strichartz bounds, here we have a loss of \(1/4\) derivative in dimension 3 and higher, respectively \(1/8\) derivative in dimension 2.

Proof

Our starting point is Theorem 4.12, which allows us to reduce the problem to proving the homogeneous Strichartz estimates (4.33) for the corresponding homogeneous equation, again for all real \(r\). To prove the proposition in this case, we have two tools at our disposal:

  1. (i)

    The energy estimates of Theorem 7.1. In view of the bounds (11.27) and (11.26), these give uniform \(\mathcal {H}^{r}\) bounds for \(v\),

    $$ \| v[\cdot ] \|_{L^{\infty }\mathcal {H}^{r}} \lesssim \|v[0]\|_{ \mathcal {H}^{r}}. $$
  2. (ii)

    The short time Strichartz estimates (4.33) with \(S=S_{ST}\) on the \(T_{h}\) time scale, provided by Theorem 10.6. Adding these with respect to the time intervals, we arrive at

    $$ \| |D|^{-\frac{d}{2}-\delta} \partial v \|_{L^{2}([0,1]; L^{\infty})} \lesssim T_{h}^{-\frac{1}{2}} \| v[0]\|_{\mathcal {H}^{\frac{1}{2}}}, \qquad n \geq 3, $$
    (11.29)

    respectively

    $$ \| |D|^{-\frac{9}{8}-\delta} \partial v \|_{L^{4}(I_{h}; L^{\infty})} \lesssim T_{h}^{-\frac{1}{4}} \| v[0]\|_{\mathcal {H}^{\frac{5}{8}}}, \qquad n =2. $$
    (11.30)

Now we want to use these tools in order to prove the long term bounds (4.33) with \(S=S_{AIT}\) on the unit time scale. Given the expression for \(T_{h}\), our first observation is that the estimates (11.29), respectively (11.30) suffice for our bounds at frequencies \(\geq h\), but not below that.

Thus, consider a lower frequency \(k < h\), and seek to estimate \(P_{k} v\). At this frequency, we have the correct estimate for the solution \(\tilde{v}\) to the linear paradifferential equation around \(u_{k}\). It remains to compare \(v\) and \(\tilde{v}\). For this we use the \(T_{\tilde{P}(u^{k})}\) flow, and we think of \(P_{k} v\) as an approximate solution for this flow,

$$ T_{P(u^{k})} P_{k} v = [T_{P(u^{k})},P_{k}] v + P_{k} \partial _{ \alpha }T_{g^{\alpha \beta}_{h} - g^{\alpha \beta}_{k}} \partial _{ \beta }v. $$

We can bound the source terms as follows, fixing the dimension to \(n \geq 3\):

$$ \begin{aligned}&| T_{P(u^{k})} P_{k} v \|_{L^{1}(I_{k},H^{-\frac{1}{2}})} \\ &\quad \lesssim \left (\| \partial ^{2} u^{k} \|_{L^{1}(I_{k}, L^{\infty})} + 2^{k} \| P_{< k} (g(\partial u^{h}) - g(\partial u^{k})) \|_{L^{1}(I_{k},L^{ \infty})}\right ) \| v\|_{L^{\infty }\mathcal {H}^{\frac{1}{2}}}. \end{aligned}$$

To conclude it suffices to estimate

$$ \| \partial ^{2} u^{k} \|_{L^{1}(I_{k}, L^{\infty})} \lesssim 1 , $$
(11.31)
$$ \| P_{< k} (g(\partial u^{h}) - g(\partial u^{k})) \|_{L^{1}(I_{k},L^{ \infty})} \lesssim 2^{-k}. $$
(11.32)

The first bound follows from our earlier Strichartz estimates for \(u^{k}\), see (11.17), (11.16). For the second bound, we expand and then it suffices to have

$$ \| P_{< k} (g'(\partial u^{j}) \partial v^{j}) \|_{L^{1}(I_{k},L^{ \infty})} \lesssim 2^{-k} 2^{-c(j-k)}, \qquad j > k, $$
(11.33)

with a positive constant \(c\) in order to allow for integration in \(j\). We expand paradifferentially, depending on the frequencies of the two factors above. It suffices to consider the following two cases:

a) \(v^{j}\) has the frequency below \(2^{k}\). Then we use the Strichartz bounds for \(v_{j}\) over intervals \(I_{j}\), and then sum over such intervals. For instance in dimension \(n \geq 3\) we get

$$ \begin{aligned} \| P_{< k} \partial v_{j}\|_{L^{1}(I_{k};L^{\infty})} \lesssim & \ |I_{k}| |I_{j}|^{-\frac{1}{2}} \sup _{I_{j}} \| P_{< k} \partial v_{j}\|_{L^{2}(I_{j};L^{ \infty})} \\ \lesssim & \ 2^{(\frac{n}{2}+\delta ) k} 2^{-(s-\frac{1}{2}) j} \frac{|I_{k}|}{|I_{j}|} |I_{j}|^{\frac{1}{2}} \\ = & \ 2^{[\frac{n}{2}+\delta - \frac{s_{1}-s}{s_{1}-s_{c}}] k} 2^{-(s- \frac{1}{2} - \frac{s_{1}-s}{2(s_{1}-s_{c})}) j} \\ = & \ 2^{-k} 2^{[(s_{c}-s)(1- \frac{1}{2(s_{1}-s_{c})}) +\delta ]k } 2^{-(s- \frac{1}{2} - \frac{s_{1}-s}{2(s_{1}-s_{c})}) (j-k)}. \end{aligned} $$

Here the coefficient of \(j-k\) is negative by a large margin, while the coefficient of \(k\) in the middle factor is also negative since \(s_{1} - s_{c} > \frac{1}{2}\) and \(\delta \) is arbitrarily small. Hence we obtain a bound as desired in (11.33).

b) The balanced case, where both frequencies have size \(2^{l}\) with \(l \geq k\). This is easier, as we have a better energy bound for the first factor. Hence in this case it is more efficient to estimate the output by applying Bernstein’s inequality first,

$$ \begin{aligned} &\| P_{< k} [P_{l}g'(\partial u^{j}) P_{l} \partial v^{j}] \|_{L^{1}(I_{k},L^{ \infty})} \\ &\quad \lesssim \ |I_{k}|^{\frac{1}{2}} 2^{\frac{nk}{2}} \| P_{l}g'(\partial u^{j}) \|_{L^{ \infty}(I_{k},L^{2})} \|P_{l} \partial v^{j} \|_{L^{2}(I_{k},L^{ \infty})} \\ &\quad \lesssim \ |I_{k}| |I_{j}|^{-\frac{1}{2}} 2^{\frac{nk}{2}} \| P_{l}g'(\partial u^{j}) \|_{L^{\infty }L^{2}} \sup _{I_{j}}\|P_{l} \partial v^{j} \|_{L^{2}(I_{j},L^{ \infty})} \\ &\quad \lesssim \ |I_{k}| |I_{j}|^{-\frac{1}{2}} 2^{\frac{nk}{2}} 2^{-(s-1) l} 2^{(\frac{n}{2}+\delta ) l} 2^{-(s-\frac{1}{2}) j} \\ &\quad = \ 2^{[\frac{n}{2} - \frac{s_{1}-s}{s_{1}-s_{c}}] k} 2^{( \frac{n}{2}-s+1+\delta ) l} 2^{-(s-\frac{1}{2} - \frac{s_{1}-s}{2(s_{1}-s_{c})}) j} \\ &\quad= \ 2^{-k} 2^{[(s_{c}-s)(2 - \frac{1}{2(s_{1}-s_{c})})+\delta ] k} 2^{( \frac{n}{2}-s+1+\delta ) (l-k)} 2^{-(s-\frac{1}{2} - \frac{s_{1}-s}{2(s_{1}-s_{c})}) (j-k)}, \end{aligned} $$

which is better than in case (a). □

11.7 Strichartz estimates for the linearized flow

Our aim here is to show that we have \(\mathcal {H}^{\frac{1}{2}}\) well-posedness and Strichartz estimates with loss of derivatives for the linearized flow around \(u^{h}\) in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) if \(n=2\)):

Proposition 11.3

Under the bootstrap assumptions (11.11), (11.12) and (11.13), the linearized equation around \(u^{h}\) is well-posed in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) if \(n=2\)), and its solutions satisfy the full Strichartz estimates (4.43) with \(S = S_{AIT}\).

Here we use the analysis in Section 9. Precisely Theorem 9.1 there shows that the above proposition follows directly from the similar result in Proposition 11.2 for the linear paradifferential equation.

11.8 Closing the bootstrap argument

Combining the Strichartz estimates for the linear paradifferential equation in Proposition 11.2 with the result of Theorem 9.1, it follows that the linearized flow around \(u^{h}\) is wellposed in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) if \(n=2\)), with the same Strichartz estimates as in Proposition 11.2, which is exactly part (b) of Theorem 11.1. As a consequence, the initial data bound (11.5) for \(v\) implies the uniform bound (11.10), which in turn closes the bootstrap assumption (11.13).

11.9 The well-posedness result

In order to be able to obtain a complete well-posedness argument, we follow the outline in [21], and measure the size of the functions \(u^{h}\) and \(v^{h}\) in terms of frequency envelopes. Precisely, we consider a normalized frequency envelope \(\epsilon c_{h}\) for \(u[0]\) in \(\mathcal {H}^{s}\). Then for the localized initial data we have the bounds

$$ \| u^{h}[0]\|_{\mathcal {H}^{s}} \lesssim \epsilon , $$
(11.34)
$$ \| u^{h}[0]\|_{\mathcal {H}^{s+1}} \lesssim 2^{h} \epsilon c_{h}. $$
(11.35)

On the other hand, fixing the dimension to \(n \geq 3\), we will measure \(v^{h}\) in \(\mathcal {H}^{\frac{1}{2}}\), where for the initial data we have

$$ \begin{aligned} \|v^{h}[0]\|_{\mathcal {H}^{\frac{1}{2}}} \lesssim 2^{-(s-\frac{1}{2})h} c_{h}, \qquad n\geq 3. \end{aligned} $$
(11.36)

Then by Theorem 11.1, we obtain corresponding uniform bounds for the solutions on the time interval \([0,1]\),

$$ \| u^{h}[\cdot ]\|_{L^{\infty}([0.1];\mathcal {H}^{s})} \lesssim \epsilon , $$
(11.37)
$$ \| u^{h}[\cdot ]\|_{L^{\infty}([0,1];\mathcal {H}^{s+1})} \lesssim 2^{h} \epsilon c_{h}. $$
(11.38)

Similarly, the linearized increments \(v_{h}\) satisfy the uniform bounds

$$ \| v_{h}[\cdot ]\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{1}{2}})} \lesssim \epsilon 2^{(s-\frac{1}{2})s} c_{h}. $$
(11.39)

Integrating the last bound with respect to \(h\), we obtain the difference bounds

$$ \| (u^{h} - u^{k})[\cdot ]\|_{L^{\infty}([0,1];\mathcal {H}^{ \frac{1}{2}})} \lesssim \epsilon 2^{(s-\frac{1}{2})s} c_{h}, \qquad h < k. $$
(11.40)

This implies that the limit

$$ u = \lim _{h \to \infty} u^{h} $$

exists in \(C([0,1];\mathcal {H}^{\frac{1}{2}})\). In view of (11.37), the limit \(u\) will also satisfy

$$ \| u[\cdot ]\|_{L^{\infty}([0,1];\mathcal {H}^{s})} \lesssim \epsilon . $$

We can also prove that we have the previous convergence in this stronger topology. To see this, we consider unit increments in \(h\), and compare \(u_{h}\) with \(u_{h+1}\), using (11.38) on one hand, and (11.40) on the other hand. This yields

$$ \| u^{h} - u^{h+1}\|_{C([0,1];\mathcal {H}^{s+1})} \lesssim 2^{h} \epsilon c_{h}, $$
(11.41)

respectively

$$ \| u^{h} - u^{h+1}\|_{C([0,1];\mathcal {H}^{\frac{1}{2}})} \lesssim \epsilon 2^{(s-\frac{1}{2})s} c_{h}, \qquad h < k. $$
(11.42)

These two bounds balance exactly at frequency \(2^{h}\), and measure the \(\mathcal {H}^{s}\) norm but with decay away from frequency \(2^{k}\). Hence the differences are almost orthogonal in \(\mathcal {H}^{s}\), and, summing them up, we obtain

$$ \| u^{h} - u^{k}\|_{C([0,1];\mathcal {H}^{s})} \lesssim \epsilon c_{[h,k]}. $$
(11.43)

This implies uniform convergence in \(\mathcal {H}^{s}\). Thus our solution \(u\) is uniquely identified as the strong \(\mathcal {H}^{s}\) uniform limit of \(u^{h}\).

The continuous dependence and the weak-Lipschitz dependence follow exactly as in [21].