Abstract
It has long been conjectured that for nonlinear wave equations that satisfy a nonlinear form of the null condition, the low regularity well-posedness theory can be significantly improved compared to the sharp results of Smith-Tataru for the generic case. The aim of this article is to prove the first result in this direction, namely for the time-like minimal surface equation in the Minkowski space-time. Further, our improvement is substantial, namely by \(3/8\) derivatives in two space dimensions and by \(1/4\) derivatives in higher dimensions.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The question of local well-posedness for nonlinear wave equations with rough initial data is a fundamental question in the study of nonlinear waves, and which has received a lot of attention over the years. The result of Smith and Tataru [38], proved almost 20 years ago, provides the sharp regularity threshold for generic nonlinear wave equations in view of Lindblad’s counterexample [30]. On the other hand, it has also been conjectured [45] that for nonlinear wave equations that satisfy a suitable nonlinear null condition, the result of [38] can be improved, and the well-posedness threshold can be lowered. In this paper we provide the first result that proves the validity of this conjecture, for a representative equation in this class, namely the hyperbolic minimal surface equation. Further, our improvement turns out to be substantial; precisely, we gain \(3/8\) derivatives in two space dimensions and \(1/4\) derivatives in higher dimension. At this regularity level, the Lorentzian metric \(g\) in our problem is no better that \(C_{x,t}^{\frac{1}{4}+} \cap L^{2}_{t} C_{x}^{\frac{1}{2}+}\), (\(C_{x,t}^{ \frac{3}{8}+} \cap L^{4}_{t} C_{x}^{\frac{1}{2}+}\) in \(2D\)) far below anything studied before.
Most of the ideas introduced in this paper will likely extend to other nonlinear wave models, and open the way toward further progress in the study of low regularity solutions.
1.1 The minimal surface equation in Minkowski space
Let \(n \geq 2\), and \({\mathfrak {M}}^{n+2}\) be the \(n+2\) dimensional Minkowski space-time. A codimension one time-like submanifold \(\Sigma \subset {\mathfrak {M}}^{n+2}\) is called a minimal surface if it is locally a critical point for the area functional
where the area element is measured relative to the Minkowski metric. A standard way to think of this equation is by representing \(\Sigma \) as a graph over \({\mathfrak {M}}^{n+1}\),
where \(u\) is a real valued function
which satisfies the constraint
expressing the condition that its graph is a time-like surface in \({\mathfrak {M}}^{n+2}\).
Then the surface area functional takes the form
Interpreting this as a Lagrangian, the minimal surface equation can be thought of as the associated Euler-Lagrange equation, which takes the form
Under the condition (1.1), the above equation is a quasilinear wave equation.
The left hand side of the last equation can be also interpreted as the mean curvature of the hypersurface \(\Sigma \), and as such the minimal surface equation is alternatively described as the zero mean curvature flow.
In addition to the above geometric interpretation, the minimal surface equation for time-like surfaces in the Minkowski space is also known as the Born-Infeld model in nonlinear electromagnetism [50], as well as a model for evolution of branes in string theory [15].
On the mathematical side, the question of global existence for small, smooth and localized initial data was considered in work of Lindblad [31], Brendle [8], Stefanov [40] and Wong [48]. The stability of a nonflat steady solution, called the catenoid, was studied in [10, 29]. Some blow-up scenarios due to failure of immersivity were investigated by Wong [49]. Minimal surfaces have also been studied as singular limits of certain semilinear wave equations by Jerrard [22]. The local well-posedness question fits into the similar theory for the broader class of quasilinear wave equations, but there is also one result that is specific to minimal surfaces, due to Ettinger [11]; this is discussed later in the paper.
In our study of the minimal surface equation, the above way of representing it is less useful, and instead it is better to think of it in geometric terms. In particular the fact that the above Lagrangian (1.2) and the equation (1.3) are formulated relative to a background Minkowski metric is absolutely non-essential; one may instead use any flat Lorentzian metric. This is no surprise since any two such metrics are equivalent via a linear transformation. Perhaps less obvious is the fact that the equations may be actually written in an identical fashion, independent of the background metric; see Remark 3.1 in Section 3.
For full details on the structure of the equation we refer the reader to Section 3 of the paper, but here we review the most important facts.
The main geometric object is the metric \(g\) that is the trace of the Minkowski metric in \({\mathfrak {M}}^{n+2}\) on \(\Sigma \), and which, expressed in the \((t=x_{0},x_{1},\ldots , x_{n})\) coordinates, has the form
where \(m_{\alpha \beta}\) denotes the Minkowski metric with signature \((-1, 1, \ldots, 1)\) in \({\mathfrak {M}}^{n+1}\). Since \(\Sigma \) is time-like, this is also a Lorentzian metric. This has determinant
and the dual metric is
Here, and later in the paper, we carefully avoid raising indices with respect to the Minkowski metric. Instead, all raised indices in this paper will be with respect to the metric \(g\).
Relative to this metric, the equation (1.3) can be expressed in the form
where \(\Box _{g}\) is the covariant d’Alembertian, and which in this problem will be shown to have the simple expression
An important role will also be played by the associated linearized equation, which, as it turns out, may be easily expressed in divergence form as
Our objective in this paper will be to study the local well-posedness of the associated Cauchy problem with initial data at \(t = 0\),
where the initial data \((u_{0},u_{1})\) is taken in classical Sobolev spaces,
and is subject to the constraint
Here we use the following notation for the Cauchy data in (1.3) at time \(t\),
We aim to investigate the range of exponents \(s\) for which local well-posedness holds, and significantly improve the lower bound for this range.
1.2 Nonlinear wave equations
The hyperbolic minimal surface equation (1.3) can be seen as a special case of more general quasilinear wave equations, which have the form
where, again, \(g^{\alpha \beta}\) is assumed to be Lorentzian, but without any further structural properties. The simplest case is when \(u\) is a scalar, real valued function. But one may equally allow \(u\) to be a vector-valued function, in which case we think of the left hand side of the equation as being in diagonal form, with the coupling occurring only via \(g\) and \(N\). This generic equation will serve as a reference.
As a starting point, we note that the equation (1.3) (and also (1.13) if \(N=0\)) admits the scaling law
This allows us to identify the critical Sobolev exponent as
Heuristically, \(s_{c}\) serves as a universal threshold for local well-posedness, i.e. we have to have \(s > s_{c}\). Taking a naive view, one might think of trying to reach the scaling exponent \(s_{c}\). However, this is a quasilinear wave equation, and getting to \(s_{c}\) has so far proved impossible in any problem of this type.
As a good threshold from above, one might start with the classical well-posedness result, due to Hughes, Kato, and Marsden [18], and which asserts that local well-posedness holds for \(s > s_{c}+1\). This applies to all equations of the form (1.13), and can be proved solely by using energy estimates. These have the form
They may also be restated in terms of quasilinear energy functionals \(E^{s}\) that have the following two properties:
-
(a)
Coercivity,
$$ E^{s}(u[t]) \approx \| u[t]\|_{\mathcal {H}^{s}}^{2}. $$ -
(b)
Energy growth,
$$ \frac{d}{dt} E^{s}(u) \lesssim \| \partial ^{2} u\|_{L^{\infty}} \cdot E^{s}(u). $$(1.15)
To close the energy estimates, it then suffices to use Sobolev embeddings, which allow one to bound the above \(L^{\infty}\) norm, which we will refer to as a control parameter, in terms of the \(\mathcal {H}^{s}\) Sobolev norm provided that \(s > \frac{n}{2}+2\), which is one derivative above scaling.
The reason a derivative is lost in the above analysis is that one would only need to bound \(\|\partial ^{2} u\|_{L^{1} L^{\infty}}\), whereas the norm that is actually controlled is \(\|\partial ^{2} u\|_{L^{\infty }L^{\infty}}\); this exactly accounts for the one derivative difference in scaling. It also suggests that the natural way to improve the classical result is to control the \(L^{p} L^{\infty}\) norm directly. This is indeed possible in the context of the Strichartz estimates, which in dimension three and higher give the bound
with another \(\epsilon \) derivatives loss in three space dimensions. When true, such a bound yields well-posedness for \(s > \frac{n+3}{2}\), which is \(1/2\) derivatives above scaling. The numerology changes slightly in two space dimensions, where the best possible Strichartz estimate has the form
which is \(3/4\) derivatives above scaling.
The difficulty in using Strichartz estimates is that, while these are well known in the constant coefficient case [12, 24] and even for smooth variable coefficients [23, 33], that is not as simple in the case of rough coefficients. Indeed, as it turned out, the full Strichartz estimates are true for \(C^{2}\) metrics, see [35] (\(n = 2,3\)), [44] (all \(n\)), but not, in general, for \(C^{\sigma}\) metrics when \(\sigma <2\), see the counterexamples of [36, 37]. This difficulty was resolved in two stages:
-
(i)
Semiclassical time scales and Strichartz estimates with loss of derivatives. The idea here, which applies even for \(C^{\sigma}\) metrics with \(\sigma <2\), is that, associated to each dyadic frequency scale \(2^{k}\), there is a corresponding “semiclassical” time scale \(T_{k} = 2^{-\alpha k}\), with \(\alpha \) dependent on \(\sigma \), so that full Strichartz estimates hold at frequency \(2^{k}\) on the scale \(T_{k}\). Strichartz estimates with loss of derivatives are then obtained by summing up the short time estimates with respect to the time intervals, separately at each frequency. This idea was independently introduced in [5] and [43], and further refined in [4] and [46].
-
(ii)
Wave packet coherence and parametrices. The observation here is that in the study of nonlinear wave equations such as (1.13), in addition to Sobolev-type regularity for the metric, we have an additional piece of information, namely that the metric itself can be seen as a solution to a nonlinear wave equation. This idea was first introduced and partially exploited in [26], but was brought to full fruition in [38], where it was shown that almost loss-less Strichartz estimates hold for the solutions to (1.13) at exactly the correct regularity level.
The result in [38] represents the starting point of the present work, and is concisely stated as follows:Footnote 1
Theorem 1.1
Smith-Tataru [38]
(1.13) is locally well-posed in \(\mathcal {H}^{s}\) provided that
respectively
As part of this result, almost loss-less Strichartz estimates were obtained both directly for the solution \(u\), and more generally for the associated linearized evolution. We will return to these estimates in Section 10 for a more detailed statement and an in-depth discussion.
The optimality of this result, at least in dimension three, follows from work of Lindblad [30], see also the more recent two dimensional result in [34]. However, this counterexample should only apply to “generic” models, and the local well-posedness threshold might possibly be improved in problems with additional structure, i.e. some form of null condition.
Moving forward, we recall that in [45], a null condition was formulated for quasilinear equations of the form (1.13).
Definition 1.2
[45]
The nonlinear wave equation (1.13) satisfies the nonlinear null condition if
In this definition the vector \(p\) is a placeholder for the \(\partial u\) variable in (1.13); for added generality, we also allow for the dependence of \(g\) on the undifferentiated \(u\).
Here we use the terminology “nonlinear null condition” in order to distinguish it from the classical null condition, which is relative to the Minkowski metric, and was heavily used in the study of global well-posedness for problems with small localized data, see [25] as well as the books [17, 39]. In geometric terms, this null condition may be seen as a cancellation condition for the self-interactions of wave packets traveling along null geodesics. In Section 3 we verify that the minimal surface equation indeed satisfies the nonlinear null condition.
Further, it was conjectured in [45] that, for problems satisfying (1.18), the local well-posedness threshold can be lowered below the one in [38]. This conjecture has remained fully open until now, though one should mention two results in [27] and [11] for the Einstein equation, respectively the minimal surface equation, where the endpoint in Theorem 1.1 is reached but not crossed.
The present work provides the first positive result in this direction, specifically for the minimal surface equation. Indeed, not only are we able to lower the local well-posedness threshold in Theorem 1.1, but in effect we obtain a substantial improvement, namely by \(3/8\) derivatives in two space dimensions and by \(1/4\) derivatives in higher dimension.
1.3 The main result
Our main result, stated in a succinct form, is as follows:
Theorem 1.3
The Cauchy problem for the minimal surface equation (1.10) is locally well-posed for initial data \(u[0]\) in \(\mathcal {H}^{s}\) that satisfy the constraint (1.12), where
respectively
Remark 1.4
The constraint \(n \leq 5\) in this result is inherited from Theorem 1.1, or more precisely its full formulation provided in Theorem 10.1, which also includes the Strichartz estimates. This result is used as a black box in the present paper, so knowledge of Theorem 10.1 in higher dimensions would directly imply that the above result also holds in higher dimensions.
The result is valid regardless of the \(\mathcal {H}^{s}\) size of the initial data. Here we interpret local well-posedness in a strong Hadamard sense, including:
-
existence of solutions in the class \(u[\, \cdot \,] \in C([0,T];\mathcal {H}^{s})\), with \(T\) depending only on the \(\mathcal {H}^{s}\) size of the initial data.
-
uniqueness of solutions, in the sense that they are the unique limits of smooth solutions.
-
higher regularity, i.e. if in addition the initial data \(u[0] \in \mathcal {H}^{m}\) with \(m > s\), then the solution satisfies \(u[\, \cdot \, ] \in C([0,T];\mathcal {H}^{m})\), with a bound depending only on the \(\mathcal {H}^{m}\) size of the data,
$$ \| u[\, \cdot \, ]\|_{C([0,T];\mathcal {H}^{m})} \lesssim \| u[0]\|_{ \mathcal {H}^{m}}. $$ -
continuous dependence in \(\mathcal {H}^{s}\), i.e. continuity of the data to solution map
$$ \mathcal {H}^{s} \ni u[0] \to u [ \, \cdot \, ] \in C([0,T];\mathcal {H}^{s}). $$ -
weak Lipschitz dependence, i.e. for two \(\mathcal {H}^{s}\) solutions \(u\) and \(v\) we have the difference bound
$$ \| u[\, \cdot \, ]-v[\, \cdot \, ]\|_{C([0,T];\mathcal {H}^{\frac{1}{2}})} \lesssim \| u[0]-v[0]\|_{\mathcal {H}^{\frac{1}{2}}} $$where the exponent \(\frac{1}{2}\) is replaced by \(\frac{5}{8}\) in two space dimensions.
We remark on the weak-Lipschitz dependence, which in more classical results is proved for a much larger range of Sobolev exponents. Here the need for balanced estimates, together with a loss of symmetry in the linearized equation, have the effect of limiting this range, namely to a smaller neighbourhood of the exponent \(\frac{1}{2}\). For the present results a single exponent suffices.
In addition to the above components of the local well-posedness result, a key intermediate role in the proof of the above theorem is played by the Strichartz estimates, not only for the solution \(u\), but also, more importantly, for the linearized problem
as well as its paradifferential counterpart
Here the paraproducts are defined using the Weyl quantization, see Section 2.2 for more details. For later reference, we state the Strichartz estimates in a separate theorem:
Theorem 1.5
The following properties hold for every solution \(u\) to the minimal surface equation as in Theorem 1.3, in the corresponding time interval \([0,T]\):
a) There exists some \(\delta _{0} > 0\), depending on \(s\) in (1.19), (1.20) so that the solution \(u\) satisfies the Strichartz estimates
b) Both the linearized equation (1.21) and its paradifferential version (1.22) are well-posed in \(\mathcal {H}^{\frac{5}{8}}\) for \(n=2\) respectively \(\mathcal {H}^{\frac{1}{2}}\) for \(n = 3, 4, 5\), and the following Strichartz estimates hold for eachFootnote 2\(\delta > 0\):
respectively
We note that the Strichartz estimates in both parts (a) and (b) have derivative losses, namely \(1/8\) derivatives in the \(L^{4} L^{\infty}\) bound in two dimensions, respectively \(1/4\) derivatives in higher dimensions. These estimates only represent the tip of the iceberg. One may also consider the inhomogeneous problem, allow source terms in dual Strichartz spaces, etc. These and other variations that play a role in this paper are discussed in Section 4.
To understand the new ideas in the proof of our main theorem, we recall the two key elements of the proof of the result in [38], namely (i) the classical energy estimates (1.15) and (ii) the nearly lossless Strichartz estimates; at the time, the chief difficulty was to prove the Strichartz estimates.
In this paper we completely turn the tables, taking part (ii) above for granted, and instead working to improve the energy estimates. Let us begin with a simple observation, which is that the minimal surface equation (1.7) has a cubic nonlinearity, which allows one to replace (1.15) with
This is what one calls a cubic energy estimate, which is useful in the study of long time solutions but does not yet help with the low regularity well-posedness question. The key to progress lies in developing a much stronger form of this bound, which roughly has the formFootnote 3
where the two control norms on the right are now balanced, and only require \(1/2\) derivative less than (1.26). This is what we call a balanced energy estimate, which may only hold for a very carefully chosen energy functional \(E^{s}\).
This is an idea that originates in our recent work on 2D water waves (see [1]), where balanced energy estimates are also used in order to substantially lower the low regularity well-posedness threshold. Going back further, this has its roots in earlier work of the last two authors and their collaborators [19, 20], in the context of trying to apply normal form methods in order to obtain long time well-posedness results in quasilinear problems. There we have introduced what we called the modified energy method, which in a nutshell asserts that in quasilinear problems it is far better to modify the energies in a normal form fashion, rather than to transform the equation. It was the cubic energy estimates of [20] that were later refined in [1] to balanced energy estimates. Along the way, we have also borrowed and adapted another idea from Alazard and Delort [2, 3], which is to prepare the problem with a partial normal form transformation, and is a part of their broader concept of paradiagonalization; that same idea is also used here.
There are several major difficulties in the way of proving energy estimates such as the ones in (1.27):
-
The normal form structure is somewhat weaker in the case of the minimal surface equation, compared to water waves. As a consequence, we have to carefully understand which components of the equation can be improved with a normal form analysis and which cannot, and thus have to be estimated directly.
-
Not only are the energy functionals \(E^{s}\) not explicit, they have to be constructed in a very delicate way, following a procedure that is reminiscent of Tao’s renormalization idea in the context of wave-maps [42], as well as the subsequent work [47] of the third author on the same problem.
-
Keeping track of symbol regularities in our energy functionals and in the proof of the energy estimates is also a difficult task. To succeed, here we adapt and refine a suitable notion of paracontrolled distributions, an idea that has already been used successfully in the realm of stochastic pde’s [13, 14].
-
The balanced energy estimates need to be proved not only for the full equation, but also for the associated linear paradifferential equation, as a key intermediate step, as well as for the full linearized flow. In particular, when linearizing, some of the favourable normal form structure (or null structure, to use the nonlinear wave equations language) is lost, and the proofs become considerably more complex.
Finally, the Strichartz estimates of [38] cannot be used directly here. Instead, we are able to reformulate them in a paradifferential fashion, and to apply them on appropriate semiclassical time scales. After interval summation, this leads to Strichartz estimates on the unit time scale but with derivative losses. Precisely, in our main Strichartz estimates, whose aim is to bound the control parameters in (1.27), we end up losing essentially \(1/8\) derivatives in two space dimensions, and \(1/4\) derivatives in higher dimension. These losses eventually determine the regularity thresholds in our main result in Theorem 1.3.
One consequence of these energy estimates is the following continuation result for the solutions:
Theorem 1.6
The \(\mathcal {H}^{s}\) solution \(u\) given by Theorem 1.3can be continued for as long as the following integral remains finite:
1.4 An outline of the paper
Paraproducts and paradifferential calculus.
The bulk of the paper is written in the language of paradifferential calculus. The notations and some of the basic product and paracommutator bounds are introduced in Section 2. Importantly, we use the Weyl quantization throughout; this plays a substantial role as differences between quantizations are not always perturbative in our analysis. Also of note, we emphasize the difference between balanced and unbalanced bounds, so some of our \(\Psi \)DO product or commutator expansions have the form
The geometric form of the minimal surface equation.
While the flat d’Alembertian may naively appear to play a role in the expansion (1.3) of the minimal surface equation, this is not at all useful, and instead we need to adopt a geometric viewpoint. As a starting point, in Section 3 we consider several equivalent formulations of the minimal surface equation, leading to its geometric form in (1.7). This is based on the metric \(g\) associated to the solution \(u\) by (1.4), whose dual we also compute. Two other conformally equivalent metrics will also play a role. In the same section we derive the linearized equation, and also introduce the associated linear paradifferential flow.
Strichartz estimates.
As explained earlier, Strichartz estimates play a major role in our analysis. These are applied to several equations, namely the full evolution, the linear paradifferential evolution and finally the linearized equation; in the present paper, we view the bounds for the paradifferential equation as the core ones, and the other bounds as derived bounds, though not necessarily in a directly perturbative fashion. The Strichartz estimates admit a number of formulations: in direct form for the homogeneous flow, in dual form for the inhomogeneous one, or in the full form. The aim of Section 4 is to introduce all of these forms of the Strichartz estimates, as well as to describe the relations between them, in the context of this paper. A new idea here is to allow source terms that are time derivatives of distributions in appropriate spaces; this is achieved by reinterpreting the wave equation as a system.
Control parameters in energy estimates.
We begin Section 5 by defining the control parameters \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\) and ℬ, which will play a fundamental role in our energy estimate. Here \({\mathcal {A}}\) and \({\mathcal {A}^{\sharp }}\) are scale invariant norms, at the level of \(\| \partial u\|_{L^{\infty}}\), which will remain small uniformly in time. ℬ, on the other hand, is time dependent and at the level of \(\||D_{x}|^{\frac{1}{2}} \partial u\|_{L^{\infty}}\), and will control the energy growth. Typically, our balanced cubic energy estimates will have the form
To propagate energy bounds we will need to know that \({\mathcal {B}}\in L^{2}_{t}\). Also in the same section we prove a number of core bounds for our solutions in terms of the control parameters.
The multiplier method and paracontrolled distributions.
Both the construction of our energies and the proof of the energy estimates are based on a paradifferential implementation of the multiplier method, which leads to space-time identities of the form
in a paradifferential format, where the vector field \(X\) is our multiplier and \(E_{X}\) is its associated energy, while \(R(u)\) is the energy flux term which will have to be estimated perturbatively. A fundamental difficulty is that the multiplier \(X\), which should heuristically be at the regularity level of \(\partial u\), cannot be chosen algebraically, and instead has to be constructed in an inductive manner relative to the dyadic frequency scales. In order to accurately quantify the regularity of \(X\), in Section 6 we use and refine the notion of paracontrolled distributions; in a nutshell, while \(X\) may not be chosen to be a function of \(\partial u\), it will still have to be paracontrolled by \(\partial u\), which we denote by .
Energy estimates for the paradifferential equation.
The construction of the energy functionals is carried out in Section 7, primarily at the level of the linear paradifferential equation, first in \(\mathcal {H}^{1}\) and then in \(\mathcal {H}^{s}\). In both cases there are two steps: first the construction of the symbol of the multiplier \(X\), as a paracontrolled distribution, and then the proof of the energy estimates. The difference between the two cases is that \(X\) is a vector field in the first case, but a full pseudodifferential operator in the second case; because of this, we prefer to present the two arguments separately.
Energy estimates for the full equation.
The aim of Section 8 is to prove that balanced cubic energy estimates hold for the full equation in all \(\mathcal {H}^{s}\) spaces with \(s \geq 1\). We do this by thinking about the full equation in a paradifferential form, i.e. as a linear paradifferential equation with a nonlinear source term, and then by applying a normal form transformation to the unbalanced part of the source term.
Well-posedness for the linearized equation.
The goal of Section 9 is to establish both energy and Strichartz estimates for \(\mathcal {H}^{\frac{1}{2}}\) solutions (\(\mathcal {H}^{\frac{5}{8}}\) in dimension two) to the linearized equation. This is achieved under the assumption that both energy and Strichartz estimates for \(\mathcal {H}^{\frac{1}{2}}\) solutions (\(\mathcal {H}^{\frac{5}{8}}\) in dimension two) for the linear paradifferential equation hold. We remark that, while the energy estimates for the linear paradifferential equation have already been established by this point in the paper, the corresponding Strichartz estimates have yet to be proved.
Short time Strichartz estimates for the full equation.
The local well-posedness result of Smith and Tataru [38] yields well-posedness and nearly sharp Strichartz estimates on the unit time scale for initial data that is small in the appropriate Sobolev space. Our objective in Section 10 is to recast this result as a short time result for a corresponding large data problem. This is a somewhat standard argument combining scaling and finite speed of propagation, though with an interesting twist due to the need to use homogeneous Sobolev norms.
Small vs. large \(\mathcal {H}^{s}\) data.
In our main well-posedness proof, in order to avoid more cumbersome notations and estimates, it is convenient to work with initial data that is small in \(\mathcal {H}^{s}\). This is not a major problem, as this is a nonlinear wave equation which exhibits finite speed of propagation. This allows us to reduce the large data problem to the small data problem by appropriate localizations. This argument is carried out at the beginning of Section 11.
Rough solutions as limits of smooth solutions.
Our sequence of modules discussed so far comes together in Section 11, where we finally obtain our rough solutions \(u\) as a limit of smooth solutions \(u^{h}\) with initial data frequency localized below frequency \(2^{h}\). The bulk of the proof is organized as a bootstrap argument, where the bootstrap quantities are uniform energy type bounds for both \(u^{h}\) and for their increments \(v^{h} = \dfrac{d}{dh} u^{h}\), which solve the corresponding linearized equation. The main steps are as follows:
-
we use the short time Strichartz estimates derived from [38] for \(u^{h}\) and \(v^{h}\) in order to obtain long time Strichartz estimates for \(u^{h}\), which in turn implies energy estimates for both the full equation and the paradifferential equation, and closes one half of the bootstrap.
-
we combine the short time Strichartz estimates and the long time energy estimates for the paradifferential equation in \(\mathcal {H}^{\frac{1}{2}}\) (\(\mathcal {H}^{\frac{5}{8}}\) if \(n=2\)) to obtain long time Strichartz estimates for the same paradifferential equation.
-
we use the energy and Strichartz estimates for the paradifferential equation to obtain similar bounds for the linearized equation. This in turn implies long time energy estimates for \(v^{h}\), closing the second half of the bootstrap loop.
The well-posedness argument.
Once we have a complete collection of energy estimates and Strichartz estimates for both the full equation and the linearized equation, we are able to use frequency envelopes in order to prove the remaining part of the well-posedness results, namely the strong convergence of the smooth solutions, the continuous dependence, and the associated uniqueness property. In this we follow the strategy outlined in the last two authors’ expository paper [21].
2 Notations, paraproducts and some commutator type bounds
We begin with some standard notations and conventions:
-
The greek indices \(\alpha \), \(\beta \), \(\gamma \), \(\delta \) etc. in expressions range from 0 to \(n\), where 0 stands for time. Roman indices \(i\), \(j\) are limited to the range from 1 to \(n\), and are associated only to spatial coordinates.
-
The differentiation operators with respect to all coordinates are \(\partial _{\alpha}\), \(\alpha = 0,\ldots, n\). By \(\partial \) without any index we denote the full space-time gradient. To separate only spatial derivatives we use the notation \(\partial _{x}\).
-
We consistently use the Einstein summation convention, where repeated indices are summed over, unless explicitly stated otherwise.
-
The inequality sign \(x \lesssim y\) means \(x \leq Cy\) with a universal implicit constant \(C\). If instead the implicit constant \(C\) depends on some parameter \(A\) then we write \(x \lesssim _{A} y\).
2.1 Littlewood-Paley decompositions and Sobolev spaces
We denote the Fourier variables by \(\xi _{\alpha}\) with \(\alpha = 0,\ldots,n\). To separate the spatial Fourier variables we use the notation \(\xi '\).
2.1.1 Littlewood-Paley decompositions
For distributions in \({\mathbb{R}}^{n}\) we will use the standard inhomogeneous Littlewood-Paley decomposition
where \(P_{k} = P_{k} (D_{x})\) are multipliers with smooth symbols \(p_{k}(\xi ')\), localized in the dyadic frequency region \(\{|\xi | \approx 2^{k}\}\) (unless \(k=0\), where we capture the entire unit ball). We emphasize that no such decompositions are used in the paper with respect to the time variable. We will also use the notations \(P_{< k}\), \(P_{>k}\) with the standard meaning. Often we will use shorthand for the Littlewood-Paley pieces of \(u\), such as \(u_{k} :=P_{k} u\) or \(u_{< k}:= P_{< k} u\). On occasion we will need multipliers with slightly larger support, e.g. \(\tilde{P}_{k}\) will denote a multiplier with similar dyadic frequency localization as \(P_{k}\), but so that \(\tilde{P}_{k} P_{k} = P_{k}\).
2.1.2 Function spaces
For our main evolution we will use inhomogeneous Sobolev space \(H^{s}\), often combined as product spaces \(\mathcal {H}^{s} = H^{s} \times H^{s-1}\) for the position/velocity components of our evolution. In the next to last section of the paper only we will have an auxiliary use for the corresponding homogeneous spaces \(\dot{H}^{s}\), in connection with scaling analysis.
For our estimates we will use \(L^{\infty}\) based control norms. In addition to the standard \(L^{\infty}\) norms, in many estimates we will use the standard inhomogeneous \(BMO\) norm, as well as its close relatives \(BMO^{s}\), with norm defined as
We will also need several related \(L^{p}\) based Besov norms \(B^{s}_{p,q}\), defined as
with the obvious changes if \(q = \infty \). In particular we will be often using these norms with \(p=\infty \) or \(p=2n\), which correspond to the spaces \(B^{0}_{\infty ,1}\), \(B^{\frac{1}{2}}_{2n,1}\) and \(B^{\frac{1}{2}}_{\infty ,2}\); these will be used for our control norms \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\) and ℬ.
2.1.3 Frequency envelopes
Throughout the paper we will use the notion of frequency envelopes, introduced by Tao (see for example [42]), which is a very useful device that tracks the evolution of the energy of solutions between dyadic energy shells.
Definition 2.1
We say that \(\{c_{k}\}_{k\geq 0} \in \ell ^{2}\) is a frequency envelope for a function \(u\) in \(H^{s}\) if we have the following two properties:
a) Energy bound:
b) Slowly varying
Here \(\delta \) is a positive constant, which is taken small enough in order to account for energy leakage between nearby frequencies.
One can also limit from above the size of a frequency envelope, by requiring that
Such frequency envelopes always exist, for instance one can define
The same notion can be applied to any Besov norms. In particular we will use it jointly for the Besov norms that define our control parameters \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\) and ℬ.
2.2 Paraproducts and paradifferential operators
For multilinear analysis, we will consistently use paradifferential calculus, for which we refer the reader to [6, 32].
We begin with the simplest bilinear expressions, namely products, for which we will use the Littlewood-Paley trichotomy
where the three terms capture the low×high frequency interactions, the high×high frequency interactions and the low×high frequency interactions. The paraproduct \(T_{f} g\) might be heuristically thought of as the dyadic sum
where the frequency gap \(\kappa \) can be simply chosen as a universal parameter, say \(\kappa = 4\), or on occasion may be increased and used as a smallness parameter in a large data context. To avoid bulky notations, in this paper we will make a harmless abuse of notation and neglect \(\kappa \) altogether. In other words, our notation \(P_{< k}\) stands in effect for \(P_{< k-\kappa}\) with a fixed universal constant \(\kappa \).
However, in our context a definition such as the above one is too imprecise, and the difference between usually equivalent choices may have nonperturbative effects when considering adjoints in our proof of balanced energy estimates later on.
In particular, the symmetry properties of \(T_{f}\) as an operator in \(L^{2}\) are critical in our energy estimates. For this reason, we choose to work with the Weyl quantization, and we define
Here \(\chi \) is a smooth function supported in a small ball and that equals 1 near the origin. With this convention, if \(f\) is real then \(T_{f}\) is an \(L^{2}\) self-adjoint operator.
For paraproducts we have a number of standard bounds which we list below, and we will refer to as Coifman-Meyer estimates:
These hold for \(1 < p < \infty \), but there are also endpoint results available roughly corresponding to \(p = 1\) and \(p = \infty \).
Paraproducts may also be thought of as belonging to the larger class of translation invariant bilinear operators. Such operators
may be described by their symbols \(b(\eta ,\xi )\) in the Fourier space, by
A special class of such operators, which we denote by \(L_{lh}\), will play an important role later in the paper:
Definition 2.2
By \(L_{lh}\) we denote translation invariant bilinear forms whose symbol \(\ell _{lh}(\eta ,\xi )\) is supported in \(\{|\eta | \ll |\xi |+1\}\) and satisfies bounds of the form
We remark that in particular the bilinear form \(B(f,g) = T_{f} g\) is an operator of type \(L_{lh}\), with symbol
Here the factor in the denominator \(\xi +\eta /2\) is the average of the \(g\) input frequency and the output frequency, and corresponds exactly to our use of the Weyl calculus. The \(L^{p}\) bounds and the commutator estimates for such bilinear form mirror exactly the similar bounds for paraproducts.
2.3 Commutator and other paraproduct bounds
Here we collect a number of general paraproduct estimates, which are relatively standard. See for instance Appendix B of [20] and Sections 2 and 3 of [1] for proofs of the following estimates as well as further references.
We begin with the following standard commutator estimate:
Lemma 2.3
\(P_{k}\) commutators
We have
A similar bound holds also in \(L^{p}\) for \(1 \leq p \leq \infty \).
The following commutator-type estimates are exact reproductions of statements from Lemmas 2.4 and 2.6 in Section 2 of [1], respectively:
Lemma 2.4
Para-commutators
Assume that \(\gamma _{1}, \gamma _{2} < 1\). Then we have
A bound similar to (2.7) holds in the Besov scale of spaces, namely from \(\dot{B}^{s}_{p, q}\) to \(\dot{B}^{s+\gamma _{1}+\gamma _{2}}_{p, q}\) for real \(s\) and \(1\leq p,q \leq \infty \).
Lemma 2.5
Para-associativity
For \(s + \gamma _{2} \geq 0\), \(s + \gamma _{1} + \gamma _{2} \geq 0\), and \(\gamma _{1} < 1\) we have
We also have a Leibniz-type rule with paraproducts, which closely follows Lemma 3.6 of [1]. Here, our setting is slightly cleaner as we have only \(T_{f}\partial _{\alpha}\) in place of \(\partial _{t} + T_{b} \partial _{\alpha}\), and the dependence on \(f\) is captured by the control norm \(A_{\frac{1}{4}}\) in [1].
Lemma 2.6
Para-Leibniz rule
For the balanced Leibniz rule error
we have the bound
The next paraproduct estimate, see Lemma 2.5 in [1], directly relates multiplication and paramultiplication:
Lemma 2.7
Para-products
Assume that \(\gamma _{1}, \gamma _{2} < 1\), \(\gamma _{1}+\gamma _{2} \geq 0\). Then
A similar bound holds in the Besov scale of spaces, namely from \(\dot{B}^{s}_{p, q}\) to \(\dot{B}^{s+\gamma _{1}+\gamma _{2}}_{p, q}\) for real \(s\) and \(1\leq p,q \leq \infty \).
We will also need the following variant, which applies for a different range of indices:
Lemma 2.8
Low-high para-products
Assume that \(\gamma _{1} > 0\), \(\gamma _{1}+\gamma _{2} \leq 0\). Then
The proof of this Lemma only requires a straightforward Littlewood-Paley decomposition for both \(f\) and \(g\), where the difference \(T_{f} T_{g} - T_{T_{f}g} \) selects the range where the \(f\) frequency is at least comparable to the \(g\) frequency. The details are left for the reader.
These are stated here in the more elegant homogeneous setting, but there are also obvious modifications that apply in the inhomogeneous case. We end with the following Moser-type result:
Lemma 2.9
Let \(F\) be smooth with \(F(0)=0\), and \(w \in H^{s}\). Set
Then we have the estimate
This should be a classical result, though we were not able to find a sufficiently accurate reference. Instead of providing a proof here, we refer to similar Moser estimates in Lemmas 5.2 and 5.7, which are more relevant to the current paper and which we prove in Section 5. For further reference, the reader may also view this as a variation of Lemma 2.3 in [1].
2.4 Paradifferential operators
As a generalization of paraproducts, we will also work with paradifferential operators. Precisely, given a symbol \(a(x,\xi )\) in \({\mathbb{R}}^{n}\), we define its paradifferential Weyl quantization \(T_{a}\) as the operator
where
The simplest class of symbols one can work with is \(L^{\infty }S^{m}\), which contains symbols \(a\) for which
for all multi-indices \(\alpha \). For such symbols, the Calderón-Vaillancourt theorem ensures appropriate boundedness in Sobolev spaces,
We remark that this class of symbols in the paradifferential quantization is contained in the class denoted by \(\mathcal {B}S^{m}_{1,1}\), see for instance Hormander [16] but also the earlier work of Bony [7] for further properties.
More generally, given a translation invariant space of distributions \(X\), we can define an associated symbol class \(X S^{m}\) of symbols with the property that
for each \(\xi \in {\mathbb{R}}^{n}\). Later in the paper, we will use several choices of symbols of this type, using function spaces that we will associate to our problem.
3 A complete set of equations
Here we aim to further describe the minimal surface equation and the underlying geometry, and, in particular, its null structure. We also derive the linearized equation, and introduce the paralinearization of both the main equation and its linearization.
3.1 The Lorentzian geometry of the minimal surface
Starting from the expression of the metric \(g\) in (1.4), the dual metric is easily computed to be
Also associated to the metric \(g\) is its determinant
and the associated volume form
This can be easily computed (e.g. using Sylvester’s determinant theorem) as
In the sequel, we will always raise indices with respect to the metric \(g\), never with respect to Minkowski. In particular we will use the standard notation
We remark that, when applied to the function \(u\), this operator has nearly the same effect as the corresponding Minkowski operator,
3.2 The minimal surface equation
Here we rewrite the minimal surface equation in covariant form. Using the \(g\) notation above and the Minkowski metric, we rewrite (1.3) as
or equivalently
Expanding the \(g\) derivative, we have
Then in the previous equation we recognize the expression for the dual metric, and the minimal surface equation becomes
Using the notation (3.2), this is written in an even shorter form,
Similarly, using also (3.3), the relation (3.4) becomes
3.3 The covariant d’Alembertian
The covariant d’Alembertian associated to the metric \(g\) has the form
which we can rewrite as
Next we need to compute the two coefficients in round brackets. The second coefficient is given by (3.7). For the first one, for later use, we perform a slightly more general computation where we differentiate \(g^{\alpha \beta}(\partial _{\gamma}u)\) as a function of its arguments \(p_{\gamma}:=\partial _{\gamma }u\),
This formula follows by directly differentiating (3.1) and from (3.3),
We use (3.1) once again to get (3.8)
From (3.8) and chain rule, we arrive at
Setting \(\gamma =\alpha \) and using the minimal surface equation in the (3.5) formulation, we get
Comparing this with (3.7), we see that the last two terms in the \(\Box _{g}\) expression above cancel, and we obtain the following simplified form for the covariant d’Alembertian:
In particular, we get the covariant form of the minimal surface equation for \(u\):
For later use, we introduce the notation
An interesting observation is that from here on, the Minkowski metric plays absolutely no role:
Remark 3.1
In order to introduce the minimal surface equations we have started from the Minkowski metric \(m^{\alpha \beta}\). However, the formulation (3.5) of the equations together with the relations (3.8) provide a complete description of the equations without any reference to the Minkowski metric, and which is in effect valid for any other Lorentzian metric. Indeed, the equation (3.5) together with the fact that the metric components \(g^{\alpha \beta}\) are smooth functions of \(\partial u\) satisfying (3.8) are all that is used for the rest of the paper. Thus, our results apply equally for any other Lorentzian metric in \({\mathbb{R}}^{n+2}\).
3.4 The linearized equations
Our objective now is to derive the linearized minimal surface equations. We will denote by \(v\) the linearized variable. Then, by (3.8), the linearization of the dual metric \(g^{\alpha \beta} = g^{\alpha \beta}(u)\) takes the form
Then the linearized equation is directly computed, using the symmetry in \(\alpha \) and \(\beta \), as
Using the expression of \(A\) in (3.13), the linearized equations take the form
Alternatively this may also be written in a divergence form,
or in covariant form,
3.5 Null forms and the nonlinear null condition
The primary null form that plays a role in this article is \(Q_{0}\), defined by
Now, we verify that the nonlinear null condition (1.18) holds; for this we use (3.8) to compute
which vanishes on the null cone \(g^{\alpha \beta} \xi _{\alpha }\xi _{\beta }= 0\).
In addition we would like the contribution of \(A\) to the linearized equation to be a null form. We get
3.6 Two conformally equivalent metrics
While the metric \(g\) is the primary metric used in this paper, for technical reasons we will also introduce two additional, conformally equivalent metrics, as follows:
(i) The metric \({\tilde{g}}\) is defined by
Then the minimal surface equation can be written as
while the linearized equation, written in divergence form is
where, still raising indices only with respect to \(g\),
The main feature of \(\tilde{g}\) is that \(\tilde{g}^{00}=1\). Because of this, it will be useful in the study of the linear paradifferential flow, in order to prevent a nontrivial paracoefficient in front of \(\partial _{0}^{2} v\) in the equations.
(ii) The metric \({\hat{g}}\) is defined by
Then the minimal surface equation can be written as
which is not so useful. Instead, the advantage of using this metric is that, using (3.13), the linearized equation can now be written in divergence form,
This will be very useful when we study the linearized equation in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) in two dimensions).
3.7 Paralinearization and the linear paradifferential flow
A key element in our study of the minimal surface equation is the associated linear paradifferential flow, which is derived from the linearized flow (3.15). In inhomogeneous form, this is
Similarly we can write the paradifferential equations associated to \({\tilde{g}}\), namely
as well as \({\hat{g}}\), which can be written in divergence form:
These are all equivalent up to perturbative errors. Accordingly, we introduce the notation
for the paradifferential wave operator as well as its counterparts \(T_{\tilde{P}}\) and \(T_{\hat{P}}\) with the metric \(g\) replaced by \({\tilde{g}}\), respectively \({\hat{g}}\).
We will first use the paradifferential equation in the study of the minimal surface problem (3.5), which we rewrite in the form
Here we carefully base our formula on the linearized flow (3.25), rather on a direct paradifferential expansion in (3.5). This is in order to insure that all nonlinear interactions in \(N(u)\) are frequency balanced at leading order.
A key contention of our paper is that the nonlinearity \(N\) plays a perturbative role. However, this has to be interpreted in a more subtle way, in the sense that \(N\) becomes perturbative only after a well chosen partial, variable coefficient normal form transformation.
Secondly, we will use it in the study of the linearized minimal surface equation, which we can write in the form
Here the nonlinearity \(N_{lin}\) will also play a perturbative role, in the same fashion as above. We caution the reader that this is not the linearization of \(N\).
4 Energy and Strichartz estimates
Both energy and Strichartz estimates play an essential role in this paper, in various forms and combinations. These are primarily applied first to the linear paradifferential flow, and then to the linearized flow associated to solutions to our main equation (1.7). Our goal here is to provide a brief overview of these estimates.
Importantly, in this section we do not prove any energy or Strichartz estimates. Instead, we simply provide definitions and context for what will be proved later in the paper, and prove a good number of equivalences between various well-posedness statements and estimates. We do this under absolutely minimal assumptions (e.g. boundedness) on the metric \(g\), in order to be able to apply these properties easily later on. In particular there are no commutator bounds needed or used in this section. The structure of the minimal surface equations also plays no role here.
4.1 The equations
For context, here we consider a pseudo-Riemannian metric \(g\) in \(I \times {\mathbb{R}}^{n}\), where \(I=[0,T]\) is a time interval of unspecified length. We will make some minimal universal assumptions on the metric \(g\):
-
both \(g\) and its inverse are uniformly bounded,
-
the time slices are uniformly space-like.
Associated to this metric \(g\), we will consider several equations:
- The linear paradifferential flow in divergence form:
-
$$ \partial _{\alpha }T_{g^{\alpha \beta}} \partial _{\beta }v = f, \qquad v[0] = (v_{0},v_{1}) . $$(4.1)
- The linear paradifferential flow in non-divergence form:
-
$$ T_{g^{\alpha \beta}} \partial _{\alpha }\partial _{\beta }v = f, \qquad v[0] = (v_{0},v_{1}) . $$(4.2)
- The linear flow in divergence form:
-
$$ \partial _{\alpha }g^{\alpha \beta} \partial _{\beta }v = f, \qquad v[0] = (v_{0},v_{1}) . $$(4.3)
- The linear flow in non-divergence form:
-
$$ g^{\alpha \beta} \partial _{\alpha }\partial _{\beta }v = f \qquad v[0] = (v_{0},v_{1}) . $$(4.4)
Several comments are in order:
-
As written, the above evolutions are inhomogeneous. If \(f = 0\) then we will refer to them as the homogeneous flows.
-
In the context of this paper, we are primarily interested in the metric \({\hat{g}}\), in which case the equation (4.3) represents our main linearized flow, and (4.1) represents our main linear paradifferential flow. The metric \(g\) and the nondivergence form of the equations will be used in order to connect our results with the result of Smith-Tataru, which will be used in our proofs.
-
One may also add a gradient potential in the equations above; with the gradient potential added there is no difference between the divergence and the non-divergence form of the equations. We omit it in this section, as it plays no role.
We will consider these evolutions in the inhomogeneous Sobolev spaces \(\mathcal {H}^{s}\). In order to do this uniformly, we will assume that \(|I| \leq 1\); else using homogeneous spaces would be more appropriate. The exponent \(s\) will be an arbitrary real number in the case of the paradifferential flows, but will have a restricted range otherwise.
4.2 Energy estimates and well-posedness for the homogeneous problem
Here we review some relatively standard definitions and facts about local well-posedness.
Definition 4.1
For any of the above flows in the homogeneous form, we say that they are (forward) well-posed in \(\mathcal {H}^{s}\) in the time interval \(I=[0,T]\) if for each initial data \(u[0] \in \mathcal {H}^{s}\) there exists a unique solution \(u\) with the property that
This corresponds to a linear estimate of the form
Sometimes one establishes additional bounds for the solution (e.g. Strichartz estimates) and these are then added in to the class of solutions for which uniqueness is established. We will comment on this where needed. If no such assumption is used, we call this unconditional uniqueness.
For completeness and reference, we now state without proof a classical well-posedness result:
Theorem 4.2
Assume that \(\partial g \in L^{1}(I;L^{\infty})\). Then
a) The paradiffererential flows (4.1) and (4.2) are wellposed in \(\mathcal {H}^{s}\) for all real \(s\).
b) The divergence form evolution (4.3) is well-posed in \(\mathcal {H}^{s}\) for \(s \in [0,1]\), and the non-divergence form evolution (4.4) is well-posed in \(\mathcal {H}^{s}\) for \(s \in [1,2]\).
We remark that the metrics \(g\) associated with the solutions of Smith-Tataru satisfy the above hypothesis, but the ones associated to the solutions in our paper do not.
A slightly stronger form of well-posedness is to assert the existence of a suitable (time dependent) energy functional \(E^{s}\) in \(\mathcal {H}^{s}\):
Definition 4.3
An energy functional for either of the above problems in \(\mathcal {H}^{s}\) is a bounded quadratic form in \(\mathcal {H}^{s}\) that has the following two properties:
-
a)
Coercivity,
$$ E^{s}(v[t]) \approx \| v[t] \|_{\mathcal {H}^{s}}^{2} . $$(4.6) -
b)
Bounded growth for solutions \(v\) to the homogeneous equation,
$$ \frac{d}{dt}E^{s}(v[t]) \lesssim B(t) \| v[t] \|_{\mathcal {H}^{s}}^{2}, $$(4.7)where \(B \in L^{1}\) depends only on \(g\).
Later we will also interpret \(E^{s}\) as a symmetric bilinear form in \(\mathcal {H}^{s}\). Such an interpretation is unique.
We remark that, in the context of Theorem 4.2, where \(\partial g \in L^{1} L^{\infty}\), an energy functional \(E^{1}\) corresponding to \(s = 1\) is classically obtained by multiplying the equation with a suitable smooth time-like vector field and integrating by parts; we refer the reader to Section 7.2.1 where this procedure is described in greater detail. Then for \(s \neq 1\) one simply defines
and the corresponding control parameter \(B\) may be taken as
4.3 The wave equation as a system and the inhomogeneous problem
Switching now to the associated inhomogeneous flows, the classical set-up is to take a source term \(f \in L^{1} H^{s-1}\), and then look for solutions \(v\) in \(C(I;\mathcal {H}^{s})\) as above. This is commonly done using the Duhamel principle, which is most readily applied by rewriting the wave equation as a system. We next describe this process.
A common choice is to write the system for the pair of variables \((v,\partial _{t} v)\). However, for us it will be more convenient to make a slightly different linear transformation, and use instead the pair
for (4.3) and (4.4), with products replaced by paraproducts in the case of the equation (4.1) or (4.2). For later use, we record the inverse of \(Q\); this is either
or its version with products replaced by paraproducts, as needed.
The system for \(\mathbf{v}\) will have the form
with the appropriate choice for the matrix operator ℒ. For instance in the case of the homogeneous equation (4.3) we have
which has the antisymmetry property
A similar property holds in the non-divergence case, but only for the principal part.
We will always work in settings where \(Q\) is bounded and invertible in \(\mathcal {H}^{s}\). This is nearly automatic in the paradifferential case; there we only need to make sure that the operator \(T_{g^{00}}\) is invertible. In the differential case we will have to ask that multiplication by \(g\) and by \((g^{00})^{-1}\) are bounded in \(H^{s-1}\). In such settings, \(\mathcal {H}^{s}\) well-posedness for our original wave equation and for the associated system are equivalent. If a good energy functional \(E^{s}\) exists for the wave equation, then we may define an associated energy functional for the system by setting
Then the properties (4.6) and (4.7) directly transfer to the homogeneous system (4.10).
If our system is (forward) well-posed in \(\mathcal {H}^{s}\), then solving it generates a (forward) evolution operator \(S(t,s)\) which is bounded in \(\mathcal {H}^{s}\) and maps the data at time \(s\) to the solution at time \(t\),
For the system it is easy to consider the inhomogeneous version
If \(\mathbf{f}\in L^{1} \mathcal {H}^{s}\) then the solution to (4.14) is given by Duhamel’s formula,
and satisfies the bound
If we have a good energy \(\mathbf{E}^{s}\) for the homogeneous system, then Duhamel’s formula easily allows us to obtain the corresponding energy estimate for the inhomogeneous one, namely
where the first term on the right arises due to the fact that the energy is quadratic in \(\mathbf{v}(t)\).
Now we are ready to return to our original set of equations, add the source term \(f\) and reinterpret the above consequences of Duhamel’s formula there. As in the homogeneous case, we define \(\mathbf{v}(t) := Q v[t]\). Then adding the source term \(f\) in the original equation is equivalent to adding a source term \(\mathbf{f}\) in the above system. Indeed, it is readily seen that for all our four equations, \(\mathbf{f}\) is given by
To complete the correspondence, we note that for such \(\mathbf{f}\) we have
Then we immediately arrive at the following result:
Theorem 4.4
a) Assume that either of the homogeneous paradifferential flows (4.1) or (4.2) are well-posed in \(\mathcal {H}^{s}\). Then the associated inhomogeneous flows are well-posed in \(\mathcal {H}^{s}\) for \(f \in L^{1} H^{s-1}\), and the following estimate holds
In addition, if an energy functional \(E^{s}\) in \(\mathcal {H}^{s}\) exists, then
b) The same holds for the flows (4.3) or (4.4) under the additional assumption that multiplication by \(g\) and \(({g^{00}})^{-1}\) is bounded in \(H^{s-1}\), with the paraproduct replaced by the corresponding product.
For our purposes in this paper, we will also need to allow for a larger class of source terms of the form
To understand why this is natural, it is instructive to start from the inhomogeneous system (4.14) and argue backward.
Above, we have used the inhomogeneous system in the case where the first component of \(\mathbf{f}\) was zero. Now we will allow for both terms in \(\mathbf{f}\) to be nonzero, and derive the corresponding wave equation. For clarity we do this in the context of the equation (4.3), for which we have computed the corresponding operator ℒ in (4.11); however, a similar computation will apply in all four cases.
Starting with a solution \(\mathbf{v}= (\mathbf{v}_{1},\mathbf{v}_{2})^{\top}\) to the inhomogeneous problem (4.14), we begin by defining
as our candidate for the wave equation solution. Then the first equation of the system reads
or equivalently
Differentiating this with respect to time we obtain
Finally we substitute \(\partial _{t} v\) from (4.23) and \(\partial _{t} \mathbf{v}_{2}\) from the second equation of the system. We already know the right hand side should vanish if \(\mathbf{f}=0\), so it suffices to track the \(\mathbf{f}\) terms. Then we easily obtain the desired equation for \(v\):
Comparing this with (4.21), we obtain the correspondence between the source terms for the wave equation and the system:
We also record here the correspondence between the solutions, in the form
noting that this is no longer homogeneous, as in (4.8).
The last step in our analysis is to reinterpret the bounds (4.16) and (4.17) in terms of \(v\) and \(f\). To do this we make the assumption that multiplication by \(g\) and \((g^{00})^{-1}\) is bounded in both \(H^{s}\) and \(H^{s-1}\). Then from (4.16) we get
Similarly, from (4.17) and (4.13) we obtain the energy bound
Here we use (4.9) and (4.26) to compute
respectively, using also (4.25),
Thus we obtain the following natural extension of Theorem 4.4 above:
Theorem 4.5
a) Assume that the homogeneous evolution (4.4) or (4.3) is well-posed in \(\mathcal {H}^{s}\), and that multiplication by \(g\) and \((g^{00})^{-1}\) is bounded in \(H^{s}\) and in \(H^{s-1}\). Consider either of the two evolutions with a source term \(f\) of the form
Then a unique solution \(u \in C(I, \mathcal {H}^{s})\) exists. If in addition the homogeneous problem admits an energy functional \(E^{s}\) as in Definition 4.3then we have the energy estimate
with \(\tilde{v}\) and \(\tilde{f}\) defined above and \(B\) as in (4.7).
b) The same result applies for the paradifferential equations (4.1), respectively (4.2), where all instances of \(g\) above are replaced by the corresponding paraproduct operators \(T_{g}\).
We emphasize here the somewhat unusual function space for \(f_{1}\), in an intersection of two spaces. This reflects the fact that \(f_{1}\) has a dual role, both as a source term and as a velocity correction.
We remark that in the situations where we apply this result, the mapping properties for \(g\) and \((g^{00})^{-1}\) will be fairly straightforward to verify. In the paradifferential case, for instance, the continuity of \(g\) will suffice.
4.4 A duality argument
Duality plays an important role in many estimates for evolution equations. We will also use duality considerations in this paper for several arguments. We restrict the discussion below to the problems written in divergence form, as this is what we will use later in the paper. However, similar versions may be formulated in the nondivergence case.
At heart, this is based on the following identity, which in the context of the operator \(\partial _{\alpha }g^{\alpha \beta} \partial _{\beta}\) is written as follows:
This holds for any test functions \(v\) and \(w\). The integral on the right can be viewed as a duality relation between \(u[t]\) and \(v[t]\),
Precisely, assuming that \(g:H^{s-1} \to H^{s-1}\) as a multiplication operator, and that \(g^{00}\) is invertible, this expression has the following two properties
-
(1)
Boundedness,
$$ B: \mathcal {H}^{s} \times \mathcal {H}^{1-s} \to {\mathbb{R}}. $$ -
(2)
Coercivity,
$$ \sup _{\| v\|_{\mathcal {H}^{1-s}} \leq 1} B(u,v) \approx \| u \|_{ \mathcal {H}^{s}} . $$
A standard consequence of this relation is the following property:
Proposition 4.6
The evolutions (4.3), respectively (4.1) are forward well-posed in \(\mathcal {H}^{s}\) iff they are backward well-posed in \(\mathcal {H}^{1-s}\).
We remark that in the context of this paper forward and backward well-posedness are almost identical, so for us this property says that well-posedness in \(\mathcal {H}^{s}\) and \(\mathcal {H}^{1-s}\) are equivalent.
The above proposition may be equivalently reformulated as the corresponding result for the system (4.10). It will be more convenient to view it in this context. To do this, we reinterpret the above duality, in terms of the associated system (4.14). In view of the symmetry property (4.12), we have the relation
where the corresponding duality relation is
which provides the duality between \(\mathcal {H}^{s}\) and \(\mathcal {H}^{1-s}\). Incidentally, a consequence of (4.31) is the duality relation
where the duality between \(\mathcal {H}^{s}\) and \(\mathcal {H}^{1-s}\) is the one given by the bilinear form \({\mathbf {B}}\) above. This can be used to construct the backward evolution in \(\mathcal {H}^{1-s}\) given the forward evolution in \(\mathcal {H}^{s}\), and vice-versa. The full equivalence argument is standard, and is omitted.
4.5 Strichartz estimates
Here we discuss several versions of Strichartz estimates, as well as the connection between them.
4.5.1 Estimates for homogeneous equations
In the context of this paper, these have the form
where for the Strichartz space \(S\) we will consider two different choices:
-
i)
Almost lossless estimates, akin to those established in Smith-Tataru [38]. The corresponding Strichartz norms, denoted by \(S=S_{ST}\) are defined as
$$ \begin{aligned} &\| v\|_{S_{ST}^{r}} = \| v \|_{L^{\infty }H^{r}} + \|\langle D_{x} \rangle ^{r-\frac{3}{4}-\delta} v \|_{L^{4} L^{\infty}}, \qquad n = 2, \\ &\| v\|_{S_{ST}^{r}} =\| v \|_{L^{\infty }H^{r}} + \|\langle D_{x} \rangle ^{r-\frac{n-1}{2}-\delta} v \|_{L^{2} L^{\infty}}, \quad \, n \geq 3. \end{aligned} $$(4.34)Here the loss of derivatives is measured by \(\delta > 0\), which is an arbitrarily small parameter.
-
ii)
Estimates with derivative losses, precisely the type that will be established in this paper. The corresponding Strichartz norms, denoted by \(S=S_{AIT}\) are defined as
$$ \begin{aligned} &\| v\|_{S_{AIT}^{r}} = \| v \|_{L^{\infty }H^{r}} + \|\langle D_{x} \rangle ^{r-\frac{3}{4} - \frac{1}{8}-\delta} v \|_{L^{4} L^{\infty}}, \qquad n = 2, \\ &\| v\|_{S_{AIT}^{r}} =\| v \|_{L^{\infty }H^{r}} + \|\langle D_{x} \rangle ^{r-\frac{n-1}{2}-\frac{1}{4}-\delta} v \|_{L^{2} L^{\infty}}, \quad \, n \geq 3. \end{aligned} $$(4.35)Here \(\delta > 0\) is again an arbitrarily small parameter, but we allow for an additional loss of derivatives in the endpoint (Pecher) estimate, namely \(1/8\) derivatives in two space dimensions and \(1/4\) in higher dimensions.
These estimates can be applied to any of the four equations discussed in this section. There are also appropriate counterparts for the corresponding system (4.10), which have the form
Under very mild assumptions on \(g\), these are equivalent to the ones for the corresponding wave equation:
Proposition 4.7
The Strichartz estimates (4.33) for the homogeneous wave equation are equivalent to the Strichartz estimates (4.36) for the associated system.
We also remark on a very mild extension of the estimate (4.33) to the inhomogeneous case. Precisely, if (4.33) holds then we also have the inhomogeneous bound
This follows in a straightforward manner by the Duhamel formula, see the discussion in Section 4.3.
We conclude the discussion of the Strichartz estimates for the homogeneous equation with a simple but important case, which will be useful for us in the sequel, and applies in particular to the solutions in [38].
Proposition 4.8
Assume that \(\partial g \in L^{1}L^{\infty}\) and that the Strichartz estimates for the homogeneous equation (4.4) hold in \(\mathcal {H}^{1}\). Then the Strichartz estimates for the homogeneous equation hold in \(\mathcal {H}^{r}\) for all \(r \in {\mathbb{R}}\) for both paradifferential flows (4.1) and (4.2).
We remark that the implicit constant in these Strichartz estimates depends on the implicit constant in the Strichartz estimate in the hypothesis and on the bound for \(\|\partial g\|_{L^{1} L^{\infty}}\). Later when we apply this result we will have uniform control over both, so we obtain uniform control over the \(\mathcal {H}^{r}\) Strichartz norm.
Proof
It will be easier to work with the inhomogeneous bound (4.37), as it is more stable with respect to perturbations. We divide the proof into several steps, all of which are relatively standard.
Step 1: We start with the case \(r=1\) with the additional assumption \(g^{00}= -1\). Then the second equation in (4.2) can be seen as a perturbation of (4.4) with an \(L^{1} L^{2}\) source term. Hence the bound (4.37) for (4.4) implies the same bound for (4.2).
Step 2: Next, assuming still that \(g^{00}= -1\), we extend the bound (4.37) for (4.2) to all real Sobolev exponents \(r\) by conjugating by \(\langle D_{x} \rangle ^{\sigma}\) with \(\sigma = r-1\), where we can estimate perturbatively the commutator
This is valid for all real \(\sigma \) and, since it involves paraproducts, can be thought of as a frequency localized bound, which is but a version of Lemma 2.3.
Step 3: Using a multiplication by \(T_{g^{00}}\), we reduce the problem with nonconstant \(g^{00}\) to the case when \(g^{00}= -1\). Here we apply Lemma 2.7 with \(\gamma _{1}+\gamma _{2} = 1\) for the composition of paraproducts, and then interpolation; this applies equally for all real \(s\). At the conclusion of this step, we have the bound (4.37) for (4.2) for all \(r\).
Step 4: We commute the paracoefficients \(T_{g^{\alpha \beta}}\) inside \(\partial _{\alpha}\) perturbatively, in order to obtain the bound (4.37) for (4.1) for all \(r\). □
4.5.2 Dual Strichartz estimates
Here one considers the corresponding inhomogeneous problems, with source terms in dual Strichartz spaces. The estimates have the form
Classically, these are obtained by duality from the homogeneous estimates, as follows:
Proposition 4.9
If the homogeneous estimates (4.33) hold in \(\mathcal {H}^{r}\) for the forward (backward) evolution then the dual estimates (4.39) hold in \(\mathcal {H}^{1-r}\) for the backward (forward) evolution.
However, one can do better than this by going instead through the system form of the equations (4.14). The dual estimates for (4.14) have the form
These are directly obtained from the homogeneous estimates for the system (4.10) via the duality (4.31):
Proposition 4.10
If the homogeneous estimates hold in \(\mathcal {H}^{r}\) for the forward (backward) evolution (4.10) then the dual estimates hold in \(\mathcal {H}^{1-r}\) for the backward (forward) evolution (4.14).
One can now further return to the original inhomogeneous equation with a source term as in (4.21), and use the correspondence (4.25) and (4.26), in order to transfer the dual bounds back. These dual estimates, which represent a generalization of (4.39), have the form
We obtain the following strengthening of Proposition 4.9:
Proposition 4.11
If the homogeneous estimates (4.33) hold in \(\mathcal {H}^{r}\) for the forward (backward) evolution then the dual estimates (4.41) hold in \(\mathcal {H}^{1-r}\) for the backward (forward) evolution.
4.5.3 Full (retarded) Strichartz estimates
Here we combine the homogeneous and dual Strichartz estimates in a single bound for the inhomogeneous problem. The classical form is
However, here we need to take the extra step where we allow source terms of the form \(f = \partial _{t} f_{1} + f_{2}\), and then the estimates have the form
As we will see, this is closely related to the corresponding bound for the associated inhomogeneous system (4.14):
Our main result here is as follows:
Theorem 4.12
Consider either of the equations (4.3) or (4.1). If the homogeneous problem is well-posed forward in \(\mathcal {H}^{r}\) and backward in \(\mathcal {H}^{1-r}\) and satisfies the homogeneous Strichartz estimates (4.42) in both cases, then the solutions to the associated forward inhomogeneous problem with source term \(f = \partial _{t} f_{1} + f_{2}\) satisfy the bounds in (4.43).
Proof
The proof consists of four steps:
Step 1: If the homogeneous problem is well-posed forward in \(\mathcal {H}^{r}\) and satisfies the homogeneous Strichartz estimates (4.36), then so does the corresponding system, see Proposition 4.7.
Step 2: If the homogeneous problem is well-posed backward in \(\mathcal {H}^{1-r}\) and satisfies the homogeneous Strichartz estimates, then so does the corresponding system. By duality, the inhomogeneous system is well-posed forward in \(H^{r}\) and satisfies the dual Strichartz bounds (4.40).
Step 3: We represent the forward \(\mathcal {H}^{r}\) solution by the Duhamel formula
The first term represents the solution to the homogeneous equation, and is estimated by (4.36). For the second term we have two bounds at our disposal: the dual bound where we fix \(t\) and estimate the output in \(\mathcal {H}^{s}\) in terms of the input in the dual Strichartz space, and the homogeneous bound where we fix \(s\), set \(\mathbf{f}(s) \in \mathcal {H}^{r}\) and estimate the output as a function of \(t\) in the Strichartz space. Concatenating the two, we get the restricted bound
where the source \(\mathbf{f}\) is supported in an interval \(I\) and the output \(\mathbf{v}\) is measured in an interval \(J\) so that \(I\) precedes \(J\). In two dimensions we can now apply the Christ-Kiselev lemma [9] (or the \(U^{p}\)-\(V^{p}\) spaces, see [28]) to get the full estimate. In three and higher dimensions we have a slight problem which is that neither method applies for bounds from \(L^{2}_{t}\) to \(L^{2}_{t}\). However in our case this is not an issue, because our estimates allow for at least a loss of \(\delta \) derivatives. Then we can afford to interpolate between the two endpoints and use the Christ-Kiselev lemma for bounds from \(L^{2-}_{t}\) to \(L^{2+}_{t}\) and then return to the endpoint setting by Bernstein’s inequality in space and Holder’s inequality in time, all at the expense of an arbitrarily small increase in the size \(\delta \) of the loss.
Step 4. We transfer the estimate (4.44) back to the original system via the correspondence (4.25), (4.26), in order to obtain (4.43). □
We conclude with a corollary of Theorem 4.12, which will be used later in the paper and follows by combining this result with Proposition 4.8:
Corollary 4.13
Assume that \(\partial g \in L^{1}L^{\infty}\) and that the Strichartz estimates for the homogeneous equation (4.4) hold in \(\mathcal {H}^{1}\). Then the full Strichartz estimates (4.43) hold in \(\mathcal {H}^{r}\) for all \(r \in {\mathbb{R}}\) for both paradifferential flows (4.1) and (4.2).
5 Control parameters and related bounds
5.1 Control parameters
Here we introduce our main control parameters associated to a solution \(u\) to the minimal surface equation, which serve to bound the growth of energy for both solutions to the minimal surface flow and for its linearization. We will use three such primary quantities, \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\) and ℬ, which are defined as Besov norms of the solution \(u\). Our notations here mirror similar notations in our earlier water wave paper [1].
We begin with \({\mathcal {A}}\), which is \(L^{\infty}\) based,
We next define the slightly stronger variant \({\mathcal {A}^{\sharp }}\gtrsim {\mathcal {A}}\), still at the same scaling but \(L^{2n}\) based,
Here the choice of the exponent \(2n\) is in no way essential, though it does provide some minor simplifications in one or two places.
Finally we define the time dependent ℬ control parameter which is again \(L^{\infty}\) based:
In a nutshell, the energy functionals we construct later in the paper will be shown to satisfy cubic balanced bounds of the form
which guarantee that energy bounds can be propagated for as long as \({\mathcal {A}^{\sharp }}\) remains finite and ℬ remains in \(L^{2}_{t}\). One should compare these bounds with the classical energy estimates, which have the form
and which require an extra half derivative in the control parameter.
We continue with a few comments concerning our choice of control parameters:
-
Here \({\mathcal {A}}\) and \({\mathcal {A}^{\sharp }}\) are critical norms for \(u\), which may be described using the Besov notation as capturing the uniform norms in time
$$ {\mathcal {A}}= \| \partial u\|_{L^{\infty}_{t} B^{0}_{\infty ,1}}, \qquad {\mathcal {A}^{\sharp }}= \| \partial u\|_{L^{\infty}_{t} B^{ \frac{1}{2}}_{2n,1}}. $$In a first approximation, the reader should think of \({\mathcal {A}}\) as simply capturing the \(L^{\infty}\) norm of \(\partial u\); the slightly stronger Besov norm above is needed for minor technical reasons, and allows us to work with scale invariant bounds. Often we will simply rely on the simpler \(L^{\infty}\)-bound, since
$$ \| \partial u\|_{L^{\infty}} \lesssim {\mathcal {A}}\lesssim { \mathcal {A}^{\sharp }}. $$(5.6) -
The control norm ℬ, taken at fixed time, is \(1/2\) derivative above scaling, and may also be described using the Besov notation as
$$ {\mathcal {B}}(t) = \| \partial u(t)\|_{B^{\frac{1}{2}}_{\infty ,2}}. $$Again, in a first approximation one should simply think of it as \(\|\partial u\|_{BMO^{\frac{1}{2}}}\), which in effect suffices for most of the analysis. Indeed, we have
$$ \| \partial u\|_{BMO^{\frac{1}{2}}} \lesssim {\mathcal {B}}. $$(5.7) -
Given the choice of these control parameters, it is not difficult to see that our energy estimates of the form (5.4) are invariant with respect to scaling. This by itself does not mean much; even the classical energy estimates, of the form (5.5), are scale invariant, but much less useful for low regularity well-posedness. What is important here is that our energy estimates are cubic and balanced.
-
The fact that our control norms are based on uniform, rather than \(L^{2}\)-bounds, particularly at the level of ℬ, is also critical. This is what allows us to use Strichartz estimates to further improve the low regularity well-posedness threshold in our results.
-
Concerning the dependence of constants in our estimates on \({\mathcal {A}}\), \({\mathcal {A}^{\sharp}}\) and ℬ, we adopt a two track system:
-
The dependence on ℬ is either linear or quadratic, and will always be explicitly stated in all estimates.
-
The dependence on \({\mathcal {A}}\) and \({\mathcal {A}^{\sharp}}\) is often nonlinear, in which case we use the notations \(\lesssim _{{\mathcal {A}}}\), \(\lesssim _{{\mathcal {A}^{\sharp}}}\). This dependence is less important, as beginning with Section 7 we will assume that \({\mathcal {A}^{\sharp}}\ll 1\), and drop it altogether except where the smallness is essential. But for clarity and also for later use, we do track this dependence in this and the next section.
-
-
In terms of using \({\mathcal {A}}\) versus \({\mathcal {A}^{\sharp}}\), we first note that ideally we would like to avoid \({\mathcal {A}^{\sharp}}\) altogether, and just use the weaker control norm \({\mathcal {A}}\). But this appears not to be possible, which is why \({\mathcal {A}^{\sharp}}\) was introduced. To streamline the analysis, in what follows we will simply think of the implicit dependence as being on \({\mathcal {A}^{\sharp}}\), which suffices for our final result. One may even take the more radical step of dropping \({\mathcal {A}}\) altogether; we decided against that, both for historical reasons and for easy reference.
For bookkeeping reasons we will use a joint frequency envelope \(\{c_{k}\}_{k}\) for the dyadic components of each of \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\), and ℬ, so that
(i) \(\{c_{k}\}_{k}\) is normalized in \(\ell ^{2}\) and slowly varying,
(ii) We have control of dyadic Littlewood-Paley pieces as follows for \(\partial u\):
A-priori, these frequency envelopes depend on time. However, at the conclusion of the paper, we will see that for our rough solutions they can be taken to be independent of time, essentially equal to appropriate \(L^{2}\)-type frequency envelopes for the initial data.
5.2 Related bounds
We will frequently need to use bounds that are similar to (5.8) in nonlinear expressions, so it is convenient to have a notation for the corresponding space:
Definition 5.1
The space \(\mathfrak{C}_{0}\) is the Banach space of all distributions \(v\) that satisfy the bounds
with the norm given by the best constant \(C\) in the above inequalities.
For this space we have the following algebra and Moser-type result:
Lemma 5.2
a) The space \(\mathfrak{C}_{0}\) is closed with respect to multiplication and para-multiplication. In particular \(\mathfrak{C}_{0}\) is an algebra.
b) Let \(F\) be a smooth function with \(F(0)=0\), and \(v \in \mathfrak{C}_{0}\). Then \(F(v) \in \mathfrak{C}_{0}\). In particular if \(\|v\|_{\mathfrak{C}_{0}} \lesssim 1\) then \(F(v)\) satisfies
In particular the above result applies to the metrics \(g\), \({\tilde{g}}\) and \({\hat{g}}\), all of which are smooth functions of \(\partial u\), and thus belong to \(\mathfrak{C}_{0}\) modulo constants (which are simply the Minkowski metric).
Proof
a) We first estimate the \(\mathfrak{C}_{0}\) norm for the paraproduct \(T_{f} g\) for \(f,g \in \mathfrak{C}_{0}\). This is straightforward, using the \(L^{\infty}\) bound for \(f\), for all but the uniform bound in (5.9). For the uniform bound, we change the summation order in the Littlewood-Paley expansion to obtain
It now remains to estimate \(\Pi (f,g)\) in \(\mathfrak{C}_{0}\). The uniform bound is almost identical to the one above. For the \({\mathcal {A}^{\sharp }}\) norm we use Bernstein’s inequality
and now the \(j\) summation is straightforward.
For the ℬ norm, on the other hand, we estimate
and again the \(j\) summation is straightforward.
b) To prove the Moser inequality we use a continuous Littlewood-Paley decomposition, which leads to the expansion
To estimate \(P_{k} F(v)\) we consider several cases:
i) \(j = k+ O(1)\). Then \(c_{j} \approx c_{k}\), \(F'(v_{< j})\) is directly bounded in \(L^{\infty}\) and our bounds are straightforward.
ii) \(j < k-4\). Then we can insert an additional localization,
where we gain from the frequency difference
which more than compensates for the difference (ratio) between \(c_{j}\) and \(c_{k}\).
iii) \(j > k+4\). In this case we reexpand \(F'(v_{< j})\) and write
We further separate into two cases:
(iii.1) \(l = j + O(1)\). Then we simply bound \(F''(v_{< l})\) in \(L^{\infty}\), and estimate first for the \({\mathcal {A}^{\sharp }}\) bound using Bernstein’s inequality
where the \(j\) and \(l\) integrations are trivial. Next we estimate for the ℬ-bound
again with easy \(j\) and \(l\) integrations.
(iii.2) \(l < j-4\). Then we can insert another frequency localization,
and repeat the computation in (b.ii) but using (5.11) to account for the difference between \(l\) and \(j\). □
In order to avoid tampering with causality, the Littlewood-Paley projections we use in this paper are purely spatial. This is more of a choice between different evils than a necessity; see for instance the alternate choice made in [38]. A substantial but worthwhile price to pay is that on occasion we will need to separately estimate double time derivatives, in a somewhat imperfect but sufficient fashion.
A good starting point in this direction is to think of bounds for second derivatives of our solution \(u\). If at least one of the derivatives is spatial, then this is straightforward:
However, matters become more complex if instead we look at the second time derivative of \(u\). The natural idea is to use the main equation (3.5) to estimate \(\partial _{t}^{2} u\), by writing in terms of spatial derivatives,
If one takes this view, the main difficulties we face are with the high-high interactions in this expression. But these high-high interactions have the redeeming feature that they are balanced, so they will often play a perturbative role. This leads us to define a corrected expression as follows:
Definition 5.3
“Good” second order derivatives
We denote by \(\widehat{\partial _{0} \partial _{0}} u\) or shortly \(\hat{\partial}_{t}^{2} u\) the expression
On the other hand if \((\alpha ,\beta ) \neq (0,0)\) then we define
With this notation, we have
Lemma 5.4
Assume that \(u\) solves the equation (3.5). Then for its second time derivative we have the decomposition
where the two components satisfy the uniform bounds
respectively
One should compare this with the easier direct bound (5.12) for spatial derivatives; the good part \(\hat {\partial }_{t}^{2} u\) satisfies a similar bound, but the error \(\pi _{2}(u)\) does not. Later, when such expressions are involved, we will systematically peel off perturbatively the error, and always avoid differentiating it further.
Proof
The main ingredient here is the Littlewood-Paley decomposition. For expository simplicity we prove (5.15) at fixed frequency \(k\). Using the notation in (5.13) we can rewrite equation (3.19) as
To finish the proof we consider the expression above localized at frequency \(2^{k}\), and evaluated in the \(L^{\infty}\)-norm
We bound each of the terms separately. For the second we use the fact that \({\tilde{g}}\) is bounded in \(L^{\infty}\), together with the third bound in (5.8), in order to get
For the first term we rely on the same procedure, and hence finish the proof of the first bound in (5.15). The second bound in (5.15) has as a starting point the same decomposition in (5.18), only that this time we want to bound the RHS terms using the control norm \({\mathcal {A}}\). Here we use the first bound in (5.8) and the algebra property of \(L^{\infty}\) to obtain
The last bound to prove is (5.16), where because of the balanced frequencies we can easily even out the derivatives balance and estimate each of the factors using the ℬ norm. Explicitly, \({\tilde{g}}^{\alpha \beta}\) is in \(\mathfrak{C}_{0} \) by Lemma 5.2 and hence, we get that for indices \((\alpha , \beta )\neq (0, 0)\):
which is the first bound in (5.16). The second bound in (5.16) is similar, but replacing the \(L^{\infty}\) norms with \(L^{2n}\) norms. □
The above lemma motivates narrowing the space \(\mathfrak{C}_{0}\), in order to also include information about \(\partial _{t} v\). For later use, we also define two additional closely related spaces.
Definition 5.5
a) The space ℭ is the space of distributions \(v\) that satisfy (5.9) and, in addition, \(\partial _{t} v\) admits a decomposition \(\partial _{t} v = w_{1}+w_{2}\) so that
endowed with the norm defined as the best possible constant \(C\) in (5.9) and in the above inequality relative to all such possible decompositions.
b) The space \(\mathfrak{DC}\) consists of all functions \(f\) that admit a decomposition \(f = f_{1}+f_{2}\) so that
endowed with the norm defined as the best possible constant \(C\) in the above inequality relative to all such possible decompositions.
c) The space \(\partial _{x} \mathfrak{DC}\) consists of functions \(f\) that admit a decomposition \(f = f_{1}+f_{2}\) so that
endowed also with the corresponding norm.
We remark that, by definition, we have the simple inclusions
Based on what we have so far, we begin by identifying some elements of these spaces:
Lemma 5.6
We have
Proof
The bounds in (5.23) are trivial unless both derivatives are time derivatives, in which case it follows directly from the previous Lemma 5.4. □
The Moser estimates of Lemma 5.2 may be extended to this setting to include all smooth functions of \(\partial u\):
Lemma 5.7
a) We have the bilinear multiplicative relations
as well as
b) The space ℭ is closed under multiplication and para-multiplication; in particular it is an algebra.
c) Let \(F\) be a smooth function, and \(v \in \mathfrak{C}\). Then \(F(v) \in \mathfrak{C}\). In particular if \(\|v\|_{\mathfrak{C}} \lesssim 1\) and \(F(0) = 0\) then \(F(v)\) satisfies
d) In addition we also have the paralinearization error bounds
where \(R\) is as in Lemma 2.9, namely \(R(v) = F(v) - T_{F'(v)} v\).
Here part (a) is the main part, after which parts (b) and (c) become immediate improvements of Lemma 5.2. But the new interesting bound is the one in part (d), where, notably, we also bound the time derivative of \(R(v)\).
Proof
a) Let \(z \in \mathfrak{C}_{0}\) and \(w \in \mathfrak{DC}\) with the decomposition \(w=w_{1}+w_{2}\) as in (5.20). We skip the first bound in (5.24), as it is a consequence of the rest of the estimates in (5.24) and (5.25), and first consider the paraproduct \(T_{z} w\). We will bound the contributions of \(w_{1}\) and \(w_{2}\) in the same norms as \(w_{1}\), respectively \(w_{2}\). Precisely, we have
respectively
Next we consider \(T_{w} z\), where we have two choices. The first choice is to use only the \({\mathcal {A}^{\sharp }}\) component of the \(\mathfrak{C}_{0}\) norm of \(z\), and prove the last bound in (5.24). Precisely, we have
respectively
Alternatively, we can use the ℬ component of the \(\mathfrak{C}_{0}\) norm of \(z\) in the bound for the \(w_{1}\) component,
and the \({\mathcal {A}^{\sharp}}\) component for the \(w_{2}\) term,
which leads to the first bound in (5.25).
It remains to consider the second bound in (5.25), where we have
respectively
b) Compared with part (a) of Lemma 5.2, it remains to estimate the time derivative of products and paraproducts. Using Leibniz’s rule, this reduces directly to the multiplicative bounds in (a).
c) Compared with part (b) of Lemma 5.2 it remains to estimate
as in (5.19), which is the same as placing it in \(\mathfrak{DC}\). By Lemma 5.2 we have \(F'(v) \in \mathfrak{C}_{0}\), while \(\partial _{0} v \in \mathfrak{DC}\). Then we can bound the product in \(\mathfrak{DC}\) by part (a) of this Lemma.
d) Subtracting the harmless linear part of \(F\), without any loss of generality we can assume that \(F'(0)=0\). We have
By (5.26) we can place \(F'(v) \in \mathfrak {C}\). Then the \({\mathcal {B}}^{2}\) bound follows directly from (5.25). For the \({\mathcal {A}^{\sharp}}^{2}\) bound we can replace \(\partial \) by \(\partial _{x}\) above, and use only \(\mathfrak {C}_{0}\) bounds. Then we can repeat the argument in Lemma 5.2 (a). □
Applying the above lemma shows that for smooth functions \(F\) with \(F(0)=0\) we have \(F(\partial u) \in \mathfrak{C}\), and in particular all components of the metrics \(g\), \({\tilde{g}}\) and \({\hat{g}}\) are in ℭ modulo constants. We also have \(F(\partial u) \partial ^{2} u \in \mathfrak{DC}\), which in particular shows that the gradient potentials \(A\) and \({\tilde{A}}\) belong to \(\mathfrak{DC}\).
We will use part (d) when \(w=\partial u\) and \(F=g\), in which case (5.27) reads
We remark that a similar \(H^{s}\) type bound for the same \(R\) is provided by (2.14), namely
The next lemma provides us with the primary example of elements of the space \(\partial _{x} \mathfrak{DC}\):
Lemma 5.8
We have
Proof
The bound in (5.30) is trivial if at least two derivatives are spatial, and follows from (5.23) unless all indices are zero. It remains to consider the case \(\alpha = \beta = \gamma = 0\). Here we rely on the earlier decomposition (5.17) to which we further apply a \(\partial _{t}\):
We now investigate each term separately, for fixed \((\alpha ,\beta ) \neq (0,0)\). We begin with the first term, which needs to be bounded in the \(\partial _{x} \mathfrak{DC}\) norm given in (5.21). We have
The term that contains the time derivative falling onto the metric will be bounded using the Moser estimate Lemma 5.7. Explicitly, we know that \({\tilde{g}}^{\alpha \beta}\) is in ℭ modulo constants, and due to Lemma 5.7, part \(c\)), we get \(\partial _{t}{\tilde{g}}^{\alpha \beta}\in \mathfrak{DC}\) which allows us to decompose it as in (5.20), \(\partial _{t}{\tilde{g}}^{\alpha \beta}={\tilde{g}}_{1}^{\alpha \beta} + {\tilde{g}}_{2}^{\alpha \beta}\) where
We now turn to the last bound which we can estimate in two ways using the last part of (5.8),
respectively the first part of (5.8)
Putting together the bounds we have leads to
We now bound the second term in (5.31) as follows:
Here we know that \((\alpha , \beta ) \neq (0,0)\) hence there are two cases to consider: (i) we have either \(\alpha =0\) or \(\beta =0\), but not both zero, which overalls means we need to bound \(\partial _{t} ^{2} \partial _{x} u\), or (ii) we have both \(\alpha , \beta \neq 0\), in which case we need a pointwise bound for \(\partial _{t}\partial ^{2}_{x} u\). However, both cases can be handled in the same way if we observe that \(\partial _{x} (\partial _{x}\partial _{t} u)\) and \(\partial _{x} (\partial ^{2}_{t} u)\) are elements in \(\partial _{x}\mathfrak{DC}\); this is a direct consequence of \(\partial ^{2} u \in \mathfrak{DC}\) as shown in (5.23), followed by the inclusion in (5.22).
Finally, the third and fourth terms in (5.31) can be treated in the same way the first term in (5.31) was shown to be bounded. □
We continue with another, slightly more subtle balanced bound:
Lemma 5.9
For \(g, h \in \mathfrak{C}\) define
Then we have the balanced bound
Proof
For \(\partial ^{2} u\) we use the \(\mathfrak{DC}\) decomposition as in (5.20),
We begin with the contribution \(r_{1}\) of \(f_{1}\), which we expand as
This vanishes unless the frequencies \(k_{1}\), \(k_{2}\) of \(g\) and \(h\) are either
(i) \(k_{1}, k_{2} \leq k\) and \(\max \{k_{1},k_{2}\} = k+O(1)\), or
(ii) \(k_{1} = k_{2} > k + O(1) \).
Then we use the \({\mathcal {A}^{\sharp }}\) component of the \(\mathfrak{C}_{0}\) norm for the lower frequency and the ℬ component for the higher frequency to estimate
as needed.
For the contribution \(r_{2}\) of \(f_{2}\) we use a similar expansion, and the first two lines of the estimate above are largely unchanged, except for the use of Bernstein’s inequality in case (ii). But now we only use the \({\mathcal {A}^{\sharp}}\) component of the \(\mathfrak {C}_{0}\) norm for both \(k_{1}\) and \(k_{2}\) frequency. This leads to
which again suffices. □
As already discussed in the introduction, the paradifferential wave operator
as well as its counterparts \(T_{\tilde{P}}\) and \(T_{\hat{P}}\) with the metric \(g\) replaced by \({\tilde{g}}\), respectively \({\hat{g}}\), play an important role in our context.
Throughout the paper, we will interpret various objects related to \(u\) as approximate solutions for the \(T_{P}\) equation. We provide several results of this type, where we use our control parameters \({\mathcal {A}}\), ℬ in order to estimate the source term in the paradifferential equation for both \(u\) and for its derivatives.
Lemma 5.10
We have
as well as the similar bounds for \(T_{\tilde{P}}\) and \(T_{\hat{P}}\).
Proof
We first prove the bound (5.34), and for this we begin with the paradifferential equation associated to the minimal surface equation (3.5)
and further isolate the part we are interested in estimating
The estimate we want relies on getting bounds for the following terms
However, the bounds for all of these terms rely on the use of the fact that \(g^{\alpha \beta}\) is in ℭ modulo constants, \(\partial g^{\alpha \beta}, \partial _{\alpha}\partial _{\beta}u \in \mathfrak{DC}\) (consequence of Lemma 5.7), as well as on the bound given by Lemma 5.4. Precisely the estimate (5.25) implies that
Similar bounds will be obtained for \(T_{\hat{P}}\) and \(T_{\tilde{P}}\) using the same results mentioned in the proof of bound (5.34) above. □
We next consider similar bounds for derivatives of \(u\). Here we will differentiate between space and time derivatives. We begin with spatial derivatives:
Lemma 5.11
We have
as well as the similar bounds for \(T_{\tilde{P}}\) and \(T_{\hat{P}}\).
Proof
For this proof we rely on the previous Lemma 5.10 and on Lemma 5.4. This becomes obvious after we commute the \(\partial _{x}\) across the \(T_{P}\) operator
The first term on the RHS of the identity above is bounded using (5.34) as follows:
Here we took advantage of the operator \(\partial _{x}\) accompanied by the frequency projector \(P_{k}\). A similar advantage will not present itself for the last term, where we need to distribute the \(\alpha \) derivative
We bound \(e_{1}\) using Lemma 5.7, by placing \(\partial _{x} \partial g^{\alpha \beta} \in \partial _{x} \mathfrak{DC}\) which means it will admit a decomposition as follows
where
Thus, we get
which leads to the desired bound once we estimate the last term accordingly. The bounds can be one of the following
For the first term in the bracket we use the control norm \({\mathcal {A}}\), and for the second term we use the ℬ norm bound.
For \(e_{2}\) we use the decomposition in Lemma 5.4 for \(\partial _{\alpha}\partial _{\beta}u\) and for \(g\) we use the fact that \(g\) is in \(\mathfrak{C}_{0}\) modulo constants, where we can use either the \({\mathcal {A}}\) bound or the ℬ bound. The computations are similar to the case of \(e_{1}\).
The bounds for \(T_{\tilde{P}}\) and \(T_{\hat{P}}\) follow from the exact argument as the one used above in the \(T_{P}\) case. □
Lemma 5.12
a) We have
i.e. there exists a representation
b) We also have
Similar results hold with \(g\) replaced by \({\tilde{g}}\) or \({\hat{g}}\).
Proof
a) We write
Here for the first term we use Lemma 5.10, while for the second, by (5.25), we have
b) The first step here is to reduce to the case of the metric \({\tilde{g}}\). Each of the other two metrics may be written in the form \(h {\tilde{g}}\), with \(h= h(\partial u)\). Then we can write
The first term corresponds to our reduction, and the remaining terms need to be estimated perturbatively. This is straightforward unless \(\alpha = 0\), so we focus now on this case. Discarding constants, we can assume here that \(h(0)=0\).
For the middle term in (5.39) we can use the bound (5.26) in Lemma 5.7 to place \(h\) in ℭ. Then \(\partial _{0} h\) is in \(\mathfrak {DC}\). Using a \(\mathfrak {DC}\) decomposition for it, \(\partial _{0} h = h_{1}+h_{2}\), we can match the two terms with the two pointwise bounds for \(\widehat{ \partial _{\beta }\partial _{\gamma}} u\), namely
which follow from (5.8) and (5.15). This yields
which suffices.
For the last expression in (5.39) we distribute \(\partial _{0}\),
For the first term we can combine the bound (5.30) with the para-composition bound in Lemma 2.4 exactly as in the proof of Lemma 5.9. For the second term we use the same \(\mathfrak{DC}\) decomposition as above for \(\partial _{0} h\). For the \(h_{1}\) contribution we have a direct bound without using any cancellations, while for \(h_{2}\) we use again Lemma 2.4. The third term is similar to the second, with the roles of \(h\) and \({\tilde{g}}^{0\beta}\) interchanged. This concludes our reduction to the case of the metric \({\tilde{g}}\).
We continue with the secondFootnote 4 reduction, which is to switch \(\partial _{\alpha}\) and \(\partial _{\gamma}\) in the expression \(\partial _{\alpha }T_{{\tilde{g}}^{\alpha \beta}} \widehat{ \partial _{\beta }\partial _{\gamma}} u\); this allows us to replace the first term in (5.38) with the second. For fixed \(\alpha \) and \(\gamma \), we write
where we claim \(f\) satisfies
This is trivial if \(\alpha = \gamma = 0\). If both are nonzero, or if one of them is zero but \(\beta \neq 0\), then there is no hat correction and this is a straightforward commutator bound. It remains to discuss the case when \(\beta = 0\) and exactly one of \(\alpha \) and \(\gamma \) are zero, say \(\gamma = 0\). Then we need to consider the difference
which can be estimated as in (5.41) using the fact that \(\alpha \neq 0\) as well as the bound (5.15) for the second time derivative of \(u\), respectively the similar bound (5.26) (third estimate) for \(\partial _{0} {\tilde{g}}\). Hence (5.41) follows.
Finally, it remains to examine the expression
where, unlike above, we take advantage of the summation with respect to \(\alpha \) and \(\beta \). Then, using the \(u\) equation, we have
The term where both \(\alpha \) and \(\beta \) are zero vanishes since \({\tilde{g}}^{00}\) is constant; this was a motivation for the first reduction above. So this can be estimated directly as in (5.41) if \(\gamma \neq 0\), and using either (5.15) or (5.26) (third estimate) for \(\partial _{0} {\tilde{g}}\), otherwise. □
6 Paracontrolled distributions
To motivate this section, we start from the classical energy estimates for the wave equation, which are obtained using the multiplier method. Precisely, one multiplies the equation \(\Box _{g} u = f\) by \(Xu\) and simply integrates by parts. Here \(X\) is any regular time-like vector field. In the next section, we prove energy estimates for the paradifferential equation (3.25), by emulating this strategy at the paradifferential level. The challenge is then to uncover a suitable vector field \(X\). Unlike the classical case, here not every time-like vector field \(X\) will suffice. Instead \(X\) must be carefully chosen, and in particular it will inherently have a limited regularity.
Since the metric \(g\) is a function of \(\partial u\), scaling considerations indicate that the vector field \(X\) should be at the same regularity level. Naively, one might hope to have an explicit expression \(X = X(\partial u)\) for our vector field. Unfortunately, seeking such an \(X\) eventually leads to an overdetermined system. At the other extreme, one might enlarge the class of \(X\) to all distributions that satisfy the same \(H^{s}\) and Besov norms as \(\partial u\), which is essentially the class of functions that satisfy (5.26). While this class will turn out to contain the correct choice for \(X\), it is nevertheless too large to allow for a clean implementation of the multiplier method.
Instead, there is a more subtle alternative, namely to have the vector \(X\) to be paracontrolled by \(\partial u\). This terminology was originally introduced by Gubinelli, Imkeller and Perkowski [13] in connection to Bony’s calculus, in order to study stochastic pde problems, see also [14]. However, similar constructions have been carried out earlier in the renormalization arguments e.g. for wave maps, in work of Tao [42], Tataru [47] and Sterbenz-Tataru [41]; the last reference used the name renormalizable for the corresponding class of distributions.
In the standard usage, this is more of a principle than an exact notion, which needs to be properly adapted to one’s purposes. For our own objective here, we provide a very precise definition of this notion, which is exactly tailored to the problem at hand.
6.1 Definitions and key properties
Definition 6.1
We say that a function \(z\) is paracontrolled by \(\partial u\) in a time interval \(I\) if it admits a representationFootnote 5 of the form
where the vector field \(a\) and the error \(r\) have the following properties:
(i) bounded para-coefficients \(a\):
(ii) balanced error \(r\):
It is convenient to think of the space of distributions \(z\) paracontrolled by \(\partial u\) as a Banach space, which we denote by \({\mathfrak {P}}(\partial u)\), or simply \({\mathfrak {P}}\). The norm in this Banach space is defined to be the largest implicit constant in (6.2) and (6.3), minimized over all representations of the form (6.1). If \(\|z\|_{{\mathfrak {P}}} \lesssim _{{\mathcal {A}^{\sharp }}} 1\) then we will simply write
![](http://media.springernature.com/lw127/springer-static/image/art%3A10.1007%2Fs00222-023-01231-3/MediaObjects/222_2023_1231_Equdn_HTML.png)
While for the most part this definition can be applied separately at each time \(t\), in our context we will think of both \(u\) and \(z\) as functions of time, and think of these bounds as uniform in \(t\). Precisely, above we think of \({\mathcal {A}^{\sharp }}\) as a global, time independent parameter, whereas ℬ is allowed to be a possibly unbounded function of \(t\).
To better understand the space \({\mathfrak {P}}\) of paracontrolled distributions, it is useful to relate it to the objects we have already discussed in the previous section:
Lemma 6.2
a) We have the inclusion \({\mathfrak {P}}\subset \mathfrak{C}\).
b) If \(F\) is a smooth function with \(F(0) = 0\), then \(F(\partial u) \in {\mathfrak {P}}\).
Proof
a) Clearly \(\partial u \in {\mathfrak {P}}\). Then the first term in (6.1) can be placed in ℭ by part (b) of Lemma 5.7. The error term \(r\) also belongs to \(\mathfrak {C}_{0}\) by Bernstein’s inequality and interpolation. This can be upgraded to ℭ using the \(\partial _{0} r\) bound in the second inequality in (6.3).
b) This is a direct consequence of parts (c), (d) of Lemma 5.7. □
Thus one may think of the class \({\mathfrak {P}}\) of paracontrolled distributions as an intermediate stage between the class of smooth functions of \(\partial u\), which is too narrow for our purposes, and the larger class ℭ, which does not carry sufficient structure.
Next we consider nonlinear properties:
Lemma 6.3
a) [Algebra property] The space \({\mathfrak {P}}(\partial u)\) is an algebra. Further, if \(z_{1}, z_{2} \in {\mathfrak {P}}\) have paracoefficients \(a_{1}\), respectively \(a_{2}\), then the paracoefficients of \(z_{1} z_{2}\) can be taken to be \(z_{1} a_{2}+z_{2} a_{1}\).
b) [Moser inequality] If \(F\) is a smooth function with \(F(0)=0\) and , then
and \(F(z)\) satisfies
Further, if \(z \in {\mathfrak {P}}\) has paracoefficients \(a\), then the paracoefficients of \(F(z)\) can be taken to be \(F'(z) a\).
Proof
a) We consider the algebra property. Let
and expand \(z_{1}z_{2}\).
We first observe that we can place \(\Pi (z_{1},z_{2})\) into the error term. For this it suffices to use the ℭ norm for \(z_{1}\), \(z_{2}\) and apply the second bound in (5.25).
We next consider \(T_{z_{1}} z_{2}\) where for \(z_{1}\) we again use only the ℭ norm. We begin with \(T_{z_{1}} r_{2}\), which we also estimate as an error term. Here we estimate again the more difficult time derivative. If it falls on the first term then we can bound the output exactly as the balanced case above, see (5.25). Else, it suffices to use the uniform bound on \(z_{1}\).
Finally, we consider the expression
where the first term has a ℭ coefficient by the ℭ algebra property, and the second may be estimated perturbatively. Here if the time derivative goes on the first factor then we are back to the previous case and no cancellation is needed. Else for \(\partial _{t} \partial _{\gamma }u\) we use the decomposition in Definition 5.1(a) (or simply Lemma 5.4), combined with Lemma 2.7.
b) To prove the Moser inequality, our starting point is Lemma 5.7(d), which allows us to reduce the problem to estimating \(T_{F'(z)} z\), using only the ℭ norm of \(z\). But here we can bound \(F'(z)\) in ℭ using the Moser bound in ℭ, which allows us to conclude as in part (a). □
In addition to the above lemmas, functions in \({\mathfrak {P}}\) essentially solve a paradifferential \(\Box _{\tilde{P}}\) equation. This will be used later to estimate lower order terms in the proof of Theorem 7.1, and for (7.108):
Lemma 6.4
Let \(h \in {\mathfrak {P}}\). Then there exist functions \(f^{\alpha}\) so that we have the representation
which satisfy the following bounds:
respectively
The same result holds for the metrics \({\tilde{g}}\), \({\hat{g}}\).
Proof
We use the representation (6.1) for \(h\). The property in the lemma holds trivially for the \(r\) component of \(h\), with
Precisely, the bound (6.5) holds due to the second part of (6.3), while for the bound (6.6), the \(\partial _{0} r\) component cancels and then we can use the first part of (6.3).
It remains to consider \(h\) of the form \(h = T_{a^{\gamma}} \partial _{\gamma }u\). We write
noting that the expression on the left hand side of (6.6) is exactly \(h^{0}-f^{0}\). We begin by refining the expression for \(h^{\alpha}\), noting that corrections of size \({\mathcal {B}}^{2}\) may be directly included into \(f^{\alpha}\) without harming (6.6). For this we write
where the first term on the right is the leading term, while the remaining terms can be estimated by \({\mathcal {B}}^{2}\) as follows:
-
The second, fourth and fifth terms are estimated directly using (6.2) for \(\partial _{\beta }a^{\gamma}\), \(\partial _{\gamma }g^{\alpha \beta}\) respectively \(\partial _{\gamma }a^{\gamma}\).
-
The third term is estimated using the commutator bound in Lemma 2.4, as well as Lemma 5.4 if both \(\beta \) and \(\gamma \) are zero.
We have reduced the problem to the case when
At this point we rewrite
noting that
which allows us to switch \(h\) and \(\tilde{h}\) also in (6.6). Then we are allowed to correct \(\tilde{h}^{\gamma}\), by writing
Now both terms on the right can be estimated by \({\mathcal {B}}^{2}\) as follows:
-
The first term is estimated directly using (6.2) for \(\partial _{\alpha }a^{\gamma}\).
-
The second term is estimated using Lemma 5.10.
Hence the proof of the lemma is concluded. □
In addition to the class of paracontrolled distributions \({\mathfrak {P}}\) we also define a secondary class of distributions, which roughly speaking corresponds to derivatives of \({\mathfrak {P}}\) functions.
Definition 6.5
The space \({\mathfrak{DP}}\) of distributions consists of functions \(y\) that admit a representation
where
with the natural associated norm.
Due to the inclusion \({\mathfrak {P}}\subset \mathfrak{C}\), we can directly relate it to the class \(\mathfrak{DC}\) introduced earlier.
Lemma 6.6
We have \({\mathfrak{DP}}\subset \mathfrak{DC}\).
Next we verify that \({\mathfrak{DP}}\) is stable under multiplication by \({\mathfrak {P}}\) functions.
Lemma 6.7
We have the bilinear bound
As a corollary of this lemma, it follows that our gradient potentials \(A^{\gamma}\) and \({\tilde{A}}^{\gamma}\) are in \({\mathfrak{DP}}\).
Proof
For \(h,z \in {\mathfrak {P}}\) we consider the expansion
The first term is in \({\mathfrak{DP}}\) by Lemma 6.3(a). The three remaining terms can be perturbatively estimated by \({\mathcal {B}}^{2}\), using the bounds in (5.25). □
Finally, we consider decompositions for \({\mathfrak{DP}}\) functions which are akin to Lemma 5.4. We will do this in two different ways, one which is shorter but loses some structure, and another which is more involved but retains full structure.
Lemma 6.8
Let \(w \in {\mathfrak{DP}}\). Then \(w\) admits a representation of the form
where
Proof
It suffices to consider \(w\) of the form \(w = \partial _{0} z\) where \(z \in {\mathfrak {P}}\), with a representation as in (6.1),
with \(a^{\gamma}\), \(r\) as in (6.2), (6.3). The bound (6.3) allows us to discard the contribution of \(r\) to (6.10). It remains to produce an appropriate modification \(\partial _{x} z_{1}\), with \(z_{1} \in \partial _{x} {\mathfrak {P}}\), for the expression
We successively peel off perturbative \(O({\mathcal {B}}^{2})\) layers from \(q\). First we use (6.2) to write
At this point we have two cases to consider:
(i) \(\gamma \neq 0\). Then we write
and the remaining expression is in \(\partial _{x} {\mathfrak {P}}\).
(ii) \(\gamma = 0\). Here we use the equation for \(u\) to write
Here the first term on the right involves at least one spatial derivative and is treated as before, in the case \(\gamma \neq 0\), while the contributions of the last two terms are perturbative, and can be bounded by \({\mathcal {B}}^{2}\). □
Our second representation provides a more explicit recipe to obtain the corrected version not only of \({\mathfrak{DP}}\) functions, but also of \({\mathfrak {P}}\times {\mathfrak{DP}}\) functions:
Lemma 6.9
Let \(w = z_{1}^{\alpha }\partial _{\alpha }z_{2}\), where \(z_{1}, z_{2} \in {\mathfrak {P}}\), and \(z_{2}\) has the \({\mathfrak {P}}\) representation
Define
Then we have
while
Proof
The contribution of \(r\) is directly perturbative so we discard it. Furthermore, the bounds in (5.25) allow us to replace perturbatively \(w\) by
Using also Lemma 5.4 we obtain
Finally, we use Lemma 2.7 to combine the two paraproducts, arriving at
as needed. Finally, the bound (6.12) is also a consequence of Lemma 5.4. □
The last lemma helps us uncover a more subtle, hidden \(\Box _{g}\) structure which appears if we compute the double divergence of the metric \({\tilde{g}}\).
Lemma 6.10
We have
Proof
For fixed \(\beta \) we expand \(\partial _{\beta }{\tilde{g}}^{\alpha \beta}\) using the relation (3.9) to obtain
For this expression we define a corresponding ring correction
which is also chosen to vanish if \((\alpha ,\beta )=(0,0)\). We claim that the difference is perturbative for fixed \(\alpha \) and \(\beta \),
Indeed, if \(\alpha \neq 0\) then this follows directly from Lemma 6.9. On the other hand if \(\alpha =0\) then \(\beta \neq 0\) in which case the hat correction can be discarded and we may distribute the time derivative, using the fact that \(\partial u, {\tilde{g}}\in \mathfrak{C}\) modulo constants, see Lemma 5.7.
It remains to estimate the expression \(\partial _{\alpha}(\partial _{\beta }\mathring{{\tilde{g}}}^{\alpha \beta}) \), where we return to the standard summation convention and take the sum with respect to all \((\alpha ,\beta )\). Here we separate the three terms in \(\partial _{\beta }\mathring{{\tilde{g}}}^{\alpha \beta}\), in particular forfeiting the cancellation when \((\alpha ,\beta )=(0,0)\). By Lemma 5.7 all paracoefficients are in ℭ, which allows us to perturbatively commute \(\partial _{\alpha}\) with them as needed. Then it suffices to estimate the expression
For all terms here we may directly use Lemma 5.12(b) directly. Hence the proof of the lemma is concluded. □
6.2 Symbol classes and the \({\mathfrak {P}}\)DO calculus
In a similar fashion to the \(L^{\infty }S^{m}\) classes of symbols, our analysis will involve paradifferential operators with symbols that on the physical side are at either the \({\mathfrak {P}}\) or the \({\mathfrak{DP}}\) level. Precisely, we will work with both the symbol classes \({\mathfrak {P}}S^{m}\) and with the classes \({\mathfrak{DP}}S^{m}\).
For comparison purposes, we recall that for just paraproducts with \({\mathfrak {P}}\) functions \(f\), we have the uniform in time product bounds
as well as the time dependent bounds
and the corresponding commutator estimates. We also have, for \(h \in {\mathfrak{DP}}\),
Our objective in what follows is to expand these kinds of bounds to the \(\Psi \)DO setting. We will see that things become more complex there. Fortunately, in the present paper we will only need such results primarily when one of the operators is a paraproduct, so we only prove our results in this case and merely make some comments about the general case.
We begin with the uniform in time bounds, i.e. the counterpart of (6.14), where not much changes:
Lemma 6.11
Let \(f \in {\mathfrak {P}}S^{j}\), \(g \in {\mathfrak {P}}S^{k}\). Then
Proof
By definition we have a decomposition \(f = f_{1} + f_{2}\) where \(f_{1}\) is an \(S^{j}\) multiplier and \(f_{2} \in {\mathcal {A}^{\sharp }}L^{\infty }S^{j}\), and similarly for \(g\). Since \(T_{f_{1}} = f_{1}(D)\) and \(T_{g_{1}} = g_{1}(D)\), the leading parts cancel and we are left only with \(O({\mathcal {A}^{\sharp }})\) terms, which can be estimated directly without using any cancellation. □
Our next result is concerned with the counterpart of (6.16), where again the result is similar:
Lemma 6.12
For \(g \in {\mathfrak {P}}\) and \(h \in {\mathfrak{DP}}S^{m}\) we have
Here, by a slight abuse of notation, by \(T_{g} h\) we mean the symbol paraproduct, where the Fourier variable is viewed as a parameter.
Proof
All operators in the lemma preserve dyadic frequency localization, so it suffices to fix a dyadic frequency size \(k\) and then show that we have
Here we can include the \(2^{mk}\) factor in \(h\) and reduce the problem to the case when \(m=0\).
In the first term we can also harmlessly replace \(g\) by \(g_{< k}\) and \(T_{g}\) by multiplication by \(g_{< k}\), as
while \(h_{< k} \in 2^{\frac{k}{2}} {\mathcal {B}}L^{\infty }S^{0}\) therefore
Similarly, in the second term we can replace \(T_{g} h\) by \(g_{< k} h\), as
akin to Lemma 6.7.
Thus it remains to bound in \(L^{2}\) the simpler operator
Our last simplification here is to separate variables in \(h\), and reduce to the case where \(h\) has a product form at frequency \(2^{k}\), namely
where \(f \in {\mathfrak{DP}}\) and \(a \in S^{0}\). This can be done for instance by thinking of \(h\) as a function of \(\xi \) in a dyadic frequency cube, smooth on the corresponding scale, and by taking a Fourier series in \(\xi \), with coefficients depending on \(x\). The coefficients will inherit the spatial regularity from \(h\), and will be rapidly decreasing since we are taking the Fourier series of a smooth function of \(\xi \).
After this simplification we may represent the operator \(T_{h}\) in the form
where the symbol of the bilinear form \(L_{lh}\) depends linearly (and explicitly) on \(a\). In this case we may rewrite the operator \(R\) in the form
At this point we can apply one last time the method of separation of variables to the symbol of \(L_{lh}\) to reduce the problem to the case when the bilinear form \(L_{lh}\) is of product type,
where the symbols for both symbols \(b_{< k}\) and \(c P_{k}\) are bounded and smooth on the \(2^{k}\) scale. After this final reduction the operator \(R\) has a commutator structure,
Here \(|P_{< k} f| \lesssim 2^{\frac{k}{2}} {\mathcal {B}}\), while the commutator can be bounded by
Hence we obtain
and the proof of the lemma is concluded. □
In very limited circumstances, we will also need a more precise commutator expansion, which arises in the context where we commute one paradifferential operator with symbol \(h \in {\mathfrak {P}}S^{m}\) with a function \(g \in {\mathfrak {P}}\). This will be applied when \(g = {\tilde{g}}^{\alpha \beta}\), but the result holds more generally. The novelty in the commutator expansion below is that we do not simply expand
but instead we seek to better understand the structure of the error,
The principal part corresponds exactly with the Lie bracket of the two symbols, interpreted paradifferentially. For possible use later, we define this more generally for two symbols:
Definition 6.13
The para-Lie bracket of two symbols \(f \in {\mathfrak {P}}S^{j}\), \(g \in {\mathfrak {P}}S^{k}\) is defined as
This belongs to \({\mathfrak{DP}}S^{j+k-1}\).
We remark that if \(f\) is merely a function, then the first term on the right drops.
While the principal part of the commutator can be described using a paradifferential operator with an appropriate symbol, the unbalanced subprincipal part has a more complex structure which would be described best using a variable coefficient bilinear form. In order to be able to describe this structure, we need a slight expansion of the class \(L_{lh}\) of bilinear operators in Definition 2.2:
Definition 6.14
By \({\mathfrak {P}}S^{m} L_{hl}\) we denote any bilinear operator which is a linear combination of operators of the form
which is either finite, or infinite but rapidly convergent.
With this notation, we have the following commutator result:
Proposition 6.15
For \(g \in {\mathfrak {P}}\) and \(h \in {\mathfrak {P}}S^{m}\) we have the commutator expansion
where
Proof
As in the proof of Lemma 6.12, we first localize in frequency to a dyadic scale \(2^{k}\) for the input/output, and reduce to the case \(s = 0\) and \(m = 2\).
We consider first the special case when \(h\) is a multiplier, \(h(x,\xi )=h(\xi )\). Then
In this case we claim that we have an exact formula,
A-priori the last term on the right, \(C\), is a \(lh\) type translation invariant bilinear form in \(g\), \(u\); all we need to do is to compute its symbol \(R(\eta ,\xi )\), and verify that it has symbol type regularity and vanishes of second order when \(\eta =0\). The symbol for \(T_{g} u\) as a bilinear form in \(g\) and \(u\) is
Then the symbol for the commutator is
We expand the last difference as a Taylor series around the middle as
with \(r\) a smooth symbol in both \(\eta \) and \(\xi \) on the \(2^{k}\) scale for \(|\eta | \ll |\xi | \approx 2^{k}\). The middle term gives the symbol of the Weyl quantization for the Lie bracket \(\{h, g\}_{p}\). The last term yields the error term \(C\), which has the \(\eta ^{2}\) factor corresponding to the two derivatives of \(g\).
Next we turn our attention to the general case, which we seek to reduce to the special case above. This is achieved by separating variables in the symbol \(h\), which allows us to assume without any restriction in generality that the symbol \(h\) has the form
Then we have a corresponding decomposition at the operator level,
Here we can estimate the commutator with \(T_{g}\) as an error term,
This is most readily seen using another separation of variables, which allows us to reduce the problem to the case when
after which we may apply Lemma 2.4. The same lemma also shows that the commutator \([T_{a},T_{g}]\) yields an error term, so we arrive at
For the commutator on the right we apply the formula (6.23), which yields
It remains to refine the first product,
for which we use Lemma 6.12. □
Our final result here is a product formula where we also need an expansion akin to (6.21). One should contrast this with Lemma 6.12, where such expansion was not necessary.
Proposition 6.16
For \(g \in {\mathfrak {P}}S^{m}\) and \(h \in {\mathfrak{DP}}\) we have the commutator expansion
where
Proof
The proof follows the same outline as the proof of the previous proposition, so we only outline the main points.
We localize first in frequency to a dyadic frequency region at scale \(2^{k}\), and then separating variables in the first factor. If \(g\) is simply a multiplier then (6.25) is an exact identity akin to (6.23) above. If instead
then we expand \(T_{g}\) as in (6.24), and then replace \(T_{a}\) by multiplication by \(a_{< k}\), using (6.18), (6.19). After these simplifications, we are left with estimating the difference
This difference is easily turned into another commutator and estimated as in (6.26); this is achieved by separating again variables in the symbol of \(L_{lh}\) as in the analysis after (6.24). □
7 Energy estimates for the paradifferential equation
Our objective in this section is to prove that the linear paradifferential flow
is locally well-posed in a range of Sobolev spaces. Precisely, we will show that
Theorem 7.1
Let \(u\) be a smooth solution for the minimal surface equation (3.5) in a time interval \(I=[0,T]\), with associated control parameters \({\mathcal {A}^{\sharp }}\) and ℬ so that
Let \(s \in {\mathbb{R}}\). Then the linear paradifferential flow (7.1) is locally well-posed in \(\mathcal {H}^{s}\) in the time interval \(I\). Furthermore, there exists an energy functional \(E^{s}(v) = E^{s}(v[t])\), depending on \(u\), which is smooth in \(\mathcal {H}^{s+1}\), with the following two properties:
a) Energy equivalence:
b) Energy estimate:
The same result is also valid for the paradifferential equations (3.26), respectively (3.27) associated to the metrics \({\tilde{g}}\) and \({\hat{g}}\).
We remark on the modular structure of our arguments. Precisely, from this section it is only the conclusion of this theorem which is used later in the paper. We also remark on the smallness condition for \({\mathcal {A}^{\sharp }}\):
Remark 7.2
The condition that \({\mathcal {A}^{\sharp }}\ll 1\) in the theorem is a technical convenience rather than a necessity. It is only used in the reduction in Proposition 7.3 in order to ensure that the operator \(T_{g^{00}}\) is invertible, and then in Lemma 7.4 in order to insure that our vector field \(X\) is forward time-like. Since \(|g^{00}| \gtrsim 1\) this may be alternatively guaranteed by a more careful choice of the quantization, respectively construction of \(X\). Another minor advantage is that with this assumption we no longer need to track the dependence on \({\mathcal {A}^{\sharp }}\) of implicit constants in all the estimates.
It will be easier to prove the result for the paradifferential flow associated to the metric \({\tilde{g}}\). Because of this, our first step will be to reduce the problem to this case. Then we will prove the result for \({\tilde{g}}\) in two steps. First, we show that the desired result holds for \(s = 0\). Then, we use a paraconjugation argument to show that the same result holds for all real \(s\).
7.1 Equivalent metrics
The idea here is that we can replace the metric \(g\) with the conformally equivalent metric \({\tilde{g}}\) given by (3.18) in order to simplify the subsequent analysis. A similar equivalence holds for the metric \({\hat{g}}\); the argument is completely identical.
Then we have the following equivalence:
Proposition 7.3
Assume that \(v\) solves (7.1). Then it also satisfies an equation of the form
where \(E\) is invertible and elliptic,
and \({\tilde{R}}\) is balanced,
Proof
We first observe that, since \(g^{00}\) is a small, \(O({\mathcal {A}})\) perturbation of a nonzero constant, it follows that \(T_{(g^{00})^{-1}}\) is an invertible elliptic operator, with elliptic inverse \(E = (T_{(g^{00})^{-1}})^{-1}\), which satisfies (7.6) for all real \(s\).
Then \(v\) solves (7.5) with \({\tilde{R}}\) of the form
Here we have the algebraic relations
This allows us to estimate \({\tilde{R}}\) in a balanced fashion using Lemma 5.7 and Lemma 2.7, as desired. □
As a consequence of this result, we see that it suffices now to prove the result in Theorem 7.1 but with the equation (7.1) replaced by
7.2 The \(H^{1} \times L^{2}\) bound
For expository purposes, we first review the multiplier method for proving energy estimates for the wave equation in a simplified setting. Guided by this, we construct a suitable vector field, to be used as our multiplier. Finally, we reinterpret the energy estimates at the paradifferential level, and prove Theorem 7.1 with \(s = 1\).
7.2.1 Energy estimates via the multiplier method
Suppose that we have a function \(v\) that solves a divergence form wave equation,
Given a vector field \(X = X^{\alpha }\partial _{\alpha}\), the standard strategy is to multiply the equation by \(X v\) and integrate by parts. For expository purposes we will follow this path here, noting that another alternative would be to interpret the vector field in the Weyl calculus, and work instead with the skew-adjoint operator
At this point we only seek to identify the principal part of the energy estimates, which will lead us to the choice of the vector field \(X\), so we do not follow this second path. However, later on, once \(X\) is chosen and we have switched to the paradifferential setting, we will need to also carefully track the lower order terms, and we will add lower order corrections to our vector field.
To further place the following computations into context, we remark that vector field energy identities for the wave equation are often employed in their covariant form, which is derived by contracting the divergence free relation for the energy momentum tensor with the vector field \(Xu\), and integrating with respect to the measure associated with the metric \(g\). Such a strategy would work but would be counterproductive in our setting, where we will reinterpret all these identities in a paradifferential fashion.
Assuming at first that the function \(v\) is compactly supported, integrating by parts several times, in order to essentially commute the second order part of \(P\) with \(X\), one arrives at the identity
where \(c_{X}\) is a quadratic expression in \(\partial v\) of the form
with coefficients given by the relation
where we recall that \(p(x,\xi ) = g^{\alpha \beta}\xi _{\alpha }\xi _{\beta}\). Removing the compact support assumption on \(v\) and introducing boundaries at times \(t = 0\) and \(t = T\), the identity above with the integral taken over \([0,T]\times {\mathbb{R}}^{n}\) still holds but with added contributions at these times,
where the contributions at the initial and final time can be thought of as energies. Here the energy density \(e_{X}\) is a bilinear expression of the form
This can be written in terms of the energy momentum tensor associated to the \(\Box _{g}\) operator,
Then we have
Thus we can define the energy functional associated to the vector field \(X\) as
The key property of the energy density \(e_{X}\) is that it is classically known to be positive definite in a pointwise sense,
provided that the vector fields \(\partial _{t}\) and \(X\) are uniformly forward time-like. Then we obtain the energy coercivity property
With these notations, we can rewrite the integral identity (7.13) as a differential identity
In a nutshell, this computation, interpreted paradifferentially, is at the heart of our proof of the energy estimates. In this context, the choice of the vector field \(X\) should naively be governed by the requirement that the energy flux form \(c_{X}\) is balanced. We note that one cannot ask for \(c_{X}\) to be zero, as this would produce an overdetermined system for \(X\), which in particular implies the condition that \(X\) is a conformal Killing field for the metric \(g\). Even the requirement that \(c_{X}\) is balanced turns out to be a bit too much, which is why we will need a second step to the above computation.
Precisely, the second step is based on another interesting observation, namely that the contribution of terms in \(c_{X}^{\alpha \beta}\) of the form
has a favourable structure and can be eliminated using a suitable Lagrangian type energy correction.
Indeed, for compactly supported \(v\), this contribution can be rewritten, integrating by parts, as
The first term can be interpreted as a correction to \(X\) in (7.10). Introducing the notation
it now takes the form
where the coefficient \(d\) of the additional zero order term is
Finally, adding in boundaries at \(t=0,T\) we obtain the integral relation
where the leading flux density is now
while the new energy density \(e_{X,q}\) has the form
We can also convert this into a differential relation akin to (7.19), namely
The identity (7.24) will heuristically provide the intuition for the proof of the desired energy estimate. However, to make this rigorous we will have to re-implement the above computation at the paradifferential level. There, the treatment of the lower order terms will differ slightly, in part in order to avoid a need for direct bounds on higher order time derivatives of \(u\).
Based on the relation (7.24), the vector field \(X\) will have to be chosen so that the symbol for the bilinear form \(c_{X,q}\) is balanced, or equivalently so that \(c_{X}\) is balanced modulo a Lagrangian contribution. In turn, the Lagrangian correction weight \(q\) will have to be chosen carefully, so that it satisfies multiple requirements:
-
(1)
Comparing the form of \(c_{X,q}\) with the earlier expression for \(c_{X}\), a natural choice would seem to be
$$ q = \partial _{\gamma }X^{\gamma}. $$(7.25) -
(2)
Examining the lower order coefficient \(d\) above, we will need to have good control over the function \(P q\). Here it is the second order part of \(P\) that matters, as the effect of the magnetic term will turn out to be directly perturbative.
Reconciling these two requirements will play an important role later on in this section.
To complete our discussion here, we need to carry out an additional step, namely to investigate what happens if we replace \(g\), \(A\) by \({\tilde{g}}\), \({\tilde{A}}\). Observing that
it becomes natural to replace the vector field \(X\), the Lagrangian weight \(q\) and the multiplier by
Then the relation (7.23) remains essentially unchanged,
and the same applies to the differential form (7.24) of the same relation. Here the principal flux symbol can be equivalently expressed in the form
Our task is now twofold:
-
To identify a suitable time-like vector field \(X\) so that the energy flux above satisfies a balanced energy estimate, and
-
To recast the above computation in the paradifferential setting without losing the energy balance; this will also require a careful choice for \(q\).
7.2.2 The construction of the vector field \(X\)
Our objective here is to construct a forward time-like vector field \(X\) so that the flux coefficients in \(c_{X,q}\) are balanced for \(q\) as in (7.25). In essence, at this stage we disregard any paradifferential frequency localizations, and work as if \(v\) has infinite frequency. We also do not distinguish between \(g\) and \({\tilde{g}}\), as this does not play a role in the choice of \(X\). Our main result governing the choice of the vector field \(X\) is where our notion of paracontrolled distributions is first needed, and reads as follows:
Lemma 7.4
There exists a forward time-like vector field \(X\) that is paracontrolled by \(\partial u\), and so that we have the balanced bound
We remark that the fact that such a vector field exists is closely connected to the fact that our equation satisfies the nonlinear null condition in a strong sense. One should think of our vector field \(X\) as the next best thing to a Killing or conformal Killing vector field. Perhaps a good terminology would a para-Killing vector field, i.e. whose deformation tensor is balanced, rather than equal to zero or a multiple of the metric.
Proof
Starting from (7.12), we compute the expression in (7.28) as follows:
Here one could freely symmetrize the coefficients relative to the pair of indices \((\alpha , \gamma )\). We have chosen to neglect the symmetrization, but, instead, we made favourable choices. The above expression would cancel for instance if
This is an overdetermined system, so we cannot hope for an exact cancellation. Even if we symmetrize (raising the \(\beta \) index first) and equate the symmetric part of the two sides, it still remains overdetermined.
But we do not need exact cancellation, we only need the difference of the two sides to be balanced. Assume for the moment that \(X\) is at the same regularity level as \(\partial u\). Then, examining the right hand side, the expressions there are unbalanced only in the paraproduct case, where the \(\partial ^{2} u\) term is the high frequency, i.e. for the terms \(T_{h(h,\nabla u)} \partial ^{2} u\). Hence we heuristically arrive at the equivalent requirement
where we introduce the notation “\(\overset{bal}{\approx }\)” to indicate that the difference between the two expressions is balanced, i.e. can be estimated as in (7.28). Then, at leading order we may cancel the \(\beta \) derivative to obtain a single paradifferential relation at one regularity level higher, namely
Modulo balanced terms we may break the paraproducts above in two. This allows us to devise an inductive scheme to construct \(X\) as a dyadic sum of frequency localized pieces, by setting
starting with the forward time-like initialization
and where the functions \(X_{k}\), localized at frequency \(2^{k}\), are defined inductively by
It remains to show that, as defined above, the vector field \(X\) has all the properties in the Lemma. We will achieve this in three stages:
-
We show that \(X\) satisfies the same bounds as \(\partial u\), (see (5.8) and (5.15)),
$$ \| X - X_{0}\|_{\mathfrak{C}} \lesssim 1 . $$(7.32)Since \({\mathcal {A}^{\sharp}}\ll 1\), this in particular guarantees that \(X\) is forward time-like.
-
We show that \(X- X_{0}\) is paracontrolled by \(\partial u\).
-
Finally, we establish the balanced bound (7.28).
To simplify the notations, we will write schematically that
with coefficients \(h\) of the form \(h = F(\partial u)\), which belong to ℭ modulo constants.
I. Dyadic bounds for \(X\). These are proved at each dyadic frequency \(k\) by induction on \(k\). We do this in two steps, where we first estimate the \(\mathfrak{C}_{0}\) norm of \(X-X_{0}\). Precisely, the first set of statements to be proved by induction for \(k > 0\) is as follows:
with a fixed large universal constant \(C\). This implies that \(\|X-X_{0}\|_{\mathfrak{C}_{0}} \lesssim 1\). The induction hypothesis combined with Bernstein’s inequality yields the bound
Then we write
which yields
respectively
Thus the induction argument closes if \(C\) is a large constant and \({\mathcal {A}^{\sharp }}\ll 1\).
The second step is to prove that
To achieve this we will prove by induction that
i.e. that \(\partial _{t} X_{\leq k}\) admits a decomposition \(\partial _{t} X_{\leq k} = f_{k1}+f_{k2}\), where
Here again \(C\) is a fixed large constant, unrelated to the earlier \(C\).
For this we write
Here the \(X\) coefficients involve only frequencies below \(2^{k}\), so we may use the induction hypothesis in the first term. For the second and third terms it suffices to use the \(\mathfrak{C}_{0}\) bound for \(X\), which we already have from the first induction. Hence, repeatedly applying the bounds in (5.24) we obtain
which closes the inductive proof of (7.35) if \(C \gg 1\) and \({\mathcal {A}^{\sharp }}\ll 1\).
II. \(X\) is paracontrolled by \(\partial u\). To prove this, we will establish the representation
This will play the role of (6.1). The Moser estimates in Lemma 5.7 show that the paracoefficients above satisfy the bounds required of \(a\) in (6.2), so it remains to establish that the errors \(r_{\alpha}\) satisfy the bounds (6.3). For this, we write
Now we apply Lemma 2.7 to estimate
as needed. It remains to bound the time derivative of \(r^{\alpha}\) in \(L^{\infty}\). For this we distribute the time derivative. If it falls on any of the para-coefficients then we can directly use the bound (5.25). Else, we use Lemma 5.9.
III. The bound for \(c_{X}^{\alpha \beta} + \partial _{\gamma }X^{\gamma }g^{\alpha \beta}\). Here we recall that
To estimate this, our starting point is the relation (7.36), together with the bounds (7.37) for \(r^{\alpha}\). Denoting
we write
For the \(r^{\alpha}\) term we use (7.37), for the next term we use the earlier bound (5.32) and the terms on the last line are estimated directly using the algebra property for \(\mathfrak{C}_{0}\) and the bilinear estimate (5.25). □
7.2.3 Paradifferential energy estimates associated to \(X\)
Now we use our vector field \(X\) to prove the balanced energy estimates for \(v\). To do this, we repeat the computations leading to the key energy relations (7.23) and (7.24) at the paradifferential level.
To fix the notations, we denote by \(T_{\tilde{P}}\) the operator in (7.8),
By a slight abuse of notation, this is not exactly the same as the Weyl quantized operator with the corresponding symbol, though the difference between the two can be seen to be balanced and thus perturbative in our analysis.
For our multiplier, inspired by the energy relation (7.26), we will use the paradifferential operator
Here ideally we would like to have
However, such a choice causes some technical difficulties due to the lack of sufficient time regularity of \({\tilde{q}}\). To avoid this, we will forego the above explicit expression for \({\tilde{q}}\), and instead ask for \({\tilde{q}}\) to satisfy the following two properties:
-
it is close to the ideal setting,
$$ | {\tilde{q}}- g^{00} \partial _{\alpha }X^{\alpha }| \lesssim { \mathcal {B}}^{2} . $$(7.39) -
it has the form \({\tilde{q}}= \partial _{x} q_{1}\), where \(q_{1} \in {\mathfrak {P}}\).
We remark that the obvious choice \({\tilde{q}}_{0} := - g^{00} \partial _{\alpha }X^{\alpha}\) for the first criteria does not satisfy the second criteria, as it contains expressions involving \(\partial _{t}^{2} u\). However, by definition we have \({\tilde{q}}_{0} \in {\mathfrak{DP}}\), therefore, a good approximation \({\tilde{q}}\) for \({\tilde{q}}_{0}\) as above exists by Lemma 6.8. Note that for this it suffices to use the fact that \(X^{\alpha }\in {\mathfrak {P}}\) separately for each \(\alpha \), rather than the more precise representation in (7.36).
Now we implement the multiplier method to prove energy estimates in the paradifferential setting. We recall our objective, which is to establish an integral energy identity of the form
for a suitable positive definite energy functional \(E_{X}\) in ℋ,
This may also be interpreted as a differential energy identity,
Notation for errors: There are two types of error/correction terms that appear in our computations:
-
Corrections in the energy functional. Here we will denote by \(Err({\mathcal {A}^{\sharp }})\) any fixed time expressions that have size \(O({\mathcal {A}^{\sharp }}) \| v[t]\|_{\mathcal {H}^{1}}^{2}\). A typical example here is a lower order term of the form
$$ \int _{{\mathbb{R}}^{n}} \partial v \cdot T_{q} v \, dx, \qquad q \in \partial _{x} {\mathfrak {P}}, $$where
$$ \| P_{< k} q \|_{L^{\infty}} \lesssim 2^{k} {\mathcal {A}}. $$ -
Corrections in the energy flux term. These are like the last term on the right in (7.40), respectively (7.42). For brevity we will denote the admissible errors in the two identities by \(Err({\mathcal {B}}^{2})\).
To establish (7.40), we consider the contributions of the two terms in \(T_{\tilde{\mathfrak {M}}}\).
I. The contribution of \(T_{\tilde{X}^{\alpha}} \partial _{\alpha}\). Integrating by parts and commuting, this is given by
so we obtain
For the double integral we peel off some perturbative contributions. The first term has a commutator structure, and we distinguish several cases. If \((\alpha , \beta ) = (0,0)\), then we simply write
If \((\alpha ,\beta )= (0,j)\) then we commute the derivative first,
where the contribution of the commutator term is estimated using Lemma 2.4,
If \((\alpha ,\beta )= (j,0)\) then we commute the paraproducts first,
where the contribution of the commutator is again perturbative once we integrate by parts with respect to \(x^{\gamma}\). If \(\gamma =0\) then this integration by parts contributes to the energy with the expression
which also plays a perturbative role. For the double paraproducts we use Lemma 2.7 to compound them, as in
We arrive at the relation
where we recall that \(c^{\alpha \beta}_{X}\) is given by the relation (7.27), and the energy functional \(E_{X}\) is given by
Here we may compound all double paraproducts and discard the commutator term, at the expense of \(Err({\mathcal {A}^{\sharp }})\) errors. We arrive at
with \(e^{\alpha \beta}_{X}\) given by (7.16), and which therefore belongs to \({\mathfrak {P}}\) modulo constants. Since \(X = \partial _{t} +O({\mathcal {A}^{\sharp }})\) is uniformly time-like, it follows that this matrix is positive definite, which implies the positivity property in (7.41).
II. The contribution of \(T_{q}\). Here we need to consider the integral
where we recall that \({\tilde{q}}= \partial _{x} q_{1}\) with \(q_{1} \in {\mathfrak {P}}\). The contribution of \({\tilde{A}}\) is directly perturbative, as \({\tilde{A}}\in {\mathfrak{DP}}\subset \mathfrak{DC}\); then one can use the \(\mathfrak {DC}\) decomposition \({\tilde{A}}= {\tilde{A}}_{1} + {\tilde{A}}_{2}\) as in Definition 5.5,(b), pairing each of the two associated bounds in (5.20) with the ℬ, respectively the \({\mathcal {A}^{\sharp}}\) bound in the \(\mathfrak {C}_{0}\) norm of \(q_{1}\):
Integrating by parts and using Lemmas 2.7, 2.8, we compute
Here the first term on the right is the one we want and the last term on the right yields an energy correction which is perturbative, i.e. of size \(Err({\mathcal {A}^{\sharp }})\). It remains to show that the second term, which we shall denote by \(I_{q}^{2}\), also yields only perturbative contributions. Heuristically, this should be relatively simple, in that we can integrate once more by parts, to obtain
Here we could estimate both integrals perturbatively and conclude directly if we knew that
Both of these bounds would be true if \(q\) contained no time derivatives of \(u\) in its expression. However, this is too much to hope for, so a more careful argument is needed. The first step in this argument has already been carried out earlier, where we saw that we may take \({\tilde{q}}\) of the form \({\tilde{q}}= \partial _{x} q_{1}\) with \(q_{1} \in {\mathfrak {P}}\). This removes one of the two potential time derivatives in \(q\), but not the second. We can use this property to write
where the uniform bound
shows that we can treat the second term perturbatively, to get
At this point, we can use the fact that \(q_{1} \in {\mathfrak {P}}\) implies that \(q_{1}\) solves an approximate paradifferential wave equation. The precise statement we use is the one in Lemma 6.4, which yields the representation
with
We use this representation to refine the outcome of the naive integration by parts above,
By the pointwise bound on \(f^{\alpha}\) in (7.45), the first term is perturbative, i.e. \(Err({\mathcal {B}}^{2})\). By the second bound in (7.45), the second term can be seen as a perturbative \(Err({\mathcal {A}^{\sharp }})\) energy correction.
We conclude that for \(I_{q}\) we have
III. Conclusion. To finish the proof of (7.40), and thus of Theorem 7.1 for \(s=0\), we combine the relations (7.43) and (7.46) to obtain
where \(E_{X}\) is redefined as the sum of the two contributions in (7.43) and (7.46), which still has the leading order term as in (7.44) plus an \(Err({\mathcal {A}^{\sharp }})\) correction.
It remains to examine the paracoefficient in the integral on the right, and show that it has size \(O({\mathcal {B}}^{2})\). At this point, we simply invoke the choice of our para-Killing vector field \(X\) in Lemma 7.4 for the first term (which we have not used so far), and the choice of \({\tilde{q}}\) in (7.39) for the second term, thereby completing the proof of (7.40).
7.3 The \(H^{s+1} \times H^{s}\) bound for the linear paradifferential flow
Here we prove Theorem 7.1 in the general case, where \(s \neq 1\). The argument will be a more complex variation of the argument in the case \(s=1\), where paraproduct based multipliers have to be replaced by paradifferential multipliers.
7.3.1 The conjugated equation
For simplicity in notations we will consider the linear paradifferential equation in \(\mathcal {H}^{s+1}\) with \(s \neq 0\). We begin by setting \(w = \langle D_{x} \rangle ^{s} v\), which solves a perturbed linear paradifferential equation of the form
where the conjugation error \({\tilde{B}}\) in the new source term is given by
Then we need to construct an \(H^{1} \times L^{2}\) balanced energy for the solution \(w\) to (7.48).
We note that \(\tilde{B}\) is a paradifferential operator, whose principal symbol \({\tilde{b}}_{0}\) is homogeneous of order one and a first degree polynomial in the time variable \(\xi _{0}\), and is given by
Using the expression (3.9) for the derivatives of the metric \(g\), this can be further written in the form
where
Here the unbalanced part of the coefficients corresponds to the case when the factor \(\partial ^{2} u\) is higher frequency compared to the \(\partial ^{\beta }u\) and \({\tilde{g}}^{\alpha \nu}\) factors. The important feature is that, at the operator level, \(T_{{\tilde{b}}_{0}^{\gamma}} \partial _{\gamma }w\) presents a null form structure of the type \(Q_{0}(\partial u, w)\), with added more regular paradifferential coefficients in \({\mathfrak {P}}\).
We switch the leading term \(2sT_{{\tilde{b}}_{0}^{\gamma}} \partial _{\gamma}\) to the left hand side of the equation; there it will play a role similar to the gradient term \({\tilde{A}}^{\gamma }\partial _{\gamma}\). The remainder \({\tilde{B}}- 2sT_{{\tilde{b}}_{0}^{\gamma}} \partial _{\gamma}\) will play a secondary role; one should think of it as renormalizable, though we will achieve this at the level of the energy, via an energy correction, rather than through an actual normal form transformation. Our equation (7.48) becomes
where the leading operator is denoted by
As in the previous case of the \(H^{1} \times L^{2}\) bounds, our strategy will be to construct a suitable vector field, or multiplier, denoted \(\tilde{X}_{s}\), which depends only on the principal symbol \(\tilde{b}_{0}\) above, and which formally generates a balanced energy estimate at the leading order. Then, reinterpreting all the analysis at the paradifferential level, we will rigorously prove that the generated energy satisfies favourable, balanced bounds.
7.3.2 The multiplier \(\tilde{X}_{s}\)
In the previous section, the multiplier \(\tilde{X}\in {\mathfrak {P}}\) was a well-chosen vector field which belongs to our space \({\mathfrak {P}}\) of paracontrolled distributions. Here, this can no longer work due to the presence of the operator \(T_{{\tilde{b}}_{0}}\), which is a pseudodifferential rather than a differential operator. For this reason we will instead use a pseudodifferential “vector field” \(i \tilde{X}_{s}\), where \(\tilde{X}_{s}\) has a real, odd symbol of the form
which will be homogeneous away from frequency zero. We carefully note that we want the symbol \(\tilde{X}_{s}\) to be a first order polynomial in \(\xi _{0}\); this is important so that we can still do integration by parts in time and have a well defined fixed time energy. The symbol \(\tilde{X}_{s}\) may be interpreted as a pseudodifferential operator using the Weyl paradifferential quantization,
However, as in the \(s=0\) case, we will allow a more general choice for the zero order component, and work instead with the modified multiplier
where the real, even zero order symbol \(\tilde{Y}_{s0} \in \partial _{x}{\mathfrak {P}}S^{0}\) will be carefully chosen later on in order to provide an appropriate Lagrangian correction in our energy estimates.
Repeating the heuristic computation in the previous subsection, in the absence of time boundaries we have an identity of the form
where \(c_{\tilde{X}_{s},B}(v,v)\) is a bilinear form whose principal symbol \(c_{\tilde{X}_{s},B}\) is of order two,
The objective would now be to choose the symbols \(\tilde{X}_{sj} \in {\mathfrak {P}}S^{j}\) so that we cancel the unbalanced part of the symbol \(c_{\tilde{X}_{s},B}\). However, it is immediately clear that this may be a bit too much to ask, as it conflicts with the requirement that \(\tilde{X}_{s}\) is a first degree polynomial in \(\xi _{0}\). Hence, as a substitute, we will seek to achieve this cancellation on the characteristic set \(p(x,\xi ) = 0\). Then, instead of asking for
we will settle for the slightly weaker property
where \(\tilde{Y}_{s0} \in {\mathfrak{DP}}S^{0}\) is a purely spatial zero homogeneous symbol, with the spatial dependence at the level of \(\partial ^{2} u\). This term will be harmless, as we will also be able to remove it in our energy energy estimates with a Lagrangian correction, by making a good choice for \(\tilde{Y}_{s0}\).
An additional requirement on our paradifferential “vector field” \(\tilde{X}_{s}\) will be that, in the energy estimate generated by \(\tilde{X}_{s}\), the associated energy functional \(E_{\tilde{X}_{s}}\) should be positive definite at the level of it principal part. Earlier, in the case when \(X\) was a vector field, this requirement was identified, via the energy momentum tensor, with the property that \(X\) is forward time-like. Here we will generalize this notion to symbols:
Definition 7.5
We say that the (real) symbol \(X= \xi _{0} X_{0}+ X_{1} \in C^{0} S^{1}\) is forward time-like if the following two properties hold:
a) \(X_{0}(x,\xi ') > 0\).
b) \(X(x,\xi _{0}^{1},\xi ') X(x,\xi _{0}^{2}, \xi ') < 0\), where \(\xi _{0}^{1}(x,\xi ') < \xi _{0}^{2}(x,\xi ')\) are the two real zeros of \(p(x,\xi )\) as a polynomial of \(\xi _{0}\).
We remark that, using \(X\) as a multiplier, relative to the metric \(g\), we will obtain an energy functional which at leading order can be described via the symbol
which should be compared with the expression (7.16) defined earlier in terms of the energy momentum tensor in the case when \(X\) is a vector field. Correspondingly, we define the energy functional
The main property of forward time-like symbols is as follows:
Lemma 7.6
The symbol \(e_{X}\) is positive definite iff \(X\) is forward time-like.
Proof
Assuming \(X_{0}\) is nonzero, we represent \(X\) in the form
where \(a_{1}+a_{2} = 1\). Then \(e_{X}\) has the form
Here \(g^{00} = -1\) and at least one of \(a_{1}\) and \(a_{2}\) are positive. Then \(e_{X}\) is positive definite iff \(X_{0} > 0\) and \(a_{1}, a_{2} > 0\). This is easily seen to be equivalent with the forward time-like condition in the above definition. □
7.3.3 The construction of \(\tilde{X}^{s}\)
Here we return to the matter of choosing \(\tilde{X}_{s}\), whose properties almost exactly mirror those of the vector field \(\tilde{X}\) in the previous subsection:
Proposition 7.7
There exists a real, odd homogeneous symbol of order one \(\tilde{X}_{s} \in \xi _{0}+ {\mathfrak {P}}S^{1}\), which is a first degree polynomial in \(\xi _{0}\), so that:
i) \(\tilde{X}_{s}\) is forward time-like.
ii) The principal symbol \(c_{\tilde{X}_{s},B}\) of the \(\tilde{X}_{s}\) energy flux admits a representation of the form
where \({\tilde{q}}_{2}\) is balanced,
and \({\tilde{q}}_{0}\) has \({\mathfrak{DP}}\) type regularity,
iii) The symbol \(\tilde{X}_{s}\) admits the \({\mathfrak {P}}S^{1}\) representation
where the para-coefficients \(a^{\gamma}(x,\xi ) = a^{\gamma}_{1}(x,\xi ') + a^{\gamma}_{0}(x,\xi ')\) with \(a^{\gamma}_{j} \in {\mathfrak {P}}S^{j}\) have the form
with \(q_{0}^{\gamma }\in {\mathfrak {P}}S^{0}\), independent of \(\xi _{0}\).
From the perspective of energy estimates, it might seem that parts (i) and (ii) are the important ones. However, part (ii) will be seen as an immediate consequence of the representation in part (iii), which thus can be thought of as the more fundamental property. Also, in the proof of the energy estimates it will on occasion be more convenient to directly use (7.64). In the sequel we will refer to \(a^{\gamma}\) as the para-coefficients of \(\tilde{X}_{s}\). We note that the choice of \(q_{0}^{\gamma}\) is uniquely determined by the requirement that \(a^{\gamma}\) are first degree polynomials in \(\xi _{0}\).
Proof
It will be somewhat easier to construct the corresponding symbol \(X_{s}\) as associated to \(p\), rather than to \(\tilde{p}\); this avoids the slight symmetry breaking in the transition from \(p\) to \(\tilde{p}\). Precisely, we will choose \(\tilde{X}_{s}\) of the form
and then express \(c_{\tilde{X}_{s},B}\) in terms of \(X_{s}\) as follows:
where \(q_{00}\) is given by
and \(b_{0}^{\gamma}\) has the form
Here we have separated the two terms in \({\tilde{b}}_{0}^{\gamma}\); the first has contributed to \(b_{0}^{\gamma}\), while the second has contributed the last term in the Lagrangian coefficient \(q_{00}\).
For clarity, we note that the exact expression of \(q_{00}\) is not important, we will only use the fact that \(q_{00} \in {\mathfrak{DP}}S^{0}\). On the other hand, for \(b_{0}^{\gamma}\) we will need the fact that it has a null structure.
Now we restate the proposition in terms of the new symbol \(X_{s}\). Our goal will be to find \(X_{s}\) in the same class as \(\tilde{X}_{s}\), so that the reduced symbol
can be represented in the form
Here there is a small twist in the argument. While \(c_{X_{s},B}\) is a second degree polynomial in \(\xi _{0}\), this is no longer the case for \(c^{red}_{X_{s},B}\), which contains the term \(\xi _{0}^{3} \{ g^{00},X_{s0}\}\). For this reason, in (7.67) we apriori have to allow for symbols \(q_{2}\), respectively \(q_{0}\) which are third, respectively first degree polynomials in \(\xi _{0}\). However, we can eliminate the \(\xi _{0}^{3}\) term in \(q_{2}\) with a \(\xi _{0}\) correction in \(q_{0}\). Then, returning to \(c_{X_{s},B}\), we obtain the representation (7.60) with \({\tilde{q}}_{2}\) of second degree and \({\tilde{q}}_{0}\) of first degree. But \(c_{X_{s},B}\) is a second degree polynomial in \(\xi _{0}\), so we finally conclude that \({\tilde{q}}_{0}\) must be independent of \(\xi _{0}\).
We now proceed to construct the symbol \(X_{s}\). As a first step in the proof, we seek to obtain a variant \(X^{0}_{s}\) of the symbol \(X_{s}\) where we drop the requirement that \(X^{0}_{s}\) is a first order polynomial in \(\xi _{0}\) but we ask for the stronger property that the associated symbol \(c^{red}_{X^{0}_{s},B}\) is fully balanced, which corresponds to \(q_{0}=0\). Then, at the end, we choose \(X_{s}\) to be the first degree polynomial in \(\xi _{0}\) that matches \(X^{0}_{s}\) at the two roots of \(p(x,\xi ) = 0\) viewed as a polynomial in \(\xi _{0}\).
The relation we seek for \(X^{0}_{s}\) to satisfy on the characteristic set of \(p\) is
where, using the expressions (3.13) and (7.65) for \(A^{\gamma}\) and \(b_{0}^{\gamma}\),
Here we recall the expression for the derivatives of \(g\), see (3.9),
Substituting this in the previous expression for \(c^{red}_{X^{0}_{s},B}\), we need the following relation to hold modulo balanced terms:
We can rewrite this using the following operator
in the form
By Lemma 7.4, we already have a solution \(X\) for \(s=0\). Thinking of this multiplicatively, it is then natural to look for \(X^{0}_{s}\) of the form
where should be zero homogeneous in \(\xi \) and must satisfy
We will also assume that \(Z_{s}\) is a positive symbol; this will help later with the time-like condition. Then we can rewrite the above relation as a condition for \(\log Z_{s}\), namely
Here the inhomogeneous term is linear in \(s\), so we will also look for a solution \(\log Z_{s}\) which is linear in \(s\).
There is one last algebraic simplification, which is to replace \(Z_{s}\) by \(\tilde{Z}_{s} = Z_{s} |\xi '|^{2s}\), which is \(2s\)-homogeneous, even, and inherits the property that \(\log \tilde{Z}_{s}\) is linear in \(s\). Then \(\log \tilde{Z}_{s}\) must solve
Dispensing with the log, we replace this by
Now we interpret the last relation paradifferentially, formally cancelling the \(L\)’s. This suggests the following scheme to construct the dyadic parts of \(\tilde{Z}_{s}\) inductively by setting
A-priori these dyadic parts have a nontrivial dependence on \(\xi \), which would have to be tracked when considering the convergence in the \(k\) summation. However, since \(\log \tilde{Z}_{s}\) is linear in \(s\), it suffices to solve this for some nonzero \(s\). The advantage here is that, if \(s\) is a positive integer (say \(s=1\)) then all our iterates are polynomials of degree \(2s\) in \(\xi \). Hence the convergence issue disappears, due to our smallness condition for \(u\), \({\mathcal {A}^{\sharp }}\ll 1\); this is exactly as in the construction of \(X\) in Section 7.2.2. This defines \(\tilde{Z}_{1}\) as a positive definite polynomial in \(\xi \) of degree 2, so that \(\tilde{Z}_{1} = \xi ^{2} (1+O({\mathcal {A}^{\sharp }}))\). Further, by the same argument as in the proof of Lemma 7.4, it follows that the coefficients of \({\tilde{Z}}_{1} -Z_{10} \) are paracontrolled by \(\partial u\); in other words, \({\tilde{Z}}_{1} - \xi ^{2} \in {\mathfrak {P}}S^{2}\). In addition, by (7.75), it also follows that (a choice for) the para-coefficients of \(\tilde{Z}_{1}\), as in Definition 6.1 is given by
Remark 7.8
We remark on the symbol \(\tilde{Z}_{1}\), which is quadratic in \(\xi \) and para-commutes with \(p\), in the sense that their Lie bracket is balanced and thus bounded by \({\mathcal {B}}^{2}\). This symbol plays a role that is similar to that of the first order symbol \(X\) constructed earlier.
Now that we have \(\tilde{Z}_{1}\), for all real \(s\) we may define
By Lemma 7.4 in the previous subsection we have . Combining this with the similar property of \(\tilde{Z}_{1}\), by the algebra and Moser properties of the space \({\mathfrak {P}}\) of paracontrolled distributions it follows that
. Finally, combining the representations of \(X\) and of \(\tilde{Z}_{1}\) as paracontrolled distributions, as in (7.36) and (7.76), we obtain the corresponding \({\mathfrak {P}}\) representation for \(X^{0}_{s}\) as in Definition 6.1 (see the relation (7.70))
This in turn yields the desired conclusion that \(c^{red}_{X^{0}_{s},B}\) is balanced,
Indeed, the equivalent form (7.70) can be obtained by directly applying the operator \(L\) in the relation (7.77); this is because the terms where the paracoefficients get differentiated are balanced, so we are left with the terms where \(L\) is applied to the main factors \(\partial u\).
Now we carry out the last step of the proof, and define the symbol \(X_{s}\) as the unique first degree polynomial in \(\xi _{0}\) with the property that
We now show that this choice for \(X_{s}\) has the desired properties.
Recall that \(\xi _{0}^{1}(x,\xi ') < \xi _{0}^{2}(x,\xi ')\) are the two real zeros of \(p(x,\xi )\) as a polynomial of \(\xi _{0}\), which are 1-homogeneous and smooth in \(\xi '\) and are also smooth functions of \(\partial u\). Thus,
The coefficients \(X_{s0}\) and \(X_{s1}\) in \(X_{s}\) are obtained by solving a linear system,
By the algebra and Moser properties of the space \({\mathfrak {P}}\) of paracontrolled distributions, it immediately follows that we have the symbol regularity properties \(X_{s0} \in {\mathfrak {P}}S^{0}\) and \(X_{s1} -1 \in {\mathfrak {P}}S^{1}\). By construction we also have a smooth division,
where we easily see that the quotient \(d\) has regularity \(d \in {\mathfrak {P}}S^{-1}\) by computing directly
One may also interpret this as a form of the Malgrange preparation theorem in an easier case where the roots are separated.
We can now use (7.79) to relate \(c^{red}_{X_{s},B}\) with \(c^{red}_{X^{0}_{s},B}\):
which is exactly the desired representation (7.67).
We can also use the relation (7.77) for part (iii) of the proposition. For this we first transition from \(X^{0}_{s}\) to \(X_{s}\). Using (7.79) and peeling off balanced terms, this gives the \({\mathfrak {P}}\) representation
In view of the paradifferential expansion (5.28) for \(g^{\alpha \beta}\), in the last bracket there is a leading order cancellation,
This implies that \(X_{s}\) admits a \({\mathfrak {P}}S^{1}\) representation of the form
where \(z \in {\mathfrak {P}}S^{-1}\). At this stage we only know that \(z\) and \(r_{s}\) are smooth as functions of \(\xi _{0}\). On the other hand, the remaining terms are at most second degree polynomials in \(\xi _{0}\). We claim that, without any restriction in generality, we may take \(z\) independent of \(\xi _{0}\) and then \(r_{s}\) has to be at most second degree polynomial in \(\xi _{0}\).
Subtracting a multiple of \(p\) from all the paracoefficients above and discarding balanced contributions, we may reduce to the case of a first degree polynomial, i.e. to a relation of the form
where \(z \in {\mathfrak {P}}S^{-1}\) and \(Z_{j} \in {\mathfrak {P}}S^{j}\), while \(\partial r_{s} = O({\mathcal {B}}^{2})\), with full symbol regularity in \(\xi \). We will show that in this case we must have \(\partial Z_{j}= O({\mathcal {B}}^{2})\), again with full symbol regularity. This would imply that we may include \(T_{p} z\) into \(r_{s}\), and thus take \(z=0\) in the last relation.
We begin by differentiating this relation in \(x\) and \(t\), noting that \(T_{\partial p} z\) may be placed in \(\partial r_{s}\):
We may also perturbatively replace \(T_{p}\) with \(p\), arriving at
where \(r^{1}\) has size \({\mathcal {B}}^{2}\) and symbol regularity,
For fixed \(x\), we examine this relation on the characteristic cone \(C = \{p(x,\xi )= 0\}\). There we have
so we may directly conclude that \(\partial Z_{1}(x,\xi '), \partial Z_{0}(x,\xi ') = O({\mathcal {B}}^{2})\). Next we need a similar bound for their derivatives \(\partial _{\xi '}^{\alpha }\partial Z_{1}(x,\xi ')\), \(\partial _{\xi '}^{ \alpha }\partial Z_{0}(x,\xi ')\) with respect to \(\xi '\). We fix \(x\) and argue by induction in \(|\alpha |\). Then it suffices to use derivatives which are tangent to the cone at that \(x\), which on one hand kill \(p\) but on the other hand give a full range of \(\xi '\) derivatives for \(Z_{j}\). Hence, we may indeed assume that \(z\) is independent of \(\xi _{0}\) and \(r_{s}\) is a second degree polynomial in \(\xi _{0}\).
Lastly we switch from \(X_{s}\) to \(\tilde{X}_{s}\). Again peeling off \(r_{s}\) type contributions, we have
It remains to expand the fourth term, using Lemma 8.2:
This finally yields the representation (7.63) with the paracoefficients in (7.64), thereby concluding the proof of part (iii) of Proposition 7.7.
The final property of \(\tilde{X}_{s}\) to be verified is that \(\tilde{X}_{s}\) is time-like. This property is easily seen to depend only on the sign of the symbol \(\tilde{X}_{s}\) on the characteristic set \(\{p=0\}\). But by construction, \(\tilde{X}_{s}\) has the same sign as \(X_{s}\) there, which in turn has the same sign as the vector field \(X\) in Section 7.2.2. Then the time-like property for \(\tilde{X}_{s}\) follows from the similar property of \(X\). □
While it is more streamlined to state Proposition 7.7 and its proof directly in terms of the symbol \(c_{\tilde{X}_{s},B}\), in order to prove energy estimates it is more efficient to peel off balanced components of \(c_{\tilde{X}_{s},B}\), so that we are left with less debris to contend with.
To start with, let us assume that \(\tilde{X}_{s} \in {\mathfrak {P}}S^{1}\) admits the representation (7.63) with \(a^{\gamma }\in {\mathfrak {P}}\) but without requiring that \(a^{\gamma}\) satisfy the relation (7.64). For such \(X_{s}\), we peel off balanced components of \(c_{\tilde{X}_{s},B}\) following the two steps in Lemma 6.9. These steps are briefly reviewed in the sequel.
In a first stage, we note that all expressions in \(c_{\tilde{X}_{s},B}\) can be seen as linear combinations on the form \({\mathfrak {P}}\cdot \partial {\mathfrak {P}}\), where the output is balanced unless the second factor has higher frequency, see Lemma 5.7(a). This allows us to replace such products by paraproducts of the form \(T_{{\mathfrak {P}}} \partial {\mathfrak {P}}\). Further, using the definition of paracontrolled distributions for the second factor, we can discard the error term as balanced and arrive at more precise paraproducts of the form \(T_{{\mathfrak {P}}} \partial ^{2} u\), namely
where the coefficients \(a^{\alpha \beta}\) and \(\tilde{a}^{\alpha \beta}\) are explicitly computable as algebraic expressions in terms of \(u\), \(\tilde{X}_{s}\) and the paracoefficients \(a^{\gamma}\) of \(\tilde{X}_{s}\). Precisely,
so we obtain, in unsymmetrized form, the relation (7.82) with
Finally, the last difficulty we face is that we do not have good enough estimates for \(\partial _{t}^{2} u\). This is rectified by using instead the corrected expression \(\hat{\partial}_{t}^{2} u \) introduced in (5.13). This yields a corresponding correction of \(c\), namely
With these notations, we can now state a more refined version of Proposition 7.7:
Proposition 7.9
Let \(\tilde{X}_{s}\) be the symbol constructed in Proposition 7.7. Then the conclusion of Proposition 7.7holds equally for \({\mathring{c}}_{\tilde{X}_{s},B}\), with the corresponding expressions \({\mathring{q}}_{2}\) and \({\mathring{q}}_{0}\) satisfying a stronger version of (7.61),
and with \({\mathring{q}}_{0}\) having \(\partial _{x} {\mathfrak {P}}\) type regularity,
Proof
A direct computation using (7.83) and (7.64) shows that the coefficients \(a^{\alpha \beta}\) have the form
and thus
We need to express this in the form \(T_{\tilde{p}} \partial _{x} {\mathcal {B}}\) plus a balanced component. For this we consider two cases:
a) If \((\alpha ,\beta ) \neq (0,0)\) then the above component of \({\mathring{c}}\) has the form
Here the first term on the right is as needed, so we set
The first term is balanced by Lemma 5.7(a) and the second is balanced by Lemma 5.9. We still need to estimate \(\partial _{0} {\mathring{q}}_{2}\) as in (7.85), which is immediate using Lemma 5.4, Lemma 5.7(a) and Lemma 5.9.
b) If \((\alpha ,\beta ) = (0,0)\) then the above component has the form
The first term on the right is treated exactly as in case (a), by pulling out one spatial derivative, while the second is directly placed in \({\mathring{q}}_{2}\) using again Lemma 5.4 and (a minor variation of) Lemma 5.7(a). □
7.3.4 Paradifferential energy estimates associated to \(\tilde{X}_{s}\)
We now use the symbol \(\tilde{X}_{s}\) given by Proposition 7.7 in order to construct an \(H^{1} \times L^{2}\) balanced energy functional for the conjugated problem (7.48). This in turn gives an \(H^{s+1} \times H^{s}\) balanced energy functional for the original linear paradifferential flow (3.25), thus completing the proof of Theorem 7.1.
Broadly speaking, we will be following the analysis in the \(s=0\) case, but with more care since we are replacing the vector field \(X\) with the pseudodifferential multiplier \(\tilde{X}_{s}\). In particular, here, instead of paraproducts we will have to commute paraproducts with paradifferential operators. The difficulty is that we will no longer be able to estimate the commutator contributions in a direct, perturbative fashion; instead, we will need to take into account unbalanced subprincipal commutator terms, and devise an additional zero order correction to \(\tilde{X}_{s}\) in order to deal with them.
We begin by considering the conjugation operator \(B\), for which we provide a favourable decomposition:
Lemma 7.10
The operator \(B\) given by (7.49) admits a decomposition
where the three components are as follows:
(i) \(B_{0} = T_{b_{0}^{\gamma}} \partial _{\gamma}\) is the leading part, with symbol
(ii) \(B_{1}\) is unbalanced but with a favourable null structure,
with \(h\) depending smoothly on \(\partial u\).
(iii) \(B_{2}\) is balanced,
This result is a direct consequence of Proposition 6.15; we have stated it here separately only for quick reference in this section.
At this point, we can repeat the multiplier computation in the previous section, using as multiplier the operator \(T_{\tilde{\mathfrak {M}}_{s}}\) defined in (7.55). Here \(\tilde{X}_{s}\) will be the symbol constructed in the previous subsection, so it remains to consider the choice of \(\tilde{Y}_{s0}\), which will be chosen as
with \({\mathring{q}}_{0}\) as in Proposition 7.9.
Using the \(T_{\tilde{\mathfrak {M}}_{s}}\) operator as a multiplier, we seek to derive an associated energy identity. Here, at leading order, we would like the energy functional \(E_{\tilde{X}_{s}}\) to be as in (7.59), described by the symbol \(e_{\tilde{X}_{s}}\) defined as in (7.58). On the other hand the energy flux is to be described at leading order by the symbol \({\mathring{c}}_{\tilde{X}_{s},B}\) in (7.84) where we add the contribution of \(\tilde{Y}_{s0}\).
To have a modular argument, at first we simply assume that
-
\(\tilde{X}_{s} \in {\mathfrak {P}}S^{1}\), with the representation (7.63) with \(a^{\gamma }\in {\mathfrak {P}}S^{1}\), but without assuming that \(a^{\gamma}\) are given by (7.64).
-
\(\tilde{Y}_{s0} \in \partial _{x} {\mathfrak {P}}\), but without assuming that \(\tilde{Y}_{s0}\) is as in Proposition 7.9.
Given such \(\tilde{X}_{s}\) and \(\tilde{Y}_{s0}\), we will describe the leading part of the energy flux using the symbol
This is a second degree polynomial in \(\xi _{0}\), which we expand as
To this expansion we associate the bilinear form
which, integrated also over time, would yield exactly the quadratic form generated by the symbol \({\tilde{c}}_{s}\) in Weyl calculus.
Now we can state our main multiplier energy identity, which is as follows:
Proposition 7.11
Let \(\tilde{X}_{s} \in {\mathfrak {P}}S^{1}\) and \(\tilde{Y}_{s0} \in \partial _{x} {\mathfrak {P}}\) be as above, and the multiplier \(T_{\tilde{\mathfrak {M}}_{s}}\) be as in (7.55). Then there exists an energy function \(E_{\tilde{X}_{s},B}\) with the following properties:
i) Leading order expression:
ii) Energy identity:
We recall again that here we assume neither that \(\tilde{X}_{s}\) is the “vector field” constructed in the previous subsection nor that \(\tilde{X}_{s}\) is forward time-like. Instead we will add these two assumptions later on when we apply the Proposition, in order to guarantee that \(\tilde{C}_{s}(w,w)\) is controlled by \(Err({\mathcal {B}}^{2})\), respectively that \(E_{\tilde{X}_{s}}\) is positive definite.
Proof
As stated, the result in the Proposition is linear with respect to both \(\tilde{X}_{s}\) and \(\tilde{Y}_{s0}\), and also separately in \({\tilde{A}}\) and \({\tilde{b}}_{0}\). This allows us to divide the proof into several cases, which turn out to be easier to manage separately.
I. The contribution of \(\tilde{X}_{s1}\) with \({\tilde{A}}=0\) and \({\tilde{b}}_{0}=0\). Our starting point here is the integral
The operator \(T_{i \tilde{X}_{s1}}\) is purely spatial and antisymmetric, so we can integrate by parts three times in \([0,T] \times {\mathbb{R}}^{n}\) to rewrite \(I_{X}^{1}\) in the form
Here the expression on the second line should be thought of as the energy and the expression on the first line represents the energy flux. We remark that if there were no boundaries at times \(t=0,T\) then this would be akin to computing the commutator of \(T_{\tilde{P}}\) and \(T_{i\tilde{X}_{s1}}\).
The above expression needs some further processing to put it in the desired form. We begin with the energy component, where we need to compound the paraproducts and separate the cases \(\alpha =0\) and \(\alpha \neq 0\). This is done using Lemma 6.11,
as needed.
We now successively consider the space-time integrals on the first line in \(I_{X}^{1}\). In the first integral, the components where the \({\tilde{g}}^{\alpha \beta}\) frequency is at least comparable to the \(\tilde{X}_{s1}\) frequency are balanced, and we can use Lemma 6.12 to compose the paraproducts as
where the integral on the right can be freely switched to the Weyl calculus if \(\alpha \neq 0\), and represents one of the desired components of our energy flux.
For the second space-time integral in \(I_{X}^{1}\) we use the commutator expansion in Proposition 6.15 to get a principal part, an unbalanced subprincipal part and a balanced term,
where the unbalanced subprincipal part \(I_{X,sub}^{1}\) has the form
We postpone the analysis of \(I_{X,sub}^{1}\) for later, and focus now on the principal part, which has symbol
As in Lemma 6.7, we may perturbatively (with \(O({\mathcal {B}}^{2} L^{\infty }S^{0})\) errors) replace this by
This is almost in the desired form, except that we need to switch it to Weyl calculus. We observe that we have no contribution if both \(\alpha \) and \(\beta \) are zero. We separate the remaining cases, where switching to the Weyl calculus yields errors as follows,
The last integral is an acceptable energy correction. For the first integral to be an acceptable energy flux error, it suffices to show that
It is easily seen that this is indeed the case if any of the derivatives apply to \(\tilde{X}_{s1}\), by using the time derivative component of the ℭ bound for either \(\tilde{X}_{s1}\) or \({\tilde{g}}^{\alpha \beta}\) to bound time derivatives (of which we can have at most one). So we are left with showing that
But for this we use Lemma 6.10.
II. The contribution of \(\tilde{X}_{s0}\) with \(A=0\). Here we will follow the same road map as in the case of \(\tilde{X}_{s1}\), but additional care will be needed in order to handle the additional time derivatives. The integral we need to consider is
We can integrate by parts once in \([0,T] \times {\mathbb{R}}^{n}\) to rewrite \(I_{X}^{0}\) in the form
In the middle term we switch the operator \(T_{{\tilde{g}}^{\alpha \beta}} T_{\tilde{X}_{s0}} \partial _{0}\) to the right, while integrating by parts once in time, in order to put it in the more symmetric form
For the energy term there is nothing new, we use as before paraproduct rules to rewrite it as the desired leading part plus an acceptable error. We now consider the second space-time integral, where more care is needed. The operator
has a commutator structure, which is good. However we have to carefully decide on the order in which we commute, because, depending on whether \(\alpha =0\) or \(\beta = 0\), we might carelessly end up with a double time derivative. The positive feature, arising from the fact that we work with the metric \({\tilde{g}}\) rather than \(g\), is that if \((\alpha ,\beta )= (0,0)\) then there is a single commutator which does not involve time derivatives. For clarity we consider the four cases separately:
-
i)
The case \(\alpha \neq 0\), \(\beta \neq 0\). This is the simplest case, where, commuting and peeling off operators of size \(O_{L^{2}}({\mathcal {B}}^{2})\), we write
$$ \begin{aligned} C = & \ T_{\partial _{0} \tilde{X}_{s0}} T_{{\tilde{g}}^{\alpha \beta}}+ T_{\tilde{X}_{s0}} T_{\partial _{0} {\tilde{g}}^{\alpha \beta}} - [T_{{ \tilde{g}}^{\alpha \beta}}, T_{\tilde{X}_{s0}}] \partial _{0}. \end{aligned} $$ -
ii)
The case \(\alpha = 0\), \(\beta \neq 0\). Here we use the same order as before.
-
iii)
The case \(\alpha \neq 0\), \(\beta = 0\). Here we reverse the order, to write
$$ \begin{aligned} C = & \ T_{{\tilde{g}}^{\alpha \beta}} T_{\partial _{0} \tilde{X}_{s0}} + T_{\partial _{0} {\tilde{g}}^{\alpha \beta}} T_{\tilde{X}_{s0}} - \partial _{0} [T_{{\tilde{g}}^{\alpha \beta}}, T_{\tilde{X}_{s0}}] , \end{aligned} $$where the middle term is integrated by parts once more to move \(\partial _{0}\) together with \(\partial _{\alpha}\), at the expense of another negligible energy correction
$$ - \left . \int [T_{{\tilde{g}}^{\alpha 0}}, T_{\tilde{X}_{s0}}] \partial _{0} w \cdot \partial _{\alpha }w \, dx \right |_{0}^{T}. $$ -
iv)
The case \(\alpha = 0\), \(\beta = 0\). Here we simply have
$$ C = T_{\partial _{0} \tilde{X}_{s0}}. $$
Now we put together the terms in the four cases.
a) In the \(\partial _{0} \tilde{X}_{s0}\) term the multiplication order does not matter, and we can further replace it by \(T_{ T_{{\tilde{g}}^{\alpha \beta}} \partial _{0} \tilde{X}_{s0}}\) modulo \(O_{L^{2}}({\mathcal {B}}^{2})\) errors. Thus we retain the integral
b) In the \(\partial _{0} {\tilde{g}}^{\alpha \beta}\) term, however, the commutator is not negligible, so in addition to \(T_{\tilde{X}_{s0}} T_{\partial _{0} {\tilde{g}}^{\alpha \beta}}\) we also need the commutator \([T_{\partial _{0} {\tilde{g}}^{\alpha 0}}, T_{\tilde{X}_{s0}}]\). Hence we get two contributions,
c) In the \([T_{{\tilde{g}}^{\alpha \beta}}, T_{\tilde{X}_{s0}}]\) term where, distinguishing between \(\beta = j \neq 0\) and \(\beta = 0\), we get
In the first integral we move \(\partial _{j}\) and the commutator term to the right, also commuting them, so the above expression is rewritten as
We retain the first term as it is, combine the second one with the second term in part (b) and discard the last one as perturbative, \(Err({\mathcal {B}}^{2})\).
Putting all terms together, we have rewritten \(I^{0}_{X}\), modulo perturbative terms, as
This can be simplified further by observing that, in view of Lemma 2.3, the term \(I_{X}^{05}\) is also perturbative. Thus we arrive at
We successively consider these terms:
IIa. The contribution of \(I_{X}^{01}\). This corresponds to the symbol
which is akin to one of the components of \(c_{X_{s},B}\) in (7.57). We can turn this into the corresponding component of \({\mathring{c}}_{\tilde{X}_{s},B}\). Precisely, given \(\tilde{X}_{s0}\) as in the representation (7.63), that component is
where we recall that the hat above is understood as nonexistent unless \(\beta = \gamma = 0\), in which case it is interpreted as the corrected expression (5.13). The difference between the two coefficients is easily seen to have size \({\mathcal {B}}^{2}\), so it is perturbative, as in Lemma 6.9. It remains to switch this modification of \(I_{X}^{01}\) to the Weyl calculus, which requires estimating the integral
This follows from the bound
which in turn follows from Lemma 5.12(b) after commuting \(a^{\gamma}_{0}\) out.
IIb. The contribution of \(I_{X}^{02}\). Exactly as above, this integral also corresponds to a term in \(c_{X_{s},B}\). Again, after a perturbative \(Err({\mathcal {B}}^{2})\) correction we can turn this into the corresponding term in \({\mathring{c}}_{\tilde{X}_{s},B}\), which has the para-coefficient
The associated integral is
We would like to switch this to Weyl calculus, but we need to be careful here because the convention for the Weyl form differs depending on whether \(\beta \) is zero or not.
If \(\beta \neq 0\) then the error corresponds to switching the operator on the left to Weyl calculus, and has the form
The same applies if \(\alpha = \beta = 0\). But if \(\alpha \neq 0\) and \(\beta = 0\) then we have to switch the paraproduct to the right, and then the Weyl correction is
We can rectify this discrepancy and switch this correction to the form in (7.98) by integrating twice by parts, first in \(x_{\beta}\) and then in \(x_{\alpha}\). Since we are in the case \(\beta = 0\), the first step yields an energy correction, namely
As \(\alpha \neq 0\), for this to be an acceptable \(Err({\mathcal {A}^{\sharp }})\) error we need the bound
This is obvious if \(\gamma \neq 0\), and follows from (5.15) otherwise.
Thus we are left with considering the correction in (7.98) summed over all \(\alpha \) and \(\beta \), and which we would like to estimate perturbatively.
Here there is no structure in the \(\gamma \) summation, so we can fix \(\gamma \). The easier case is when \(\gamma \neq 0\). Then we can commute \(\partial _{\gamma}\) out, as well as \(a^{\gamma}\), and \(\partial _{\beta}\) in, writing
where the error term \(f\) satisfies
We may also correct the second order time derivative, arriving at
The remaining term is no longer directly perturbative, but its contribution may be instead estimated integrating by parts,
The last term is a bounded energy correction, as
It remains to show that the first term is also perturbative,
Commuting \(\partial _{\alpha} \) inside and discarding \(a^{\gamma }\partial _{\gamma}\), this reduces to
which is again a consequence of Lemma 5.12(b).
It remains to consider the case \(\gamma =0\), where we take advantage of the hat correction. Precisely, using the \(u\) equation, we write
We substitute this into the paracoefficient in (7.98), peeling off perturbative contributions. Fixing \(\mu \) and \(\nu \) we may assume \(\mu \neq 0\) and arrive at
with \(f\) as in (7.99). At this point we can repeat the argument in the case \(\gamma \neq 0\).
IIc. The contribution of \(I_{X}^{03}\). We recall that this is
This term is easily seen to be perturbative unless the spatial frequency of \(\tilde{X}_{s0}\) is smaller than that of \(\partial _{0} {\tilde{g}}^{\alpha \beta}\), see Lemma 6.12. Thus we can think of the principal symbol of the product \(T_{\tilde{X}_{s0}} T_{\partial _{0} {\tilde{g}}^{\alpha \beta}}\) as being \(T_{\tilde{X}_{s0}} \partial _{0} {\tilde{g}}^{\alpha \beta}\). However some care is needed here with the error, which is lower order but not necessarily balanced. Precisely, using Proposition 6.16, we can expand this product into a leading part, an unbalanced subprincipal part and a perturbative term,
This yields a corresponding decomposition of \(I_{X}^{03}\) into
To better describe the first two terms we take a closer look at the coefficient \(\partial _{0} {\tilde{g}}^{\alpha \beta}\), for which we compute
Here we have a double time derivative \(\partial _{t}^{2} u\) when \(\delta = 0\), which we replace as before by \(\hat{\partial}_{t}^{2} u\) with perturbative errors. Once this is done, we may also replace all products by paraproducts, arriving at the modified expression
so that the difference is perturbative in the sense that
Finally we return to the operator setting, where we make the above substitution. In the principal part can compound the outer paracoefficients at the expense of more negligible errors, writing it in a paradifferential form
where the order zero symbols \(q^{\alpha \beta}\) are given by
Here the symbol \(q^{\alpha \beta} \xi _{\alpha }\xi _{\beta}\) is a component of \({\mathring{c}}_{\tilde{X}_{s},B}\), as desired. All we need now is to convert the last expression for \(I_{X,main}^{03}\) to Weyl form. This conversion yields the additional error
which we need to estimate. Here we separate the three terms in \(q^{\alpha \beta}\). For the first term, after one commutation it remains to show that
which we get from Lemma 5.12. The second term is similar if we integrate by parts to switch \(\alpha \) and \(\beta \), at the expense of a bounded energy correction. Finally, the third term is exactly as in the case of \(I_{X}^{02}\).
Similarly, in the subprincipal term in (7.100) we may peel off perturbative errors to write it as a linear combination of expressions of the form
We postpone their analysis for later, for now we simply list the two types of contributions:
.
IId. The contribution of \(I_{X}^{04}\). We recall that this is
This has a similar treatment to \(I_{X}^{03}\). For the commutator above we must have again a smaller frequency on \(\tilde{X}_{s0}\), else this yields a perturbative contribution. Using Proposition 6.15 we expand the commutator into a leading part, an unbalanced subprincipal part and a perturbative term,
where the remainder \(R\) satisfies perturbative bounds of the form
We first consider the contribution of the leading part \(I_{X,main}^{04}\). For \(\partial _{j}{\tilde{g}}^{\alpha \beta}\) we use the expansion in (7.101) with the subscript 0 replaced by \(j \neq 0\), arriving at
where the order zero symbols \(q^{\alpha \beta}\) are given by
and the remainder \(R\) is as above. Then the leading part can be written as
Now the symbol \(q^{\alpha \beta} \xi _{0}\xi _{\alpha }\xi _{\beta}\) is a component of \({\mathring{c}}_{\tilde{X}_{s},B}\), as desired. It remains to convert the last expression for \(I_{X,main}^{04}\) to Weyl form. The error in doing that is
Estimating this expression requires the bound
Here \(q^{00}=0\) so we avoid the case of two time derivatives. This allows us to commute \(\partial _{\alpha }\partial _{\beta}\) inside and take \(\partial _{j}\) outside modulo perturbative terms. Then \(\partial _{j}\) yields the \(2^{k}\) factor, and we have reduced the problem to proving that
Re-labeling this becomes
The expression on the left vanishes if \(\delta =0\). This allows us to break the para-coefficient in two using Lemma 2.7 and replace this by
which is finally a consequence of Lemma 5.12.
Next we consider the subprincipal term. Here we use again the expansion in (7.101) and recombine paracoefficients to rewrite it as a linear combination of terms of the form
respectively
where \(h \in {\mathfrak {P}}S^{-2}\) roughly corresponds to \(\partial _{\xi}^{2} \tilde{X}_{s0}\). Here we can freely separate variables and reduce to the case when \(h\) is a function, including the multiplier part in \(L_{lh}\).
We remark that until now we were able to exclude the case when \(\alpha = \beta = 0\). However, at this point we need to separate the three types of contributions in order to take advantage of their structures. Because of this, from here on we have to also allow for the case \(\alpha =\beta = 0\), forfeiting the cancellation that would otherwise occur in this case between the different terms. We postpone the estimate for the subprincipal terms for the end of the proof.
III. The contribution of \(\tilde{Y}_{s0}\) with \({\tilde{A}}=0\) and \({\tilde{b}}=0\). Here we consider the integral
where we recall that \(\tilde{Y}_{s0} \in \partial _{x} {\mathfrak {P}}S^{0}\). We integrate once by parts to write
The last integral is an admissible energy correction. In both space-time integrals we move \(T_{\tilde{Y}_{s0}}\) to the left, and combine the two paraproducts as in Lemma 2.7, peeling off perturbative contributions, to get
The symbol of the bilinear form in the first integral is the desired component of \({\mathring{c}}_{s}\), but we need to convert it to Weyl calculus. This yields an error which is half of the second integral, which in turn needs to be estimated perturbatively. Commuting \(\partial _{\beta}\) inside, we are left with
Here \(\tilde{Y}_{s0}\) is of the form \(\tilde{Y}_{s0}= \partial _{x} h\), with \(h \in {\mathfrak {P}}S^{0}\). We can harmlessly commute \(\partial _{x}\) out, to arrive at
In the absence of boundaries at \(t = 0,T\) here we could integrate by parts once more to rewrite this as
and then use Lemma 6.4. The same argument applies if we add in the boundaries, by carefully tracking the boundary contributions. Precisely, we use the lemma to rewrite the expression \(J\) in (7.108) as follows:
Now, in view of Lemma 6.4, both the energy and the flux terms are perturbative.
IV. The contribution of the gradient potential \({\tilde{A}}\) and of \({\tilde{b}}_{0}\). We discuss the two together, as their contributions are similar. This has the form
which we need to shift to Weyl calculus after peeling off a perturbative contribution. For instance the contribution of \(\tilde{Y}_{s0}\) is directly perturbative. On the other hand, \({\tilde{A}}^{\gamma}\) contains \(\partial _{0}^{2} u\) terms which need to be corrected, while \({\tilde{b}}_{0}^{\gamma}\) does not. In any case, the correction can be freely added as its contribution has size \(Err({\mathcal {B}}^{2})\). Below we denote by \({\mathring{\tilde{A}}}\) the corrected version of \({\tilde{A}}\).
Next we consider the contribution of \(\tilde{X}_{s1}\), where we need to shift the operator product \(T_{\tilde{X}_{s1}} T_{{\tilde{A}}^{\gamma}}\) to the Weyl calculus via Lemma 6.12:
and similarly for \(b_{0}\), i.e. the desired term plus a null unbalanced lower order term plus a perturbative contribution. We note here that the contribution of the null unbalanced lower order term has the form
Finally we consider the contribution of \(\tilde{X}_{s0}\),
We use again the product formula for paraproducts to write
which generates a leading term and a subprincipal term.
The leading term is
Its symbol is as needed, but we still have to switch it to Weyl calculus. This switch introduces an error
To bound its contribution, we would like to have the symbol bound
Here we use the expressions for \({\tilde{A}}^{\gamma}\) and \(b_{0}^{\gamma}\), take out bounded paracoefficients, and we are left with
But this is in turn a consequence of Lemma 5.12.
To conclude, we record the form of the subprincipal term,
V. The unbalanced lower order terms. These are the expressions identified earlier, which we recall here:
All of these exhibit a null structure.
We directly compress four of these into the expression
where the analysis will be slightly different depending on whether \(\gamma \) and \(\delta \) are zero or not.
In \(I_{X,sub}^{042}\) the case \(\alpha \neq 0\) is included above. If instead \(\alpha = 0\) then we integrate by parts \(\partial _{\alpha}\) to the left, so that, after a perturbative energy correction, we arrive at
Now we use the paradifferential equation \(T_{\tilde{P}}\) equation for \(w\), which after more perturbative errors allows us to replace the leading \(\partial _{0}^{2}\) operator by \(\partial \partial _{x}\), with a \({\mathfrak {P}}\) paracoefficient. Then \(\partial _{x}\) combines with \(T_{{\mathfrak {P}}S^{-2}}\) to give \(T_{{\mathfrak {P}}S^{-2}}\), thereby reducing the problem to the case of (7.118).
Finally in \(I_{X,sub}^{041}\) we commute inside and distribute the \(\partial _{\alpha}\) derivative, peeling off perturbative errors. We arrive at
The first term is estimated by commuting \(T_{{\tilde{g}}^{\alpha \delta}}\) inside \(L_{lh}\) and onto the first argument, after which we use Lemma 5.12. In the second term we pull \(\partial _{\beta}\) out, reducing the problem either to \(I_{X,sub}^{042}\), which was discussed earlier, or to
But here we can pull out one of the \(\partial _{x}\) operators to reduce to the case of (7.118).
After this discussion we have reduced the problem to the estimate for \(I_{sub,\gamma \delta}\). Here, from easiest to hardest, we need to consider the case when neither of \(\gamma \) or \(\delta \) is zero, then when one of them is zero, and finally when none of them is zero. We will first illustrate the principle in the easiest case, and then describe the additional complications for the most difficult case. We leave the intermediate case for the reader.
A. The case \(\gamma ,\delta \neq 0\). The argument in this case consists of three integrations by parts in a circular manner. Here we have \(h \in {\mathfrak {P}}S^{-1}\). We may include \(\partial _{\delta}\) in \(h\) in which case \(h \in {\mathfrak {P}}S^{0}\). Separating variables, the problem can be further reduced to \(h \in {\mathfrak {P}}\). In the computations below we omit \(h\) altogether, as it does not play any role. Then it remains to bound the integral
Similarly, derivatives applied to \(g\) yield perturbative contributions, of size \(O({\mathcal {B}}^{2})\), and will not be explicitly written in order to avoid cluttering the formulas. In the absence of boundary terms, we compute as follows, integrating by parts in order to convert the null form into three \(T_{\tilde{P}}\) operators modulo admissible errors:
We now distribute \(T_{{\tilde{g}}^{\alpha \beta}}\), noting that any commutator errors involve derivatives of \({\tilde{g}}\) and thus are perturbative. We arrive at
It remains to add the boundary terms at times \(t=0,T\) into the above computation. Such boundary terms arise from the integration by parts with respect to \(x_{0}\). We obtain the following enhanced version of (7.120):
The boundary terms are easily seen as lower order energy corrections, so it remains to estimate the interior contributions. For the first one we can use the \(w\) equation to get the fixed time bounds
which sufficesFootnote 6 by combining the two components of the last term with either the \({\mathcal {A}}\) or the ℬ bound for \(u\) in \(L_{lh}\). The other two interior contributions reduce to the bound
which is a consequence of Lemma 5.11 and which suffices to estimate the expressions in (7.121).
B. The case \(\gamma =\delta = 0\). In this case we seek to estimate the integral
Here the hat correction plays a perturbative role and could be omitted. However, in the computations below we need to keep it in order to be able to estimate energy corrections. Our computations emulate the simpler case considered above, but with some care in order to avoid iterated time derivatives. Integrating by parts the \(\beta \) derivative we get
where the last term accounts for the boundary contributions obtained when \(\beta =0\). In the first integral we perturbatively move \(T_{{\tilde{g}}^{\alpha \beta}}\) on the first \(L_{lh}\) argument and then use Lemma 5.11; this allows us to move the entire first integral into the error, leading us to
On the other hand we can perturbatively drop the hat and integrate by parts the \(\alpha \) derivative. This gives
In the first integral we perturbatively move \(T_{{\tilde{g}}^{\alpha \beta}}\) on the second \(L_{lh}\) argument; using the paradifferential \(w\) equation where the \({\tilde{A}}\) and \({\tilde{b}}\) terms are perturbative, this yields a lower order correction to our multiplier. Switching the \(\alpha \) and \(\beta \) indices we obtain
where \(\|Q\|_{L^{2} \to L^{2}} \lesssim {\mathcal {A}^{\sharp }}\).
The next step is to add the relations (7.125) and (7.126). Here we separate the cases \(\alpha = j \neq 0\) and \(\alpha =0\), where in the first case we can drop the hat and pull out the \(\partial _{\alpha}\) in \(L_{lh}\),
In the first integral we integrate by parts to switch \(\partial _{0}\) to the left and then \(\partial _{j}\) to the right. Then we distribute the \(\partial _{0}\) on the left. This yields
In the first integral we may perturbatively correct \(\partial _{0}^{2} u\); this allows us to put back together the cases when \(\alpha \) is zero and nonzero,
Finally, we commute \(T_{{\tilde{g}}^{\alpha \beta}} \) to the right factor, and use the \(w\) equation to add another perturbative factor to our multiplier. This gives
with a modified \(Q\), as desired.
The proof of Proposition 7.11 is now concluded. □
We now conclude the proof of Theorem 7.1 using Proposition 7.11, with the vector field \(\tilde{X}_{s}\) chosen as in Proposition 7.7 and \(\tilde{Y}_{s0}\) defined as in (7.91). For these we have at our disposal not only the conclusion of Proposition 7.7, but also the refined version in Proposition 7.9. This guarantees that the symbol \({\tilde{c}}_{s}\) in (7.92) has size \({\mathcal {B}}^{2}\), in the sense that its coefficients in (7.93) satisfy
These conditions, in turn, guarantee that the flux term \(C_{s}\) in our energy estimate (7.96) satisfies
and thus the conclusion of Theorem 7.1 follows.
8 Energy estimates for the full equation
Our objective here is to prove energy estimates for the solution \(u\) to the minimal surface equation (1.7) in \(\mathcal {H}^{s} = H^{s} \times H^{s-1}\), in terms of our control parameters \({\mathcal {A}^{\sharp }}\) and ℬ.
Theorem 8.1
For each \(s \geq 1\) there exists an energy functional \(E^{s}_{NL}\) for the minimal surface equation (1.7) in \(H^{s} \times H^{s-1}\) with the property that for all \(\mathcal {H}^{s}\) solutions \(u \) to (1.7) with \({\mathcal {A}^{\sharp }}\ll 1\) and \({\mathcal {B}}\in L^{2}\) we have:
a) coercivity,
b) energy bound,
Because of the assumption \({\mathcal {A}^{\sharp}}\ll 1\), in this section we no longer need to track the dependence of implicit constants on \({\mathcal {A}^{\sharp}}\). The exception to this is in the proof of Lemma 8.4, where the smallness of \({\mathcal {A}^{\sharp}}\) is used in order to guarantee the invertibility of our partial normal form transformation; even there, we only need to use linear and quadratic \({\mathcal {A}^{\sharp}}\) factors.
The rest of this section is devoted to the proof of the theorem. This has two main ingredients:
-
(1)
Reduction to the paradifferential equation, using normal form analysis.
-
(2)
Energy estimates for the paradifferential equation, which have already been proved in Theorem 7.1.
Hence, our primary objective here will be to carry out the above reduction. We recall the minimal surface equation,
In order to use the energy estimates obtained in the previous section, we write this in paradifferential form:
where the source term \(N(u)\) is given by
Here we cannot treat \(N\) perturbatively; precisely, we do not have an estimate of the form
even though \(N(u)\) is cubic in \(u\), and the above inequality is dimensionally correct. This is because \(N\) contains some unbalanced contributions.
To address this issue, our strategy will be to correct \(u\) via a well chosen normal form transformation, in order to eliminate the unbalanced part of \(N(u)\). But in order to do this, we have to first identify the unbalanced part of \(N(u)\), and reveal its null structure. A first step in this direction is to better describe the contributions of the metric coefficients \(g^{\alpha \beta}\) in \(N\); explicitly we want to extract the renormalizable terms (i.e. the terms to which we can apply a normal form correction). For this we express \(g^{\alpha \beta}\) paradifferentially as follows:
Lemma 8.2
The metric \(g^{\alpha \beta}\) can be expressed paradifferentially as follows
where \(R (u)\) satisfies the following balanced bounds for \(s \geq 1\):
as well as
Proof
The representation in (8.5) and the bound (8.6) for \(R\) follow from (3.8) and Lemma 2.9. To get (8.7) one estimates each term in \(R\) separately, using no cancellations. □
This suggests that the nonlinear contribution \(N(u)\) should be seen as the sum of two terms
where \(N_{1}\) has null structure and \(N_{2}\) is balanced,
We will first prove that \(N_{2}(u)\) is a perturbative term:
Lemma 8.3
The expression given by \(N_{2}(u)\) satisfies the bound
Proof
We begin with the first difference in \(N_{2}\), and look separately at each \(\alpha \), \(\beta \) and \(\gamma \). If \((\alpha , \beta )\neq (0,0)\) then we apply Lemma 2.7 to obtain
If \((\alpha , \beta )= (0,0)\) then we use the wave equation for \(u\) with the \(\tilde{g}\) metric (3.19) to write
exactly as in Lemma 5.4. Then for the first term we have the estimate
which suffices in order to apply Lemma 2.7 as above in order to estimate its contribution.
On the other hand, the bound for the contribution of \(\pi _{2}(u)\) is easier because by Lemma 5.4 we have the direct uniform bound
Now, we turn our attention to the second term in \(N_{2}(u)\), where we again discuss separately the \((\alpha , \beta )\neq (0,0)\) and \((\alpha , \beta ) = (0,0)\) cases.
For the \((\alpha , \beta )\neq (0,0)\) case we use the bound in (8.6) to obtain
Next we consider the case \((\alpha , \beta )= (0,0)\), and observe that we again need to use the decomposition (8.9). The contribution of \(\hat{\partial}_{0}^{2} u\) is estimated using (8.10) and the bound (8.6) for \(R\), exactly as above:
For the \(\pi _{2}(u)\) contribution we use the pointwise bound in (8.11) and the \(H^{s-1}\) bound in (8.7) for \(R\),
Finally, a similar analysis leads to the bound for the balanced term \(\Pi (\partial _{\alpha} \partial _{\beta}u, R(u))\). □
To account for the unbalanced part \(N_{1} (u)\) of \(N\) we introduce a normal form correction
Our goal will be to show that the normal form variable solves a linear inhomogeneous paradifferential equation with a balanced source term.
Lemma 8.4
The normal form correction above has the following properties:
-
a)
It is bounded from above and below,Footnote 7
$$ \Vert \tilde{u}[t] \Vert _{\mathcal {H}^{s}}\approx \Vert u[t] \Vert _{ \mathcal {H}^{s}}. $$ -
b)
It solves the an equation of the form
$$ ( \partial _{\alpha} T_{g^{\alpha \beta}}\partial _{\beta} - T_{A^{ \gamma}}\partial _{\gamma}) \tilde{u}=N_{2}(u) + \partial _{t} R_{1}(u) +R_{2}(u), $$(8.13)where
$$ \Vert R_{1}(u)\Vert _{H^{s}}\lesssim {\mathcal {B}}^{2} \Vert u[t] \Vert _{\mathcal {H}^{s}}, \qquad \Vert R_{2}(u)\Vert _{H^{s-1}} \lesssim {\mathcal {B}}^{2} \Vert u[t]\Vert _{\mathcal {H}^{s}}, $$(8.14)and
$$ \Vert R_{1} (u)\Vert _{H^{s-1}}\lesssim {\mathcal {A}^{\sharp }}^{2} \Vert u[t]\Vert _{\mathcal {H}^{s}}. $$(8.15)
We remark that here we expand the meaning of “balanced source terms” to include expressions of the form \(\partial _{t} R_{1}\) with \(R_{1}\) as above. This is required due to the fact that time derivatives are often more difficult to estimate in our context, and are allowed in view of the result in Theorem 4.5.
Proof
a) In view of the smallness of \({\mathcal {A}^{\sharp }}\), for the boundedness of the normal form it suffices to show that we have the fixed time bound
as well as
For the first bound we directly have
We prove the second bound in a similar manner, but we first apply the time derivative and analyze each term separately:
There are multiple cases arising from the strategy we will implement for terms involving two times derivatives, as well as from the particular structure of each of the terms.
We begin with \(r_{1}\), where we need to separate the \(\beta \neq 0\) and \(\beta = 0\) cases. The easiest case is when \(\beta \neq 0\), where we have
Here we have used the energy control we have for \(\partial _{t} u\), which in turn gives control of all spatial derivatives of \(\partial _{t} u\). For the case \(\beta =0\) we use the decomposition for \(\partial _{t}^{2} u\) as in Lemma 5.4. For the first component we use the second bound in (5.15),
For the second component we argue similarly but using the second bound in (5.16) together with Bernstein’s inequality,
We continue with the bound for \(r_{2}\), where we do not need to distinguish between the time and space derivatives,
Lastly, we need to bound \(r_{3}\). Here we argue as in (8.19),
so it remains to prove the following bound for \(\partial _{t} \partial ^{\beta }u\),
Here we use the chain rule for the paracoefficient to write \(\partial _{t} \partial ^{\beta }u\) as a linear combination of \(\partial ^{2}_{t} u\) and \(\partial _{t}\partial _{x} u\),
where \(h:= h(\partial u)\). For the \(\partial ^{2}_{t}\) term we use the minimal surface equation (3.5), arriving at the representation
where \(\tilde{h}\) incorporates the corresponding metric coefficients. As before, we need to use a Littlewood-Paley decomposition
The first two terms are easy to estimate using only \(L^{\infty}\) bounds for \(\partial u\) and \(\tilde{h}(\partial u)\),
Finally for the third we use instead the \({\mathcal {A}^{\sharp }}\) component of the \(\mathfrak{C}_{0}\) norm for both \(\partial u\) and \(\tilde{h}(\partial u)\),
Hence (8.20) follows. This finishes the proof of (8.17), and thus of the boundedness from above and below of the normal form transformation in our desired Sobolev space \(\mathcal {H}^{s}\).
b) We begin with the supposedly easier contribution, meaning with the term \(T_{A^{\gamma}}\partial _{\gamma }u_{2}\). To bound this term we would like to commute the \(\partial _{\gamma}\) and place it in front of the product,
This would look good for the first term on the RHS. However the last term would be problematic, as it may contain three derivatives with respect to time. To avoid this issue we first substitute \(A^{\gamma}\), which by (3.13) is given by
with the more manageable leading part \(\mathring{A}^{\gamma}\) given by
Here the hat correction is from the Definition 5.3. Then
We will successively place each of these terms in \(\partial _{t} R_{1}\) or \(R_{2}\). We place the first term in \(R_{2}\). To prove (8.14) for this term, we use the bounds (8.16) and (8.17) for \(u_{2}\), and Lemma 6.9 to bound the coefficient
We will place the second term in \(\partial _{t} R_{1}\) if \(\gamma =0\) and in \(R_{2}\) if \(\gamma \neq 0\).
In order to prove (8.14) we measure \(u_{2}\) in \(H^{s+\frac{1}{2}}\),
Then (8.14) follows if we can bound the coefficient \(\mathring {A}\) by
But this is also a consequence of Lemma 6.9, see (6.12). On the other hand for the bound (8.15) we estimate \(u_{2}\) in \(H^{s}\) as in (8.16), and then it suffices to show that
which is similar.
The last term is placed in \(R_{2}\) using again the bounds (8.16) and (8.17) for \(u_{2}\) on one hand and Lemma 5.12 on the other hand, to obtain
Now we consider the main term \(\partial _{\alpha} T_{g^{\alpha \beta}}\partial _{\beta}u_{2}\), which can be expanded as
Depending on whether \(\alpha =0\) or \(\alpha \neq 0\), we place the middle term into \(\partial _{t}R_{1}\) or \(R_{2}\), respectively:
Here we use the property \(\partial _{\beta}\partial ^{\gamma}u \in \mathfrak{DC}\) to handle the case when \(\beta =0\) for the first bound, and (8.20) for the second.
The first term can be rewritten in the form
by using Lemma 2.5, as well as Lemma 5.4 for the case \((\beta , \gamma )=(0,0)\). Similarly the last term can be rewritten in the analogous form
Finally we distribute the \(\alpha \) derivative in both (8.22) and (8.23). For the first term on the right in (8.22) we get
We place \(s_{1}\) in \(R_{2}\) using Lemma 5.12,
The term \(s_{2}\) is also estimated perturbatively using the fact that \(\partial _{\alpha}\partial ^{\gamma}u \in \mathfrak{DC}\), which allows us to decompose it as a sum \(f_{1}+f_{2}\) as in (5.20). Then we estimate
using Lemma 5.4 for \(\widehat{\partial _{\beta}\partial _{\gamma}} u\). In \(s_{3}\) we can switch \(T_{g}\) onto the other argument of \(\Pi \) using Lemma 2.5 and remove the hat correction, so that it becomes half of \(N_{1}\).
The last remaining term to bound is the one on the RHS of (8.23). Here we distribute again the \(\alpha \)-derivative
By inspection we observe that the first term in the equality above is the second half of \(N_{1}\). The remaining three terms can be estimated perturbatively using exactly the same approach as in the case of (8.22). □
In view of Lemma 8.3 we can include \(N_{2}(u)\) into \(R_{2}(u)\) in (8.13), obtaining the shorter representation of the source term
where \(R_{1}\) and \(R_{2}\) satisfy the bounds (8.14) and (8.15).
For the homogeneous paradifferential problem we have the \(\mathcal {H}^{s}\) energy \(E^{s}\) given by Theorem 7.1. We will use this to construct our desired nonlinear energy \(E^{s}_{NL}\) in Theorem 8.1. Because we have the source term \(\partial _{t}R_{1}\), the associated nonlinear energy will not be simply given by \(E^{s}(\tilde{u}[t])\). Instead, the correct energy is the one provided by Theorem 4.5, namely
where the correction \(r[t]\) is given by
Then by the estimate in (4.29) we obtain
where \(r_{1}[t]\) is as in (4.28),
Our nonlinear energy \(E^{s}_{NL}\) is coercive because \(r[t]\) is small,
due to the bound (8.15). Finally, we control the time derivative of the energy, because
This is due to the bound in (8.14).
9 Energy and Strichartz estimates for the linearized equation
Our objective here is to prove that the homogeneous linearized equation is well-posed in the space \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) in two dimensions) and satisfies Strichartz estimates with an appropriate loss of derivatives, namely (4.33) with \(S=S_{AIT}\), under the assumption that the associated linear paradifferential equation has similar properties. The main result of this section is as follows:
Theorem 9.1
Let \(s\) be as in (1.20) (respectively (1.19) in dimension \(n = 2\)). Let \(u\) be a smooth solution for the minimal surface equation in a unit time interval, and which satisfies the energy and Strichartz bounds
Assume also that the associated linear paradifferential equation
is well-posed in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) in dimension \(n = 2\)) in a unit time interval, and satisfies the full Strichartz estimates (4.43), with \(S= S_{AIT}\), in the same interval.
Then the homogeneous linearized equation
is also well-posed in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) in dimension \(n = 2\)), and satisfies the homogeneous form of the Strichartz estimates in (4.33) with \(S = S_{AIT}\).
We continue with several comments on the result in the theorem, in order to better place it into context.
-
Up to this point we only know that both the full equation and the associated linear paradifferential equation satisfy good energy estimates, but we do not yet know that they also satisfy the corresponding Strichartz estimates. This is however not a problem, as the main result of this section, namely Theorem 9.1 above, will only be used as a module within our main bootstrap argument in the last section of the paper, by which time we will have already established the energy and Strichartz estimates for both the full equation and for the linear paradifferential equation.
-
The exponent \(s\) in the above result needs not be the same as the one in our main result in Theorem 1.3; it can be taken to be smaller, as long as it still satisfies the constraints in (1.19), (1.20).
-
While we can no longer control the linearized evolution purely in terms of the control parameters \({\mathcal {A}}\), \({\mathcal {A}^{\sharp }}\), and ℬ, these still play role in the analysis. The hypothesis of the theorem guarantees that
$$ {\mathcal {A}^{\sharp }}\ll 1, \qquad \| {\mathcal {B}}\|_{L^{2}} \ll 1. $$ -
The exponent \(\delta \) in the bound (4.43) with \(S= S_{AIT}\) should be thought of as being sufficiently small, compared with the distance between \(s\) and the threshold in (1.19), (1.20).
-
The smoothness assumption on \(u\) is not used in any quantitative way. Its role is only to ensure that we already have solutions for the linearized problem, so we can skip an existence argument. Thus, by a density argument, the result of the theorem may be seen as an a-priori estimate for smooth solutions \(v\) to the linearized equation. As our rough solutions will be obtained in the last section as limits of smooth solutions, this assumption may be discarded a posteriori.
-
The reason we only consider the homogeneous case in the linearized equation (9.3) is to shorten the proof, as this is all that is used later in the last section. However, the result also extends to the inhomogeneous case. In particular in dimension \(n \geq 3\) this is an immediate consequence of Theorem 4.12, but in dimension \(n=2\) an extra argument would be needed.
One major simplification in this section, compared with the previous two sections, is that we no longer have the earlier difficulties in estimating the second order time derivatives and even some third order time derivatives of \(u\). In particular, we have the following relatively straightforward lemma:
Lemma 9.2
For solutions \(u\) to the minimal surface equation as in (9.1) we have the bounds
as well as
where \(\delta _{0} >0\) depends only on \(s\) and
respectively
Proof
For (9.4) we only need to consider the second order time derivatives, which we can write using the minimal surface equation as
By Moser inequalities we have \(\|h(\partial u)\|_{H^{s-1}} \ll 1\). Since \(s -1 > n/2\), it is easily verified that the space \(S^{s-2}_{AIT}\) is left unchanged by multiplication by \(h(\partial u)\). The same argument applies to derivatives of the metric \(\partial {\hat{g}}\).
For (9.5) we can use again the minimal surface equation to obtain the representation
where we can further write
Hence we need to multiply two functions in \(S^{s-2}_{AIT}\), which contains a range of mixed \(L^{p}\) norms at varying spatial Sobolev regularities. We can do this optimally if both mixed norms can be chosen to have non-negative Sobolev index. In order to avoid using Sobolev embeddings we further limit the range of exponents to the case when one of the Sobolev indices may be taken to be zero. This gives the range of exponents in the lemma. □
Next we discuss the strategy of the proof. The first potential strategy here would be to try to view the equation (9.3) as a perturbation of (9.2). Unfortunately such a strategy does not seem to work in our context, because this would require a balanced estimate for the difference between the two operators, whereas this difference contains some terms which are clearly unbalanced.
To address the above difficulty, the key observation is that the aforementioned difference exhibits a null structure, at least in its unbalanced part. This opens the door to a partial normal form analysis, in order to develop a better reduction of (9.3) to (9.2). Because of this, the proof of the theorem will be done in two steps:
-
(1)
The normal form analysis, where a suitable normal form transformation is constructed.
-
(2)
Reduction to the paradifferential equation, using the above normal form.
9.1 Preliminary bounds for the linearized variable
The starting point for the proof of the theorem is to rewrite the divergence form of the linearized equation (9.3) as an inhomogeneous paradifferential evolution (9.2) with a perturbative source \(f\), as follows:
While we cannot directly prove a balanced cubic estimate for \(f\), a useful initial step is to establish a quadratic estimate for it. The expression for \(f\) involves \(v\) and \(\partial _{t} v\), which we already control, but also \(\partial _{t}^{2} v\), which we do not. So we estimate it first:
Lemma 9.3
For solutions \(v\) to (9.3) we have
Proof
We consider the case \(n \geq 3\), and comment on the case \(n=2\) at the end. Using the equation (9.3) for \(v\), we may write
Here by Moser inequalities we have
Then, using also (9.4) the conclusion of the Lemma follows from the straightforward multiplicative estimates
where it is important that \(s > \frac{n}{2}+1\) and \(s > \frac{5}{2}\). This last condition is not valid in dimension \(n=2\), where we only ask that \(s > 2+\frac{3}{8}\). This is why the Sobolev exponents in this case need to be increased by \(\frac{1}{8}\). □
We now return to the quadratic estimate for the source term in (9.6):
Lemma 9.4
Let \(v \in S^{\frac{1}{2}}_{AIT}\) satisfy (9.3). Then \(v\) also solves the inhomogeneous paradifferential equation
with source term \(f\) satisfies the following bounds:
a) For \(n \geq 3\) we have the uniform bound
and the space-time bound
with
b) For \(n =2 \) we have the uniform bound
and the space-time bound
Proof
To avoid cluttering the notations, we prove the result in the case \(n \geq 3\). The two dimensional case is identical up to appropriate adjustments of \(L^{p}\) exponents.
We write
For the two terms where \(\partial _{\alpha}\) has fallen on \(g\), we have
Finally, in the cases where the \(\partial _{\alpha}\) has fallen on \(v\), we easily obtain the same estimate due to a good balance of derivatives. Here we use Lemma 9.3 to bound second derivatives of \(v\). □
9.2 The normal form analysis
The estimate in Lemma 9.4 is suboptimal as it does not recognize the cubic structure of the source. This is due to components of the source in which the linearized variable \(v\) is the second highest frequency, and which are not efficiently balanced with respect to derivatives. In fact, these cubic terms may heuristically be viewed as quadratic with a low frequency coefficient.
To better understand the source terms, we begin with a better description of the metric coefficients. By applying Lemma 2.9 to \(g^{- \frac{1}{2}}g^{\alpha \beta}\) (see also (3.8)) and rearranging, we may write the paradifferential expansion
where \(R\) satisfies favourable balanced bounds, as in Lemma 5.7,
To obtain a cubic estimate for (9.6), we substitute (9.13) in (9.6) and write
where
consists of the essentially quadratic, nonperturbative components, while
consists of the balanced, directly perturbative components. We address the essentially quadratic components in \(N_{1}(u)\) by passing to a renormalization \(\tilde{v}\) of \(v\),
This renormalization eliminates the components of the source where the linearized variable \(v\) is the second highest frequency. We thus replace \(N_{1}(u)\) with a source consisting of terms with \(v\) only at the third highest frequency, and hence may be viewed as authentically cubic.
Proposition 9.5
Let \(v \in S^{\frac{1}{2}}_{AIT}\) be a solution for (9.3). Then the following two properties hold in dimension \(n \geq 3\):
(i) Equivalent norms:
(ii) \(\tilde{v}\) solves a good linear paradifferential equation of the form
where the source terms are perturbative:
as well as
The same result holds in two space dimensions at the level of \(v \in \mathcal {H}^{\frac{5}{8}}\).
Proof
(i) For the bound (9.17), it suffices to estimate \(v_{2}\) as follows:
The first term of \(v_{2}\) can be directly estimated using scale invariant \({\mathcal {A}}\) bounds,
The third and the fourth terms are similar. However, for the second term we need to use the \({\mathcal {A}^{\sharp }}\) control norm combined with Bernstein’s inequality:
We next consider (9.22), where we distribute the time derivative, obtaining several types of terms:
a) Terms with distributed derivatives, namely \(T_{T_{\partial u} \partial v} \partial u\) and \(\Pi (T_{\partial u} \partial v , \partial u)\). We estimate the first by
and the second, using Sobolev embeddings, by
b) Terms with two derivatives on the high frequency \(u\), namely \(T_{T_{\partial u} v} \partial ^{2} u\) and \(\Pi (T_{\partial u} v , \partial ^{2} u)\). In view of the bound (9.4), the corresponding estimate is nearly identical to case (a) above.
c) Terms with \(\partial _{t} \partial ^{\gamma }u\). Here we know that \(\partial _{t} \partial ^{\gamma }u \in H^{s-2}\), so we arrive at estimates which are also similar to case (a).
d) Terms with two derivatives on \(v\). If one of them is spatial (i.e. \(\gamma \neq 0\)) then this is similar to or better than case (a). So we are left with the expressions \(T_{T_{\partial u} \partial _{t}^{2} v} u\) and \(\Pi (T_{\partial u} \partial _{t}^{2} v , u)\). But there we can use the bound (9.7) and complete the analysis again as in case (a).
(ii) The proof of (9.18) along with the estimates (9.19), (9.20) will be accomplished in four steps.
1) We first estimate the balanced source term component \(N_{2}(u)\) from (9.15). We consider below the paraproduct \(T\) term, but the \(\Pi \) term is similar. We first consider the cases where the outer derivative \(\partial _{\alpha }= \partial _{i}\) is a spatial derivative, which we will place in \(f_{2}\). We have by Lemma 5.7 (see (9.14) above)
where the \({\mathcal {B}}^{2}\) factor is integrable in time. We place the case where \(\partial _{\alpha }= \partial _{0}\) in \(\partial _{t} f_{1}\), estimating
and
as well as
Here we have a single ℬ factor, which is \(L^{2}\) in time, as needed for the \(L^{2} L^{\infty}\) Strichartz norm in (9.20).
2) Next, we apply product and commutator lemmas to exchange \(N_{1}(u)\) for an equivalent expression up to perturbative errors, in preparation for comparison with the contribution from the normal form corrections. Here, we discuss the first term of \(N_{1}(u)\),
but the remaining terms, including the balanced \(\Pi \) terms, are similar, using the analogous product and commutator lemmas. We first consider the cases where the outer derivative \(\partial _{\alpha }= \partial _{i}\) is a spatial derivative, and place all perturbative errors in \(f_{2}\). By an application of product and commutator Lemmas 2.7 and 2.4, we may replace (9.23) with
Then applying Lemma 2.4 and the estimate (2.13) in Lemma 2.8, it suffices to consider
In the case where \(\partial _{\alpha }= \partial _{0}\), we place all perturbative errors in \(\partial _{t} f_{1}\). The bound for \(f_{1}\) in (9.19) is similar to the one for \(f_{2}\), but there is a price to pay, namely that we also need to prove (9.20). Fortunately for (9.20) we may disregard all commutator structure and discard all the para-coefficients, as they are bounded and gain an \({\mathcal {A}^{\sharp }}\) factor, so we are left with proving a bound of the form
Here for the uniform bound we simply write at fixed time
and for the \(L^{2} L^{\infty}\) bound we have
using ℬ for the square integrability in time and then applying Bernstein’s inequality in space to convert the \(L^{2}\) bound into \(L^{\infty}\).
Applying the same analysis to the other terms of \(N_{1}(u)\), we have reduced the problem to
3) We next establish the cancellation between the normal form correction and \(N_{1}'(u)\). In this step, we discuss only the low-high \(T\) paraproduct contributions, and return to the \(\Pi \) contributions in Step 4. Applying \(T_{{\hat{P}}}\) to the \(T\) term of \(v_{2}\) in (9.16), we have the contribution
a) We first would like to observe that the cases where the derivatives \(\partial _{\beta}\) and \(\partial _{\gamma}\) are split, between \(v\) and the high frequency \(u\), cancel with the first two terms of \(N_{1}'(u)\). The main task to verify before doing so is that the cases where the \(\partial _{\beta}\) falls on the lowest frequency para-coefficient \(\partial ^{\gamma }u\) are perturbative due to an efficient balance of derivatives, and may be absorbed into \(f_{2}\) or \(f_{1}\). To see this, we analyze separately cases involving spatial versus time derivatives. In the case of spatial derivatives \(\partial _{\alpha }= \partial _{i}\) and \(\partial _{\beta }= \partial _{j}\), we directly estimate
In the case where \(\partial _{\beta }= \partial _{0}\), we obtain the same estimate in the same manner, except when \(\partial _{\gamma }= \partial _{0}\). In this case, we may use Lemma 5.4 to estimate the lowest frequency \(\partial _{0}^{2} u\).
It remains to consider the case \(\partial _{\alpha }= \partial _{0}\), which we place in \(\partial _{t} f_{1}\). We have
as before. For \(f_{1}\) however, we also require an estimate for the full Strichartz norm in (9.20). We separate \(\partial _{\beta}\) again into spatial and time derivatives. For the spatial case, we have by Sobolev embeddings,
for the uniform bound, as well as
for the \(L^{2} L^{\infty}\) bound.
For the case \(\partial _{\beta }= \partial _{0}\), the lowest frequency includes an instance of \(\partial _{0}^{2} u\), where we apply Lemma 5.4. This contributes a spatial component \(\hat{\partial}^{2}_{t} u\) which is estimated as before, as well as a balanced \(\Pi \) interaction, namely \(\pi _{2}(u)\). This case is estimated by
for the energy norm respectively
for the \(L^{2} L^{\infty}\) bound.
Having dismissed the perturbative cases via this analysis, we observe an exact cancellation with the first two terms of \(N_{1}'(u)\). Collecting the remaining paraproduct terms from \(N_{1}'(u)\) and (9.24), we are left with the expression
b) Before proceeding, we further process the first term in (9.25), with the key step being an integration by parts which reveals an instance of \(T_{{\hat{P}}}v\). Reindexing, we rewrite this term as
Then applying Lemma 2.4 and the estimate (2.13) in Lemma 2.8 to commute \(T_{{\hat{g}}^{\alpha \beta}}\), similar to step 2), we replace this by
Simulating an integration by parts with respect to \(\partial _{\alpha}\), we write this as
We will carry the first of these terms forward to 3c), while the latter term is perturbative. To see this, we observe that \(\partial _{\alpha}\) may commute through \(T_{\partial ^{\gamma }u}\), similar to the analysis in 3a). Thus we arrive at the expression
We consider separately via \(f_{2}\) and \(\partial _{t} f_{1}\) the contributions corresponding to \(\partial _{\gamma }= \partial _{i}\) and \(\partial _{\gamma }= \partial _{0}\) respectively. For the bound (9.19), using Lemma 9.4 and the Strichartz exponents \((p_{1},q_{1})\) given by
we estimate, in both cases,
It remains to prove the bound (9.20), but this is again a simpler bound where we have a considerable gain. Indeed, using only \(H^{s}\) Sobolev bounds but including (9.4) and (9.7) we obtain at fixed time
which suffices for all the Strichartz bounds.
c) Returning to (9.25) and replacing the first term via the analysis in 3b), we are now left with
We observe a cancellation between the first two terms. To see this, we apply the Leibniz rule for the \(\partial _{\gamma}\) derivative on the first term. Similar to 3a), cases where the derivative falls on the lowest frequency \(\partial ^{\gamma }u\) or \({\hat{g}}^{\beta \alpha}\) are perturbative. We also have a term which cancels the second term, leaving us with
Applying also the commutator Lemma 2.4 and the bound (2.13) in Lemma 2.8 as in 2), we rewrite this as
d) We apply the Leibniz rule with respect to \(\partial _{\alpha}\). Here we observe that cases where \(\partial _{\alpha}\) falls on lower frequency instances of \(u\) or \(g\) are perturbative. Note that in contrast to the previous substeps, we no longer have the \(\partial _{\alpha}\) divergence and so we must put all terms in \(f_{2}\).
We consider for instance the term
Excluding the case of two time derivatives in \(\partial _{\alpha}\partial ^{\gamma }u\), this is easily estimated due to a favorable balance of derivatives. In the case of two time derivatives, we have \(\partial _{\alpha}\partial ^{\gamma }u \in \mathfrak{DC}\) so we can use the decomposition in Definition 5.1, say \(\partial _{\alpha}\partial ^{\gamma }u = h_{1}+h_{2}\). The first component can be thought of as a spatial derivative and is again easily estimated. It remains to consider the contribution of the second term \(h_{2} \in {\mathcal {B}}^{2} L^{\infty}\):
A similar analysis applies in the cases where \(\partial ^{\alpha}\) falls on a low frequency metric coefficient \(g\).
e) We record the remaining terms after applying the Leibniz rule, and will observe instances of \(T_{{\hat{P}}}\) for which we use the equation (9.6), as well as a cancellation. We arrive at
Reindexing the second term, and applying the bound (2.13) in Lemma 2.8 in the second and the third term, the above expression may be reduced to the form
Now in the two middle terms we have a commutator structure, which can be estimated directly by Lemma 2.4. It remains to consider the first and the last term. We apply (9.10) to the first term, and estimate in a dual Strichartz norm with exponents \((p_{1},q_{1})\) as in (9.26),
For the last term, on the other hand, we use a Strichartz bound for \(v\) and match it with the bound (9.5) in Lemma 9.2,
where
Here the Strichartz exponents \(p_{3}\) and \(q_{3}\) are chosen so that the first factor on the right is controlled by \(\| v\|_{S^{\frac{1}{2}}_{AIT}}\) and \(\delta \) is arbitrarily small. On the other hand \(\delta _{0}\) is a fixed positive parameter which depends only on the distance between \(s\) and its lower bound.
4) It remains to consider the cancellation between the balanced \(\Pi \) terms in the normal form correction and in \(N_{1}'(u)\). Here the analysis is identical to the analysis for the low-high \(T\) contributions in Step 3, due to the analogous structure for the \(T\) and \(\Pi \) terms in both \(v_{2}\) and \(N_{1}'(u)\). The main care that is needed is to observe that all negative Sobolev exponent norms have been addressed in Step 3 by either using a divergence structure, or by Sobolev embeddings, which apply equally well to the balanced \(\Pi \) case. □
9.3 Reduction to the paradifferential equation
Here we first use the well-posedness result for the linear paradifferential equation in order to obtain a good bound for \(\tilde{v}\). The source terms are perturbative by (9.19) and Theorem 4.12, so the solution \(\tilde{v}\) must satisfy the bound
It remains to show that the Strichartz estimates carry over to \(v\). For this, it suffices to show that
If this is true, then combining the last two bounds with the norm equivalence (9.17) we obtain the desired bound for the linearized evolution (9.3), namely
with a universal implicit constant. This concludes the proof of Theorem 9.1 in dimension \(n \geq 3\). The case \(n=2\) is virtually identical.
It remains to prove (9.29). The energy norm for \(v_{2}\) has already been estimated in part (i) of Proposition 9.5, so it remains to consider the \(L^{2} L^{\infty}\) norm in three and higher dimensions. This is a soft bound, where we only need to use the energy bound for \(v\) on the right, and not the full Strichartz norm, as we would also have been allowed. There are eight norms to estimate; most of them are similar, so we consider a representative sample, leaving the rest for the reader.
For a streamlined unbalanced bound we consider the term
where we have used Bernstein’s inequality twice and the Strichartz bound for \(u\). This pattern is followed for all unbalanced terms.
For the worst balanced case, we apply the time derivative to \(v\) in the next to last term in \(v_{2}\). Then we have to estimate
where we have used Bernstein’s inequality twice, Lemma 9.3 and the Strichartz bound for \(u\).
10 Short time Strichartz estimates
The aim of this section is to provide a more detailed overview of the local well-posedness result in [38], and at the same time to provide a formulation of this result which applies in a large data setting, but for a short time. Instead of working with the equation (1.13), here it is easier to work with the problem
for a possibly vector-valued function \({\mathbf {u}}\). This is exactly the set-up of [38], and has the advantage that it is scale invariant. We recall that the scaling exponent for this problem is \(s_{c} = \frac{n}{2}\). In our problem, we will apply the results in this section to the function \({\mathbf {u}}= \partial u\).
We begin with a review of the local well-posedness result in [38], but where we describe also the structure of the Strichartz estimates:
Theorem 10.1
Smith-Tataru [38]
Consider the problem (10.1) with initial data satisfying
where
respectively
Then the solution exists on the time interval \([0,1]\), and satisfies the following Strichartz estimates
respectively
with a small \(\delta _{0} > 0\).
In addition, another conclusion of the work in [38], which is used as an intermediate step in the proof of the theorem above, is that the linearized problem around the solutions in Theorem 10.1 is well-posed in a range of Sobolev spaces, and almost lossless Strichartz estimates hold for them. Precisely, we have the following:
Theorem 10.2
[38]
Let \({\mathbf {u}}\) be a solution for (10.1) in the time interval \([0,1]\) as in Theorem 10.1. Then the linear equation
is well-posed in \(\mathcal {H}^{r}\) in the same time interval for \(1 \leq r \leq s_{1}+1\), and the solutions satisfy the uniform and Strichartz estimates (4.33) for the same range of \(r\).
We note that in [38] it is also assumed that \(g^{00}= -1\), akin to our metric \({\tilde{g}}\); but it is clear that such an assumption is not needed in the above theorems, as one can simply divide the equation by \(g^{00}\).
We also remark that the equation (10.7) is not the same as the linearized equation. The reason (10.7) is preferred in [38] is the extended upper bound for \(r\). It is also noted in [38] that for a range of \(r\) with a lower upper bound, the conclusion of the last theorem is also valid for the full linearized equation; this is a straightforward perturbative argument. From below, the Sobolev exponent \(r = 1\) suffices in dimension \(n\geq 3\) in [38], though it is also clear that this is not optimal. Indeed, for dimension \(n = 2\) the above result is extended in [38] to the range \(\frac{3}{4} \leq r \leq s_{1}+1\), and the linearized equation is shown to be well-posed in \(\mathcal {H}^{\frac{3}{4}}\); see [38, Lemma A4]; the same method also works in higher dimension.
We also remark that if the linearized equation is in divergence form, (which can be arranged in the present paper, see (3.24)), then, by duality, (forward/backward) well-posedness in \(\mathcal {H}^{r}\) implies (backward/forward) well-posedness in \(\mathcal {H}^{1-r}\), with the center point at \(r = \frac{1}{2}\). This motivates why, in the context of the present paper, it is easiest to study the linearized equation exactly in \(\mathcal {H}^{\frac{1}{2}}\). Unfortunately our argument runs into a technical obstruction in dimension \(n=2\), which is why we make a slight adjustment there and work instead in \(\mathcal {H}^{\frac{5}{8}}\).
To summarize, in the present paper we will not need directly the conclusion of Theorem 10.2, but rather a minor variation of it where we also consider the divergence form equation and its associated paradifferential flow, and we lower the range for \(r\) in order to include the space \(\mathcal {H}^{\frac{1}{2}}\) (\(\mathcal {H}^{\frac{5}{8}}\) in dimension two).
In the proof of the main result of this paper, we will need to use this result for solutions that are not small in \(\mathcal {H}^{s}\), so we cannot apply it directly. Instead, we will seek to rephrase it and use it in a large data setting via a scaling argument.
The difficulty we face is that rescaling keeps homogeneous Sobolev norms unchanged, rather than the inhomogeneous ones. A first step in this direction is to consider smooth solutions, but which may be large at low frequency:
Theorem 10.3
Consider the problem (10.1) with initial data satisfying
Then the solution exists up to time 1, and satisfies the uniform bound
and the Sobolev bound
In addition,
whenever the right hand side is finite.
Proof
Locally, after subtracting a constant, the data is small in \(\mathcal {H}^{N}\) so the existence of regular solutions is classical. It remains to establish energy estimates in homogeneous Sobolev norms. The problem reduces to the case of the paradifferential flow, and, by conjugation with a power of \(\langle D_{x} \rangle \), to bounds in \(\mathcal {H}^{1}\) that are straightforward. □
A second step is the following variation of Theorem 10.1, where we consider a small \(\mathcal {H}^{s_{1}}\) perturbation of a small and smooth data:
Theorem 10.4
Consider the problem (10.1) with initial data \({\mathbf {u}}[0]\) of the form
where the two components satisfy
Then the solution \(u\) exists on the time interval \([0,1]\), and satisfies the following Strichartz estimates
respectively
with a small \(\delta _{0} > 0\).
We remark that the solutions in this second theorem are still covered by Theorem 10.1. The only difference is that the constant in the Strichartz bound depends only on the \({\mathbf {u}}^{hi}[0]\) bound.
Proof
This follows by a direct application of the results in Theorem 10.1 and Theorem 10.2. We write an equation for \({\mathbf {u}}^{hi}={\mathbf {u}}-{\mathbf {u}}^{lo}\),
where the source term \(\mathbf{f}^{hi}\) can be estimated at fixed time by
and thus it is perturbative. Then we apply the Strichartz estimates in Theorem 10.2 to \({\mathbf {u}}^{hi}\), and the desired conclusion follows. □
Now we consider the large data problem, where we show local well-posedness by a scaling argument. The price to pay will be that the time interval for which we have the solutions will be shorter. Precisely, we will show that
Theorem 10.5
For any \(s_{1}\) as in (10.3), (10.4) there exists \(\delta _{0} > 0\) so that the following holds: For any \(M > 0\) and any solution \({\mathbf {u}}\) to the problem (1.13) with initial data satisfying
We have:
a) The solution exists up to time \(T_{M}\) given by
with uniform bounds
as well as
whenever the right hand side is finite.
b) The solution \({\mathbf {u}}\) satisfies the following Strichartz estimates in \([0,T_{M}]\):
respectively
c) Furthermore, the homogeneous Strichartz estimates (4.33) also hold in \(\mathcal {H}^{r}\) for the associated linear equations (10.7), on the same time intervals for \(r \in [1,s_{1}]\).
Proof
As stated, the result is invariant with respect to scaling. Precisely \(M\) plays the role of a scaling parameter, and by scaling we can set it to 1.
It remains to prove the result for \(M=1\) in which case \(T_{M}=1\). In a nutshell, the idea of the proof is to use the finite proof of propagation to localize the problem and, by scaling, to reduce it to the case when Theorems 10.3, 10.4 can be applied. To fix the notations, we will consider the case \(n \geq 3\) in what follows; the two dimensional case is identical after obvious changes in notations.
On the Fourier side we split the initial data into two components,
and we denote by \({\mathbf {u}}\) and \({\mathbf {u}}^{lo}\) the corresponding solutions.
On the other hand on the physical side we partition the initial time slice \(t=0\) into cubes \(Q\) of size 1, and consider a partition of unity associated to the covering by \(8Q\),
and define the localized initial data
which agrees with \({\mathbf {u}}[0]\) in \(6Q\) up to a constant. The speed of propagation for solutions \({\mathbf {u}}\) with \(|{\mathbf {u}}| \ll 1\) is close to 1, therefore the corresponding solutions \(w_{Q}\) agree with \({\mathbf {u}}\) in \(4Q\) (again, up to a constant) in \([0,1]\), assuming both exist up to this time.
Next we consider the existence and properties of the solutions \({\mathbf {u}}_{Q}\) in the time interval \([0,T_{M}]\). For \({\mathbf {u}}_{Q}[0]\) we have a low-high decomposition,
Now we consider energy bounds for the initial data. For \({\mathbf {u}}^{lo}\) we have
Since \(s_{1}-1 < n/2\), after localization this also implies that the low frequency components satisfy
which is exactly as in Theorem 10.3, respectively Theorem 10.4.
On the other hand, for the high frequency bounds we have the almost orthogonality relation
By Theorem 10.1, it follows that the solutions \({\mathbf {u}}_{Q}\) exist up to time 1, and satisfy the Strichartz bounds
Theorem 10.4 allows us to improve this to
The solutions \({\mathbf {u}}_{Q}\), respectively \({\mathbf {u}}^{lo}_{Q}\) agree with \({\mathbf {u}}\), \({\mathbf {u}}^{lo}\) in \([0,1]\times 4\tilde{Q}\). Then we can recombine the \({\mathbf {u}}_{Q}\) bounds using a partition of unity on the unit spatial scale. We obtain a \({\mathbf {u}}\) bound, namely
On the other hand for \({\mathbf {u}}_{lo}\) we have the bounds given by Theorem 10.3.
The energy bounds for \({\mathbf {u}}-{\mathbf {u}}^{lo}\) and \({\mathbf {u}}^{lo}\) combined yield the desired energy bound (10.18) in the theorem. In terms of the Strichartz bounds (10.20), we already have them for \({\mathbf {u}}- {\mathbf {u}}^{lo}\) so it remains to prove them for \({\mathbf {u}}^{lo}\). But there we trivially use Sobolev embeddings and Holder’s inequality in time.
It remains to consider the Strichartz estimates for \(\mathcal {H}^{1}\) solutions to the linearized equation. By the same finite speed of propagation argument as above, it suffices to prove them for the linearization around the localized solutions \({\mathbf {u}}_{Q}\). But this follows by Theorem 10.2. □
To conclude this section we reinterpret the above result in the context of the minimal surface equation, exactly in the for it will be used in the last section. We keep the same notations, with the only change that now \(s_{c} = \frac{n}{2}+1\):
Theorem 10.6
For any \(s_{1}\) as in (10.3), (10.4) there exists \(\delta _{0} > 0\) so that the following holds: For any \(M > 0\) and any solution \(u\) to the problem (1.7) with initial data satisfying
We have:
a) The solution exists up to time \(T_{M}\) given by
with uniform bounds
as well as
whenever the right hand side is finite.
b) The solution \(u\) satisfies the following Strichartz estimates in \([0,T_{M}]\):
respectively
c) Furthermore, the homogeneous Strichartz estimates (4.33) also hold in \(\mathcal {H}^{r}\) for the associated linear equations (10.7), on the same time intervals for \(r \in [1,s_{1}]\). Also the full Strichartz estimates (4.42) with \(S= S_{ST}\) hold for the linear paradifferential equation hold in \(\mathcal {H}^{r}\) on the same time intervals for all real \(r\).
The theorem is obtained by applying the previous theorem to \({\mathbf {u}}= \partial u\). For the Strichartz estimates for the linear paradifferential equation we observe in addition that we have the bound
Then the \(r=1\) case of the Strichartz estimates for the linear equations (10.7) together with Proposition 4.8 imply the desired conclusion.
11 Conclusion: proof of the main result
After using the finite speed of propagation to reduce to the small data problem, here we combine our balanced energy estimates with the short time Strichartz bounds in order to complete the proof of our main result in Theorem 1.3. Our rough solutions are constructed as limits of smooth solutions obtained by regularizing the initial data, so the emphasis is on obtaining favourable estimates for these smooth solutions.
11.1 Reduction to small data
By Sobolev embeddings, the initial data satisfies
Then given \(x_{0} \in {\mathbb{R}}^{n}\), within a small ball \(B(x_{0},4r)\) we have
This allows us to truncate the above differences near \(x_{0}\) to obtain the localized data
where \(\chi \in \mathcal {D}({\mathbb{R}}^{n})\) is equal to 1 in \(B(0,2)\) and 0 outside \(B(0,4)\).
Let \(\epsilon > 0\). Then for small enough \(r\), depending on \(\epsilon \), these initial data are close to the initial data for the linear solution to the minimal surface equation given by
in the sense that
This will be our smallness condition for the initial data, with \(\partial {\mathbf {u}}^{x_{0},r}\) in a compact subset of the set described in (1.12).
To reduce the problem to the case when the initial data satisfies instead the simpler smallness condition
it suffices to apply a linear transformation in the Minkowski space \({\mathbb{R}}^{n+2}\) that preserves the time slices but maps our linear solution \(\tilde{u}^{x_{0},r}\) to the zero solution. The price we pay for this is that the background Minkowski metric is then changed to another Lorentzian metric. But the new metric belongs to a compact set in the space of flat Lorenzian metrics for which the time slices are uniformly space-like and the graph of the zero function is uniformly time-like. Hence our small data result applies uniformly to these localized solutions, see Remark 3.1. Then, due to the finite speed of propagation, we also obtain solutions up to time \(O(r)\) for the original problem.
11.2 Uniform bounds for regularized solutions
Let \(s\) be as in Theorem 1.3. Given an initial data \(u[0] \in \mathcal {H}^{s}\) that is small,
we consider a continuous family of frequency localizations
to frequencies \(\leq 2^{h}\). For fixed \(h\) and a short time which may depend on \(h\), these solutions exist by Theorem 10.5. Further, they are smooth and also depend smoothly on \(h\). Finally, we consider the functions
These functions solve the linearized equation around \(u^{h}\), with initial data
which is localized at frequency \(2^{h}\). The functions \(v^{h}\) will be measured in \(\mathcal {H}^{\frac{1}{2}}\) in dimension \(n\geq 3\) and in \(\mathcal {H}^{\frac{5}{8}}\) in dimension \(n=2\). Thus the initial data for \(v^{h}\) satisfies the bound
Our first objective will be to show that these solutions exist on a time interval that does not depend on \(h\), and satisfy uniform bounds:
Theorem 11.1
The above solutions \(u^{h}\) have the following properties:
-
a) Uniform lifespan and uniform bounds
The solutions \(u^{h}\) exist up to time 1, with uniform bounds
$$ \| u^{h}[\cdot ] \|_{C([0,1];\mathcal {H}^{s})} \lesssim \epsilon , $$(11.6)and higher regularity bounds
$$ \| u^{h}[\cdot ] \|_{C([0,1];\mathcal {H}^{s+1})} \lesssim 2^{h} \epsilon . $$(11.7) -
b) Bounds for the linearized flow
The linearized equation around \(u^{h}\) is well-posed in \(\mathcal {H}^{\frac{1}{2}}\), with uniform estimates in \([0,1]\), uniformly in \(h\),
$$ \begin{aligned} \| v\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{1}{2}})} \lesssim \|v[0]\|_{ \mathcal {H}^{\frac{1}{2}}}, \qquad n \geq 3, \\ \| v\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{5}{8}})} \lesssim \|v[0]\|_{ \mathcal {H}^{\frac{5}{8}}}, \qquad n =2, \end{aligned} $$(11.8)and uniform Strichartz estimates with loss of derivatives,
$$ \begin{aligned} \| \langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{4}-\delta} \partial v \|_{L^{2} L^{\infty}} \lesssim \|v[0]\|_{\mathcal {H}^{\frac{1}{2}}}, \qquad n \geq 3, \\ \| \langle D_{x} \rangle ^{-\frac{n}{2}-\frac{1}{4}-\delta} \partial v \|_{L^{4} L^{\infty}} \lesssim \|v[0]\|_{\mathcal {H}^{\frac{5}{8}}}, \qquad n = 2, \end{aligned} $$(11.9)for any \(\delta > 0\).
The exponent \(s+1\) in (11.7) is chosen so that it falls into the range of existing theory, where we already have well-posedness and continuous dependence. We remark that, as a corollary of part (b), we also obtain uniform bounds for the functions \(v_{h}\), namely
11.3 The bootstrap assumptions
Our proof of the main result in Theorem 11.1 will be formulated as a bootstrap argument. Then the question is what is a good bootstrap assumption. Having the bounds for the linearized equation as part of the bootstrap assumption would be technically complicated. On the other hand, not having any assumptions at all related to the linearized equation would introduce too many difficulties in getting the argument started. As it turns out, there is a good middle ground, which is to have the uniform energy bounds on both \(u^{h}\) and \(v^{h}\) as part of the bootstrap assumptions, which are then set as follows:
-
i)
Uniform \(\mathcal {H}^{s}\) bounds:
$$ \| u^{h}[\cdot ] \|_{C([0,1];\mathcal {H}^{s})} \leq 1, $$(11.11) -
ii)
Higher regularity bounds:
$$ \| u^{h}[\cdot ] \|_{C([0,1];\mathcal {H}^{s+1})} \leq 2^{h}, $$(11.12) -
iii)
Difference bounds,
$$ \begin{aligned} \| v_{h}\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{1}{2}})} \leq 2^{-(s- \frac{1}{2})h}, \qquad n\geq 3, \\ \| v_{h}\|_{L^{\infty}([0,1];\mathcal {H}^{\frac{5}{8}})} \leq 2^{-(s- \frac{5}{8})h}, \qquad n=2. \end{aligned} $$(11.13)
The \(v_{h}\) bootstrap bound will be useful in particular in order to obtain good low frequency bounds for differences of the \(u^{h}\) functions,
with the obvious change in two dimensions.
To avoid having a bootstrap assumption on a noncompact set of functions, we may freely restrict the range of \(h\). Precisely, given an arbitrary threshold \(h_{0}\), we assume the bootstrap assumption to hold for all \(h \leq h_{0}\) and show that the desired bounds hold in the same range. Since \(h_{0}\) plays no role in the analysis, we will simply drop it in the proofs.
11.4 Short time Strichartz estimates for \(u^{h}\) and \(v^{h}\)
Our goal here is to use the results in Theorem 10.6 together with our bootstrap assumption in order to obtain short time Strichartz estimates for both \(u^{h}\) and \(v^{h}\).
By the bootstrap assumptions (11.11) and (11.12), we may bound the local well-posedness norm \(\mathcal {H}^{s_{1}}\) of the solution \(u^{h}\) by
Then the result of Theorem 10.5 is valid on time intervals \(I_{h}\) of length
In practice, \(s_{1}\) will be chosen as close as possible to the threshold in (10.3), (10.4). This will insure that in all dimensions we have
In particular, by Theorem 10.5 it follows that the solution \(u^{h}\) satisfies full Strichartz estimates on such intervals,
respectively
Also the linearized problem and the linear paradifferential flow will be well-posed in \(\mathcal {H}^{\frac{1}{2}}\) and will satisfy Strichartz estimates on similar time intervals,
respectively
where the \(L^{\infty}\) norm on the right may be replaced by the same \(\mathcal {H}^{\frac{1}{2}}\) norm evaluated at some fixed time within \(I_{h}\). The last set of bounds may be in particular applied to \(v^{h}\), which, in view of our bootstrap assumption, yields
respectively
11.5 Long time Strichartz estimates for \(u^{h}\) and \(v^{h}\)
Our objective now is to obtain long time Strichartz bounds by simply adding up the short time bounds. Some care is needed when using (11.21) and (11.20) because, as \(h\) increases, we gain on one hand in the bound on the right, but we loose in the size of the interval \(I_{h}\). However, the gain overrides the loss, so integrating in \(h\) we arrive at the difference bound
respectively
Now we are able to obtain Strichartz bounds for \(u^{h}\) on the full time interval \([0,1]\), simply by adding the short time bounds. Precisely, we claim that for some small universal \(\delta _{1} > 0\) we have
respectively
To see this, we differentiate cases depending on how \(k\) and \(h\) compare. We fix the dimension to \(n \geq 3\) for clarity.
a) If \(k \geq h\), then we simply apply (11.16) or (11.17), taking the loss from the number of intervals. For instance in three and higher dimensions we get for \(\delta _{1} \leq \delta _{0}\)
for a favourable choice of \(s_{1}\); for instance \(s_{1} = s+\frac{1}{4}\) suffices, as then \(\frac{s_{1}-s}{s_{1}-s_{c}} < \frac{1}{2}\). The two dimensional argument is similar.
b) if \(k< h\) instead, then we first write
Here the first term was already estimated before, while for the second we use (11.22) or (11.23), where the loss from the interval size is only in terms of \(k\) and not \(h\). In dimension three and higher this yields
again for a good choice of \(s_{1}\) (same as above) and a small enough \(\delta \).
In particular, the estimates (11.24), respectively (11.25) allow us to estimate our control parameter ℬ as follows:
respectively
This in turn allows us to use Theorem 8.1 to control the energy growth for the full equation, and in particular to prove the bounds (11.6) and (11.7), thus closing part of the bootstrap loop, namely for the bounds (11.11) and for (11.12).
11.6 Strichartz estimates for the paradifferential flow
Our objective here is to establish Strichartz estimates with loss of derivatives for the linear paradifferential flow around \(u^{h}\). Thus, we consider an \(\mathcal {H}^{\frac{1}{2}}\) solution \(v\) for the paradifferential flow around \(u^{h}\), and we seek to estimate it dyadic pieces in the Strichartz norm, with frequency losses:
Proposition 11.2
Under the bootstrap assumptions (11.11), (11.12) and (11.13), \(\mathcal {H}^{r}\) solutions \(v\) for the linear paradifferential equation
satisfy the Strichartz estimates (4.43) with \(S = S_{AIT}\) for all \(r \in {\mathbb{R}}\).
Compared with the full Strichartz bounds, here we have a loss of \(1/4\) derivative in dimension 3 and higher, respectively \(1/8\) derivative in dimension 2.
Proof
Our starting point is Theorem 4.12, which allows us to reduce the problem to proving the homogeneous Strichartz estimates (4.33) for the corresponding homogeneous equation, again for all real \(r\). To prove the proposition in this case, we have two tools at our disposal:
-
(i)
The energy estimates of Theorem 7.1. In view of the bounds (11.27) and (11.26), these give uniform \(\mathcal {H}^{r}\) bounds for \(v\),
$$ \| v[\cdot ] \|_{L^{\infty }\mathcal {H}^{r}} \lesssim \|v[0]\|_{ \mathcal {H}^{r}}. $$ -
(ii)
The short time Strichartz estimates (4.33) with \(S=S_{ST}\) on the \(T_{h}\) time scale, provided by Theorem 10.6. Adding these with respect to the time intervals, we arrive at
$$ \| |D|^{-\frac{d}{2}-\delta} \partial v \|_{L^{2}([0,1]; L^{\infty})} \lesssim T_{h}^{-\frac{1}{2}} \| v[0]\|_{\mathcal {H}^{\frac{1}{2}}}, \qquad n \geq 3, $$(11.29)respectively
$$ \| |D|^{-\frac{9}{8}-\delta} \partial v \|_{L^{4}(I_{h}; L^{\infty})} \lesssim T_{h}^{-\frac{1}{4}} \| v[0]\|_{\mathcal {H}^{\frac{5}{8}}}, \qquad n =2. $$(11.30)
Now we want to use these tools in order to prove the long term bounds (4.33) with \(S=S_{AIT}\) on the unit time scale. Given the expression for \(T_{h}\), our first observation is that the estimates (11.29), respectively (11.30) suffice for our bounds at frequencies \(\geq h\), but not below that.
Thus, consider a lower frequency \(k < h\), and seek to estimate \(P_{k} v\). At this frequency, we have the correct estimate for the solution \(\tilde{v}\) to the linear paradifferential equation around \(u_{k}\). It remains to compare \(v\) and \(\tilde{v}\). For this we use the \(T_{\tilde{P}(u^{k})}\) flow, and we think of \(P_{k} v\) as an approximate solution for this flow,
We can bound the source terms as follows, fixing the dimension to \(n \geq 3\):
To conclude it suffices to estimate
The first bound follows from our earlier Strichartz estimates for \(u^{k}\), see (11.17), (11.16). For the second bound, we expand and then it suffices to have
with a positive constant \(c\) in order to allow for integration in \(j\). We expand paradifferentially, depending on the frequencies of the two factors above. It suffices to consider the following two cases:
a) \(v^{j}\) has the frequency below \(2^{k}\). Then we use the Strichartz bounds for \(v_{j}\) over intervals \(I_{j}\), and then sum over such intervals. For instance in dimension \(n \geq 3\) we get
Here the coefficient of \(j-k\) is negative by a large margin, while the coefficient of \(k\) in the middle factor is also negative since \(s_{1} - s_{c} > \frac{1}{2}\) and \(\delta \) is arbitrarily small. Hence we obtain a bound as desired in (11.33).
b) The balanced case, where both frequencies have size \(2^{l}\) with \(l \geq k\). This is easier, as we have a better energy bound for the first factor. Hence in this case it is more efficient to estimate the output by applying Bernstein’s inequality first,
which is better than in case (a). □
11.7 Strichartz estimates for the linearized flow
Our aim here is to show that we have \(\mathcal {H}^{\frac{1}{2}}\) well-posedness and Strichartz estimates with loss of derivatives for the linearized flow around \(u^{h}\) in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) if \(n=2\)):
Proposition 11.3
Under the bootstrap assumptions (11.11), (11.12) and (11.13), the linearized equation around \(u^{h}\) is well-posed in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) if \(n=2\)), and its solutions satisfy the full Strichartz estimates (4.43) with \(S = S_{AIT}\).
Here we use the analysis in Section 9. Precisely Theorem 9.1 there shows that the above proposition follows directly from the similar result in Proposition 11.2 for the linear paradifferential equation.
11.8 Closing the bootstrap argument
Combining the Strichartz estimates for the linear paradifferential equation in Proposition 11.2 with the result of Theorem 9.1, it follows that the linearized flow around \(u^{h}\) is wellposed in \(\mathcal {H}^{\frac{1}{2}}\) (respectively \(\mathcal {H}^{\frac{5}{8}}\) if \(n=2\)), with the same Strichartz estimates as in Proposition 11.2, which is exactly part (b) of Theorem 11.1. As a consequence, the initial data bound (11.5) for \(v\) implies the uniform bound (11.10), which in turn closes the bootstrap assumption (11.13).
11.9 The well-posedness result
In order to be able to obtain a complete well-posedness argument, we follow the outline in [21], and measure the size of the functions \(u^{h}\) and \(v^{h}\) in terms of frequency envelopes. Precisely, we consider a normalized frequency envelope \(\epsilon c_{h}\) for \(u[0]\) in \(\mathcal {H}^{s}\). Then for the localized initial data we have the bounds
On the other hand, fixing the dimension to \(n \geq 3\), we will measure \(v^{h}\) in \(\mathcal {H}^{\frac{1}{2}}\), where for the initial data we have
Then by Theorem 11.1, we obtain corresponding uniform bounds for the solutions on the time interval \([0,1]\),
Similarly, the linearized increments \(v_{h}\) satisfy the uniform bounds
Integrating the last bound with respect to \(h\), we obtain the difference bounds
This implies that the limit
exists in \(C([0,1];\mathcal {H}^{\frac{1}{2}})\). In view of (11.37), the limit \(u\) will also satisfy
We can also prove that we have the previous convergence in this stronger topology. To see this, we consider unit increments in \(h\), and compare \(u_{h}\) with \(u_{h+1}\), using (11.38) on one hand, and (11.40) on the other hand. This yields
respectively
These two bounds balance exactly at frequency \(2^{h}\), and measure the \(\mathcal {H}^{s}\) norm but with decay away from frequency \(2^{k}\). Hence the differences are almost orthogonal in \(\mathcal {H}^{s}\), and, summing them up, we obtain
This implies uniform convergence in \(\mathcal {H}^{s}\). Thus our solution \(u\) is uniquely identified as the strong \(\mathcal {H}^{s}\) uniform limit of \(u^{h}\).
The continuous dependence and the weak-Lipschitz dependence follow exactly as in [21].
Notes
Of course with an implicit constant that may depend on \(\delta \).
See Section 2 for our Besov norm notations.
Note that these two reductions are interchangeable.
Such a representation might not be unique in general, though later in the paper we often identify specific choices for the paracoefficients.
Here the \(\tilde{P}_{B} w\) term may be interpreted as arising from a lower order correction to our multiplier \(\tilde{\mathfrak {M}}_{s}\).
A slight expansion of the argument shows that it is in effect invertible.
References
Ai, A., Ifrim, M., Tataru, D.: Two dimensional gravity waves at low regularity I: Energy estimates (2019)
Alazard, T., Delort, J.-M.: Global solutions and asymptotic behavior for two dimensional gravity water waves. Ann. Sci. Éc. Norm. Supér. (4) 48(5), 1149–1238 (2015)
Alazard, T., Delort, J.-M.: Sobolev estimates for two dimensional gravity water waves. Astérisque (374), viii+241 (2015)
Bahouri, H., Chemin, J.-Y.: Équations d’ondes quasilinéaires et effet dispersif. Int. Math. Res. Not. 21, 1141–1178 (1999)
Bahouri, H., Chemin, J.-Y.: Équations d’ondes quasilinéaires et estimations de Strichartz. Am. J. Math. 121(6), 1337–1377 (1999)
Bony, J.-M.: Calcul symbolique et propagation des singularités pour les équations aux dérivées partielles non linéaires. Ann. Sci. Éc. Norm. Supér. (4) 14(2), 209–246 (1981)
Bony, J.-M.: Calcul symbolique et propagation des singularités pour les équations aux dérivées partielles non linéaires. Ann. Sci. Éc. Norm. Supér. (4) 14(2), 209–246 (1981)
Brendle, S.: Hypersurfaces in Minkowski space with vanishing mean curvature. Commun. Pure Appl. Math. 55(10), 1249–1279 (2002)
Christ, M., Kiselev, A.: Maximal functions associated to filtrations. J. Funct. Anal. 179(2), 409–425 (2001)
Donninger, R., Krieger, J., Szeftel, J., Wong, W.: Codimension one stability of the catenoid under the vanishing mean curvature flow in Minkowski space. Duke Math. J. 165(4), 723–791 (2016)
Ettinger, B.: Well-posedness of the three-form field equation and the minimal surface equation in Minkowski space. ProQuest LLC, Ann Arbor, MI, Thesis (Ph.D.)–University of California, Berkeley (2013)
Ginibre, J., Velo, G.: Generalized Strichartz inequalities for the wave equation. J. Funct. Anal. 133(1), 50–68 (1995)
Gubinelli, M., Imkeller, P., Perkowski, N.: Paracontrolled distributions and singular PDEs. Forum Math. Pi 3(e6), 75 (2015)
Gubinelli, M., Koch, H., Oh, T.: Paracontrolled approach to the three-dimensional stochastic nonlinear wave equation with quadratic nonlinearity. ArXiv e-prints, (2018). arXiv:1811.07808
Hoppe, J.: Some classical solutions of relativistic membrane equations in 4-space-time dimensions. Phys. Lett. B 329(1), 10–14 (1994)
Hörmander, L.: Pseudo-differential operators of type \(1,1\). Commun. Partial Differ. Equ. 13(9), 1085–1111 (1988)
Hörmander, L.: Lectures on Nonlinear Hyperbolic Differential Equations. Mathématiques & Applications (Berlin) [Mathematics & Applications], vol. 26. Springer, Berlin (1997)
Hughes, T.J.R., Kato, T., Marsden, J.E.: Well-posed quasi-linear second-order hyperbolic systems with applications to nonlinear elastodynamics and general relativity. Arch. Ration. Mech. Anal. 63(3), 273–294 (1977)
Hunter, J.K., Ifrim, M., Tataru, D., Kwong Wong, T.: Long time solutions for a Burgers-Hilbert equation via a modified energy method. Proc. Am. Math. Soc. 143(8), 3407–3412 (2015)
Hunter, J.K., Ifrim, M., Tataru, D.: Two dimensional water waves in holomorphic coordinates. Commun. Math. Phys. 346(2), 483–552 (2016)
Ifrim, M., Tataru, D.: Local well-posedness for quasilinear problems: a primer. Am. Math. Soc. Bull. New Ser. 60(2), 167–194 (2023)
Jerrard, R.: Defects in semilinear wave equations and timelike minimal surfaces in Minkowski space. Anal. PDE 4(2), 285–340 (2011)
Kapitanskiuı, L.V.: Estimates for norms in Besov and Lizorkin-Triebel spaces for solutions of second-order linear hyperbolic equations. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI), 171 (Kraev. Zadachi Mat. Fiz. i Smezh. Voprosy Teor. Funktsiuı. 20(106–162), 185–186 (1989)
Keel, M., Tao, T.: Endpoint Strichartz estimates. Am. J. Math. 120(5), 955–980 (1998)
Klainerman, S.: The null condition and global existence to nonlinear wave equations. In: Nonlinear Systems of Partial Differential Equations in Applied Mathematics, Part 1 (Santa Fe, N.M., 1984). Lectures in Appl. Math., vol. 23, pp. 293–326. Am. Math. Soc., Providence (1986)
Klainerman, S., Rodnianski, I.: Improved local well-posedness for quasilinear wave equations in dimension three. Duke Math. J. 117(1), 1–124 (2003)
Klainerman, S., Rodnianski, I., Szeftel, J.: The bounded \(L^{2}\) curvature conjecture. Invent. Math. 202(1), 91–216 (2015)
Koch, H., Tataru, D.: Dispersive estimates for principally normal pseudodifferential operators. Commun. Pure Appl. Math. 58(2), 217–284 (2005)
Krieger, J., Lindblad, H.: On stability of the catenoid under vanishing mean curvature flow on Minkowski space. Dyn. Partial Differ. Equ. 9(2), 89–119 (2012)
Lindblad, H.: Counterexamples to local existence for quasilinear wave equations. Math. Res. Lett. 5(5), 605–622 (1998)
Lindblad, H.: A remark on global existence for small initial data of the minimal surface equation in Minkowskian space time. Proc. Am. Math. Soc. 132(4), 1095–1102 (2004)
Métivier, G.: Para-Differential Calculus and Applications to the Cauchy Problem for Nonlinear Systems. Centro di Ricerca Matematica Ennio De Giorgi (CRM) Series, vol. 5. Edizioni Della Normale, Pisa (2008)
Mockenhaupt, G., Seeger, A., Sogge, C.D.: Local smoothing of Fourier integral operators and Carleson-Sjölin estimates. J. Am. Math. Soc. 6(1), 65–130 (1993)
Ohlmann, G.: Ill-posedness of a quasilinear wave equation in two dimensions for data in \(H^{7/4}\). arXiv e-prints, (2021). arXiv:2107.03732
Smith, H.F.: A parametrix construction for wave equations with \(C^{1,1}\) coefficients. Ann. Inst. Fourier (Grenoble) 48(3), 797–835 (1998)
Smith, H.F., Sogge, C.D.: On Strichartz and eigenfunction estimates for low regularity metrics. Math. Res. Lett. 1(6), 729–737 (1994)
Smith, H.F., Tataru, D.: Sharp counterexamples for Strichartz estimates for low regularity metrics. Math. Res. Lett. 9(2–3), 199–204 (2002)
Smith, H.F., Tataru, D.: Sharp local well-posedness results for the nonlinear wave equation. Ann. Math. (2) 162(1), 291–366 (2005)
Sogge, C.D.: Lectures on Non-linear Wave Equations, 2nd edn. International Press, Boston (2008)
Stefanov, A.: Global regularity for the minimal surface equation in Minkowskian geometry. Forum Math. 23(4), 757–789 (2011)
Sterbenz, J., Tataru, D.: Energy dispersed large data wave maps in \(2+1\) dimensions. Commun. Math. Phys. 298(1), 139–230 (2010)
Tao, T.: Global regularity of wave maps. II. Small energy in two dimensions. Commun. Math. Phys. 224(2), 443–544 (2001)
Tataru, D.: Strichartz estimates for operators with nonsmooth coefficients and the nonlinear wave equation. Am. J. Math. 122(2), 349–376 (2000)
Tataru, D.: Strichartz estimates for second order hyperbolic operators with nonsmooth coefficients. II. Am. J. Math. 123(3), 385–423 (2001)
Tataru, D.: Nonlinear wave equations. In: Proceedings of the International Congress of Mathematicians, Vol. III, Beijing, 2002, pp. 209–220. Higher Ed. Press, Beijing (2002)
Tataru, D.: Strichartz estimates for second order hyperbolic operators with nonsmooth coefficients. III. J. Am. Math. Soc. 15(2), 419–442 (2002)
Tataru, D.: Rough solutions for the wave maps equation. Am. J. Math. 127(2), 293–377 (2005)
Wong, W.W.Y.: Global existence for the minimal surface equation on \(\mathbb{R}^{1,1}\). Proc. Am. Math. Soc. Ser. B 4, 47–52 (2017)
Wong, W.W.Y.: Singularities of axially symmetric time-like minimal submanifolds in Minkowski space. J. Hyperbolic Differ. Equ. 15(1), 1–13 (2018)
Zwiebach, B.: A First Course in String Theory, 2nd edn. Cambridge University Press, Cambridge (2009). With a foreword by David Gross
Acknowledgements
The first author was supported by the Henry Luce Foundation. The second author was supported by a Luce Associate Professorship, by the Sloan Foundation, and by an NSF CAREER grant DMS-1845037. The third author was supported by the NSF grant DMS-2054975 as well as by a Simons Investigator grant from the Simons Foundation.
This material is also based upon work supported by the National Science Foundation under Grant No. DMS-1928930 while all three authors participated in the program Mathematical problems in fluid dynamics hosted by the Mathematical Sciences Research Institute in Berkeley, California, during the Spring 2021 semester.
In addition, the authors would like to express their thanks and gratitude to the anonymous referee, whose very thorough report led to many improvements and clarifications.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
This article is published under an open access license. Please check the 'Copyright Information' section either on this page or in the PDF for details of this license and what re-use is permitted. If your intended use exceeds what is permitted by the license or if you are unable to locate the licence and re-use information, please contact the Rights and Permissions team.
About this article
Cite this article
Ai, A., Ifrim, M. & Tataru, D. The time-like minimal surface equation in Minkowski space: low regularity solutions. Invent. math. 235, 745–891 (2024). https://doi.org/10.1007/s00222-023-01231-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00222-023-01231-3