Introduction

Landau damping is a collisionless relaxation mechanism discovered by Landau [46] (after preliminary works of Vlasov [86]) predicting the decay of spatial oscillations in a plasma when perturbed around certain stable spatially homogeneous distributions. This effect is now considered fundamental to modern plasma physics (see e.g. [17, 67, 77, 80]). It was later “exported” to galactic dynamics by Lynden-Bell [55, 56] where it is thought to play a key role in the stability of galaxies. Moreover, Landau damping is one example of a more general effect usually referred to as “phase mixing”, which arises in many physical settings; see [9, 10, 67] and the references therein. See also [10, 20, 23, 67] for a discussion about the differences and similarities with dispersive phenomena.

The physical model we will be focusing on is the nonlinear Vlasov equations, which is posed in the periodic box \(x \in {\mathbb {T}}^d_L := [-L,L]^d\) with size \(L>0\):

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \partial _t f + v\cdot \nabla _x f + F(t,x)\cdot \nabla _v f = 0, \\ \displaystyle F(t,x) = -\nabla _x W *_{x} \left( \rho _f(t,x) - L^{-d}\int _y \rho _f(t,y) \, \mathrm{d}y\right) , \\ \displaystyle \rho _f(t,x) = \int _{{\mathbb {R}}^d} f(t,x,v) \, \mathrm{d}v, \\ \displaystyle f(t=0,x,v) = f_{{\mathrm{in}}}(x,v), \end{array} \right. \end{aligned}$$
(1.1)

with \(f(t,x,v) :{\mathbb {R}}\times {\mathbb {T}}^d _L \times {\mathbb {R}}^d \rightarrow [0,\infty )\), the distribution function in phase space, and W the non-local interaction potential. We are interested in solutions of the form \(f(t,x,v) = f^0(v) + h(t,x,v)\), where \(f^0(v)\) is a spatially homogeneous background distribution and h is a mean-zero perturbation. If we denote simply the (perturbation) density \(\rho (t,x)\), then the Vlasov equations can be written as

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \partial _t h + v\cdot \nabla _x h + F(t,x)\cdot \nabla _v h + F(t,x)\cdot \nabla _vf^0 = 0, \\ \displaystyle F(t,x) = -\nabla _x W *_{x} \rho (t,x), \\ \displaystyle \rho (t,x) = \int _{{\mathbb {R}}^d} h(t,x,v) \, \mathrm{d}v, \\ \displaystyle h(t=0,x,v) = h_{{\mathrm{in}}}(x,v). \end{array} \right. \end{aligned}$$
(1.2)

The potential W(x) describes the mean-field interaction between particles; the cases of most physical interest are (1) Coulomb repulsive interactions \(F = e \nabla _x \Delta _x ^{-1} \rho \) between electrons in plasmas (where \(e >0\) is the electron charge-to-mass ratio) and Newtonian attractive interactions \(F = -m \mathcal {G}\nabla _x \Delta _x ^{-1} \rho \) between stars in galaxies (where \(m>0\) is the mass of the identical stars and \(\mathcal {G}\) is the gravitational constant). In Fourier variables (see later for the notation) these two cases correspond respectively to \(\widehat{W}(k) = (2\pi )^{-2} e L^2 \left| k\right| ^{-2}\) and \(\widehat{W}(k) = - (2\pi )^{-2} m\mathcal {G}L^2 \left| k\right| ^{-2}\). The former arises in plasma physics where (1.1) describes the distribution of electrons in a plasma interacting with a spatially homogeneous background of ions ensuring global electrical neutrality, after neglecting magnetic effects and ion acceleration. The latter arises in galactic dynamics where (1.1) describes a distribution of stars interacting via Newtonian gravitation, neglecting smaller planetary objects as well as relativistic effects, and assuming Jean’s swindle (see [67] and the references therein).

By re-scaling tx and W, we may normalize the size of the box to \(L=2\pi \) without loss of generality and write \({\mathbb {T}}^d = {\mathbb {T}}^d_{2\pi }\) (see Remark 2). For simplicity of notation and mathematical generality, we consider a general class of potentials with Coulomb/Newton representing the most singular examples. Specifically, we only require that there exists \(C_W < \infty \) and \(\gamma \ge 1\) such that

$$\begin{aligned} \forall \,k \in {\mathbb {Z}}^d \setminus \{ 0\}, \quad |\widehat{W}(k)| \le \frac{C_W}{|k|^{1+\gamma }}. \end{aligned}$$
(1.3)

This paper is concerned with the mathematically rigorous treatment of Landau damping for the full nonlinear mean-field dynamics, as initiated in [20, 42, 67].

Historical Context

The foundation of classical mechanics are the Newton’s laws and one of the most fundamental issues is the understanding of irreversibility at a macroscopical level based on these (reversible) laws. Landau damping is, as we shall see, part of the quest for the understanding of irreversibility at a statistical level, despite the fact that it is a reversible process. Consider the N particle system governed by binary interactions,

$$\begin{aligned} \dot{p}_j = - \frac{\partial \mathcal {H}}{\partial q_j}, \qquad \dot{q}_j = \frac{\partial \mathcal {H}}{\partial p_j}, \qquad \mathcal {H}(q,p) := \sum _{j=1} ^N \frac{\left| p_j \right| ^2}{2} + \sum _{j_1 \not = j_2} \phi \left( q_{j_1} - q_{j_2}\right) \end{aligned}$$
(1.4)

with some radially symmetric interaction potential \(\phi \) (note we are taking an electrostatic approximation by neglecting magnetic effects). Consider now indistinguishable particles with identical mass, (say normalized to 1), and reduce the canonical Hamiltonian variables to simply \(q_j=x_j\) and \(p_j = v_j\) (position and velocity of the j-th particle, both belonging to \({\mathbb {R}}^3\)). Denote \(X=(x_1,\dots ,x_N)\) and \(V=(v_1,\dots ,v_N)\). Liouville’s theorem states that the distribution function \(F^N(t,X,V)\) is constant along any trajectory in phase space. This translates into Liouville equation

$$\begin{aligned} 0 = \frac{\partial F^N}{\partial t} + \left\{ F^N, \mathcal {H}\right\} = \frac{\partial F^N}{\partial t} + V \cdot \nabla _X F^N - \sum _{j_1=1} ^N \left( \sum _{j_2 \not = j_1} \nabla \phi (x_{j_2} - x_{j_1}) \cdot \nabla _{v_{j_1}} F^N \right) . \end{aligned}$$
(1.5)

Observe that this evolution equation preserves the symmetry of the distribution function (i.e. the fact that it is invariant under permutation of the particles). Liouville’s theorem also implies in particular that along this microscopic evolution the Boltzmann entropy is preserved:

$$\begin{aligned} {\frac{\mathrm d}{\mathrm dt}}\int _{{\mathbb {R}}^{6N}} F^N \log F^N {\, \mathrm d}X {\, \mathrm d}V = 0. \end{aligned}$$
(1.6)

This highlights the fact that the microscopic evolution is reversible and that no loss of information occurs along time.

In the many-particle limit \(N \rightarrow +\infty \), one tries to close an equation on the reduced one-particle distribution function

$$\begin{aligned} f(t,x_1,v_1) := \int _{{\mathbb {R}}^{6(N-1)}} F^N(t,X,V) {\, \mathrm d}x_2 \cdots {\, \mathrm d}x_N {\, \mathrm d}v_2 \cdots {\, \mathrm d}v_N. \end{aligned}$$
(1.7)

Maxwell proposed in [61] the so-called Boltzmann equation for the statistical evolution of f. In the case of short-range interactions (e.g. \(\phi \) has compact support), Boltzmann [14] then suggested a formal derivation in the limit \(N \rightarrow +\infty \), with the so-called Boltzmann-Grad scaling: \(N r^2 = O(1)\), and discovered the celebrated H-theorem, which shows that this collisional equation is irreversible and thus quite different from (1.2).

In 1936, Landau [47] proposed a modification of the Boltzmann equation in order to model collisional gases of charged particles, such as electrons in a plasma. However, it was noted by Vlasov [85] that collisions are relatively weak in plasmas (see also [17]) and hence, for many phenomena, it makes sense to consider only mean-field electromagnetic fields. Assuming a mean-field scale \(N \phi = O(1)\) where \(\phi \) is the interaction potential between particles, he derived the Vlasov-Poisson equations (1.2).

Observe that in both cases (Coulomb or Newton forces) Vlasov-Poisson is time-reversible: it is invariant under the change of variable \((t,x,v) \rightarrow (-t,x,-v)\). This of course reflects the time-reversibility of the Newton equations at a microscopic level, which unlike collisional kinetic models, has been preserved in the mean-field limit. Hence, it follows that the entropy is constant.

A few years later, Landau [47] discovered a (then) mysterious “damping” effect: the exponential decay of longitudinal electrostatic waves, predicted by linearizing (1.2) around a homogeneous Maxwellian steady state. This provides a certain “asymptotic stability” for the spatially homogeneous steady state in the sense of the asymptotic stability of the spatial density. It was later argued by Lynden-Bell [55] that a similar phenomenon was occurring in galactic dynamics, where the gas of electrons interacting by electro-static forces is replaced by a “gas of stars” interacting by gravitation forces. See §2.1 below for review for how to derive this on the linear level. In the case of plasmas, it was confirmed experimentally in [57].

Landau damping at first appears at odds with the reversibility of (1.2). It was van Kampen in [81] who pointed out that Landau damping is actually a phase-mixing effect, that is, the streaming of particles creates rapid oscillations (in v) of the distribution function which are averaged away by the v integral in the non-local law for F. Information is not lost, but is simply transferred to the small scales in v, and hence while the behavior appears to be irreversible, in fact, it is completely reversible, as demonstrated by the weakly nonlinear plasma echoes [58]. Mixing as a reversible relaxation mechanism had already been discovered in the context of 2D Euler by Orr in 1907 [71] (where the linear problem is much easier), which is now known as inviscid damping (see also [10]). This effect was also demonstrated to be reversible by ‘Euler echo’ experiments [88, 89].

The original works in physics neglected nonlinear effects, which lead to some speculation, and Landau himself was seemingly very prudent about the exact validity of his analysis for the nonlinear equation and remained silent on this point. The mathematically rigorous theory of the linear damping was pioneered by Backus [4] and Penrose [73] as discussed above, and further clarified by many mathematicians, see e.g. [23, 59]. Penrose seemed optimistic that Landau damping should occur in the nonlinear equations, and highlighted the fact that near equilibrium, the linear evolution would cause the nonlinear electric field to decay and that the nonlinear equations would be asymptotically linear. Such a situation is very reminiscent of scattering in dispersive equations, as discussed in more detail in e.g. [10, 20, 23, 67]. However, Backus [4] made the important point that while the electric field may decay, the gradients \(\nabla _v f\) will grow, and hence perhaps the linearization will cease to make good predictions.

The first nonlinear results were obtained by [20, 42] which showed that Landau damping was at least possible in (1.2) for analytic data (see also [52] for a negative result). However, Landau damping was not shown to hold for all small data until [67]. The results therein hold in analytic spaces or in close to analytic Gevrey spaces [29] and the authors made heuristic conjectures about the minimal regularity required. The proof involved an intricate use of Eulerian and Lagrangian coordinates combined with a global-in-time Newton iteration reminiscent of the proof of the KAM theorem (see therein for a discussion about broader analogies with KAM theory). A key step in the analysis of [67] was controlling the potentially destabilizing influence of the plasma echoes, a weakly nonlinear effect discovered by Malmberg et. al. in [58]. The basic mechanism of the plasma echo is as follows. The force field is damped due to mixing in phase space transferring information from large spatial scales to small scales in the velocity distribution, however, as mixing is time-reversible, un-mixing will then create growth in the force field by transferring information back from the velocity distribution to the large spatial scales. Therefore, any nonlinear effect which transfers information to modes which are un-mixing will lead to a large force field in the future when that information un-mixes (hence ‘echo’). These can potentially chain into a cascade of echoes, as demonstrated experimentally in plasmas [58] and 2D Euler [88, 89], hence we cannot expect to rule them, or the potential cascade, out. See [67] and below for more discussion of the role this resonance plays in the nonlinear theory. In fluid mechanics, the growth caused by un-mixing was noticed earlier by Orr [71] and is known there as the Orr mechanism; see [10] for more discussion.

Objective of New Work

In this work, we provide a new proof of Landau damping for (1.2) that nearly obtains the “critical” Gevrey regularity predicted in [67]. Obviously, both works have common themes (as these are coming from the physics), namely,

  1. (1)

    the abstract linear stability condition is the same,

  2. (2)

    the physical mechanism of phase mixing transferring regularity to decay is the same,

  3. (3)

    the isolation and control of the plasma echoes is still the main challenge.

However, on a mathematical level, the two proofs are quite different; see §2 for a full discussion.

Our proof is shorter, and arguably simpler—assuming some knowledge in paradifferential calculus—than that of [67]. More importantly, this new approach is more robust for developing a general theory. The paper [67] was a breakthrough that potentially opens the way for mathematical studies on phase mixing for nonlinear PDEs, but the main drawback was that the conceptual and technical construction of the proof was so long and mixed so many different abstract viewpoints that it has proved hard to use in a wider context. Our new approach, by avoiding the mixture of Lagrangian and Eulerian viewpoints, avoiding the Newton scheme, and using paradifferential energy estimates which take full advantage of time varying multipliers, makes the proof significantly less “rigid”. There is a pressing need for more flexible and powerful analytical tools as the ‘basic’ situation studied in [67] is only a tiny part of what remains to be understood and we believe that our work is a key step to addressing many interesting and qualitatively different questions ahead. Indeed, the tools developed in this paper have already been adapted to prove a nonlinear Landau damping result in relativistic plasmas [87]. Other examples where our methods may prove useful are damping in un-confined geometries by separating scales in frequencies, geometries and cyclotron effects imposed by external magnetic fields or more subtle self-created magnetic effects, damping in weakly collisional plasmas and finally the stability of self-created gravitational geometries [49, 66] or nonlinear BGK waves [13].

Our proof combines the viewpoint in the original work [67] with the recent work on the 2D Euler equations in [10]. This latter work proves the asymptotic stability (in a suitable sense) of sufficiently smooth shear flows near Couette flow in \({\mathbb {T}}\times {\mathbb {R}}\) via inviscid damping (the hydrodynamic analogue of Landau damping). The proof in [10] uses a number of ideas specific to the structure of 2D Euler, however, some aspects of the viewpoint taken therein will be useful here as well (when suitably combined with ideas from [67]).

One of the main ingredients of our proof, employed also in [10], is the use of the paradifferential calculus to split nonlinear terms into one that carries the transport structure and another that is analogous to the nonlocal interaction term referred to as ‘reaction’ in [67]. It has been long believed that Nash-Moser or similar Newton iterations (see the classical work [64, 65]) can generally be replaced by a more standard fixed point argument if one uses better all of the structure in the equation. This has been the case in most examples in the literature (e.g. Nash’s isometric embedding theorem [68] by Günther [3436]), with maybe the only exception being KAM theory. Other examples can be found in [40], where Hörmander specifically points out paradifferential calculus as a useful tool in this context.

It is well known that one of the main physical barriers to Landau damping is nonlinear particle trapping, whereby particles are trapped in the potential wells of (say) electrostatic waves. Exact traveling wave solutions of this type exist in plasma physics and are known as BGK waves [13]. They were used by Lin and Zeng in [52] to show that one needs at least \(H^\sigma \) with \(\sigma > 3/2\) regularity on the distribution function to expect Landau damping in (1.2) in the neighborhood of Landau-stable stationary solutions (see Definition 1.1 below). The plasma echoes provide a natural nonlinear bootstrap mechanism by which the electrostatic waves can persist long enough to trap particles. After modding out by particle free streaming, the bootstrap appears as a cascade of information to high frequencies and the regularity requirement of Gevrey-\((2+\gamma )^{-1}\) comes from formal ‘worst-case’ calculations carried out in [67]. Mathematically, the same requirement arises here in §6. Lower regularity is an open question: it seems plausible that Theorem 1 is false for all Gevrey-\(\frac{1}{s}\) with \(s < (2+\gamma )^{-1}\), however there might be additional cancellations that could allow it to hold in lower regularities. Finally, in weakly collisional plasmas, the requirement could perhaps be relaxed in some suitable sense (e.g. permitting data which is Gevrey plus a smaller rough contribution that will be instantly regularized, as in the recent work on the Navier-Stokes equations [11]).

It is well known that many areas in physics present striking analogies with each other and many important developments have come from a good understanding of these analogies. This work is an example where the analogy between 2D incompressible Euler and Vlasov-Poisson proved fruitful. The connection between inviscid damping and Landau damping has been acknowledged by a number of authors, for example [6, 16, 19, 32, 51, 79]. Both have similar weakly nonlinear echoes [82, 83, 88, 89], moreover, the work of Lin and Zeng [51, 52] and [9] show that particle trapping and vortex roll-up may in fact be essentially the same ‘universal’ over-turning instability. On a more general level, both systems are conservative transport equations governed by a single scalar quantity (the vorticity in the 2D Euler equation and the distribution function in Vlasov-Poisson equations). Both equations have a large set of stationary solutions: the shear flows for the Euler equation and spatially homogeneous distributions for Vlasov-Poisson equations being the simplest. Each can be viewed as Hamiltonian systems and variational methods have been used for both to provide nonlinear stability results in low-regularity spaces (i.e. functional spaces invariant under the free-streaming operator): we refer for instance, among a huge literature, to [3] in the context of the Euler equation, and to the recent remarkable results [37, 49] in the context of the gravitational Vlasov-Poisson equation (see also the review paper [66] and the references therein). One can also derive the incompressible Euler system from the Vlasov-Poisson system in the quasi-neutral regime for cold electrons (vanishing initial temperature) [18, 60].

The Main Result

In what follows, we denote the Gevrey-\(\nu ^{-1}\) norms, \(\nu \in (0,1]\), with Sobolev corrections \(\sigma \in {\mathbb {R}}\) (see §3.1 for Fourier analysis conventions and §1.3.1 for other notations),

$$\begin{aligned} \left\| f\right\| ^2_{\mathcal {G}^{\lambda ,\sigma ;\nu }}&= \sum _{k \in \mathbb {Z}^d}\int _\eta \left| \hat{f}_k(\eta )\right| ^2 \langle k,\eta \rangle ^\sigma e^{2\lambda \langle k,\eta \rangle ^{\nu }} \, \mathrm{d}\eta , \end{aligned}$$
(1.8a)
$$\begin{aligned} \left\| \rho (t)\right\| ^2_{\mathcal {F}^{\lambda ,\sigma ;\nu }}&= \sum _{k \in \mathbb {Z}^d} \left| \hat{\rho }_k(t)\right| ^2 \langle k,kt \rangle ^\sigma e^{2\lambda \langle k,kt \rangle ^{\nu }}. \end{aligned}$$
(1.8b)

When \(\sigma = 0\) or \(\nu = s\), (defined below) these parameters are usually omitted.

Recall the sufficient linear stability condition (L) introduced in [67] for analytic background distributions, which we slightly adapt to the norms we are using.

Definition 1.1

Given a homogeneous analytic distribution \(f^0(v)\) we say it satisfies stability condition (L) if there exist constants \(C_0,\bar{\lambda },\kappa > 0\) and an integer \(M > d/2\) such that

$$\begin{aligned} \sum _{\alpha \in {\mathbb {N}}^d : \left| \alpha \right| \le M} \left\| v^{\alpha }f^0\right\| ^2_{\mathcal {G}^{\bar{\lambda };1}} \le C_0, \end{aligned}$$
(1.9)

and for all \(\xi \in \mathbb {C}\) with \(Re \, \xi < \bar{\lambda }\),

$$\begin{aligned} \inf _{k \in \mathbb {Z}^d}\left| {\mathcal {L}}(\xi ,k) - 1\right| \ge \kappa , \end{aligned}$$
(1.10)

where \({\mathcal {L}}\) is defined by the following (where \(\bar{\xi }\) denotes the complex conjugate of \(\xi \)),

$$\begin{aligned} {\mathcal {L}}(\xi ,k) = -\int _0^\infty e^{\bar{\xi }\left| k\right| t} \widehat{f^0}\left( kt\right) \widehat{W}(k)\left| k\right| ^2 t \, \mathrm{d}t. \end{aligned}$$
(1.11)

Remark 1

Note (1.9) implies the integral in (1.11) is absolutely convergent by the \(H^{d/2+}\hookrightarrow C^0\) embedding theorem.

In [67], it is shown that (L) is practically equivalent to several well known stability criteria in plasma and galactic dynamics (see §4.3 for a completion of the proof that the Penrose condition [73] implies (L)).

We prove the following nonlinear Landau damping result, which for Coulomb/Newton interaction nearly obtains the Gevrey-3 regularity predicted heuristically in [67]. In §2 we give the outline of the proof and discuss the relationship with the original proof in [67] and the proof of inviscid damping in 2D Euler [10].

Theorem 1

Let \(f^0\) be given which satisfies stability condition (L) with constants M,\(\bar{\lambda }\), \(C_0\) and \(\kappa \). Let \(\frac{1}{(2+\gamma )} < s \le 1\) and \(\lambda _0 > {\lambda ^\prime } > 0\) be arbitrary (if \(s = 1\) we require \(\bar{\lambda } > \lambda _0\)). Then there exists an \(\epsilon _0 = \epsilon _0(d,M,\bar{\lambda },C_0,\kappa ,\lambda _0,{\lambda ^\prime },s)\) such that if \(h_{{\mathrm{{in}}}}\) is mean zero (that is, \(\int \int h_{in}(x,v) dx dv = 0\)) and

$$\begin{aligned} \sum _{\alpha \in {\mathbb {N}}^d \, : \, \left| \alpha \right| \le M}\left\| v^{\alpha }h_{{\mathrm{{in}}}}\right\| ^2_{\mathcal {G}^{\lambda _0;s}} < \epsilon ^2 \le \epsilon ^2_0, \end{aligned}$$

then there exists a mean zero \(h_{+\infty } \in \mathcal {G}^{\lambda ';s}\) satisfying,

$$\begin{aligned} \left\| h(t,x+vt,v) - h_{+\infty }(x,v)\right\| _{\mathcal {G}^{\lambda ^{\prime };s}}&\lesssim \epsilon e^{-\frac{1}{2}(\lambda _0 - {\lambda ^\prime })t^{s}}, \end{aligned}$$
(1.12a)
$$\begin{aligned} \left\| \rho (t)\right\| _{\mathcal {F}^{\lambda ^\prime ;s}}&\lesssim \epsilon e^{-\frac{1}{2}(\lambda _0 - {\lambda ^\prime })t^{s}}, \end{aligned}$$
(1.12b)

for all \(t \ge 0\).

Remark 2

Through the rescaling on W, our estimate of \(\epsilon _0\) in Theorem 1 is a decreasing function of the side-length of the original torus, L. That is, the restriction for nonlinear stability becomes more stringent as the confinement is removed. Moreover, through the rescaling on time, Theorem 1 predicts damping on a characteristic time-scale of O(L). See [30, 31, 67, 84] for more discussion about what can happen without confinement.

Remark 3

It is immediate to deduce estimates on the complete distribution \(f(t,x,v) = f^0(v) + h(t,x,v)\) of the original equation (1.1). In particular, Theorem 1 shows that all solutions to (1.1) with non-trivial spatial dependence close to \(f^0\) satisfy \(\left\| f(t)\right\| _{H^N} \approx \langle t \rangle ^N\) (the same as free transport).

Remark 4

From (1.12a) we have the homogenization \(h(t,x,v) \rightharpoonup <h_{+\infty }(\cdot ,v)>_x\) and the exponential decay of the electrical or gravitational field F[h](tx). Note however that the estimates on h and \(\rho \) are more precise as they show decay rates which increase with k (the spatial frequency).

Remark 5

The theorem also holds backwards in time for some \(h_{-\infty } \in \mathcal {G}^{\lambda ';s}\).

Remark 6

The asymptotic distribution function \(f_{+\infty }(x,v) := f^0(v) + h_{+\infty }(x,v)\) depends on the entire nonlinear evolution, however, at least one can show \(f_{+\infty }\) is within \(O(\varepsilon ^2)\) of the distribution predicted by the linear theory in \(\mathcal {G}^{\lambda ^{\prime \prime };s}\) for any \(\lambda ^{\prime \prime } < {\lambda ^\prime }\) (similar to [67]). See §7 for a sketch.

Remark 7

Though \(f^0\) is analytic, the statement shows the asymptotic stability of homogeneous distributions within a small neighborhood of \(f^0\) in Gevrey-\(s^{-1}\) since we are really making a perturbation of \(f^0(v) + <h_{{\mathrm{in}}}(\cdot ,v)>_x\). However, the size of that neighborhood depends on the parameters in Definition 1.1, so it still must be close to an analytic distribution satisfying (L).

Remark 8

Requiring \(h_{{\mathrm{in}}}\) to be average zero does not lose any generality. Indeed, if \(h_{{\mathrm{in}}}\) were not mean zero we may apply Theorem 1 to \(\tilde{f}^0(v) = f^0(v) +\left( <h_{{\mathrm{in}}}(\cdot ,v)>_x\right) _{<1}\) and \(\tilde{h}_{{\mathrm{in}}} = h_{{\mathrm{in}}} -\left( <h_{{\mathrm{in}}}(\cdot ,v)>_x\right) _{<1}\), where \(g_{<1}\) denotes projection onto frequencies less than one. Since (L) is open, for \(\epsilon _0\) sufficiently small, \(\tilde{f}^0\) still satisfies (L) with slightly adjusted parameters \(C_0\) and \(\kappa \).

Remark 9

If W is in the Schwartz space, then Theorem 1 holds for all \(0 < s < 1\) (although \(\epsilon _0\) goes to zero as \(s \searrow 0\)). The heuristics of [67] suggest that even in the case of analytic W, we cannot hope to work in the Sobolev scale without major new ideas, if such a result holds at all.

Notation and Conventions

We denote \({\mathbb {N}}= \left\{ 0,1,2,\dots \right\} \) (including zero) and \(\mathbb {Z}_*= \mathbb {Z}\setminus \left\{ 0\right\} \). For \(\xi \in \mathbb {C}\) we use \(\bar{\xi }\) to denote the complex conjugate. We will denote the \(\ell ^1\) vector norm \(\left| k,\eta \right| = \left| k\right| + \left| \eta \right| \), which by convention is the norm taken in our work. We denote

$$\begin{aligned} \langle v \rangle = \left( 1 + \left| v\right| ^2 \right) ^{1/2}. \end{aligned}$$

We use the multi-index notation: given \(\alpha = (\alpha _1,\dots ,\alpha _d) \in {\mathbb {N}}^d\) and \(v = (v_1,\dots ,v_d) \in {\mathbb {R}}^d\) then

$$\begin{aligned} v^\alpha = v^{\alpha _1}_1v^{\alpha _2}_2 \dots v^{\alpha _d}_d, \quad \quad \quad D_\eta ^\alpha = (i\partial _{\eta _1})^{\alpha _1} \dots (i\partial _{\eta _d})^{\alpha _d}. \end{aligned}$$

We denote Lebesgue norms for \(p,q \in [1,\infty ]\) and ab either in \({\mathbb {R}}^d\), \(\mathbb {Z}^d\) or \({\mathbb {T}}^d\) as

$$\begin{aligned} \left\| f\right\| _{L_a^p L_b^q} = \left( \int _a \left( \int _b \left| f(a,b)\right| ^q \, \mathrm{d}b \right) ^{p/q} \, \mathrm{d}a\right) ^{1/p} \end{aligned}$$

and Sobolev norms (usually applied to Fourier transforms) as

$$\begin{aligned} \left\| \hat{f}\right\| ^2_{H^M_\eta } = \sum _{\alpha \in {\mathbb {N}}^d; \left| \alpha \right| \le M} \left\| D_\eta ^\alpha \hat{f}\right\| _{L^2_\eta }^2. \end{aligned}$$

We will often use the short-hand \(\left\| \cdot \right\| _2\) for \(\left\| \cdot \right\| _{L^2_{z,v}}\) or \(\left\| \cdot \right\| _{L^2_v}\) depending on the context.

See §3 for the Fourier analysis conventions we are taking. A convention we generally use is to denote the discrete x (or z) frequencies as subscripts. We use Greek letters such as \(\eta \) and \(\xi \) to denote frequencies in \({\mathbb {R}}^d\) and lowercase Latin characters such as k and \(\ell \) to denote frequencies in \(\mathbb {Z}^d\). Another convention we use is to denote \(N,N^\prime \) as dyadic integers \(N \in {\mathbb {D}}\) where

$$\begin{aligned} {\mathbb {D}} = \left\{ \frac{1}{2},1,2,\dots ,2^j,\dots \right\} . \end{aligned}$$

When a sum is written with indices N or \(N^\prime \) it will always be over a subset of \({\mathbb {D}}\). This will be useful when defining Littlewood-Paley projections and paraproduct decompositions, see §3. Given a function \(A = A(k,\eta ) \in L^\infty \), we define the Fourier multiplier operator \(\mathcal {A} g = A(\nabla _{z,v}) g\) by

$$\begin{aligned} \left( \widehat{A(\nabla _{z,v})g}\right) _k(\eta ) := A( (ik,i\eta ) ) \hat{g}_k(\eta ), \end{aligned}$$

and we also define the corresponding operator \(\mathcal {A}_|(\rho ) = A(\nabla _{z,zt})\rho =: A_|(\nabla _z) \rho \) when acting on functions of x only along the frequencies space (kkt) given by the velocity averages along the moving frame:

$$\begin{aligned} \widehat{A(\nabla _{z,zt}) \rho }_k(t) = A((ik,ikt))\hat{\rho }_k(t), \quad A_|(\nabla _z) = A(\nabla _{z,zt}). \end{aligned}$$
(1.13)

We use the notation \(f \lesssim g\) when there exists a constant \(C > 0\) independent of the parameters of interest such that \(f \le Cg\) (we analogously define \(f \gtrsim g\)). Similarly, we use the notation \(f \approx g\) when there exists \(C > 0\) such that \(C^{-1}g \le f \le Cg\). We sometimes use the notation \(f \lesssim _{\alpha } g\) if we want to emphasize that the implicit constant depends on some parameter \(\alpha \).

Outline of the Proof

Linear Behavior

In [85], Vlasov sought to understand (1.2) for small h by first linearizing the equations around the steady state \(f^0(v) = e^{-\frac{|v|^2}{2 T}}/(2 \pi T)^{3/2}\):

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \partial _t h + v\cdot \nabla _x h + F(t,x)\cdot \nabla _vf^0 = 0, \\ \displaystyle F(t,x) = -\nabla _x W *_{x} \rho (t,x), \\ h(t=0,x,v) = h_{{\mathrm{in}}}(x,v). \end{array} \right. \end{aligned}$$
(2.1)

However, Vlasov searched for plane wave solutions via a normal mode method, and in general the spectrum of (2.1) turns out to be purely continuous. Landau understood that the normal mode method was bound to fail for (2.1) (indeed he described the approach of Vlasov as being “without any foundation”) and instead argued for studying and solving the Cauchy problem for given analytic initial data, i.e. the “vibrations for a given initial distribution”. He then solved (2.1) via a Fourier-Laplace transform method. Interestingly, a very similar exchange had already occurred in fluid mechanics long before Vlasov and Landau: when studying the stability of shear flows, Lord Rayleigh had also attempted to take a normal mode approach [76], but it was subsequently understood by Lord Kelvin and Orr [43, 71] that this would not work, for reasons that are indeed the same. However, we do note that by properly dealing with “singular eigenfunctions” for the continuous spectrum, one can actually solve (2.1) with a spectral method, see e.g. [68, 62, 63, 81] for more discussion.

Landau’s argument is subtle and strongly reminiscent of the by-now classical Gearhart-Prüss-Greinier Theorem (see for instance [24, TheoremV.1.11,page302]; see also below in §4). However, as pointed by Penrose [73], Backus [4] and others, it is not completely rigorous. For our discussion of linear stability, let us follow a variant of the approach of Penrose, whose method is the most straightforward and flexible. By taking the Fourier transform of (2.1) in x and then integrating in v and time, one derives the set of decoupled Volterra equations:

$$\begin{aligned} \hat{\rho }(t,k) = \hat{h}_{{\mathrm{in}}} (k,kt) - \int _0 ^t \widehat{W}(k) k \cdot k t \widehat{f^0}(kt) \hat{\rho }(s,k) {\, \mathrm d}s, \end{aligned}$$
(2.2)

where here \(\hat{\rho }(t,k)\) denotes the Fourier transform in x, \(\hat{h_{in}}(k,\eta )\) denotes the Fourier transform in both x and v. Upon extending all the functions by zero for negative times and (formally) taking the Fourier transform in time, we have

$$\begin{aligned} \tilde{\rho }(\omega ,k) = A(\omega ,k) + {\mathcal {L}}\left( \frac{i\omega }{\left| k\right| },k \right) \tilde{\rho }(\omega ,k), \end{aligned}$$
(2.3)

where \({\mathcal {L}}\) is given above in (1.11) and

$$\begin{aligned} A(\omega ,k) = \int _0^\infty e^{-i\omega t} \hat{h}_{{\mathrm{in}}}(k,kt) {\, \mathrm d}t. \end{aligned}$$

Therefore, if (1.10) holds then we can write

$$\begin{aligned} \tilde{\rho }(\omega ,k) = \frac{A(\omega ,k)}{1-{\mathcal {L}}\left( \frac{i\omega }{\left| k\right| },k \right) }, \end{aligned}$$

then using that the Fourier transform is unitary on \(L^2\) we have, at least,

$$\begin{aligned} \int _0^\infty \left| \hat{\rho }(t,k\right| ^2 {\, \mathrm d}t \lesssim \kappa ^{-1} \int _0^\infty \left| \hat{h}_{{\mathrm{in}}}(k,kt)\right| ^2 {\, \mathrm d}t. \end{aligned}$$

The RHS is finite as soon as \(\left| v\right| ^{\frac{d-1}{2}+}h_{in} \in L^2\) by taking the Sobolev space restriction on the Fourier side; see Lemma 3.4 below. To obtain higher decay rates one can use several methods. To obtain polynomial rates of decay, the simplest approach is to take derivatives in \(\omega \) of (2.3). For example, one derivative is enough to prove that \(\hat{\rho }(t,k)\) is integrable in time, allowing to prove that solutions to (2.1) are asymptotically ballistic, i.e. \(h(t,x,v) \sim h_{\infty }(x-tv,v)\) for some \(h_{\infty }\) as \(t \rightarrow \infty \). Hence, by considering \(\widehat{h_\infty (x-tv,v)}(k,\eta ) = \widehat{h_\infty }(k,\eta +kt)\), we see that the information in the distribution function is being moved to higher frequencies/smaller scales in v. Moreover, we can see that \(h(t) \rightharpoonup <h_\infty >_x\), and so the evolution is weakly (but not strongly) returning to a spatially homogeneous equilibrium. For analytic regularity, the approach taken in [67] was to multiply (2.2) by \(e^{\left| k\right| t}\) and deduce an estimate on \(\Phi (t,k) = e^{\left| k\right| t} \hat{\rho }(t,k)\). This inspires what is done below in §4 in Gevrey class.

Several comments are left to be made. First, the above argument is only formal, as in order to take the Fourier transform in time to deduce (2.3), one already needed to have some a priori decay estimate on \(\hat{\rho }(t,k)\). This can be resolved in several ways, see §4 below for a full discussion. Second, Penrose derived a reasonably practical way of checking (1.10) for general background distributions, known as the Penrose condition, which is also known to be essentially sharp [52]. See §4 below for a proof of this.

Summary, Weakly Nonlinear Heuristics, and Comparison with Original Proof [68]

Next, let us consider how to prove Theorem 1 and extend the linear results to the nonlinear level. Landau damping predicts that the solution evolves by kinetic free transport as \(t \rightarrow \infty \):

$$\begin{aligned} h(t,x,v) \sim h_\infty (x-vt,v). \end{aligned}$$

We ‘mod out’ by the characteristics of free transport and work in the coordinates \(z = x-vt\) by making the definition \(g(t,z,v) := h(t,z + tv, v)\). Then (1.12a) becomes equivalent to \(g(t) \rightarrow h_\infty \) strongly in Gevrey\(-\frac{1}{s}\). This coordinate shift was used in [20, 42] and is related to the notion of ‘gliding regularity’ used in [67] (although we will not have to use the time-shifting tricks employed therein). Moreover, it is also closely related to the notion of ‘profile’ used in nonlinear dispersive equations (see e.g. [28]). A related coordinate change is also used in [10].

A straightforward computation gives the evolution equation:

$$\begin{aligned} \left\{ \begin{array}{l} \partial _t g + F(t,z+vt) \cdot (\nabla _v -t\nabla _z)g + F(t,z+vt)\cdot \nabla _vf^0 = 0, \\ g(t = 0,z,v) = h_{{\mathrm{in}}}(z,v). \end{array} \right. \end{aligned}$$
(2.4)

Note that the density satisfies \(\hat{\rho }_k(t) = \hat{g}_k(t,kt)\); by the \(H^{d/2+} \hookrightarrow C^0\) embedding theorem and the requirement \(M > d/2\), this formula at least makes sense pointwise in time provided \(\sum _{\alpha \le M}\left\| v^\alpha g\right\| _2\) is finite. Moreover, from this formula we see that a uniform bound on the regularity of f translates directly into decay of the density: this is precisely the phase mixing mechanism. Hence, to prove (1.12b), we are aiming for a uniform-in-time bound on the regularity in a velocity polynomial-weighted space (so that we may restrict the Fourier transform using \(H^{d/2+} \hookrightarrow C^0\)).

All of our analysis is on the Fourier side; taking Fourier transforms and using \(\widehat{[F(t,z+vt)]}(t,k,\eta ) = -(2\pi )^dik\widehat{W}(k)\hat{\rho }_k(t) \delta _{\eta \,{=}\, kt}\), (2.4) becomes

$$\begin{aligned}&\partial _t \hat{g}_k(t,\eta ) + \hat{\rho }_k(t) \widehat{W}(k) k \cdot (\eta - tk) \hat{f}^0(\eta - kt)\nonumber \\&\quad + \sum _{\ell \in \mathbb {Z}_*^d} \hat{\rho }_\ell (t)\widehat{W}(\ell )\ell \cdot \left[ \eta - tk\right] \hat{g}_{k-\ell }(t,\eta - t\ell ) = 0. \end{aligned}$$
(2.5)

Note in the summation that \(\eta - tk = (\eta -t\ell ) - t(k-\ell )\). By the formula \(\hat{\rho }_k(t) = \hat{g}_k(t,kt)\), (2.5) is a closed equation for f, however, we will derive a separate integral equation for \(\rho (t)\) and look at \((g,\rho )\) as a coupled system (similar to [67]). Integrating (2.5) in time and evaluating at \(\eta = kt\) gives

$$\begin{aligned} \hat{\rho }_k(t)&= \hat{h}_{{\mathrm{in}}}(k,kt) - \int _0^t\hat{\rho }_k(\tau ) \widehat{W}(k)k \cdot k(t - \tau ) f^0\left( k(t-\tau )\right) {\, \mathrm d}\tau \nonumber \\&\quad - \int _0^t \sum _{\ell \in \mathbb {Z}_*^d} \hat{\rho }_\ell (\tau )\widehat{W}(\ell )\ell \cdot k\left( t-\tau \right) \hat{g}_{k-\ell }(\tau ,kt - \tau \ell ) {\, \mathrm d}\tau . \end{aligned}$$
(2.6)

As in [67], our goal now is to use the system (2.5)–(2.6) to derive a uniform control on the regularity of g in the moving frame (referred to as ‘gliding regularity’ in [67]). The linear term in (2.6) is handled with the help of the abstract stability condition (L); the difference here with [67] is that we must adapt this to the Gevrey norms we are using, which is done using a slightly technical decomposition technique similar to one which appeared in [67] to treat Gevrey data (carried out below in §4). The main point of departure from [67] is our treatment of the nonlinear terms in (2.5)–(2.6). There are schematically two mechanisms of potential loss of regularity in (2.5)–(2.6) and one potential loss of localization in velocity space:

  1. 1.

    Equation (2.5) describes a transport structure in \({\mathbb {T}}^d \times {\mathbb {R}}^d\), and hence we can expect this to induce the loss of regularity usually associated with transport by controlled coefficients. A different loss of regularity occurs in (2.6) due to the \( k(t-\tau )\hat{g}_{k-\ell }(\tau ,kt-\ell \tau )\): here there is a derivative of g appearing but no transport structure to take advantage of. However, we still refer to this effect as ‘transport’ and remark that it seems related to the beam instability [17].

  2. 2.

    We can see from (2.6) that \(\hat{\rho }_\ell (\tau )\) has a strong effect on \(\hat{\rho }_k(t)\) when \(kt \sim \ell \tau \), which is referred to as reaction. These nonlinear resonances are exactly the plasma echoes of [58], and arise due to interaction with the oscillations in the velocity variable. These effects can potentially amplify high frequencies (e.g. costing regularity) at localized times of strong interaction. A related effect will appear in (2.5) when \(\rho \) forces g via interaction with lower frequencies, and we will refer to this as reaction as well. It was shown in [67] via formal heuristics that an infinite cascade of echoes could lose Gevrey-\((2 + \gamma )^{-1}\) regularity, and hence the restriction \(s > (2 + \gamma )^{-1}\).

  3. 3.

    The density \(\rho \) is a restriction of the Fourier transform of g and the nonlinear terms in (2.5) and (2.6) each involve Fourier restrictions. This issue was treated in [67] by adapting carefully the norms used in order to keep under control some \(L^1\)-based norms of regularity.

The proof of [67] employs a global-in-time Newton scheme which loses a decreasing amount of analytic regularity at each step. The linearization of the Newton scheme provides a natural way to isolate transport effects from reaction effects. The transport is treated by Lagrangian methods to estimate analytic regularity along the characteristic trajectories. The reaction effects are treated via time-integrated estimates on (2.6) that account carefully for the localized, time-delayed effects of the plasma echoes. The proof treats (2.5) and (2.6) as a coupled system in the sense that the main estimates are two coupled but separate controls, one on g and one on \(\rho \). To extend the results to Gevrey regularity, a frequency decomposition of the initial data is employed so that at each step in the scheme everything is analytic.

Here the two different mechanisms of transport and reaction are recovered by a rather different approach, employed recently in the proof of inviscid damping [10]. We use a paraproduct decomposition in order to split the bilinear terms as

$$\begin{aligned} G_1 G_2 = (G_1)_{{\mathrm{lower}}} (G_2)_{{\mathrm{higher}}} + (G_1)_{{\mathrm{higher}}} (G_2)_{{\mathrm{lower}}} + (G_1 G_2)_{{\mathrm{similar~ frequencies}}}. \end{aligned}$$

In a general sense, one of the first two terms on the RHS will capture the transport effects and the other will capture the reaction effects. The last term is a remainder which roughly corresponds to the quadratic error term in the Newton iteration. Indeed, the paraproduct decomposition can be thought of formally linearizing the evolution of higher frequencies around lower frequencies.

As in [10], the transport terms in (2.5) are treated via an adaptation of the Gevrey energy methods of [25, 50]. The essential content is a commutator estimate to take advantage of the cancellations inherent in the natural transport structure. For this step to work we need to use \(L^2\) based norms, and so to deal with the Fourier restrictions, we use norms with polynomial weights in velocity.

The reaction effects in (2.6) are treated here by making use of a refinement of the integral estimates on \(\rho \) of [67] but with some important conceptual changes inspired from [10]: the loss of Gevrey regularity can occur along time rather than iteratively in a Newton scheme. Together with the paraproduct decompositions, this allows us to significantly shorten the proof.

Since we do not use a Newton scheme, we have an additional new constraint: we are not allowed to lose any derivatives in our coupled estimates on \(\rho \) and g, a problem due to the derivative of g appearing in (2.5)–(2.6). However, since regularity can be traded for decay, we solve this problem by propagating controls on both “high” and “low” norms of regularity – the “high” ones being mildly growing in time, a general scheme which is common in the study of wave and dispersive equations (see e.g. [28, 44, 53] and the references therein).

Gevrey Functional Setting

As discussed in the previous section, our goal now is to use the system (2.5)–(2.6) to derive a uniform control on the regularity of g. Unlike the norms used in [10] and [67] we will only need the standard norm \(\mathcal {G}^{\lambda ,\sigma ;s}\) (and the variant \(\mathcal {F}^{\lambda ,\sigma ;s}\)) defined in (1.8) with time-dependent \(\lambda (t)\). For future notational convenience, we define the Fourier multiplier A(t) such that \(\left\| A(t)g\right\| _2 = \left\| g\right\| _{\mathcal {G}^{\lambda (t),\sigma ;s}}\):

$$\begin{aligned} A_k(t,\eta )&= e^{\lambda (t)\langle k,\eta \rangle ^s} \langle k,\eta \rangle ^\sigma , \end{aligned}$$
(2.7)

where \(\sigma > d+8\) is fixed and \(\lambda (t)\) is an index (or ‘radius’) of regularity which is decreasing in time. In the sequel, we also denote for \(\sigma ^\prime \in {\mathbb {R}}\):

$$\begin{aligned} A^{(\sigma ^\prime )}_k(t,\eta )&= e^{\lambda (t)\langle k,\eta \rangle ^s} \langle k,\eta \rangle ^{\sigma + \sigma ^\prime }. \end{aligned}$$

We will choose \(\lambda \) and s so as to absorb the potential loss of regularity due to the plasma echoes. In particular, the choice \(s > (2+\gamma )^{-1}\) will be necessary to ensure that the nonlinear plasma echoes do not destabilize the phase mixing mechanism. This restriction is used only in equations (6.6) and (6.9) of §6. Additionally, in order to absorb the loss of regularity from the echoes, we will need to choose \(\lambda (t)\) to decay slowly in time; this restriction is also only used in (6.6) and (6.9) of §6. It will suffice to make the following choices:

$$\begin{aligned} s > (2+\gamma )^{-1}, \quad \quad \alpha _0 = \frac{\lambda _0}{2} + \frac{\lambda ^\prime }{2}, \quad \quad a = \frac{(2+\gamma )s-1}{(1+\gamma )}, \end{aligned}$$

(note \(0 < a < s\) if \(s < 1\)) and

$$\begin{aligned} \lambda (t) = \frac{1}{8}\left( \lambda _0 - {\lambda ^\prime }\right) \left( 1-t\right) _+ + \alpha _0 + \frac{1}{4}\left( \lambda _0 - {\lambda ^\prime }\right) \min \left( 1,\frac{1}{t^a}\right) . \end{aligned}$$
(2.8)

Then \(\alpha _0 \le \lambda (t) \le 7\lambda _0/8 + {\lambda ^\prime }/\,8\) and the derivative never vanishes:

$$\begin{aligned} \dot{\lambda }(t) \lesssim -\frac{a(\lambda _0-{\lambda ^\prime })}{\langle t \rangle ^{1+a}}. \end{aligned}$$
(2.9)

Norms such as \({\mathcal {G}}^{\lambda (t),\sigma ;s}\) are common when dealing with analytic or Gevrey regularity, for example, see the works [10, 21, 22, 25, 27, 45, 50, 67]. The Sobolev correction \(\sigma \) is included mostly for technical convenience. This correction allows to avoid needing to pay Gevrey regularity where Sobolev regularity would suffice; for example, as \(\lambda (t) \gtrsim 1\) and \(s < 1\), \({\mathcal {G}}^{\lambda (t),0;s}\) is an algebra, however it is simpler to use that \(H^\sigma ({\mathbb {T}}^d \times {\mathbb {R}}^d)\) is an algebra for \(\sigma > d\) instead. In order to study the analytic case, \(s = 1\), we would need to add an additional Gevrey-\(\frac{1}{\zeta }\) correction with \((2+\gamma )^{-1} < \zeta < 1\) as an intermediate regularity (and define a in terms of \(\zeta \)) so that we may take advantage of beneficial properties of Gevrey norms (see, for example, Lemma 3.3). For the duration of the proof we assume \(s < 1\) and do not address the additional technical minor issue in the limit case \(s=1\).

The reason for using \(\langle \cdot \rangle \) in the definition of \(A_k\) and \(A_k^{(\sigma ')}\) in (2.7), as opposed to \(\left| \cdot \right| \), is so that for all \(\alpha \in {\mathbb {N}}^d\) and \(\sigma ^\prime \in {\mathbb {R}}\),

$$\begin{aligned} \left| D_\eta ^\alpha A^{(\sigma ^\prime )}_k(t,\eta )\right|&\lesssim _{\left| \alpha \right| ,\lambda _0,\sigma ^\prime } \frac{1}{\langle k,\eta \rangle ^{\left| \alpha \right| (1-s)}}A^{(\sigma ^\prime )}_k(t,\eta ), \end{aligned}$$
(2.10)

which is useful when estimating velocity moments.

Uniform in Time Regularity Estimates

In this section we set up the continuity argument we use to derive a uniform bound on \((\rho ,g)\). In order to ensure the formal computations are rigorous, we first regularize the initial data to be analytic. The following standard lemma provides local existence of a unique analytic solution which remains analytic as long as a suitable Sobolev norm remains finite. The local existence of analytic solutions can be proved with an abstract Cauchy-Kovalevskaya theorem, see for example [69, 70]. The propagation of analyticity by Sobolev regularity can be proved by a variant of the arguments in [50] along with the inequality \(\left\| B\rho \right\| _2 \lesssim \sum _{\alpha \le M} \left\| v^\alpha B g\right\| _2\) for all Fourier multipliers B (with our notation (1.13)) and all integers \(M > d/2\). We omit the proof of Lemma 2.1 for brevity.

Lemma 2.1

(Local existence and propagation of analyticity) Let \(M > d/2\) be an integer and \(\tilde{\lambda }>0\). Suppose \(h_{{\mathrm{{in}}}}\) is analytic such that

$$\begin{aligned} \sum _{\alpha \in {\mathbb {N}}^d: \left| \alpha \right| \le M}\left\| v^\alpha h_{{\mathrm{{in}}}}\right\| _{\mathcal {G}^{\tilde{\lambda };1}} < \infty . \end{aligned}$$

Then there exists some \(T_0 > 0\) such that there exists a unique analytic solution g(t) to (2.4) on [0, T] for all \(T < T_0\) and for some index \(\tilde{\lambda }(t)\) with \(\inf _{t \in [0,T]} \tilde{\lambda }(t) > 0\) we have,

$$\begin{aligned} \sup _{t \in [0,T]}\sum _{\alpha \in {\mathbb {N}}^d: \left| \alpha \right| \le M}\left\| v^\alpha g(t)\right\| _{\mathcal {G}^{\tilde{\lambda }(t);1}} < \infty . \end{aligned}$$

Moreover, if for some \(T \le T_0\) and \(\sigma > d/2\), \(\limsup _{t \nearrow T} \sum _{\alpha \in {\mathbb {N}}^d:\left| \alpha \right| \le M}\left\| v^\alpha g(t)\right\| _{H^{\sigma }_{x,v}} < \infty \), then \(T < T_0\).

Remark 10

If \(d \le 3\) and the solution has finite kinetic energy, then global analytic solutions to (1.2) exist even for large data. See the classical results [12, 41, 54, 74, 78] for the global existence of strong solutions (from which analyticity can be propagated by a variant of e.g. [50]). We remark that in \(d \ge 4\), finite time blow-up is possible at least for gravitational interactions [48], however, this case is still covered by Theorem 1.

Remark 11

To treat the case \(s = 1\) in Theorem 1, we would need to be slightly more careful in applying Lemma 2.1. In this case, we may regularize the data to a larger radius of analyticity \(\tilde{\lambda } > \lambda _0\), perform our a priori estimates until \(\tilde{\lambda }(t) = \lambda (t)\), at which point we may stop, re-regularize and restart iteratively.

Lemma 2.1 implies that as long as we retain control on the moments and regularity of the regularized solutions, they exist and remain analytic. We perform the a priori estimates on these solutions, for which the computations are rigorous, and then we may pass to the limit to show that the original solutions exist globally and satisfy the same estimates as the regularized solutions. For the remainder of the paper, we omit these details and discuss only the a priori estimates.

For constants \(K_i\), \(1 \le i \le 3\) fixed in the proof depending only on \(C_0,\bar{\lambda },\kappa \), s, d, \(\lambda _0\) and \({\lambda ^\prime }\), let \(I \subset {\mathbb {R}}_+\) be the largest interval of times such that \(0 \in I\) and the following controls hold for all \(t \in I\):

$$\begin{aligned} \sum _{\alpha \in {\mathbb {N}}^d:\left| \alpha \right| \le M}\left\| \langle \nabla _{z,v} \rangle A(v^\alpha g)(t)\right\| ^2_{2}= & {} \sum _{\alpha \in {\mathbb {N}}^d:\left| \alpha \right| \le M}\left\| A^{(1)}(v^\alpha g)(t)\right\| ^2_{2} \le 4K_1\langle t \rangle ^7\epsilon ^2 \nonumber \\\end{aligned}$$
(2.11a)
$$\begin{aligned} \sum _{\alpha \in {\mathbb {N}}^d:\left| \alpha \right| \le M}\left\| \langle \nabla _{z,v} \rangle ^{-\beta }A(v^\alpha g)(t)\right\| ^2_{2}= & {} \sum _{\alpha \in {\mathbb {N}}^d:\left| \alpha \right| \le M}\left\| A^{(-\beta )}(v^\alpha g)(t)\right\| ^2_{2} \le 4K_2\epsilon ^2 \nonumber \\\end{aligned}$$
(2.11b)
$$\begin{aligned} \int _0^t \left\| A\rho (\tau )\right\| _2^2 {\, \mathrm d}\tau\le & {} 4K_3\epsilon ^2, \end{aligned}$$
(2.11c)

where we may fix \(\beta > 2\) arbitrary and \(\epsilon \) satisfies a certain smallness assumption as in Theorem 1. Recall the definition of A in (2.7). It is clear from the assumptions that if \(K_i \ge 1\) then \(0 \in I\). The primary step in the proof of Theorem 1 is to show that \(I = [0,\infty )\).

For the regularized solutions it will be clear from the ensuing arguments that the quantities on the LHS of (2.11) are continuous in time, from which it follows that I is relatively closed in \({\mathbb {R}}_+\). Hence define \(T^\star \le \infty \) such that \(I = [0,T^\star ]\). In order to prove that \(T^\star = \infty \) it suffices to establish that I is also relatively open, which is implied by the following bootstrap.

Proposition 2.2

(Bootstrap) There exists \(\epsilon _0, K_i\) depending only on \(d,M,\bar{\lambda },C_0,\kappa ,\lambda _0,{\lambda ^\prime }\) and s such that if (2.11) holds on some time interval \([0,T^\star )\) and \(\epsilon < \epsilon _0\), then for \(t \le T^\star \),

$$\begin{aligned}&\displaystyle \sum _{\alpha \in {\mathbb {N}}^d:\left| \alpha \right| \le M}\left\| \langle \nabla _{z,v} \rangle A(v^\alpha g)(t)\right\| ^2_{2} = \sum _{\alpha \in {\mathbb {N}}^d:\left| \alpha \right| \le M} \left\| A^{(1)}(v^\alpha g)(t)\right\| ^2_{2} < 2K_1\langle t \rangle ^7 \epsilon ^2 \nonumber \\ \end{aligned}$$
(2.12a)
$$\begin{aligned}&\displaystyle \sum _{\alpha \in {\mathbb {N}}^d:\left| \alpha \right| \le M}\left\| \langle \nabla _{z,v} \rangle ^{-\beta }A(v^\alpha g)(t)\right\| ^2_{2} =\sum _{\alpha \in {\mathbb {N}}^d:\left| \alpha \right| \le M} \left\| A^{(-\beta )} (v^\alpha g)(t)\right\| ^2_{2} < 2K_2\epsilon ^2 \nonumber \\ \end{aligned}$$
(2.12b)
$$\begin{aligned}&\displaystyle \int _0^{t} \left\| A\rho (\tau )\right\| _2^2 {\, \mathrm d}\tau < 2K_3\epsilon ^2, \end{aligned}$$
(2.12c)

from which it follows that \(T^\star = \infty \).

Once Proposition 2.2 is deduced, Theorem 1 follows quickly. This is carried out in §7.

Remark 12

In order to close the bootstrap in Proposition 2.2 we need to understand how, or if, the constants \(K_i\) depend on each other so that we can be sure that they can be chosen self-consistently. In fact, \(K_3\) is determined by the linearized evolution (from \(C_{LD}\) in Lemma 4.1) then \(K_1\) is fixed in §5.3 depending on \(K_3\) (and s,d,\(\lambda _0\),\({\lambda ^\prime }\)) and \(K_2\) is analogously fixed in §5.4 depending on \(K_3\) (and s, d, \(\lambda _0\), \({\lambda ^\prime }\) but not directly \(K_1\)). Finally, \(\epsilon _0\) is chosen small with respect to everything.

Remark 13

The unbalance of a whole derivative between (2.12c) and (2.12a) uses the regularization of the interaction potential and is the only aspect of the proof which requires \(\gamma \ge 1\).

The purpose of the weights in velocity is to control derivatives of the Fourier transform so that the trace Lemma 3.4 and the \(H^{d/2+} \hookrightarrow C^0\) embedding theorem can be applied to restrict the Fourier transform along rays and pointwise. Both are necessary to deduce the controls on the density in §5.1 and §5.2. Moreover, from (2.11b) and (2.10) we have,

$$\begin{aligned} \left\| A^{(-\beta )}\hat{g}\right\| _{L_k^2H^M_\eta }&\lesssim \sum _{\alpha \in {\mathbb {N}}^d: \left| \alpha \right| \le M} \left\| D^\alpha _\eta A^{(-\beta )}\hat{g}\right\| _{L_k^2 L_\eta ^2}\nonumber \\&\le \sum _{\alpha \in {\mathbb {N}}^d: \left| \alpha \right| \le M}\sum _{j \le \alpha } \frac{\alpha !}{j!(\alpha -j)!}\left\| (D_\eta ^{\alpha -j}A^{(-\beta )})(D_\eta ^{j}\hat{g})\right\| _{L_k^2 L_\eta ^2} \nonumber \\&\lesssim _{M} \sqrt{K_2} \epsilon . \end{aligned}$$
(2.13)

Similarly, (2.11a) implies

$$\begin{aligned} \left\| A^{(1)}\hat{g}\right\| _{L_k^2 H^M_\eta } \lesssim _{M} \sqrt{K_1} \epsilon \langle t \rangle ^{7/2}. \end{aligned}$$
(2.14)

Let us briefly summarize how Proposition 2.2 is proved. The main step is the time-integral estimate on the \(L^2\) norm of the density in (2.12c), deduced in §5.1. This is done by analyzing (2.6). The linear term in (2.6) is treated using a Fourier-Laplace transform and (L) as in [67] with a technical time decomposition in order to get an estimate in Gevrey regularity using A(t). The nonlinear term (2.6) is decomposed using a paraproduct into reaction, transport and remainder terms. As discussed above in §2.2, the reaction term in (2.6) is connected to the plasma echoes. Once the paraproduct decomposition and (2.11b) have allowed us to isolate this effect, it is treated with an adaptation of §7 in [67], carried out in §6 (our treatment is in the same spirit but not quite the same). The transport term describes the interaction of \(\rho \) with ‘higher’ frequencies of g; this effect is controlled using (2.11a). Once the time-integral estimate has been established, we also derive a relatively straightforward pointwise-in-time control in §5.2.

The estimate (2.12a) is deduced in §5.3 via an energy estimate in the spirit of [10]. A paraproduct is again used to decompose the nonlinearity into reaction, transport and remainder terms. As in [10], the transport term is treated using an adaptation of the methods of [25, 45, 50]. However, perhaps more like [67], the reaction term is treated using (2.11c). The time growth is due to the fact that there is no regularity available to transfer to decay; that the estimate is closable at all requires the regularization from \(\gamma \ge 1\). The low norm estimate (2.12b) is proved in §5.4 in a fashion similar to that of (2.12a) (the uniform bound is possible due to the regularity gap of \(\beta > 2\) derivatives between (2.11c) and (2.12b), which can be into time decay on \(\rho \)).

Toolbox

In this section we review some of the technical tools used in the proof of Theorem 1: the Littlewood-Paley dyadic decomposition, paraproducts and useful inequalities for working in Gevrey regularity.

Fourier Analysis Conventions

For a function \(g=g(z,v)\) we write its Fourier transform \(\hat{g}_k(\eta )\) where \((k,\eta ) \in \mathbb {Z}^d \times {\mathbb {R}}^d\) with

$$\begin{aligned} \hat{g}_k(\eta )&:= \frac{1}{(2\pi )^{d}}\int _{{\mathbb {T}}^d \times {\mathbb {R}}^d} e^{-i z k - iv\eta } g(z,v) \, \mathrm{d}z\, \mathrm{d}v, \\ g(z,v)&:= \frac{1}{(2\pi )^{d}}\sum _{k \in \mathbb {Z}^d} \int _{{\mathbb {R}}^d} e^{i z k + iv\eta } \hat{g}_k(\eta ) {\, \mathrm d}\eta . \end{aligned}$$

We use an analogous convention for Fourier transforms to functions of x or v alone.

With these conventions we have the following relations:

$$\begin{aligned} {\left\{ \begin{array}{ll}\displaystyle \int _{{\mathbb {T}}^d \times {\mathbb {R}}^d} g(z,v) \overline{g}(z,v) \, \mathrm{d}z\, \mathrm{d}v\displaystyle = \sum _{k \in {\mathbb {Z}}^d}\int _{{\mathbb {R}}^d} \hat{g}_k(\eta ) \overline{\hat{g}}_{k}(\eta ) {\, \mathrm d}\eta , \\ \displaystyle \widehat{g^1g^2} = \displaystyle \frac{1}{(2\pi )^{d}}\hat{g^1} *\hat{g^2},\\ \displaystyle (\widehat{\nabla g})_k(\eta ) = (ik,i\eta )\hat{g}_k(\eta ), \\ \displaystyle (\widehat{v^\alpha g})_k(\eta ) = D_\eta ^\alpha \hat{g}_k(\eta ). \end{array}\right. } \end{aligned}$$

The following versions of Young’s inequality occur frequently in the proof.

Lemma 3.1

  1. (a)

    Let \(g_k^1(\eta ),g_k^1(\eta ) \in L^2(\mathbb {Z}^d \times {\mathbb {R}}^d)\) and \(\langle k \rangle ^\sigma r_k(t) \in L^2(\mathbb {Z}^d)\) for \(\sigma > d/2\). Then, for any \(t \in {\mathbb {R}}\),

    $$\begin{aligned} \left| \sum _{k,\ell } \int _\eta g_k^1(\eta ) r_\ell (t) g_{k-\ell }^1(\eta -t\ell ) {\, \mathrm d}\eta \right| \lesssim _{d,\sigma } \left\| g^1\right\| _{L^2_{k,\eta } }\left\| g^1\right\| _{L^2_{k,\eta } } \left\| \langle k \rangle ^\sigma r(t)\right\| _{L_k^2}. \end{aligned}$$
    (3.1)
  2. (b)

    Let \(g_k^1(\eta ), \langle k \rangle ^\sigma g_k^2(\eta ) \in L^2(\mathbb {Z}^d \times {\mathbb {R}}^d)\) and \(r_k \in L^2(\mathbb {Z}^d)\) for \(\sigma > d/2\). Then, for any \(t \in {\mathbb {R}}\),

    $$\begin{aligned} \left| \sum _{k,\ell } \int _\eta g_k^1(\eta ) h_\ell (t) g_{k-\ell }^2(\eta -t\ell ) {\, \mathrm d}\eta \right| \lesssim _{d,\sigma } \left\| g^1\right\| _{L^2_{k,\eta } }\left\| \langle k \rangle ^{\sigma }g^2\right\| _{L^2_{k,\eta } } \left\| r(t)\right\| _{L_k^2}. \end{aligned}$$
    (3.2)

Proof

To prove (a):

$$\begin{aligned}&\left| \sum _{k,\ell } \int _\eta g_k^1(\eta ) r_\ell g_{k-\ell }^2(\eta -t\ell ) {\, \mathrm d}\eta \right| \\&\quad \le \sum _{k} \left( \int _{\eta } \left| g_k^1(\eta )\right| ^2 {\, \mathrm d}\eta \right) ^{1/2} \sum _{\ell } r_\ell \left( \int _{\eta } \left| g_{k-\ell }^2(\eta -\ell t)\right| ^2 {\, \mathrm d}\eta \right) ^{1/2} \\&\quad = \sum _{k} \left( \int _{\eta } \left| g_k^1(\eta )\right| ^2 {\, \mathrm d}\eta \right) ^{1/2} \sum _{\ell } r_\ell \left( \int _{\eta } \left| g_{k-\ell }^2(\eta )\right| ^2 {\, \mathrm d}\eta \right) ^{1/2} \\&\quad \le \left( \sum _{k}\int _{\eta }\left| g_k^1(\eta )\right| ^2 {\, \mathrm d}\eta \right) ^{1/2} \left[ \sum _{k} \left( \sum _{\ell } r_\ell \left( \int _{\eta } \left| g_{k-\ell }^2(\eta )\right| ^2 {\, \mathrm d}\eta \right) ^{1/2}\right) ^{2}\right] ^{1/2} \\&\quad \le \left\| g^1\right\| _{L^2_{k,\eta }} \left\| g^2\right\| _{L^2_{k,\eta }} \sum _{\ell } \left| r_{\ell }\right| \\&\quad \lesssim _{d,\sigma } \left\| g^1\right\| _{L^2_{k,\eta }} \left\| g^2\right\| _{L^2_{k,\eta }} \left\| \langle k \rangle ^\sigma r_k\right\| _{L_k^2}, \end{aligned}$$

where the penultimate line followed from the \(L^2 \times L^1 \mapsto L^{2}\) Young’s inequality. The proof of (b) is analogous, simply putting \(g^2\) rather than r in \(L^1_k\). \(\square \)

Littlewood-Paley Decomposition

This work makes use of the Littlewood-Paley dyadic decomposition. Here we fix conventions and review the basic properties of this classical theory (see e.g. [5] for more details).

Define the joint variable \(\Xi := (k,\eta ) \in {\mathbb {Z}}^d \times {\mathbb {R}}^d\). Let \(\psi \in C_0^\infty ({\mathbb {Z}}^d \times {\mathbb {R}}^d;{\mathbb {R}})\) be a radially symmetric non-negative function such that \(\psi (\Xi ) = 1\) for \(\left| \Xi \right| \le 1/2\) and \(\psi (\Xi ) = 0\) for \(\left| \Xi \right| \ge 3/4\). Then we define \(\phi (\Xi ) := \psi (\Xi /2) - \psi (\Xi )\), a non-negative, radially symmetric function supported in the annulus \(\{ 1/2 \le |\Xi | \le 3/2 \}\). Then we define the rescaled functions \(\phi _N(\Xi ) = \phi (N^{-1}\Xi )\), which satisfy

$$\begin{aligned} supp \, \phi _N(\Xi ) = \{ N/2 \le \left| \Xi \right| \le 3N/2 \} \end{aligned}$$

and we have classically the partition of unity,

$$\begin{aligned} 1 = \psi (\Xi ) + \sum _{N \in 2^{\mathbb {N}}} \phi _N(\Xi ), \end{aligned}$$

(observe that there are always at most two non-zero terms in this sum), where we mean that the sum runs over the dyadic numbers \(N = 1,2,4,8,\dots ,2^{j},\dots \).

For \(g \in L^2({\mathbb {T}}^d \times {\mathbb {R}}^d)\) we define

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle g_{N} &{} := \phi _N\left( \nabla _{z,v}\right) g, \\ g_{\frac{1}{2}} &{} := \psi \left( \nabla _{z,v}\right) g, \\ g_{< N} &{} := \displaystyle g_{\frac{1}{2}} + \sum _{N' \in 2^{{\mathbb {N}}} \, : \, N' < N} g_{N'}, \end{array}\right. } \end{aligned}$$

and we have the natural decomposition,

$$\begin{aligned} g = \sum _{N \in {\mathbb {D}}} g_N = g_{\frac{1}{2}} + \sum _{N \in 2^{\mathbb {N}}} g_N, \quad {\mathbb {D}} := \left\{ \frac{1}{2}, 1, 2, \dots , 2^j,\dots \right\} . \end{aligned}$$

Normally one would use \(g_0\) or \(g_{-1}\) rather than the slightly inconsistent \(g_{1/2}\), however here \(g_0\) is reserved for the much more commonly used projection onto the zero mode only in z or x.

There holds the almost orthogonality and the approximate projection property

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \sum _{N \in {\mathbb {D}}} \left\| g_N\right\| _2^2 \le \left\| g\right\| ^2_2 \le 2 \sum _{N \in {\mathbb {D}}} \left\| g_N\right\| _2^2 \\ \displaystyle \left\| (g_{N})_N\right\| _2 \le \left\| g_N\right\| _2. \end{array}\right. } \end{aligned}$$
(3.3)

Similarly to (3.3) but more generally, if \(g = \sum _{{\mathbb {D}}} g_N\) with \(\text{ supp } \, g_N \subset \{ C^{-1}N \le |\Xi | \le C N\}\) for \(N \ge 1\) and \(\text{ supp } \, g_{\frac{1}{2}} \subset \left\{ \left| \Xi \right| \le C\right\} \) then we have

$$\begin{aligned} \left\| g\right\| ^2_2 \approx _C \sum _{N \in {\mathbb {D}}} \left\| g_N\right\| _2^2. \end{aligned}$$
(3.4)

Moreover, the dyadic decomposition behaves nicely with respect to differentiation:

$$\begin{aligned} \left\| \langle \nabla _{z,v} \rangle g_N\right\| _2 \approx N \left\| g_N\right\| _2. \end{aligned}$$
(3.5)

We also define the notation

$$\begin{aligned} g_{\sim N} = \sum _{N' \in {\mathbb {D}} \, : \, C^{-1} N \le N' \le C N} g_{N'}, \end{aligned}$$

for some constant C which is independent of N. Generally the exact value of C which is being used is not important; what is important is that it is finite and independent of N.

In some steps of the proof, we will apply the Littlewood-Paley decomposition to the spatial density \(\rho (t,x)\). In this case it will be more convenient to use the following definition that uses kt in place of the frequency in v, a natural convention when one recalls that \(\hat{\rho }_k(t) = \hat{g}_k(t,kt)\):

$$\begin{aligned} {\left\{ \begin{array}{ll} \widehat{\rho (t)}_N &{} = \phi _N(\left| k,kt\right| )\hat{\rho }_k(t), \\ \widehat{\rho (t)}_{\frac{1}{2}} &{} = \psi (\left| k,kt\right| )\hat{\rho }_k(t), \\ \widehat{\rho (t)}_{< N} &{} = \displaystyle \rho (t)_{\frac{1}{2}} + \sum _{N' \in 2^{{\mathbb {N}}}: N' < N} \rho (t)_{N'}. \end{array}\right. } \end{aligned}$$

Remark 14

We have opted to use the compact notation above, rather than the commonly used alternatives \(\Delta _{j}g = g_{2^j}\) and \(S_jg = g_{<2^j}\), in order to reduce the number of characters in long formulas.

The Paraproduct Decomposition

Another key Fourier analysis tool employed in this work is the paraproduct decomposition, introduced by Bony [15] (see also [5]). Given suitable functions \(g^1,g^2\) we may define the paraproduct decomposition (in either (zv) or just v),

$$\begin{aligned} g^1g^2= & {} \sum _{N \ge 8} g^1_{<N/8}g^2_N + \sum _{N \ge 8} g^1_N g^2_{<N/8} + \sum _{N \in {\mathbb {D}}}\sum _{N/8 \le N^\prime \le 8N} g^1_{N} g^2_{N^\prime }\nonumber \\:= & {} T_{g^1}g^2 + T_{g^2}g^1 + {\mathcal {R}}\left( g^1,g^2\right) \end{aligned}$$
(3.6)

where all the sums are understood to run over \({\mathbb {D}}\).

The advantage of this decomposition in the energy estimates is that the first term \(T_{g^1} g^2\) contains the highest derivatives on \(g^2\) but allows to take advantage of the frequency cutoff on the first function \(g^1\), whereas the second term \(T_{g^2} g^1\) contains the highest derivatives on \(g^1\) but allows to take advantage of the frequency cutoff on the second function \(g^2\). Finally the last “remainder” term contains the contribution from comparable frequencies which allows to split regularity evenly between the factors.

Elementary Inequalities for Gevrey Regularity

In this section we discuss a set of elementary, but crucial, inequalities for working in Gevrey regularity spaces. First we point out that Gevrey and Sobolev regularities can be related with the following two inequalities.

  1. (i)

    For \(x \ge 0\), \(\alpha > \beta \ge 0\), \(C,\delta > 0\),

    $$\begin{aligned} e^{Cx^{\beta }} \le e^{C\left( \frac{\alpha -\beta }{\alpha } \right) \left( \frac{C}{\delta }\right) ^{\frac{\beta }{\alpha - \beta }}} e^{\delta x^{\alpha }}. \end{aligned}$$
    (3.7)
  2. (ii)

    For \(x \ge 0\), \(\alpha > 0\), \(\sigma ,\delta > 0\),

    $$\begin{aligned} e^{-\delta x^{\alpha }} \lesssim _{\sigma , \alpha } \frac{1}{\delta ^{\frac{\sigma }{\alpha }} \langle x \rangle ^{\sigma }}. \end{aligned}$$
    (3.8)

Next, we state several useful inequalities regarding the weight \(\langle x \rangle = (1 + \left| x\right| ^2)^{1/2}\). In particular, the improvements to the triangle inequality for \(s < 1\) given in (3.10), (3.11) and (3.12) are important for getting useful bilinear (and trilinear) estimates. The proof is straightforward and is omitted.

Lemma 3.2

Let \(0 < s < 1\) and \(x,y \ge 0\).

  1. (i)

    We have the triangle inequalities (which hold also for \(s = 1\)),

    $$\begin{aligned}&\displaystyle \langle x + y \rangle ^s \le \langle x \rangle ^s + \langle y \rangle ^s&\end{aligned}$$
    (3.9a)
    $$\begin{aligned}&\displaystyle \left| \langle x \rangle ^s - \langle y \rangle ^s\right| \le \langle x-y \rangle ^s&\end{aligned}$$
    (3.9b)
    $$\begin{aligned}&\displaystyle C_s\left( \langle x \rangle ^s + \langle y \rangle ^s\right) \le \langle x+y \rangle ^s,&\end{aligned}$$
    (3.9c)

    for some \(C_s > 0\) depending only on s.

  2. (ii)

    In general,

    $$\begin{aligned} \left| \langle x \rangle ^s - \langle y \rangle ^s\right| \lesssim _s \frac{1}{\langle x \rangle ^{1-s} + \langle y \rangle ^{1-s}}\langle x-y \rangle . \end{aligned}$$
    (3.10)
  3. (iii)

    If \(\left| x-y\right| \le x/K\) for some \(K > 1\), then we have the improved triangle inequality

    $$\begin{aligned} \left| \langle x \rangle ^s - \langle y \rangle ^s\right| \le \frac{s}{(K-1)^{1-s}}\langle x-y \rangle ^s. \end{aligned}$$
    (3.11)
  4. (iv)

    We have the improved triangle inequality for \(x \ge y\),

    $$\begin{aligned} \langle x + y \rangle ^s \le \left( \frac{\langle x \rangle }{\langle x+y \rangle }\right) ^{1-s}\left( \langle x \rangle ^s + \langle y \rangle ^s\right) . \end{aligned}$$
    (3.12)

The following product lemma will be used several times in the sequel. Notice that \(\tilde{c} < 1\) for \(s < 1\), which shows that we gain something by working in Gevrey spaces with \(s < 1\); indeed this important gain is used many times in the nonlinear estimates. We sketch the proof as it provides a representative example of arguments used several times in §5.

Lemma 3.3

(Product Lemma) For all \(0<s<1\), and \(\sigma \ge 0\) there exists a \(\tilde{c} = \tilde{c}(s,\sigma ) \in (0,1)\) such that the following holds for all \(\lambda > 0\), \(g^1,g^2 \in {\mathcal {G}}^{\lambda ,\sigma ;s}({\mathbb {T}}^d \times {\mathbb {R}}^d)\) and \(r(t) \in {\mathcal {F}}^{\lambda ,\sigma ;s}({\mathbb {T}}^d)\):

$$\begin{aligned}&\sum _{k \in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}^d} \int _{{\mathbb {R}}^d} \langle k,\eta \rangle ^{2\sigma } e^{2\lambda \langle k,\eta \rangle ^s} \left| \hat{g^1}_k(\eta )\hat{r}_{\ell }(t)\hat{g^2}_{k-\ell }(\eta -\ell t)\right| {\, \mathrm d}\eta&\nonumber \\&\qquad \lesssim _{\lambda ,\sigma ,s,d} \left\| g^1\right\| _{\mathcal {G}^{\lambda ,\sigma ;s}}\left\| g^2\right\| _{\mathcal {G}^{\tilde{c}\lambda ,0;s}}\left\| r(t)\right\| _{\mathcal {F}^{\lambda ,\sigma ;s}} + \left\| g^1\right\| _{\mathcal {G}^{\lambda ,\sigma ;s}}\left\| g^2\right\| _{\mathcal {G}^{\lambda ,\sigma ;s}} \left\| r(t)\right\| _{\mathcal {F}^{\tilde{c}\lambda ,0;s}}. \end{aligned}$$
(3.13)

Moreover, we have the algebra property, for \(g \in {\mathcal {G}}^{\lambda ,\sigma ;s}({\mathbb {T}}^d \times {\mathbb {R}}^d)\) and \(r(t) \in {\mathcal {F}}^{\lambda ,\sigma ;s}({\mathbb {T}}^d)\),

$$\begin{aligned} \sum _{k \in \mathbb {Z}^d} \int _{{\mathbb {R}}^d}\left| \sum _{\ell \in \mathbb {Z}^d} e^{\lambda \langle k,\eta \rangle ^s}\langle k,\eta \rangle ^\sigma \hat{r}_{\ell }(t)\hat{g}_{k-\ell }(\eta - \ell t) \right| ^2 {\, \mathrm d}\eta \lesssim \left\| r(t)\right\| ^2_{\mathcal {F}^{\lambda ,\sigma ;s}} \left\| g\right\| ^2_{\mathcal {G}^{\lambda ,\sigma ;s}}. \end{aligned}$$
(3.14)

Proof

We only prove (3.13), which is slightly harder. Denote \(B_k(\eta ) = \langle k,\eta \rangle ^{\sigma } e^{\lambda \langle k,\eta \rangle ^s}\). The proof proceeds by decomposing with a paraproduct:

$$\begin{aligned}&\sum _{k,\ell \in \mathbb {Z}^d} \int _\eta \left| B\hat{g^1}_k(\eta )B_k(\eta )\hat{r}_{\ell }(t)\hat{g^2}_{k-\ell }(\eta -\ell t)\right| {\, \mathrm d}\eta \\&\quad \le \sum _{N \ge 8}\sum _{k,\ell \in \mathbb {Z}_*^d} \int _\eta \left| B\hat{g^1}_k(\eta )B_k(\eta )\hat{r}_{\ell }(t)_N\hat{g^2}_{k-\ell }(\eta -\ell t)_{<N/8}\right| {\, \mathrm d}\eta \\&\quad \quad + \sum _{N \ge 8}\sum _{k,\ell \in \mathbb {Z}_*^d} \int _\eta \left| B\hat{g^1}_k(\eta )B_k(\eta )\hat{r}_{\ell }(t)_{<N/8}\hat{g^2}_{k-l}(\eta -\ell t)_{N}\right| {\, \mathrm d}\eta \\&\quad \quad + \sum _{N \in {\mathbb {D}}} \sum _{N/8 \le N^\prime \le 8N}\sum _{k,\ell \in \mathbb {Z}_*^d} \int _\eta \left| B\hat{g^1}_k(\eta )B_k(\eta )\hat{r}_{\ell }(t)_{N^\prime }\hat{g^2}_{k-\ell }(\eta -\ell t)_{N}\right| {\, \mathrm d}\eta \\&\quad = R + T + {\mathcal {R}}. \end{aligned}$$

Note that the R and T term are almost, but not quite, symmetric. Consider first the R term. On the support of the integrand we have the frequency localizations

$$\begin{aligned} \frac{N}{2} \le \left| \ell ,\ell t\right|&\le \frac{3N}{2}, \end{aligned}$$
(3.15a)
$$\begin{aligned} \left| k-\ell ,\eta - \ell t\right|&\le \frac{3N}{32}, \end{aligned}$$
(3.15b)
$$\begin{aligned} \frac{13}{16} \le \frac{\left| k,\eta \right| }{\left| \ell ,t \ell \right| }&\le \frac{19}{16}. \end{aligned}$$
(3.15c)

Therefore, by (3.11), there exists some \(c = c(s) \in (0,1)\) such that

$$\begin{aligned} B_k(\eta )&\le \langle k,\eta \rangle ^{\sigma } e^{\lambda \langle \ell ,\ell t \rangle ^s} e^{c\lambda \langle k-\ell ,\eta -\ell t \rangle ^s} \lesssim _\sigma \langle \ell ,t\ell \rangle ^{\sigma } e^{\lambda \langle \ell ,\ell t \rangle ^s} e^{c\lambda \langle k-\ell ,\eta -\ell t \rangle ^s} \\&\lesssim _{\lambda } \langle \ell ,t\ell \rangle ^{\sigma } e^{\lambda \langle \ell ,\ell t \rangle ^s}\langle k-\ell ,\eta -\ell t \rangle ^{-\frac{d}{2}-1} e^{\frac{1}{2}(c+1)\lambda \langle k-\ell ,\eta -\ell t \rangle ^s}, \end{aligned}$$

where in the last inequality we applied (3.8). Adding a frequency localization with (3.15), denoting \(\tilde{c} = \frac{1}{2}(c + 1)\), and using (3.2) we have

$$\begin{aligned} R&\lesssim \sum _{N \ge 8}\sum _{k,\ell \in \mathbb {Z}_*^d} \int _\eta \left| B\hat{g^1}_k(\eta )_{\sim N}B_\ell (\ell t)\hat{r}_{\ell }(t)_Ne^{\tilde{c}\lambda \langle k-\ell ,\eta -\ell t \rangle ^s} \right. \nonumber \\&\left. \quad \langle k-\ell ,\eta -\ell t \rangle ^{-\frac{d}{2}-1} \hat{g^2}_{k-l}(\eta -lt)_{<N/8}\right| {\, \mathrm d}\eta \\&\lesssim \sum _{N \ge 8} \left\| g^1_{\sim N}\right\| _{\mathcal {G}^{\lambda ,\sigma ;s}} \left\| r(t)_N\right\| _{\mathcal {F}^{\lambda ,\sigma ;s}} \left\| g^2\right\| _{\mathcal {F}^{\tilde{c}\lambda ,0;s}}. \end{aligned}$$

By Cauchy-Schwarz and (3.4) we have

$$\begin{aligned} R \lesssim \left\| g^1\right\| _{\mathcal {G}^{\lambda ,\sigma ;s}} \left\| r(t)\right\| _{\mathcal {F}^{\lambda ,\sigma ;s}} \left\| g^2\right\| _{\mathcal {F}^{\tilde{c}\lambda ,0;s}}, \end{aligned}$$

which appears on the RHS of (3.13). Treating the T term is essentially the same except reversing the role of \((\ell ,t\ell )\) and \((k-\ell ,\eta -t\ell )\) and applying (3.1) as opposed to (3.2).

To treat the \({\mathcal {R}}\) term we use a simple variant. We claim that there exists some \(c^\prime = c^\prime (s) \in (0,1)\) such that on the support of the integrand we have

$$\begin{aligned} B_k(\eta ) \lesssim _\sigma e^{c^\prime \lambda \langle k-\ell ,\eta -t\ell \rangle ^s}e^{c^\prime \lambda \langle \ell ,t\ell \rangle ^s}. \end{aligned}$$
(3.16)

To see this, consider separately the cases (say) \(N \ge 128\) and \(N < 128\). On the latter, \(B_k(\eta )\) is simply bounded by a constant. In the case \(N \ge 128\) we have the frequency localizations

$$\begin{aligned} \frac{N}{2} \le \left| k-\ell ,\eta - \ell t\right|&\le \frac{3N}{2}, \end{aligned}$$
(3.17a)
$$\begin{aligned} \frac{N^\prime }{2} \le \left| \ell ,\ell t\right|&\le \frac{3N^\prime }{2}, \end{aligned}$$
(3.17b)
$$\begin{aligned} \frac{1}{24} \le \frac{\left| k-\ell ,\eta - kt\right| }{\left| \ell ,\ell t\right| }&\le 24, \end{aligned}$$
(3.17c)

and hence we may apply (3.12) since in this case \(64 \le \left| k-\ell ,\eta -\ell \tau \right| \approx \left| \ell ,\ell \tau \right| \). Further, we can use (3.8) to absorb the Sobolev corrections and, indeed, we have (3.16) on the support of the integrand in \({\mathcal {R}}\). Hence,

$$\begin{aligned} {\mathcal {R}} \lesssim \sum _{N \in {\mathbb {D}}} \sum _{N^\prime \approx N}\sum _{k,\ell \in \mathbb {Z}_*^d} \int _\eta \left| B\hat{g^1}_k(\eta )e^{c^\prime \lambda \langle \ell ,\ell t \rangle ^s}\hat{r}_{\ell }(t)_{N^\prime } e^{c^\prime \lambda \langle k-\ell ,\eta - \ell t \rangle ^s}\hat{g^2}_{k-l}(\eta -lt)_{N}\right| {\, \mathrm d}\eta . \end{aligned}$$

Applying (3.2) followed by (3.5) and (3.8) (since \(c^\prime < 1\)) implies,

$$\begin{aligned} {\mathcal {R}}&\lesssim \sum _{N \in {\mathbb {D}}} \sum _{N^\prime \approx N} \left\| g^1\right\| _{\mathcal {G}^{\lambda ,\sigma ;s}} \left\| r_{N^\prime }\right\| _{\mathcal {F}^{c^\prime \lambda ,\frac{d}{2}+1;s}} \left\| g^2_N\right\| _{\mathcal {G}^{c^\prime \lambda ,0;s}} \\&\lesssim \sum _{N \in {\mathbb {D}}} \frac{1}{N} \left\| g^1\right\| _{\mathcal {G}^{\lambda ,\sigma ;s}} \left\| r_{\sim N}\right\| _{\mathcal {F}^{c^\prime \lambda ,\frac{d}{2}+2;s}} \left\| g^2_N\right\| _{\mathcal {G}^{c^\prime \lambda ,0;s}} \\&\lesssim _{\lambda ,\sigma } \left\| g^1\right\| _{\mathcal {G}^{\lambda ,\sigma ;s}} \left\| r\right\| _{\mathcal {F}^{\lambda ,\sigma ;s}} \left\| g^2\right\| _{\mathcal {G}^{c^\prime \lambda ,0;s}}. \end{aligned}$$

Hence (after possibly adjusting \(\tilde{c}\)) this term appears on the RHS of (3.13). \(\square \)

We also need the standard Sobolev space trace lemma, which we will apply on the Fourier side.

Lemma 3.4

(\(L^2\) Trace) Let u be smooth on \({\mathbb {R}}^d\) and \(C \subset {\mathbb {R}}^d\) be an arbitrary straight line. Then for all \(\sigma \in {\mathbb {R}}_+\) with \(\sigma > (d-1)/2\),

$$\begin{aligned} \left\| u\right\| _{L^2(C)} \lesssim \left\| u\right\| _{H^\sigma ({\mathbb {R}}^d)}. \end{aligned}$$

Proof

It follows by induction on co-dimension with the standard \(H^{1/2}\) restriction theorem [1]. \(\square \)

Linear Damping in Gevrey Regularity

The first step to proving (2.12c) is understanding (forced) linear Landau damping in the \(L^2\) Gevrey norms we are using. This is provided by the following lemma, which also shows that (L) implies linear damping in all Gevrey regularities (\(s > 1/3\) is not relevant to the proof). It is crucial that the same norm appears on both sides of (4.2) so that we may use it in the nonlinear estimate on \(\rho (t)\). The main idea of the proof of Lemma 4.1 appears in [67] to treat damping in Gevrey regularity and is based on decomposing the solution into analytic/exponentially decaying sub-components. We note that Lin and Zeng in [52] have linear damping results at much lower regularities (similarly, we believe the following proof can be adapted also to Sobolev spaces). We will first give a formal proof of Lemma 4.1 as an a priori estimate in §4.1 and then explain the rigorous justification in §4.2. In §4.3, we discuss the proof that the Penrose condition [73] implies (L).

Lemma 4.1

(Linear integral-in-time control) Let \(f^0\) satisfy the condition (L) with constants \(M> d/2\) and \(C_0,\bar{\lambda },\kappa > 0\). Let \(A_k(t,\eta )\) be the multiplier defined in (2.7) for \(s \in (0,1)\), and \(\lambda =\lambda (t) \in (0,\lambda _0)\) as defined in (2.8). Let F(t) and \(T^\star > 0\) be given such that, if we denote \(I = [0,T^\star )\),

$$\begin{aligned} \int _0 ^{T_*} \left\| F(t)\right\| _{\mathcal {F}^{\lambda (t),\sigma ;s}} ^2 {\, \mathrm d}t = \sum _{k \in {\mathbb {Z}}^d_*}\left\| A_k(t,kt) F_k(t)\right\| _{L_t^2(I)}^2 < \infty . \end{aligned}$$

Then there exists a constant \(C_{LD} = C_{LD}(C_0,\bar{\lambda },\kappa ,s,d,\lambda _0,{\lambda ^\prime })\) such that for all \(k \in \mathbb {Z}_*^d\), the solution \(\phi _k(t)\) to the system

$$\begin{aligned} \phi _k(t)= F_k(t) + \int _0^tK^0(t-\tau ,k) \phi _k(\tau ) {\, \mathrm d}\tau \end{aligned}$$
(4.1)

in \(t \in {\mathbb {R}}_+\) with \(K^0(t,k) := -\tilde{f}^0\left( kt\right) \widehat{W}(k)\left| k\right| ^2t\) satisfies the mode-by-mode estimate

$$\begin{aligned} \forall \, k \in \mathbb {Z}_*^d, \quad \int _0 ^{T_*} A_k(t,kt)^2 |\phi _k(t)|^2 {\, \mathrm d}t \le C_{LD} ^2 \int _0 ^{T_*} A_k(t,kt) |F_k(t))|^2 {\, \mathrm d}t \end{aligned}$$
(4.2)

which is equivalent to \(\int _0 ^{T_*} \left\| \phi \right\| _{\mathcal {F}^{\lambda (t),\sigma ;s}}^2 {\, \mathrm d}t \le C_{LD} ^2 \int _0 ^{T_*} \left\| F\right\| _{\mathcal {F}^{\lambda (t),\sigma ;s}}^2 {\, \mathrm d}t\).

Remark 15

The proof proceeds slightly differently in the case \(s = 1\) where the additional requirement \(\bar{\lambda } > \lambda (0)\) occurs naturally (and the constant badly depends in the parameter \(\bar{\lambda } - \lambda (0)\)).

Proof of the A Priori Estimate

We only consider the \(s < 1\) case; the analytic case is only a slight variation. As the hypothesis on F is known a priori only to hold on \([0,T_*)\), we simply extend \(F_k(t)\) to be zero for all \(t \ge T_*\).

Step 1. Rough Grönwall bound. First we deduce a rough bound using Grönwall’s inequality with no attempt to be optimal. This bound shows in particular that the integral equation (4.1) is globally well-posed (for each frequency \(k \in {\mathbb {Z}}^d_*\)) in the norm associated with the multiplier A. By (3.9), the definition of \(K^0\), (1.3) and that \(\lambda (t)\) is non-increasing in time,

$$\begin{aligned} A_k(t,kt) \left| \phi _k(t)\right|&\le A_k(t,kt)\left| F_k(t)\right| + \int _0^t A_k(t,kt)\left| K^0(t-\tau ,k)\right| \left| \phi _k(\tau )\right| {\, \mathrm d}\tau \\&\lesssim A_k(t,kt)\left| F_k(t)\right| + \int _0^t \langle k(t-\tau ) \rangle ^\sigma e^{\lambda (t)\langle k(t-\tau ) \rangle ^s} \\&\quad \times \left| \widehat{\nabla _v f^0}(k(t-\tau ))\right| A_k(\tau ,k\tau )\left| \phi _k(\tau )\right| {\, \mathrm d}\tau . \end{aligned}$$

Then by (3.8), the \(H^{d/2+} \hookrightarrow C^0\) embedding theorem and (1.9),

$$\begin{aligned} A_k(t,kt) \left| \phi _k(t)\right|&\lesssim A_k(t,kt)\left| F_k(t)\right| + \left( \sup _{\eta \in {\mathbb {R}}^d}\langle \eta \rangle ^\sigma e^{\lambda (0)\langle \eta \rangle ^s} \left| \widehat{f^0}(\eta )\right| \right) \\&\quad \times \int _0^t A_k(\tau ,k\tau )\left| \phi _k(\tau )\right| {\, \mathrm d}\tau \\&\lesssim A_k(t,kt)\left| F_k(t)\right| + \left\| \langle \eta \rangle ^\sigma e^{\lambda (0)\langle \eta \rangle ^s} \widehat{f^0}(\eta )\right\| _{H^M_\eta } \\&\quad \times \int _0^t A_k(\tau ,k\tau )\left| \phi _k(\tau )\right| {\, \mathrm d}\tau . \end{aligned}$$

By (1.9) with (3.8), it follows by Grönwall’s inequality that for some \(C > 0\) we have the following (using also Cauchy-Schwarz and (3.8) in the last inequality),

$$\begin{aligned} A_k(t,kt) \left| \phi _k(t)\right| \lesssim e^{Ct} \int _0^t \left| A_k(\tau ,k\tau )F_k(\tau )\right| {\, \mathrm d}\tau \lesssim e^{2Ct}\left\| AF_k\right\| _{L^2_t(I)}. \end{aligned}$$
(4.3)

Step 2. Frequency localization. We would like to use the Fourier-Laplace transform as in [67], but \(F_k(t)\) does not decay exponentially in time. Instead, we deduce the estimate by decomposing the problem into a countable number of exponentially decaying contributions. Let \(R \ge e\) to be a constant fixed later depending only on \(\bar{\lambda }\), \(\lambda (0)^{-1}\) and s, let \(F^n_k(t) = F_k(t) \mathbf {1}_{Rn \le \left| kt\right| ^s \le R(n+1)}\), and define \(\phi _k^n\) as solutions to

$$\begin{aligned} \phi _k^n(t) = F_k^n(t) + \int _0^t K^0(t-\tau ,k) \phi _k^n(\tau ) {\, \mathrm d}\tau . \end{aligned}$$
(4.4)

Then \(\phi _k(t) = \sum _{n=0}^\infty \phi _k^n(t)\) by linearity of the equation, and by the definition of \(F^n_k(t)\), \(\phi _k^n(t)\) is supported for \(\left| kt\right| ^{s} \ge Rn\). Moreover, obviously (4.3) holds for each \(\phi _k^n\). Now we will use the Fourier-Laplace transform in time to get an \(L_t^2\) estimate on \(\phi _k^n\) as in [67], but with a contour which gets progressively closer to the imaginary axis as n increases. Define the indices \(\mu _n\) as

$$\begin{aligned} \mu _0&= \mu _1, \\ \mu _n&= \frac{1}{(Rn)^{1/s}}\left[ \lambda \left( \frac{(Rn)^{1/s}}{\left| k\right| }\right) \langle (Rn)^{1/s} \rangle ^s + \sigma \log \langle (Rn)^{1/s} \rangle \right] , \quad \quad n \ge 1. \end{aligned}$$

We will have the requirement that \(\mu _n < \bar{\lambda }\) so that our integration contour always lies in the half plane defined in (L). As long as \(s < 1\) this only requires us to choose R large relative to \(\bar{\lambda }(\lambda (0))^{-1}\), so to fix ideas suppose that R is large enough so that \(\sup \mu _n < \bar{\lambda }/2\). Define the amplitude corrections

$$\begin{aligned} N_{k,0}&= \langle k \rangle ^{\sigma } \exp \left[ \lambda \left( \frac{(R)^{1/s}}{\left| k\right| }\right) \langle k \rangle ^\sigma \right] , \\ N_{k,n}&= \frac{\langle k,(Rn)^{1/s} \rangle ^{\sigma }}{\langle (Rn)^{1/s} \rangle ^{\sigma }} \exp \left[ \lambda \left( \frac{(Rn)^{1/s}}{\left| k\right| }\right) \left[ \langle k,(Rn)^{1/s} \rangle ^s - \langle (Rn)^{1/s} \rangle ^s\right] \right] \quad \quad n \ge 1. \end{aligned}$$

The role of this correction is highlighted by the fact that when \(\left| kt\right| = (Rn)^{1/s}\) and \(n \ge 1\), then

$$\begin{aligned} {\left\{ \begin{array}{ll}\displaystyle e^{\mu _n|kt|} = \langle kt \rangle ^{\sigma } e^{\lambda (t) \langle kt \rangle ^s} \\ \displaystyle N_{n,k} = \frac{\langle k,kt \rangle ^\sigma }{\langle kt \rangle ^\sigma } e^{\lambda (t)\left( \langle k, kt \rangle ^s - \langle kt \rangle ^s \right) } \\ \displaystyle e^{\mu _n|kt|} N_{n,k} = \langle k, kt \rangle ^\sigma e^{\lambda (t) \langle k, kt \rangle ^s} = A_k(t,kt). \end{array}\right. } \end{aligned}$$

Hence the index \(\mu _n\) is related to the radius of analyticity in the velocity variable, whereas the correction \(N_{k,n}\) measures the ratio of what is lost by not taking into account the regularity in the space variable. A further error is introduced by the fact that these regularity weights only exactly fit the multiplier A at the left endpoint of the interval of the decomposition.

Step 3. The one-block estimate via the Fourier-Laplace transform. Now multiply (4.4) by \(e^{\mu _n\left| k\right| t}N_{k,n}\), and denoting

$$\begin{aligned} {\left\{ \begin{array}{ll} R^n_k(t) &{} = e^{\mu _n\left| k\right| t} N_{k,n} F_k^n(t)\\ \Phi ^n_k(t) &{} = e^{\mu _n\left| k\right| t}N_{k,n} \phi _k^n(t), \end{array}\right. } \end{aligned}$$

we have

$$\begin{aligned} \Phi _k^n(t) = R_k^n(t) + \int _0^t e^{\mu _n\left| k\right| (t-\tau )} K^0(t-\tau ,k) \Phi _k^n (\tau ){\, \mathrm d}\tau . \end{aligned}$$
(4.5)

Taking the Fourier transform in time \(\hat{G}(\omega ) = (1/\sqrt{2\pi }) \int _{\mathbb {R}}e^{-it\omega } G(t) {\, \mathrm d}t\) (extending \(R_k^n\) and \(\Phi _k^n\) and \(K^0\) as zero for negative times)Footnote 1 we obtain

$$\begin{aligned} \hat{\Phi }_k ^n(\omega ) = \hat{R}^n_k(\omega ) + \mathcal {L}\left( k,\mu _n + i \frac{\omega }{|k|} \right) \hat{\Phi }^n _k(\omega ), \end{aligned}$$

(where \({\mathcal {L}}(\xi ,k) := \int _0^{+\infty } e^{\bar{\xi }\left| k\right| t}K^0(t,k) {\, \mathrm d}t\)), which is formally solved as

$$\begin{aligned} \Phi _k^n(\omega ) = \frac{\hat{R}^n_k(\omega )}{1 - \mathcal {L}\left( k,\mu _n + i \frac{\omega }{|k|} \right) }. \end{aligned}$$

Applying the stability condition (L) and Plancherel’s theorem implies

$$\begin{aligned} \left\| \Phi _k^n(t)\right\| _{L_t^2({\mathbb {R}})} \lesssim \frac{1}{\kappa }\left\| R_k^n(t)\right\| _{L^2_t({\mathbb {R}})}. \end{aligned}$$
(4.6)

Step 4. Coming back to the original multiplier A. Consider \(Rn \le \left| kt\right| ^{s} \le R(n+1)\) for \(n \ge 1\). From the definition (2.8) of \(\lambda (t)\) we see that if R is chosen sufficiently large then for \(t \ge R^{1/s}\) we have

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} \left( \lambda (t) \langle k,kt \rangle ^s \right)&= \left( \dot{\lambda }(t) + s\lambda (t) \frac{\left| k\right| \left| k,kt\right| }{\langle k,kt \rangle ^2} \right) \langle k,kt \rangle ^s \\&= \left( -\frac{a(\lambda _0-{\lambda ^\prime })}{4\langle t \rangle ^{1+a}} + s\left| k\right| \lambda (t)\frac{\left| k,kt\right| }{\langle k,kt \rangle ^{2}}\right) \langle k,kt \rangle ^s > 0, \end{aligned}$$

and hence \(\lambda (t)\langle k,kt \rangle ^s\) is increasing since \(\left| k\right| \ge 1\). Applying this gives,

$$\begin{aligned} N_{k,n} e^{\mu _n\left| kt\right| }&= \exp \left[ \lambda \left( \frac{(Rn)^{1/s}}{\left| k\right| }\right) \left[ \langle k,(Rn)^{1/s} \rangle ^s - \langle (Rn)^{1/s} \rangle ^s\right] \right] \frac{\langle k,(Rn)^{1/s} \rangle ^{\sigma }}{\langle (Rn)^{1/s} \rangle ^{\sigma }} \nonumber \\&\quad \times \exp \left[ \frac{\left| kt\right| }{(Rn)^{1/s}}\left( \lambda \left( \frac{(Rn)^{1/s}}{\left| k\right| }\right) \langle (Rn)^{1/s} \rangle ^s + \sigma \log \langle (Rn)^{1/s} \rangle \right) \right] \nonumber \\&\le \exp \left[ \lambda \left( t\right) \langle k,kt \rangle ^s \right] \langle k,kt \rangle ^{\sigma }\langle (Rn)^{1/s} \rangle ^{\sigma \left( \frac{(n+1)^{1/s}}{n^{1/s}} -1\right) } \nonumber \\&\quad \times \exp \left[ \lambda \left( \frac{(Rn)^{1/s}}{\left| k\right| }\right) \left( \frac{(n+1)^{1/s}}{(n)^{1/s}} - 1\right) \langle (Rn)^{1/s} \rangle ^s\right] \nonumber \\&\lesssim _{R,\lambda _0,s} A_k(t,kt). \end{aligned}$$
(4.7)

A similar result holds for \(n=0\) when \(\left| kt\right| ^s \le R\) using that \(\lambda (t)\) is non-increasing,

$$\begin{aligned} N_{k,0} e^{\mu _0\left| kt\right| }&= \exp \left[ \lambda \left( \frac{R^{1/s}}{\left| k\right| }\right) \langle k \rangle ^s\right] \langle k \rangle ^{\sigma }\nonumber \\&\quad \times \exp \left[ \frac{\left| kt\right| }{R^{1/s}}\left( \lambda \left( \frac{R^{1/s}}{\left| k\right| }\right) \langle R^{1/s} \rangle ^s + \sigma \log \langle R^{1/s} \rangle \right) \right] \\&\lesssim _R A_k(t,kt). \end{aligned}$$

Hence it follows from (4.6) that

$$\begin{aligned} \left\| \Phi _k^n(t)\right\| _{L_t^2({\mathbb {R}})} \lesssim \frac{1}{\kappa }\left\| AF_k(t) \mathbf {1}_{Rn \le \left| kt\right| ^s \le R(n+1)}\right\| _{L_t^2({\mathbb {R}})}. \end{aligned}$$
(4.8)

Step 5. Summation of the different frequency blocks. Now we want to estimate \(A\phi _k\) by summing over n which will require some almost orthogonality. This is possible as each \(\phi _k^n\) is exponentially localized near \(\left| kt\right| ^s \approx Rn\). Computing, noting that by construction \(\phi _k^n\) is only supported on \(\left| kt\right| ^s \ge Rn\),

$$\begin{aligned} \left\| A\phi _k\right\| _{L_t^2}^2&= \int _0^{+\infty } \left| \sum _{n=0}^{+\infty } A_k(t,kt)\phi ^n_k(t) \right| ^2 {\, \mathrm d}t \nonumber \\&\lesssim \int _0^{\infty } \left| A_k(t,kt)\phi ^0_k(t)\right| ^2 {\, \mathrm d}t + \int _0^{\infty } \left| \sum _{n=1}^{\infty } A_k(t,kt)e^{-\mu _n\left| kt\right| } N_{k,n}^{-1} \mathbf {1}_{\left| kt\right| ^s \ge Rn} \Phi ^n_k(t) \right| ^2 {\, \mathrm d}t \nonumber \\&= \int _0^{\infty } \left| A_k(t,kt)\phi ^0_k(t)\right| ^2 {\, \mathrm d}t + \int _0^{\infty } \sum _{n,n^\prime \ge 1} A_k(t,kt)^2 e^{-\mu _n\left| kt\right| }\nonumber \\&\quad \times N_{k,n}^{-1}e^{-\mu _{n^\prime }\left| kt\right| } N_{k,n^\prime }^{-1} \mathbf {1}_{\left| kt\right| ^s \ge Rn} \mathbf {1}_{\left| kt\right| ^s \ge Rn^\prime } \Phi ^n_k(t) \Phi ^{n^\prime }_k(t) {\, \mathrm d}t. \end{aligned}$$
(4.9)

First we will approach the infinite sum, which is the more challenging term. For this we use Schur’s test. Indeed, if we denote the interaction kernel

$$\begin{aligned} K_{n,n^\prime }(t,k) := A_k(t,kt)^2 e^{-\mu _n\left| kt\right| } N_{k,n}^{-1}e^{-\mu _{n^\prime }\left| kt\right| } N_{k,n^\prime }^{-1} \mathbf {1}_{\left| kt\right| ^s \ge Rn} \mathbf {1}_{\left| kt\right| ^s \ge Rn^\prime }, \end{aligned}$$

then Schur’s test (or Cauchy-Schwarz three times) implies

$$\begin{aligned} \int _0^{+\infty } \sum _{n,n^\prime \ge 1}K_{n,n^\prime }(t,k)\Phi ^{n^\prime }_k(t)\Phi ^{n}_k(t) {\, \mathrm d}t&\le \left( \sup _{t \in [0,\infty )}\sup _{n\ge 1} \sum _{n^\prime =1}^{\infty } K_{n,n^\prime }(t,k)\right) ^{1/2} \nonumber \\&\quad \times \left( \sup _{t \in [0,\infty )}\sup _{n^\prime \ge 1} \sum _{n=1}^{\infty } K_{n,n^\prime }(t,k)\right) ^{1/2} \nonumber \\&\quad \times \sum _{n=1}^{\infty } \left\| \Phi _k^n(t)\right\| ^2_{L^2_t({\mathbb {R}})}. \end{aligned}$$
(4.10)

It remains to see that the row and column sums of the interaction kernel are uniformly bounded in time. Since the kernel is symmetric in n and \(n^\prime \) it suffices to consider only one of the sums. The computations above to deduce (4.7) can be adapted to show at least that

$$\begin{aligned} A_k(t,kt)e^{-\mu _{n^\prime }\left| kt\right| } N_{k,n^\prime }^{-1}\mathbf {1}_{\left| kt\right| ^s \ge Rn^\prime } \lesssim _R 1, \end{aligned}$$

and hence, since \(\lambda (t)\) is decreasing,

$$\begin{aligned}&\sum _{n=1}^{+\infty } K_{n,n^\prime }(t,k) \lesssim _R \sum _{n=1}^{+\infty } A_k(t,kt) e^{-\mu _n\left| kt\right| } N_{k,n}^{-1}\mathbf {1}_{\left| kt\right| ^s \ge Rn} \\&\quad = \sum _{n=1}^{+\infty } \mathbf {1}_{\left| kt\right| ^s \ge Rn} e^{\lambda (t)\langle k,kt \rangle ^s - \lambda \left( \frac{(Rn)^{1/s}}{\left| k\right| }\right) \left( \langle k,(Rn)^{1/s} \rangle ^s - \langle (Rn)^{1/s} \rangle ^s\right) }\frac{\langle k,kt \rangle ^\sigma \langle (Rn)^{1/s} \rangle ^\sigma }{\langle k,(Rn)^{1/s} \rangle ^\sigma } e^{-\mu _n\left| kt\right| } \\&\quad \lesssim \sum _{n = 1}^{+\infty } \mathbf {1}_{\left| kt\right| ^s \ge Rn} e^{\lambda (t)\langle k,kt \rangle ^s - \lambda \left( t\right) \left( \langle k,(Rn)^{1/s} \rangle ^s - \langle (Rn)^{1/s} \rangle ^s\right) }\frac{\langle k,kt \rangle ^\sigma \langle (Rn)^{1/s} \rangle ^\sigma }{\langle k,(Rn)^{1/s} \rangle ^\sigma } \\&\qquad \times e^{-\frac{\left| kt\right| }{(Rn)^{1/s}}\left[ \lambda \left( t\right) \langle (Rn)^{1/s} \rangle ^s + \sigma \log \langle (Rn)^{1/s} \rangle \right] }. \end{aligned}$$

Since \(e \le (Rn)^{1/s} \le \left| kt\right| \), we have \(-\langle k,(Rn)^{1/s} \rangle ^s + \langle (Rn)^{1/s} \rangle ^s \le \langle kt \rangle ^s - \langle k,kt \rangle ^{s}\) since the LHS is increasing as a function of n and therefore,

$$\begin{aligned}&\sum _{n=1}^{+\infty } K_{n,n^\prime }(t,k)\\&\quad \lesssim \sum _{n = 1}^{+\infty } \mathbf {1}_{\left| kt\right| ^s \ge Rn} e^{\lambda (t)\langle kt \rangle ^s\left[ 1- \frac{\left| kt\right| \langle (Rn)^{1/s} \rangle ^s}{ (Rn)^{1/s} \langle kt \rangle ^s}\right] } e^{\sigma \log \left[ \frac{\langle k,kt \rangle \langle (Rn)^{1/s} \rangle }{\langle k,(Rn)^{1/s} \rangle }\right] - \sigma \frac{\left| kt\right| }{(Rn)^{1/s}}\log \langle (Rn)^{1/s} \rangle }. \end{aligned}$$

Since \(\langle x \rangle \langle k,x \rangle ^{-1}\) is increasing in x for \(\left| k\right| \ge 1\) and \(x \ge 0\), we have

$$\begin{aligned} \sum _{n=1}^{+\infty } K_{n,n^\prime }(t,k) \lesssim \sum _{n = 1}^{+\infty } \mathbf {1}_{\left| kt\right| ^s \ge Rn} e^{\lambda (t)\langle kt \rangle ^s\left[ 1- \frac{\left| kt\right| \langle (Rn)^{1/s} \rangle ^s}{ (Rn)^{1/s} \langle kt \rangle ^s}\right] } e^{\sigma \log \langle kt \rangle \left( 1 - \frac{\left| kt\right| \log \langle (Rn)^{1/s} \rangle }{(Rn)^{1/s} \log \langle kt \rangle } \right) }. \end{aligned}$$

Finally using that \(x / \log \langle x \rangle \) is increasing for \(x \ge e\) we get

$$\begin{aligned} \sum _{n=1}^{+\infty } K_{n,n^\prime }(t,k) \lesssim \sum _{n = 1}^{+\infty } \mathbf {1}_{\left| kt\right| ^s \ge Rn} e^{\lambda (t)\langle kt \rangle ^s\left[ 1- \frac{\left| kt\right| \langle (Rn)^{1/s} \rangle ^s}{ (Rn)^{1/s} \langle kt \rangle ^s}\right] }. \end{aligned}$$

Using that \(\langle x^{1/s} \rangle ^s \ge x\), the sum can be bounded by

This shows that the row sums of \(K_{n,n^\prime }(t,k)\) are uniformly bounded; by symmetry the column sums are also bounded and by (4.10), this completes the treatment of the summation in (4.9).

Now we turn our attention to the \(n = 0\) term in (4.9). By (4.3),

$$\begin{aligned} \int _0^{+\infty } \left| A_k(t,kt)\phi _k^0(t)\right| ^2 {\, \mathrm d}t&= \int _0^{R^{1/s}} \left| A_k(t,kt)\phi _k^0(t)\right| ^2 {\, \mathrm d}t + \int _{R^{1/s}} ^{+\infty } \left| A_k(t,kt)\phi _k^0(t)\right| ^2 {\, \mathrm d}t \nonumber \\&\lesssim _R \left\| AF_k\right\| ^2_{L^2_t({\mathbb {R}})} + \int _{R^{1/s}}^{+\infty } \left| A_k(t,kt)\phi _k^0(t)\right| ^2 {\, \mathrm d}t. \end{aligned}$$
(4.11)

However, for \(\left| t\right| \ge R^{1/s}\), we have by (3.9), that \(\lambda (t)\) is non-increasing and (3.7),

$$\begin{aligned} A_k(t,kt) \le e^{\lambda (t)\langle k \rangle ^s + \lambda (t)\langle kt \rangle ^s}\langle k \rangle ^\sigma \langle kt \rangle ^\sigma \lesssim _R e^{\lambda \left( \frac{R^{1/s}}{\left| k\right| }\right) \langle k \rangle ^s}\langle k \rangle ^\sigma e^{\mu _0\left| kt\right| } \lesssim N_{k,0} e^{\mu _0\left| kt\right| }, \end{aligned}$$

which implies with (4.11) and (4.8) that

$$\begin{aligned} \int _0^{\infty } \left| A_k(t,kt)\phi _k^0(t)\right| ^2 {\, \mathrm d}t \lesssim \left( 1 + \frac{1}{\kappa ^2}\right) \left\| AF_k\right\| _{L^2_t({\mathbb {R}})}. \end{aligned}$$
(4.12)

Combining (4.9), (4.12), (4.10) with (4.8) we have

$$\begin{aligned} \left\| A\phi _k\right\| _{L^2_t(I)}^2&\lesssim _R \left( 1 + \frac{1}{\kappa ^2}\right) \left\| AF_k\right\| ^2_{L^2_t({\mathbb {R}})} + \sum _{n=1}^\infty \left\| \Phi ^n_k(t)\right\| ^2_{L^2_t(I)} \\&\lesssim \left( 1 + \frac{1}{\kappa ^2}\right) \sum _{n=0}^\infty \left\| AF_k(t) \mathbf {1}_{Rn \le \left| kt\right| ^s \le R(n+1)}\right\| ^2_{L^2_t({\mathbb {R}})}, \end{aligned}$$

which completes the proof of the lemma.

Rigorous Justification of the A Priori Estimate

The reader may have noticed that in the previous subsection it seems that we only used the bound from below \(|1-\mathcal {L}(k,\xi )|\ge \kappa \) with \(\xi = \mu _n + i \omega /|k|\), i.e. in the strip \(\mathfrak {R}e\, \xi \in (0,\bar{\lambda }/2)\). The subtlety is that the Fourier-Laplace transform of \(\Phi _k^n (t)\) is only granted to exist when some \(L^2\) integrability as \(t \rightarrow \infty \) is known.

To be more specific, consider (4.5). From the Grönwall bound (4.3) established in step 1, it is clear that the Fourier-Laplace transform would exist if one chooses \(\mu _n< -2C\), however it is not clear that we can perform the computation as \(\mu _n\) approaches the imaginary axis. In order to avoid a circular argument –establishing some time decay by assuming the existence of a Fourier-Laplace transform which already requires some time decay–, we can appeal to several arguments:

  1. 1.

    We can use as a black box the Paley-Wiener theory (see [72, Chap. 18] or [33, Chap. 2]): for every \(f \in L^1_{loc}({\mathbb {R}}_+)\) there exists a unique solution \(u \in L^1_{loc}({\mathbb {R}}_+)\) to the integral equation \(u = f + k * u\) with \(k \in L^1_t\) and with \(k \le e^{Ct}\) for some constant \(C>0\) (where the convolution over \(t \in {\mathbb {R}}_+\) is defined as before by extending functions to zero on \({\mathbb {R}}_-\)), given by \(u = f - f* r\) where \(r \in L^1_{loc}({\mathbb {R}}_+)\) is the so-called resolvent kernel of k. The latter is the unique solution to \(r = k + r *k\) and the key result of the theory is that \(r \in L^1({\mathbb {R}}_+)\) iff the Fourier-Laplace transform of k satisfies \(\mathcal {L}[k](\xi ) \not =1\) for any \(\mathfrak {R}e \, \xi \le 0\). As soon as \(r, f \in L^1({\mathbb {R}}_+)\) we have \(u \in L^1({\mathbb {R}}_+)\). Then step 3 of Lemma 4.1 can be justified by applying this theory to \(u(t) := \Phi ^n_k(t)\), \(f(t) :=R^n_k(t)\) and \(k(t) := e^{\mu _n\left| k\right| t} K^0(t,k)\).

  2. 2.

    A second method is to use an approximation argument, in the spirit of energy methods in PDEs, which was discussed in [84, Section3]. Define \(\Phi ^n_{k,\delta }(t) := \Phi ^n_k(t) e^{-\delta t^2/2}\). Now we have the existence of Fourier-Laplace transform of \(\Phi ^n_k(t)\) for any \(\mu _n\) thanks to the Gaussian decay in time and (4.3). Using the Fourier-Laplace transform on (4.5) and (L), we may deduce that for \(\mu _n < -2C\) (where 2C comes from the estimate (4.3)) that the following formula holds:

    $$\begin{aligned} \hat{\Phi }^n_{k,\delta }(\omega ) = \left( \frac{\hat{R}^n_k(\omega )}{1 - \mathcal {L}\left( k,\mu _n + i \frac{\omega }{|k|} \right) } \right) *\gamma _\delta , \quad \gamma _\delta (\omega ) := \frac{e^{-\frac{|\omega |^2}{2 \delta }}}{\sqrt{2 \pi \delta }}. \end{aligned}$$

    Since this is an analytic function in \(\mu _n\) and \(\omega \) as long as we do not approach a singularity, by analytic continuation we may deduce that this formula remains valid for all \(\mu _n < \bar{\lambda }\) by (L). Therefore by Plancherel’s theorem

    $$\begin{aligned} \left\| \Phi _{k,\delta }^n(t)\right\| _{L_t^2({\mathbb {R}})} \lesssim \frac{1}{\kappa }\left\| R_k^n(t)\right\| _{L^2_t({\mathbb {R}})} \left\| \gamma _\delta \right\| _{L^1({\mathbb {R}})} \lesssim \frac{1}{\kappa }\left\| R_k^n(t)\right\| _{L^2_t({\mathbb {R}})} \end{aligned}$$

    which is an estimate independent of \(\delta >0\). We then let \(\delta \) go to zero and deduce by Fatou’s lemma the desired bound (4.6):

    $$\begin{aligned} \left\| \Phi _{k}^n(t)\right\| _{L_t^2({\mathbb {R}})} \lesssim \frac{1}{\kappa }\left\| R_k^n(t)\right\| _{L^2_t({\mathbb {R}})} \end{aligned}$$

    (which also justifies the existence of the Fourier-Laplace transform).

Remark 16

Let us mention to finish with that the present discussion is related to the Gerhart-Herbst-Prüss-Greiner theorem [2, 26, 39, 75] (see also [24]) for semigroups in Hilbert spaces. The latter asserts that the semigroup decay is given by the spectral bound, under a sole pointwise control on the resolvent. While the constants seem to be non-constructive in the first versions of this theorem, Engel and Nagel gave a comprehensive and elementary proof with constructive constant in [24, Theorem 1.10; chapter V]. Let us also mention on the same subject subsequent more recent works like [38]. The main idea in the proof of [24, Theorem1.10,chapterV], which is also used in [38], is to use a Plancherel identity on the resolvent in Hilbert spaces in order to obtain explicit rates of decay on the semigroup in terms of bounds on the resolvent. However in the Volterra integral equation we study here, there is no semigroup structure on the unknown \(\phi _k(t)\), and we cannot appeal directly to these classical results.

The Penrose Criterion

A generalized form of the Penrose criterion [73] was given in [67] as follows:

$$\begin{aligned} (\mathbf{P}) \qquad \forall \, k \in {\mathbb {Z}}^d_* \ \text{ and } \ r \in {\mathbb {R}}\ \text{ s.t. } \ (f^0_k)'(r) =0, \quad \widehat{W}(k) \left( \text{ p.v. } \int _{\mathbb {R}}\frac{(f^0_k)'(r)}{r-w} {\, \mathrm d}r\right) <1, \end{aligned}$$
(4.13)

where \(f_k^0\) denotes the marginals of the background \(f^0\) along arbitrary wave vector \(k \in {\mathbb {Z}}^d_*\):

$$\begin{aligned} f^0_k(r) := \int _{kr/|k|+k^\bot } f^0(w) {\, \mathrm d}w, \quad r \in {\mathbb {R}}. \end{aligned}$$

The proof that condition (P) implies the condition (L) was not quite complete in [67] as it was proved only that (P) implies the lower bound \(|1-\mathcal {L}(k,\xi )|\ge \kappa \) in a strip and not in a half-plane. The complete proof due to Penrose relies on the argument principle. The starting point is to observe that \(\mathcal {L}(k,\xi ) = \int _0 ^{+\infty } e^{\bar{\xi }|k|t} K^0(t,k) {\, \mathrm d}t\) with \(\xi = \lambda + i \zeta \) and \(K^0(t,k) := -\hat{f}^0\left( kt\right) \widehat{W}(k)\left| k\right| ^2t\) is well-defined for \(\lambda < \bar{\lambda }\) by the analyticity of \(f^0\) and is small for large \(\zeta \) by integration by parts. We therefore restrict ourselves to a compact interval \(|\zeta |\le C\) in \(\zeta \), and we compute by the Plemelj formula (see [67] for more details) that

$$\begin{aligned} \mathcal {L}(k,i\zeta ) = \hat{W}(k) \left[ \left( \text{ p.v. } \int _{\mathbb {R}}\frac{(f^0_k)'(r)}{r-\zeta } \, dr \right) - i \pi (f^0_k)'(\zeta ) \right] . \end{aligned}$$
(4.14)

Therefore, the condition (P) implies that \(|1-\mathcal {L}(k,\xi )| \ge 2\kappa \) for some \(\kappa >0\) at \(\lambda =0\) and for \(|\zeta |\le C\). Combined with smallness for large \(\zeta \) and continuity, we deduce the lower bound \(|1-\mathcal {L}(k,\xi )| \ge \kappa \) in a strip \(\mathfrak {R}e \, \xi \in [0,\lambda ']\) for some \(\lambda '>0\). Since the function \(\xi \mapsto \mathcal {L}(k,\xi )\) is holomorphic on \(\mathfrak {R}e \, \xi < \bar{\lambda }\) and the value 1 is not taken on \(i{\mathbb {R}}\), by the argument principle, the value 1 can only be taken on \(\mathfrak {R}e \, \xi <0\) if \(\Xi : \zeta \mapsto \mathcal {L}(k,i\zeta )\) has a positive winding number around this value. However, this would imply that the curve \(\Xi \) crosses the real axis above 1, which is prohibited again by (4.14) and (P), which concludes the proof.

Energy Estimates on the \((\rho ,g)\) System

In this section we perform the necessary energy estimates to prove Proposition 2.2, that is, we deduce the multi-tier controls stated in (2.12) from (2.11) for suitable \(K_i\) and sufficiently small \(\epsilon \).

\(L^2_t(I)\) Estimates on the Density (Equation (2.12c))

The most fundamental estimate we need to make is the \(L_t^2(I)\) control (2.12c), which requires the linear damping Lemma 4.1 and the crucial plasma echo analysis carried out in §6 (which we apply as a black box in this section). The controls (2.11b) and (2.11a) were chosen specifically for this.

To highlight its primary importance, we state the inequality as a separate proposition.

Proposition 5.1

(Nonlinear control of \(\rho \)) For suitable \(K_i\) and \(\epsilon _0\) sufficiently small, the estimate (2.12c) holds under the bootstrap hypotheses (2.11).

Proof

Proposition 5.1 requires two main controls, which follow from (2.11b) and (2.11a) respectively.

  1. (a)

    Define the time-response kernel \(\bar{K}_{k,\ell }(t,\tau )\) for some \(c = c(s) \in (0,1)\) (determined by the proof):

    $$\begin{aligned} \bar{K}_{k,\ell }(t,\tau )&= \frac{1}{\left| \ell \right| ^{\gamma }}e^{(\lambda (t) - \lambda (\tau ))\langle k,kt \rangle ^s}e^{c\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s}\nonumber \\&\quad \times \left| k(t-\tau )\hat{g}_{k-\ell }(\tau ,kt-\ell \tau )\right| \mathbf {1}_{\ell \ne 0}. \end{aligned}$$
    (5.1)

    Proposition 5.1 will depend on the estimate

    $$\begin{aligned} \left( \sup _{t \ge 0} \sup _{k \in \mathbb {Z}_*^d}\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}\tau \right) \left( \sup _{\tau \ge 0} \sup _{\ell \in \mathbb {Z}_*^d} \sum _{k \in \mathbb {Z}_*^d} \int _{\tau }^\infty \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}t\right) \lesssim K_2\epsilon ^2. \end{aligned}$$
    (5.2)
  2. (b)

    Proposition 5.1 will also depend on the estimate

    $$\begin{aligned} \sup _{\tau \ge 0}e^{(c-1)\alpha _0\langle \tau \rangle ^s} \sum _{k \in \mathbb {Z}^d} \sup _{\omega \in \mathbb {Z}_*^d} \sup _{x\in {\mathbb {R}}^d} \int _{-\infty }^\infty \left| (A \widehat{\nabla g})_{k}\left( \tau ,\frac{\omega }{\left| \omega \right| }\zeta - x\right) \right| ^2 d\zeta \lesssim K_1\epsilon ^2. \end{aligned}$$
    (5.3)

The condition (5.2) controls reaction: the interaction of the density with the lower frequencies of g; condition (5.3) controls transport: the interaction with higher frequencies of g. The latter, condition (5.3), follows from (2.11a): by Lemma 3.4 followed by (2.14),

$$\begin{aligned} \sum _{k \in \mathbb {Z}^d} \sup _{\omega \in \mathbb {Z}_*^d} \sup _{x \in {\mathbb {R}}^d} \int _{-\infty }^\infty \left| (A \widehat{\nabla g})_{k}\left( \tau ,\frac{\omega }{\left| \omega \right| }\zeta -x\right) \right| ^2 {\, \mathrm d}\zeta&\lesssim _M \left\| A \widehat{\nabla g}(\tau )\right\| ^2_{L_k^2 H^M_\eta } \lesssim K_1 \epsilon ^2 \langle \tau \rangle ^7, \end{aligned}$$

from which (5.3) follows by (3.8) and \(c < 1\). Since condition (5.2) is much harder to verify and contains the physical mechanism of the plasma echoes, we prove Proposition 5.1 assuming (5.2). In §6 below, we prove that (5.2) follows from (2.11b).

Expanding the integral equation (2.6) using the paraproduct decomposition:

$$\begin{aligned} \hat{\rho }_k(t)&= \hat{h}_{{\mathrm{in}}}(k,kt) - \int _0^t \hat{\rho }_k(\tau )\left| k\right| ^2 \widehat{W}(k)(t-\tau )\hat{f}^0(k(t-\tau )) {\, \mathrm d}\tau \nonumber \\&\quad - \int _0^t\sum _{\ell \in \mathbb {Z}_*^d}\sum _{N \ge 8} \hat{\rho }_\ell (\tau )_{<N/8} \widehat{W}(\ell ) \ell \cdot k(t-\tau ) \hat{g}_{k-\ell }(\tau ,kt-\ell \tau )_{N} {\, \mathrm d}\tau \nonumber \\&\quad - \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \sum _{N \ge 8} \hat{\rho }_\ell (\tau )_{N} \widehat{W}(\ell ) \ell \cdot k(t-\tau ) \hat{g}_{k-\ell }(\tau ,kt-\ell \tau )_{<N/8} {\, \mathrm d}\tau \nonumber \\&\quad - \int _0^t\sum _{\ell \in \mathbb {Z}_*^d}\sum _{N \in {\mathbb {D}}} \sum _{N/8 \le N^\prime \le 8N} \hat{\rho }_\ell (\tau )_{N^\prime } \widehat{W}(\ell ) \ell \cdot k(t-\tau ) \hat{g}_{k-\ell }(\tau ,kt-\ell \tau )_{N} {\, \mathrm d}\tau , \nonumber \\&= \hat{h}_{{\mathrm{in}}}(k,kt) - \int _0^t \hat{\rho }_k(\tau )\left| k\right| ^2 \widehat{W}(k)(t-\tau )\hat{f}^0(k(t-\tau )) {\, \mathrm d}\tau \nonumber \\&\quad - T_k(t) - R_k(t) - {\mathcal {R}}_k(t). \end{aligned}$$
(5.4)

Recall our convention that the Littlewood-Paley projection of \(\hat{\rho }_\ell (\tau )\) treats \(\ell \tau \) in place of the v frequency. We begin by applying Lemma 4.1 to (5.4), which implies for each \(k \in \mathbb {Z}_*^d\),

$$\begin{aligned} \left\| A\hat{\rho }_k\right\| ^2_{L_t^2(I)}&\lesssim C_{LD}\left\| A_k(\cdot ,k\cdot )\hat{h}_{{\mathrm{in}}}(k,k\cdot )\right\| ^2_{L_t^2(I)} + C_{LD}\left\| AT_k\right\| ^2_{L_t^2(I)} \nonumber \\&\quad + C_{LD}\left\| AR_k\right\| ^2_{L_t^2(I)} + C_{LD}\left\| A{\mathcal {R}}_k\right\| ^2_{L_t^2(I)}. \end{aligned}$$
(5.5)

First, Lemma 3.4 and a version of the argument applied in (2.13) (using also that \(\lambda (t)\) is decreasing (2.8)) imply

$$\begin{aligned} \sum _{k \in \mathbb {Z}_*^d} \left\| A_k(\cdot ,k\cdot )\hat{h}_{{\mathrm{in}}}(k,k\cdot )\right\| ^2_{L_t^2(I)}&= \sum _{k \in \mathbb {Z}_*^d} \int _0^{T^\star } \left| A_k(t,kt) \hat{h}_{{\mathrm{in}}}(k,kt)\right| ^2 {\, \mathrm d}t \nonumber \\&\lesssim \sum _{k \in \mathbb {Z}_*^d}\left\| A_k(0,\cdot )\hat{h}_{{\mathrm{in}}}(k,\cdot )\right\| _{H^M({\mathbb {R}}^d_\eta )}^2 \lesssim \epsilon ^2. \end{aligned}$$
(5.6)

\(\square \)

Now we turn to the nonlinear contributions in (5.4).

Reaction

Our goal is to show

$$\begin{aligned} \left\| AR\right\| _{L_k^2L^2_t(I)}^2 \lesssim K_2\epsilon ^2\left\| A\hat{\rho }\right\| ^2_{L_k^2L^2_t(I)}, \end{aligned}$$
(5.7)

since for \(\epsilon \) chosen sufficiently small, this contribution can then be absorbed on the LHS of (5.5).

First, by applying (1.3),

$$\begin{aligned} \left\| AR\right\| _{L_k^2L^2_t(I)}^2&\lesssim \sum _{k \in \mathbb {Z}_*^d}\int _0^{T^\star } \left[ A_k(t,kt)\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \sum _{N \ge 8}\left| \hat{g}_{k-\ell }(\tau ,kt-\ell \tau )_{<N/8}\right. \right. \\&\quad \left. \left. \frac{\left| k(t-\tau )\right| }{\left| \ell \right| ^{\gamma }} \hat{\rho }_\ell (\tau )_N\right| {\, \mathrm d}\tau \right] ^{2} {\, \mathrm d}t. \end{aligned}$$

By definition, the Littlewood-Paley projections imply the frequency localizations (as in (3.15)):

$$\begin{aligned} \frac{N}{2} \le \left| \ell \right| + \left| \ell \tau \right|&\le \frac{3N}{2}, \end{aligned}$$
(5.8a)
$$\begin{aligned} \left| k-\ell \right| + \left| kt-\ell \tau \right|&\le \frac{3N}{32}, \end{aligned}$$
(5.8b)
$$\begin{aligned} \frac{13}{16} \le \frac{\left| k,kt\right| }{\left| \ell ,\tau \ell \right| }&\le \frac{19}{16}. \end{aligned}$$
(5.8c)

From (5.8), on the support of the integrand, (3.11) implies that for some \(c = c(s) \in (0,1)\):

$$\begin{aligned} A_k(t,kt)&= e^{(\lambda (t) - \lambda (\tau ))\langle k,kt \rangle ^s}A_k(\tau ,kt) \lesssim e^{(\lambda (t) - \lambda (\tau ))\langle k,kt \rangle ^s}e^{c\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s}A_\ell (\tau ,\ell \tau ). \end{aligned}$$
(5.9)

Therefore, by definition of \(\bar{K}\) we have (dropping the Littlewood-Paley projection on g),

$$\begin{aligned} \left\| AR\right\| _{L_k^2L^2_t(I)}^2&\lesssim \sum _{k\in \mathbb {Z}_*^d}\int _0^{T^\star } \left[ \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) A_\ell (\tau ,\ell \tau )\sum _{N \ge 8} \left| \hat{\rho }_\ell (\tau )_N\right| {\, \mathrm d}\tau \right] ^{2} {\, \mathrm d}t. \end{aligned}$$

Since the Littlewood-Paley projections define a partition of unity,

$$\begin{aligned} \left\| AR\right\| _{L_k^2L^2_t(I)}^2&\lesssim \sum _{k\in \mathbb {Z}_*^d}\int _0^{T^\star } \left[ \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) A_\ell (\tau ,\ell \tau )\left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \right] ^{2} {\, \mathrm d}t. \end{aligned}$$

From here we may proceed analogous to §7 in [67], for which we apply Schur’s test in \(L_k^2L_t^2\). Indeed,

$$\begin{aligned} \left\| AR\right\| _{L_k^2L^2_t(I)}^2&\lesssim \sum _{k\in \mathbb {Z}_*^d}\int _0^{T^\star }\left( \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}\tau \right) \\&\quad \times \left( \int _0^t \sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) \left| A_\ell (\tau ,\ell \tau )\hat{\rho }_\ell (\tau )\right| ^2 {\, \mathrm d}\tau \right) {\, \mathrm d}t \\&\le \left( \sup _{t \ge 0} \sup _{k\in \mathbb {Z}_*^d}\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}\tau \right) \\&\quad \times \sum _{k\in \mathbb {Z}_*^d}\int _0^{T^\star }\left( \int _0^t \sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) \left| A_\ell (\tau ,\ell \tau )\hat{\rho }_\ell (\tau )\right| ^2 {\, \mathrm d}\tau \right) {\, \mathrm d}t. \end{aligned}$$

By Fubini’s theorem,

$$\begin{aligned} \left\| AR\right\| _{L_k^2L^2_t(I)}^2&\lesssim \left( \sup _{t \ge 0} \sup _{k\in \mathbb {Z}_*^d}\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}\tau \right) \\&\quad \times \sum _{\ell \in \mathbb {Z}_*^d} \int _0^{T^\star } \left( \int _{\tau }^{T^\star } \sum _{k\in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}t\right) \left| A_\ell (\tau ,\ell \tau )\hat{\rho }_\ell (\tau )\right| ^2 {\, \mathrm d}\tau \\&\lesssim \left( \sup _{t \ge 0} \sup _{k\in \mathbb {Z}_*^d}\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}\tau \right) \\&\quad \times \left( \sup _{\tau \ge 0} \sup _{\ell \in \mathbb {Z}_*^d} \sum _{k\in \mathbb {Z}_*^d} \int _{\tau }^{T^\star }\bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}t\right) \left\| A\hat{\rho }\right\| ^2_{L^2_k L_t^2(I)}. \end{aligned}$$

Hence, condition (5.2) implies (5.7).

Transport

As above, first apply (1.3),

$$\begin{aligned} \left\| AT\right\| _{L_k^2L_t^2(I)}^2&\lesssim \sum _{k \in \mathbb {Z}_*^d}\int _0^{T^\star } \left[ A_k(t,kt)\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \sum _{N \ge 8} \left| \hat{g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right. \right. \nonumber \\&\quad \left. \left. \frac{\left| k(t-\tau )\right| }{\left| \ell \right| ^{\gamma }} \hat{\rho }_\ell (\tau )_{<N/8}\right| {\, \mathrm d}\tau \right] ^{2} {\, \mathrm d}t. \end{aligned}$$

By the Littlewood-Paley projections, on the support of the integrand there holds,

$$\begin{aligned}&\displaystyle \frac{N}{2} \le \left| k-\ell \right| + \left| kt - \ell \tau \right| \le \frac{3N}{2}&\end{aligned}$$
(5.10a)
$$\begin{aligned}&\displaystyle \left| \ell \right| + \left| \ell \tau \right| \le \frac{3N}{32}&\end{aligned}$$
(5.10b)
$$\begin{aligned}&\displaystyle \frac{13}{16} \le \frac{\left| k,kt\right| }{\left| k-\ell ,kt-\tau \ell \right| } \le \frac{19}{16}.&\end{aligned}$$
(5.10c)

By (5.10), on the support of the integrand, (3.11) implies that for some \(c = c(s) \in (0,1)\):

$$\begin{aligned} A_k(t,kt)&= e^{(\lambda (t) - \lambda (\tau ))\langle k,kt \rangle ^s}A_k(\tau ,kt) \\&\lesssim e^{(\lambda (t) - \lambda (\tau ))\langle k,kt \rangle ^s}e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}A_{k-\ell }(\tau ,kt-\ell \tau ). \end{aligned}$$

Using that

$$\begin{aligned} \left| k(t-\tau )\right| \le \left| kt-\ell \tau \right| + \tau \left| k-\ell \right| \le \langle \tau \rangle \left| k-\ell ,kt-\ell \tau \right| , \end{aligned}$$
(5.11)

we have (ignoring the Littlewood-Paley projection on \(\rho \) and the \(\left| \ell \right| ^{-\gamma }\) which are not helpful),

$$\begin{aligned} \left\| AT\right\| _{L_k^2L_t^2(I)}&\lesssim \sum _{k \in \mathbb {Z}_*^d} \int _0^{T^\star }\left[ \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t \sum _{N \ge 8} \left| (A \widehat{\nabla g})_{k-\ell }(\tau ,kt-\ell \tau )_N\right| \right. \nonumber \\&\quad \left. e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \right] ^2 {\, \mathrm d}t. \end{aligned}$$

Since the Littlewood-Paley projections define a partition of unity (and using also the Cauchy-Schwarz inequality),

$$\begin{aligned} \left\| AT\right\| _{L_k^2 L_t^2(I)}^2&\lesssim \sum _{k \in \mathbb {Z}_*^d} \int _0^{T^\star } \left[ \sum _{\ell \in \mathbb {Z}_*^d}\int _0^t e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \right] \\&\quad \times \left[ \sum _{l\in \mathbb {Z}_*^d} \int _0^t\left| (A \widehat{\nabla g})_{k-\ell }(\tau ,kt-\ell \tau )\right| ^2 e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \right] {\, \mathrm d}t. \end{aligned}$$

By Cauchy-Schwarz and \(\sigma > \frac{d}{2}+2\),

$$\begin{aligned} \sum _{\ell \in \mathbb {Z}_*^d}\int _0^t e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau&\le \left( \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} e^{2(c-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \langle \tau \rangle ^2\langle \ell ,\ell \tau \rangle ^{-2\sigma } {\, \mathrm d}\tau \right) ^{1/2} \left\| A\hat{\rho }\right\| _{L_k^2 L^2_t(I)} \nonumber \\&\lesssim \left\| A\hat{\rho }\right\| _{L_k^2 L_t^2(I)}. \end{aligned}$$
(5.12)

Then (5.12) and Fubini’s theorem imply

$$\begin{aligned}&\left\| AT\right\| _{L_k^2 L_t^2(I)}\\&\quad \lesssim \left\| A\hat{\rho }\right\| _{L^2_kL_t^2(I)} \sum _{k \in \mathbb {Z}_*^d} \int _0^{T^\star } \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t\left| (A \widehat{\nabla g})_{k-\ell }(\tau ,kt-\ell \tau )\right| ^2 e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau {\, \mathrm d}t \\&\quad \lesssim \left\| A\hat{\rho }\right\| _{L_k^2L_t^2(I)}\sum _{\ell \in \mathbb {Z}_*^d} \int _0^{T^\star } \left( \sum _{k \in \mathbb {Z}_*^d} \int _\tau ^{T^\star }\left| (A \widehat{\nabla g})_{k-\ell }(\tau ,kt-\ell \tau )\right| ^2 {\, \mathrm d}t \right) e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \\&\quad \le \left\| A\hat{\rho }\right\| _{L_k^2L_t^2(I)}\sum _{\ell \in \mathbb {Z}_*^d} \int _0^{T^\star } \left( \sum _{k \in \mathbb {Z}_*^d} \int _{-\infty }^\infty \left| (A \widehat{\nabla g})_{k-\ell }(\tau ,kt - \ell \tau )\right| ^2 {\, \mathrm d}t \right) e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \\&\quad \le \left\| A\hat{\rho }\right\| _{L_k^2L_t^2(I)} \left( \sup _{\tau \ge 0}e^{(c-1)\alpha _0\langle \tau \rangle ^s} \sum _{k \in \mathbb {Z}^d}\sup _{\omega \in \mathbb {Z}_*^d} \sup _{x \in {\mathbb {R}}^d} \int _{-\infty }^\infty \left| (A \widehat{\nabla g})_{k}\left( \tau ,\frac{\omega }{\left| \omega \right| }\zeta - x\right) \right| ^2 {\, \mathrm d}\zeta \right) \\&\qquad \times \sum _{\ell \ne 0} \int _0^{T^\star } e^{\lambda (\tau )\langle l,l\tau \rangle ^s} \langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau . \end{aligned}$$

Proceeding as in (5.12) then gives

$$\begin{aligned}&\left\| AT\right\| _{L_k^2 L_t^2(I)} \lesssim \left\| A\hat{\rho }\right\| _{L_k^2L_t^2(I)}^2 \\&\quad \times \left( \sup _{\tau \ge 0}e^{(c-1)\alpha _0\langle \tau \rangle ^s} \sum _{k \in \mathbb {Z}^d}\sup _{\omega \in \mathbb {Z}_*^d}\sup _{x \in {\mathbb {R}}^d}\int _{-\infty }^\infty \left| (A \widehat{\nabla g})_{k}\left( \tau ,\frac{\omega }{\left| \omega \right| }\zeta -x\right) \right| ^2 {\, \mathrm d}\zeta \right) . \end{aligned}$$

Using condition (5.3), we derive

$$\begin{aligned} \left\| AT\right\| _{L_k^2 L_t^2(I)}&\lesssim _{\alpha _0} K_1 \epsilon ^2\left\| A\hat{\rho }\right\| _{L_k^2L^2_t(I)}^2, \end{aligned}$$
(5.13)

which suffices to treat transport.

Remainders

We treat the remainder with a variant of the method used to treat transport. First, by (1.3):

$$\begin{aligned} \left\| A{\mathcal {R}}\right\| ^2_{L_k^2L_t^2(I)}&\lesssim \sum _{k \in \mathbb {Z}_*^d} \int _0^{T^*}\left[ A_k(t,kt)\int _0^t\sum _{\ell \in \mathbb {Z}_*^d}\sum _{N \in {\mathbb {D}}} \sum _{N/8 \le N^\prime \le 8N} \left| \hat{\rho }_\ell (\tau )_{N^\prime }\right| \left| k(t-\tau )\right| \right. \nonumber \\&\qquad \left. \left| \hat{g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right| {\, \mathrm d}\tau \right] ^2 {\, \mathrm d}t. \end{aligned}$$

Next we claim that on the integrand there holds for some \(c^\prime = c^\prime (s) \in (0,1)\),

$$\begin{aligned} A_k(t,kt) \lesssim _{\lambda _0,\alpha _0} e^{c^\prime \lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s} e^{c^\prime \lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}. \end{aligned}$$
(5.14)

Indeed, this follows simply by following the argument used to deduce (3.16), with kt replace \(\eta \). Therefore by (5.14), Cauchy-Schwarz,

$$\begin{aligned} \left\| A{\mathcal {R}}\right\| ^2_{L_k^2L_t^2(I)}&\lesssim \sum _{k \in \mathbb {Z}_*^d} \int _0^{T^*}\left[ \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \left( \sum _{N \in {\mathbb {D}}}e^{2\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\langle \tau \rangle ^2\left| \hat{\rho }_\ell (\tau )_{\sim N}\right| ^2\right) {\, \mathrm d}\tau \right] \\&\quad \times \left[ \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \left( \sum _{N \in {\mathbb {D}}}e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}e^{2c^{\prime }\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s}\right. \right. \\&\quad \quad \left. \left. \left| \widehat{\nabla g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right| ^2 \right) {\, \mathrm d}\tau \right] {\, \mathrm d}t. \end{aligned}$$

By the almost orthogonality of the Littlewood-Paley decomposition (3.4) and \(\sigma > 1\),

$$\begin{aligned} \left\| A{\mathcal {R}}\right\| ^2_{L_k^2L_t^2(I)}&\lesssim \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)}\sum _{k \in \mathbb {Z}_*^d} \int _0^{T^*} \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \left( \sum _{N \in {\mathbb {D}}} e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\right. \\&\quad \,\times \left. e^{2c^{\prime }\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s}\left| \hat{g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right| ^2 \right) {\, \mathrm d}\tau {\, \mathrm d}t. \end{aligned}$$

By Fubini’s theorem and Lemma 3.4,

$$\begin{aligned} \left\| A{\mathcal {R}}\right\| ^2_{L_k^2L_t^2(I)}&\lesssim \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)} \int _0^{T^\star }\sum _{\ell \in \mathbb {Z}_*^d} e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\\&\quad \times \left( \sum _{N \in {\mathbb {D}}} \sum _{k \in \mathbb {Z}_*^d} \int _\tau ^{T^\star } \left| A\widehat{\nabla g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right| ^2 {\, \mathrm d}t\right) {\, \mathrm d}\tau \\&\lesssim \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)}\int _0^{T^\star }\sum _{\ell \in \mathbb {Z}_*^d} e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \\&\quad \times \left( \sum _{N \in {\mathbb {D}}} \sum _{k \in \mathbb {Z}^d} \sup _{\omega \in \mathbb {Z}_*^d} \sup _{x\in {\mathbb {R}}^d} \int _{-\infty }^{\infty } \left| A \widehat{\nabla g}_{k}\left( \tau ,\frac{\omega }{\left| \omega \right| } \zeta -x\right) _{N}\right| ^2 {\, \mathrm d}\zeta \right) {\, \mathrm d}\tau \\&\lesssim \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)} \int _0^{T^\star }\sum _{\ell \in \mathbb {Z}_*^d} e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \\&\quad \times \left( \sum _{N \in {\mathbb {D}}} \sum _{k \in \mathbb {Z}^d} \left\| A^{(1)}\hat{g}_k(\tau )_N\right\| ^2_{H^M({\mathbb {R}}^d_\eta )} \right) {\, \mathrm d}\tau . \end{aligned}$$

The Littlewood-Paley projections do not commute with derivatives in frequency space, however since the projections have bounded derivatives we still have (see §3),

$$\begin{aligned} \left\| A{\mathcal {R}}\right\| ^2_{L_k^2L_t^2(I)}&\lesssim _M \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)} \int _0^{T^\star }\sum _{\ell \in \mathbb {Z}_*^d} e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \\&\quad \times \left( \sum _{N \in {\mathbb {D}}} \sum _{\left| \alpha \right| \le M} \left\| (D_\eta ^\alpha A^{(1)} \hat{g}_k)(\tau )_{\sim N}\right\| ^2_{L_k^2L^2_\eta } \right) {\, \mathrm d}\tau . \end{aligned}$$

Then by the almost orthogonality (3.4) with (2.14), (3.8) and \(c^{\prime } < 1\), we have

$$\begin{aligned} \left\| A{\mathcal {R}}\right\| ^2_{L_k^2L_t^2(I)}&\lesssim K_1\epsilon ^2 \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)} \int _0^{T^\star }\sum _{\ell \in \mathbb {Z}_*^d} e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\langle \tau \rangle ^7 {\, \mathrm d}\tau \nonumber \\&\lesssim K_1\epsilon ^2 \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)}, \end{aligned}$$
(5.15)

which suffices to treat remainder contributions.

Conclusion of \(L^2\) Bound

Putting (5.6), (5.7), (5.13) and (5.15) together with (5.5) we have for some \(\tilde{K} = \tilde{K}(s,d,M,\lambda _0,\alpha _0)\),

$$\begin{aligned} \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)} \le \tilde{K} C_{LD}\epsilon ^2 + \tilde{K} C_{LD}(K_1 + K_2)\epsilon ^2\left\| A\hat{\rho }(t)\right\| ^2_{L_k^2L_t^2(I)}. \end{aligned}$$

Therefore for \(\epsilon ^2 < \frac{1}{2}(\tilde{K} C_{LD}(K_1 + K_2))^{-1}\) we have

$$\begin{aligned} \left\| A\hat{\rho }(t)\right\| ^2_{L_k^2L_t^2(I)} < 2 \tilde{K} C_{LD} \epsilon ^2. \end{aligned}$$

Hence, Proposition 5.1 follows provided we fix \(K_3 = \tilde{K} C_{LD}\).

Pointwise-in-time Estimate on the Density

The constant \(K_3\) basically only depends on the linearized Vlasov equation with homogeneous background \(f^0\). The same is not true of the pointwise-in-time estimate we deduce next.

Lemma 5.2

(Pointwise estimate) For \(\epsilon _0\) sufficiently small, under the bootstrap hypotheses (2.11), there exists some \(K_4 = K_4(C_0,\bar{\lambda },\kappa ,M,s, d,\lambda _0,{\lambda ^\prime },K_1,K_2,K_3)\) such that for \(t \in [0,T^\star ]\),

$$\begin{aligned} \left\| A\rho (t)\right\| _2^2&\le K_4\langle t \rangle \epsilon ^2. \end{aligned}$$
(5.16)

Proof

As in [67], we use the \(L_t^2\) bound together with (2.6). Our starting point is again the paraproduct decomposition (5.4):

$$\begin{aligned} \left\| A\hat{\rho }(t)\right\| _{L_k^2}&\lesssim \sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)\hat{h}_{{\mathrm{in}}}(k,kt)\right| ^2 \nonumber \\&\quad + \sum _{k \in \mathbb {Z}_*^d} \left( A_k(t,kt)\int _0^t \hat{\rho }_k(\tau )\left| k\right| ^2 \widehat{W}(k)(t-\tau )\hat{f}^0(k(t-\tau )) {\, \mathrm d}\tau \right) ^{2} \nonumber \\&\quad + \sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)T_k(t)\right| ^2 + \sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)R_k(t)\right| ^2 \nonumber \\&\quad + \sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt){\mathcal {R}}_k(t)\right| ^2. \end{aligned}$$
(5.17)

To treat the initial data we use the \(H^{d/2+} \hookrightarrow C^0\) embedding and that \(\lambda (t)\) is decreasing (2.8):

$$\begin{aligned} \sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)\hat{h}_{{\mathrm{in}}}(k,kt)\right| ^2&\le \sum _{k \in \mathbb {Z}_*^d} \sup _{\eta \in {\mathbb {R}}^d} \left| A_k(t,\eta )h_{{\mathrm{in}}}(k,\eta )\right| ^2 \lesssim \left\| A(0)\hat{h}_{{\mathrm{in}}}\right\| ^2_{L^2_k H_\eta ^M} \lesssim \epsilon ^2, \end{aligned}$$
(5.18)

where we used an argument analogous to (2.13) to deduce the last inequality.\(\square \)

Linear Contribution

Next consider the term in (5.17) coming from the homogeneous background. By (1.3), (3.9), (3.8), and the \(H^{d/2+}\hookrightarrow C^0\) embedding with (3.7), (1.9) and (2.12c),

$$\begin{aligned}&\sum _{k \in \mathbb {Z}_*^d} \left( A_k(t,kt)\int _0^t \hat{\rho }_k(\tau )\left| k\right| ^2 \widehat{W}(k)(t-\tau )\hat{f}^0(k(t-\tau )) {\, \mathrm d}\tau \right) ^{2} \nonumber \\&\quad \lesssim \sum _{k \in \mathbb {Z}_*^d} \left( \int _0^t A_k(\tau ,k\tau )\left| \hat{\rho }_k(\tau )\right| \langle k(t-\tau ) \rangle ^{\sigma +1} e^{\lambda (t)\langle k(t-\tau ) \rangle ^s} \left| \hat{f}^0(k(t-\tau ))\right| {\, \mathrm d}\tau \right) ^{2} \nonumber \\&\quad \lesssim \left( \sup _\eta e^{\lambda _0\langle \eta \rangle ^s} \left| \hat{f}^0(\eta )\right| \right) ^2 \sum _{k \in \mathbb {Z}_*^d} \left( \int _0^t A_k(\tau ,k\tau )\left| \hat{\rho }_k(\tau )\right| e^{\frac{1}{2}(\lambda (0)-\lambda _0)\langle t - \tau \rangle ^s} {\, \mathrm d}\tau \right) ^2 \nonumber \\&\quad \lesssim C_0^2\left( \sum _{k \in \mathbb {Z}_*^d} \int _0^t \left| A_k(\tau ,k\tau )\hat{\rho }_k(\tau )\right| ^2 {\, \mathrm d}\tau \right) \left( \int _0^t e^{(\lambda (0)-\lambda _0)\langle t - \tau \rangle ^s} {\, \mathrm d}\tau \right) \nonumber \\&\quad \lesssim \sum _{k \in \mathbb {Z}_*^d }\left\| A\hat{\rho }_k\right\| ^2_{L^2_t(I)} \nonumber \\&\quad \lesssim K_3\epsilon ^2. \end{aligned}$$
(5.19)

Reaction

Next we treat the reaction term in (5.17), which by (1.3),

$$\begin{aligned} \sum _{k \in \mathbb {Z}_*^d} \left| A_k(kt)R_k\right| ^2&\lesssim \sum _{k \in \mathbb {Z}_*^d} \left[ A_k(t,kt)\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \sum _{N \ge 8}\left| \hat{g}_{k-\ell }(\tau ,kt-\ell \tau )_{<N/8}\right| \right. \nonumber \\&\left. \quad \frac{\left| k(t-\tau )\right| }{\left| \ell \right| ^{\gamma }} \left| \hat{\rho }_\ell (\tau )_N\right| {\, \mathrm d}\tau \right] ^{2}. \end{aligned}$$

As in the \(L_t^2\) estimate we have by (5.9) and the definition (5.1) of \(\bar{K}\):

$$\begin{aligned} \sum _{k \in \mathbb {Z}_*^d} \left| A_k(kt)R_k(t)\right| ^2&\lesssim \sum _{k \in \mathbb {Z}_*^d} \left[ \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) \left| A_\ell (\tau ,\ell \tau ) \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \right] ^{2}. \end{aligned}$$

By the Cauchy-Schwarz inequality and Fubini’s theorem,

$$\begin{aligned}&\sum _{k \in \mathbb {Z}_*^d} \left| A_k(kt)R_k\right| ^2 \lesssim \sum _{k \in \mathbb {Z}_*^d} \left( \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}\tau \right) \\&\quad \quad \times \left( \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t \bar{K}_{k,\ell }(t,\tau )\left| A_\ell (\tau ,\ell \tau ) \hat{\rho }_\ell (\tau )\right| ^2 {\, \mathrm d}\tau \right) \\&\quad \lesssim \left( \sup _{t \ge 0}\sup _{k \in \mathbb {Z}_*^d}\int _0^t\sum _{\ell \in \mathbb {Z}_*^d } \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}\tau \right) \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t \left( \sum _{k \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) \right) \left| A_\ell (\tau ,\ell \tau ) \hat{\rho }_\ell (\tau )\right| ^2 {\, \mathrm d}\tau \\&\quad \lesssim \left( \sup _{t \ge 0}\sup _{k \in \mathbb {Z}_*^d}\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}\tau \right) \left( \sup _{0 \le \tau \le t} \sup _{\ell \in \mathbb {Z}_*^d} \sum _{k \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) \right) \left\| A\hat{\rho }\right\| _{L^2_k L^2_t(I)}. \end{aligned}$$

The first factor appears in (5.2) and is controlled by Lemma 6.1. The second factor is controlled by Lemma 6.3 and results in the power of \(\langle t \rangle \) loss. Therefore by (2.12c),

$$\begin{aligned} \sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)R_k(t)\right| ^2 \lesssim K_2 K_3 \langle t \rangle \epsilon ^4, \end{aligned}$$
(5.20)

which suffices to treat the reaction term.

Transport

By (1.3) and \(\left| k(t-\tau )\right| \le \langle \tau \rangle \left| k-\ell ,kt-\ell \tau \right| \), the transport term is bounded by

$$\begin{aligned}&\sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)T_k(t)\right| ^2\\&\quad \lesssim \sum _{k \in \mathbb {Z}_*^d} \left[ A_k(t,kt)\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \sum _{N \ge 8} \left| \widehat{\nabla _{z,v} g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right| \langle \tau \rangle \left| \hat{\rho }_\ell (\tau )_{<N/8}\right| {\, \mathrm d}\tau \right] ^{2}. \end{aligned}$$

We begin as in Proposition 5.1. By the frequency localizations (5.10) (which hold on the support of the integrand), (3.11) implies that for some \(c = c(s) \in (0,1)\) we have (using also that the Littlewood-Paley projections define a partition of unity),

$$\begin{aligned}&\sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)T_k(t)\right| ^2 \\&\quad \lesssim \sum _{k \in \mathbb {Z}_*^d} \left[ \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t \left| (A \widehat{\nabla g})_{k-\ell }(\tau ,kt-\ell \tau )\right| e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \right] ^2. \end{aligned}$$

From Cauchy-Schwarz, (5.12) and (2.12c),

$$\begin{aligned} \sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)T_k(t)\right| ^2&\lesssim \sqrt{K_3}\epsilon \sum _{k \in \mathbb {Z}_*^d}\sum _{\ell \in \mathbb {Z}_*^d} \int _0^t\left| (A \widehat{\nabla g})_{k-\ell }(\tau ,kt-\ell \tau )\right| ^2\\&\quad \times e^{c\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \langle \tau \rangle \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau . \end{aligned}$$

By Fubini’s theorem, (3.8), the \(H^{d/2+} \hookrightarrow C^0\) embedding theorem and (2.14) with (3.8),

$$\begin{aligned}&\sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)T_k(t)\right| ^2\\&\quad \lesssim \sqrt{K_3}\epsilon \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t\left( \sum _{k \in \mathbb {Z}_*^d}\left| (A \widehat{\nabla g})_{k-\ell }(\tau ,kt-\ell \tau )\right| ^2 e^{\frac{1}{2}(c-1)\alpha _0\langle \tau \rangle ^s}\right) e^{\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \\&\quad \lesssim \sqrt{K_3}\epsilon \left( \sup _{\tau \le t}e^{\frac{1}{2}(c-1)\alpha _0\langle \tau \rangle ^s}\sum _{k \in \mathbb {Z}^d} \sup _{\eta \in {\mathbb {R}}^d} \left| (A \widehat{\nabla g})_{k}(\tau ,\eta )\right| ^2\right) \left( \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t e^{\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \right) \\&\quad \lesssim \sqrt{K_3}\epsilon \left( \sup _{\tau \le t}e^{\frac{1}{2}(c-1)\alpha _0\langle \tau \rangle ^s}\left\| A^{(1)}\hat{g}\right\| ^2_{L^2_k H^M_\eta }\right) \left( \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t e^{\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \right) \\&\quad \lesssim K_1\sqrt{K_3}\epsilon ^3 \left( \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t e^{\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\left| \hat{\rho }_\ell (\tau )\right| {\, \mathrm d}\tau \right) . \end{aligned}$$

Proceeding as in (5.12) and applying (2.12c), we get

$$\begin{aligned} \sum _{k \in \mathbb {Z}_*^d} \left| A_k(t,kt)T_k(t)\right| ^2 \lesssim K_3K_1 \epsilon ^4. \end{aligned}$$
(5.21)

Remainders

The remainder follows from a slight variant of the argument used to treat transport. By (1.3),

$$\begin{aligned} \left\| A{\mathcal {R}}(t)\right\| _{L_k^2}^2&\lesssim \sum _{k \in \mathbb {Z}_*^d}\left[ A_k(t,kt)\int _0^t\sum _{\ell \in \mathbb {Z}_*^d}\sum _{N \in {\mathbb {D}}} \sum _{N/8 \le N^\prime \le 8N} \left| \hat{\rho }_\ell (\tau )_{N^\prime }\right| \left| k(t-\tau )\right| \right. \\&\qquad \left. \left| \hat{g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right| {\, \mathrm d}\tau \right] ^2. \end{aligned}$$

As in Proposition 5.1, (5.14) holds on the support of the integrand and hence, by (5.11) and using the Cauchy-Schwarz inequality,

$$\begin{aligned}&\left\| A{\mathcal {R}}(t)\right\| _{L_k^2}^2 \lesssim \sum _{k \in \mathbb {Z}_*^d} \left[ \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \left( \sum _{N \in {\mathbb {D}} }e^{2\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\langle \tau \rangle ^2\left| \hat{\rho }_\ell (\tau )_{\sim N}\right| ^2\right) {\, \mathrm d}\tau \right] \\&\quad \times \left[ \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \left( \sum _{N \in {\mathbb {D}}}e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}e^{2c^{\prime }\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s}\left| \widehat{\nabla g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right| ^2 \right) {\, \mathrm d}\tau \right] . \end{aligned}$$

By the almost orthogonality of the Littlewood-Paley decomposition (3.4) and \(\sigma > 1\),

$$\begin{aligned} \left\| A{\mathcal {R}}(t)\right\| ^2_{L_k^2}&\lesssim \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)}\sum _{k \in \mathbb {Z}_*^d} \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \left( \sum _{N \in {\mathbb {D}}}e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}e^{2c^{\prime }\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s} \right. \\&\quad \left. \left| \widehat{\nabla g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right| ^2 \right) {\, \mathrm d}\tau . \end{aligned}$$

By Fubini’s theorem, the \(H^{d/2+} \hookrightarrow C^0\) embedding theorem and (3.8),

$$\begin{aligned} \left\| A{\mathcal {R}}(t)\right\| ^2_{L_k^2}&\lesssim \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)}\sum _{\ell \in \mathbb {Z}_*^d} \int _0^t e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s}\left( \sum _{k \in \mathbb {Z}_*^d} \sum _{N \in {\mathbb {D}}} e^{2c^{\prime }\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s}\right. \\&\quad \left. \left| \widehat{\nabla g}_{k-\ell }(\tau ,kt-\ell \tau )_{N}\right| ^2 \right) {\, \mathrm d}\tau \\&\lesssim \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)}\sum _{\ell \in \mathbb {Z}_*^d} \int _0^t e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \\&\quad \quad \times \left( \sum _{k \in \mathbb {Z}^d} \sum _{N \in {\mathbb {D}}} \left\| (A^{(-\beta )}\hat{g}_{k})(\tau )_{N}\right\| _{H^M_\eta }^2 \right) {\, \mathrm d}\tau . \end{aligned}$$

The Littlewood-Paley projections do not commute with derivatives in frequency space, however since the projections have bounded derivatives we still have (see §3),

$$\begin{aligned} \left\| A{\mathcal {R}}(t)\right\| ^2_{L_k^2}&\lesssim _M \left\| A\hat{\rho }\right\| ^2_{L_k^2L_t^2(I)} \sum _{\ell \in \mathbb {Z}_*^d} \int _0^t e^{2(c^{\prime }-1)\lambda (\tau )\langle \ell ,\ell \tau \rangle ^s} \nonumber \\&\quad \times \left( \sum _{N \in {\mathbb {D}}} \sum _{\left| \alpha \right| \le M} \left\| (D_\eta ^\alpha A^{(-\beta )} \hat{g}_k)(\tau )_{\sim N}\right\| ^2_{L_k^2L^2_\eta } \right) {\, \mathrm d}\tau . \end{aligned}$$

Hence by (3.4), (2.13), (2.12c) and (3.8),

$$\begin{aligned} \left\| A{\mathcal {R}}(t)\right\| ^2_{L_k^2}&\lesssim K_3 K_2 \epsilon ^4. \end{aligned}$$
(5.22)

Summing (5.18), (5.19), (5.20), (5.21) and (5.22) implies the result with \(K_4 \approx 1 + K_3 + K_2K_3 + K_3K_1\) (in fact we are rather sub-optimal).

Proof of High Norm Estimate (2.12a)

In this section we derive the high norm estimate on the full distribution, (2.12a). For some multi-index \(\alpha \in {\mathbb {N}}^{d}\) with \(\left| \alpha \right| \le M\), compute the time-derivative

$$\begin{aligned} \frac{1}{2}\frac{\mathrm{d}}{\mathrm{d}t}\left\| A^{(1)}D_\eta ^\alpha \hat{g}\right\| _2^2&= \sum _{k\in \mathbb {Z}^d}\int _\eta \dot{\lambda }(t)\langle k,\eta \rangle ^{s} \left| A^{(1)} D_\eta ^\alpha \hat{g}_k(\eta )\right| ^2 {\, \mathrm d}\eta \nonumber \\&\quad + \sum _{k \in \mathbb {Z}^d} \int _\eta A^{(1)} D_\eta ^\alpha \overline{\hat{g}_k(\eta )} A^{(1)} D_\eta ^\alpha \partial _t \hat{g}_k(\eta ) {\, \mathrm d}\eta \nonumber \\&= CK + E. \end{aligned}$$
(5.23)

Like similar terms appearing in [10, 45, 50], the CK term (for ‘Cauchy-Kovalevskaya’) is used to absorb the highest order terms coming from E.

Turning to E, we separate into the linear and nonlinear contributions

$$\begin{aligned} E&= -\sum _{k \in \mathbb {Z}^d_*} \int _\eta A^{(1)}D_\eta ^\alpha \overline{\hat{g}_k(\eta )} A^{(1)}_k(t,\eta )D_\eta ^\alpha \left[ \hat{\rho }_k(t) \widehat{W}(k)k \cdot (\eta - tk) \widehat{f^0}(\eta - kt)\right] {\, \mathrm d}\eta \nonumber \\&\quad -\sum _{k\in \mathbb {Z}^d} \int _\eta A^{(1)}D_\eta ^\alpha \overline{\hat{g}_k(\eta )} A^{(1)}_k(t,\eta )D_\eta ^\alpha \left[ \sum _{\ell \in \mathbb {Z}_*^d} \hat{\rho }_\ell (t)\widehat{W}(\ell ) \ell \cdot (\eta - tk) \hat{g}_{k-\ell }(t,\eta - t\ell )\right] {\, \mathrm d}\eta \nonumber \\&= -E_{L} - E_{NL}. \end{aligned}$$
(5.24)

Linear Contribution

The linear contribution \(E_L\) is easier from a regularity standpoint than \(E_{NL}\) since we may lose regularity when estimating \(f^0\). However, \(E_L\) has one less power of \(\epsilon \) which requires some care to handle and is the reason we cannot just take \(K_1\) in (2.12a) to be O(1). The treatment of \(E_L\) begins with the product lemma (3.13):

$$\begin{aligned} \left| E_L\right|&\lesssim \left\| A^{(1)} D_\eta ^\alpha \hat{g}\right\| _2\left\| A^{(1)} D_\eta ^\alpha (\eta \hat{f}^0(\eta ))\right\| _{L^2_\eta } \left\| \nabla _x W *_x \rho \right\| _{\mathcal {F}^{\tilde{c}\lambda (t);s}} \\&\quad + \left\| A^{(1)} D_\eta ^\alpha \hat{g}\right\| _2\left\| v^\alpha (\nabla _vf^0)\right\| _{\mathcal {G}^{\tilde{c} \lambda (t);s}}\left\| A^{(1)}\nabla _x W *_x \rho (t)\right\| _{2}. \end{aligned}$$

By (1.9), (3.8) and (3.7),

$$\begin{aligned} \left\| A^{(1)}D^{\alpha }_\eta (\eta \hat{f}^0(\eta ))\right\| _2&\le \left\| A^{(1)}\left( \eta D^{\alpha }_\eta \hat{f}^0(\eta )\right) \right\| _2 + \sum _{\left| j\right| = 1;j \le \alpha }\left\| A^{(1)} \left( \eta D^{\alpha -j}_\eta \hat{f}^0(\eta )\right) \right\| _2 \nonumber \\&\lesssim C_0. \end{aligned}$$
(5.25)

Next we use \(\gamma \ge 1\) to deduce from (1.3),

$$\begin{aligned} A^{(1)}_k(t,kt) \left| \widehat{W}(k)k\right| = \langle k,kt \rangle \left| \widehat{W}(k)k\right| A_k(t,kt) \lesssim \langle t \rangle A_k(t,kt), \end{aligned}$$

which implies (also using \(\tilde{c} < 1\) and (3.8)),

$$\begin{aligned} \left| E_L\right|&\lesssim e^{(\tilde{c} - 1)\alpha _0\langle t \rangle ^s}\left\| A^{(1)}D_\eta ^\alpha \hat{g}\right\| _2\left\| A\rho (t)\right\| _2 + \langle t \rangle \left\| A^{(1)} D_\eta ^\alpha \hat{g}\right\| _2\left\| A\rho (t)\right\| _2 \nonumber \\&\lesssim \langle t \rangle \left\| A^{(1)} D_\eta ^\alpha \hat{g}\right\| _2\left\| A\rho (t)\right\| _2. \end{aligned}$$
(5.26)

Commutator Trick for the Nonlinear Term

Turn now to the nonlinear term in (5.24), \(E_{NL}\). Here we cannot lose much regularity on any of the factors involved, however we have additional powers of \(\epsilon \) which will eliminate the large constants. First, we expand the \(D_\eta ^\alpha \) derivative

$$\begin{aligned} E_{NL}&= \sum _{k \in \mathbb {Z}^d} \int _\eta A^{(1)}D_\eta ^\alpha \overline{\hat{g}_k(\eta )}\left( A^{(1)}_k(t,\eta ) \left[ \sum _{\ell \in \mathbb {Z}_*^d} \hat{\rho }_\ell (t)\widehat{W}(\ell )\ell \cdot ( \eta - tk ) D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell )\right] \right) {\, \mathrm d}\eta \\&\quad + \sum _{k\in \mathbb {Z}^d} \int _\eta A^{(1)}D_\eta ^\alpha \overline{\hat{g}_k(\eta )}\left( A^{(1)}_k(t,\eta ) \sum _{\left| j\right| = 1; j \le \alpha }\left[ \sum _{\ell \in \mathbb {Z}_*^d} \hat{\rho }_\ell (t)\widehat{W}(\ell )\ell _j D_\eta ^{\alpha -j} \hat{g}_{k-\ell }(t,\eta - t\ell )\right] \right) {\, \mathrm d}\eta \\&= E_{NL}^1 + E_{NL}^2. \end{aligned}$$

Consider first \(E_{NL}^1\), as this contains an extra derivative which results in a loss of regularity that must be balanced by the CK term in (5.23). To gain from the cancellations inherent in transport we follow the commutator trick used in (for example) [10, 25, 45, 50] by applying the identity,

$$\begin{aligned} \frac{1}{2}\int _{{\mathbb {T}}^d \times {\mathbb {R}}^d} F(t,z+tv)\cdot (\nabla _v - t\nabla _z) \left[ A^{(1)}(v^\alpha g)\right] ^2 {\, \mathrm d}z {\, \mathrm d}v = 0, \end{aligned}$$
(5.27)

to write,

$$\begin{aligned} E_{NL}^1&= \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta A^{(1)} D_\eta ^\alpha \overline{\hat{g}_k(\eta )} \hat{\rho }_\ell (t)\widehat{W}(\ell )\ell \cdot (\eta - kt) \\&\quad \times \left[ A^{(1)}_{k}(t,\eta ) - A^{(1)}_{k-\ell }(t,\eta -t\ell )\right] \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) {\, \mathrm d}\eta . \end{aligned}$$

We divide further via paraproduct:

$$\begin{aligned} E_{NL}^1&= \sum _{N \ge 8} T^1_N + \sum _{N \ge 8} R^1_N + {\mathcal {R}}^1, \end{aligned}$$
(5.28)

where the transport term is given by

$$\begin{aligned} T^1_N&= \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta A^{(1)}D_\eta ^\alpha \overline{\hat{g}_k(\eta )} \hat{\rho }_\ell (t)_{<N/8}\widehat{W}(\ell )\ell \cdot (\eta - kt) \nonumber \\&\quad \times \left[ A^{(1)}_{k}(t,\eta ) - A^{(1)}_{k-\ell }(t,\eta -t\ell )\right] \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{N} {\, \mathrm d}\eta , \end{aligned}$$
(5.29)

and the reaction term by

$$\begin{aligned} R^1_N&= \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta A^{(1)}D_\eta ^\alpha \overline{\hat{g}_k(\eta )} \hat{\rho }_\ell (t)_{N}\widehat{W}(\ell )\ell \cdot \left( \eta - kt\right) \nonumber \\&\quad \times \left[ A^{(1)}_{k}(t,\eta ) - A^{(1)}_{k-\ell }(t,\eta -t\ell )\right] \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{<N/8} {\, \mathrm d}\eta . \end{aligned}$$
(5.30)

The remainder, \({\mathcal {R}}^1\), is whatever is left over.

Transport

On the support of the integrand in (5.29) we have

$$\begin{aligned}&\displaystyle \frac{N}{2} \le \left| k-\ell ,\eta -t\ell \right| \le \frac{3N}{2},&\end{aligned}$$
(5.31a)
$$\begin{aligned}&\displaystyle \left| \ell ,\ell t\right| \le \frac{3N}{32},&\end{aligned}$$
(5.31b)
$$\begin{aligned}&\displaystyle \frac{13}{16} \le \frac{\left| k,\eta \right| }{\left| k-\ell ,\eta -t\ell \right| } \le \frac{19}{16}.&\end{aligned}$$
(5.31c)

By (5.31a) we can gain from the multiplier:

$$\begin{aligned} \left| \frac{A^{(1)}_k(\eta )}{A^{(1)}_{k-l}(\eta -\ell t)} - 1\right|&= \left| \frac{e^{\lambda \langle k,\eta \rangle ^s} \langle k,\eta \rangle ^{\sigma +1}}{e^{\lambda \langle k-\ell ,\eta -\ell t \rangle ^s} \langle k-\ell ,\eta -\ell t \rangle ^{\sigma +1}} - 1 \right| \nonumber \\&\le \left| e^{\lambda \langle k,\eta \rangle ^s -\lambda \langle k-\ell ,\eta -\ell t \rangle ^s} - 1 \right| \nonumber \\&\quad + e^{\lambda \langle k,\eta \rangle ^s -\lambda \langle k-\ell ,\eta -\ell t \rangle ^s}\left| \frac{\langle k,\eta \rangle ^{\sigma +1}}{\langle k-\ell ,\eta -t\ell \rangle ^{\sigma +1}} -1 \right| . \end{aligned}$$
(5.32)

By \(\left| e^x - 1\right| \le xe^x\), (3.10) and (3.11) (using (5.31a)), there is some \(c = c(s) \in (0,1)\) such that:

$$\begin{aligned} \left| e^{\lambda \langle k,\eta \rangle ^s -\lambda \langle k-\ell ,\eta -\ell t \rangle ^s} - 1\right|&\le \lambda \left| \langle k,\eta \rangle ^s - \langle k-\ell ,\eta -\ell t \rangle ^s\right| e^{\lambda \langle k,\eta \rangle ^s -\lambda \langle k-\ell ,\eta -\ell t \rangle ^s} \nonumber \\&\lesssim \frac{\langle \ell ,\ell t \rangle }{\langle k,\eta \rangle ^{1-s} +\langle k-\ell ,\eta -\ell t \rangle ^{1-s}} e^{c\lambda \langle \ell ,\ell t \rangle ^s}. \end{aligned}$$
(5.33)

The other term in (5.32) can be treated with the mean-value theorem and (3.11), resulting in a bound not worse than (5.33). Therefore, applying (1.3), (5.32), (5.33) and adding a frequency localization by (5.31a) to \(T^1_N\) implies

$$\begin{aligned} \left| T^1_N\right|&\lesssim \sum _{k \in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(1)} D_\eta ^\alpha \hat{g}_k(\eta )\right| \left| \hat{\rho }_\ell (t)_{<N/8}\right| \frac{\left| \eta - \ell t - t(k-\ell )\right| \langle \ell ,\ell t \rangle }{\langle k,\eta \rangle ^{1-s} +\langle k-\ell ,\eta -\ell t \rangle ^{1-s}} e^{c\lambda \langle \ell ,\ell t \rangle ^s} \\&\quad \times A^{(1)}_{k-\ell }(t,\eta -t\ell )\left| \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell )\right) _{N}\right| {\, \mathrm d}\eta \\&\lesssim \langle t \rangle ^2\sum _{k \in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| \left( A^{(1)} D_\eta ^\alpha \hat{g}_k(\eta )\right) _{\sim N}\right| \left| \hat{\rho }_\ell (t)\right| \langle \ell \rangle \left| k-\ell ,\eta - t\ell \right| ^{s/2}\left| k,\eta \right| ^{s/2} e^{c\lambda \langle \ell ,\ell t \rangle ^s} \\&\quad \times A^{(1)}_{k-\ell }(t,\eta -t\ell )\left| \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell )\right) _{N}\right| {\, \mathrm d}\eta . \end{aligned}$$

Applying (3.1) implies

$$\begin{aligned} \left| T^1_N\right|&\lesssim \langle t \rangle ^2 \left\| \left| \nabla _{z,v}\right| ^{s/2}A^{(1)}(v^\alpha g)_{\sim N}\right\| _2\left\| \left| \nabla _{z,v}\right| ^{s/2}A^{(1)}(v^\alpha g)_{N}\right\| _2 \left\| \rho (t)\right\| _{\mathcal {F}^{c\lambda (t),\frac{d}{2}+2;s}}. \end{aligned}$$

Using the regularity gap provided by \(c < 1\) and (3.8) (also \(\sigma > \frac{d}{2}+2\)),

$$\begin{aligned} \left| T^1_N\right|&\lesssim \langle t \rangle ^2 e^{(c-1)\lambda (t) \langle t \rangle ^{s}}\left\| A\rho (t)\right\| _2\left\| \langle \nabla _{z,v} \rangle ^{s/2}A^{(1)}(v^\alpha g)_{\sim N}\right\| _2\left\| \langle \nabla _{z,v} \rangle ^{s/2}A^{(1)}(v^\alpha g)_{N}\right\| _2 \nonumber \\&\lesssim _{\alpha _0} e^{\frac{1}{2}(c-1)\alpha _0 \langle t \rangle ^{s}} \left\| A\rho (t)\right\| _2 \left\| \langle \nabla _{z,v} \rangle ^{s/2}A^{(1)}(v^\alpha g)_{\sim N}\right\| _2^2. \end{aligned}$$
(5.34)

We will find that this term is eventually absorbed by the CK term in (5.23).

Reaction

Next we consider the reaction contribution, where the commutator introduced by the identity (5.27) to deal with transport will not be useful. Hence, write \(R^1_N = R_N^{1;1} + R_N^{1;2}\) where

$$\begin{aligned} R_N^{1;1}&= \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta A^{(1)}D_\eta ^\alpha \overline{\hat{g}_k(\eta )} A^{(1)}_{k}(t,\eta ) \hat{\rho }_\ell (t)_{N}\widehat{W}(\ell )\ell \cdot \left( \eta - kt\right) \nonumber \\&\quad \times \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{<N/8}{\, \mathrm d}\eta . \end{aligned}$$

We focus on \(R_N^{1;1}\) first; \(R_N^{1;2}\) is easier as the norm is landing on the ‘low frequency’ factor. On the support of the integrand, we have the frequency localizations (3.15), from which it follows by (3.11) that there exists some \(c = c(s) \in (0,1)\) such that

$$\begin{aligned} \left| R_N^{1;1}\right|&\lesssim \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(1)}D_\eta ^\alpha \hat{g}_k(\eta )\right| A^{(1)}_\ell (t,\ell t)\left| \widehat{W}(\ell ) \ell \hat{\rho }_\ell (t)_{N}\right| \nonumber \\&\quad \times e^{c\lambda \langle k-\ell ,\eta -t\ell \rangle ^s} \left| \left[ \eta - tk\right] \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell )\right) _{<N/8} \right| {\, \mathrm d}\eta . \end{aligned}$$
(5.35)

Now again we have the crucial use of the assumption \(\gamma \ge 1\) as in \(E_L^1\):

$$\begin{aligned} A^{(1)}_\ell (t,\ell t)\left| \widehat{W}(\ell ) \ell \right| \lesssim \frac{\langle \ell ,\ell t \rangle }{\left| \ell \right| } A_\ell (t,\ell t)&\lesssim \langle t \rangle A_\ell (t,\ell t). \end{aligned}$$

Therefore (adding a frequency localization by (3.15)), by \(\left| \eta -kt\right| \le \langle t \rangle \left| k-\ell ,\eta -t\ell \right| \) and (3.8)

$$\begin{aligned} \left| R_N^{1;1}\right|&\lesssim \langle t \rangle \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(1)}\left( D_\eta ^\alpha \hat{g}_k(\eta )\right) _{\sim N}\right| \left| A_\ell (t,t \ell ) \hat{\rho }_\ell (t)_{N}\right| e^{c\lambda \langle k-\ell ,\eta -t\ell \rangle ^s}\nonumber \\&\quad \quad \left| (\eta - tk) \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell )\right) \right| {\, \mathrm d}\eta \\&\lesssim \langle t \rangle ^2\sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d}\int _\eta \left| A^{(1)} \left( D_\eta ^\alpha \hat{g}_k(\eta )\right) _{\sim N}\right| \left| A_\ell (t,t \ell ) \hat{\rho }_\ell (t)_{N}\right| e^{\lambda \langle k-\ell ,\eta -t\ell \rangle ^s}\\&\quad \quad \left| \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell )\right) \right| {\, \mathrm d}\eta . \end{aligned}$$

Applying (3.2) and \(\sigma - \beta > \frac{d}{2}+1\),

$$\begin{aligned} \left| R_N^{1;1}\right|&\lesssim \langle t \rangle ^2 \left\| A^{(1)}\left( v^\alpha g\right) _{\sim N}\right\| _2 \left\| A\rho _N\right\| _2 \left\| A^{(-\beta )}v^\alpha g\right\| _2 \nonumber \\&\lesssim \frac{\left\| A^{(-\beta )}v^\alpha g\right\| _2}{\langle t \rangle ^2} \left\| A^{(1)}\left( v^\alpha g\right) _{\sim N}\right\| ^2_2 + \left\| A^{(-\beta )}v^\alpha g\right\| _2 \langle t \rangle ^6 \left\| A\rho _N\right\| ^2_2, \end{aligned}$$
(5.36)

which will suffice to treat this term.

Next turn to \(R_N^{1;2}\), which is easier. By (1.3),

$$\begin{aligned} \left| R_N^{1;2}\right|&\lesssim \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(1)} D_\eta ^\alpha \hat{g}_k(\eta )\right| \left| \hat{\rho }_\ell (t)_{N}\right| \left| \eta - kt\right| \\&\quad \quad \left| A^{(1)}_{k-\ell }(t,\eta -t\ell ) \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{<N/8}\right| {\, \mathrm d}\eta . \end{aligned}$$

Since the frequency localizations (3.15) hold also on the support of the integrand of \(R_N^{1;2}\) (in particular \(\left| \eta -kt\right| \le \langle t \rangle \left| k-\ell ,\eta -t\ell \right| \lesssim \langle t \rangle \left| \ell ,\ell t\right| \)),

$$\begin{aligned} \left| R_N^{1;2}\right|&\lesssim \langle t \rangle \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(1)} D_\eta ^\alpha \hat{g}_k(\eta )_{\sim N}\right| \langle \ell ,t\ell \rangle \left| \hat{\rho }_\ell (t)_{N}\right| A^{(1)}_{k-\ell }(t,\eta -t\ell )\nonumber \\&\quad \left| \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{<N/8}\right| {\, \mathrm d}\eta . \end{aligned}$$

Therefore, by (3.1), (3.5), (3.8), (3.15) and \(\sigma > \frac{d}{2} + 3\),

$$\begin{aligned} \left| R_N^{1;2}\right|&\lesssim \langle t \rangle \left\| A^{(1)}(v^\alpha g)_{\sim N}\right\| _2 \left\| \rho (t)_N\right\| _{\mathcal {F}^{0,\frac{d}{2}+2}} \left\| A^{(1)}(v^\alpha g)\right\| _2 \nonumber \\&\lesssim \frac{\langle t \rangle }{N} \left\| A^{(1)}(v^\alpha g)_{\sim N}\right\| _2 \left\| \rho (t)_N\right\| _{\mathcal {F}^{0,\frac{d}{2}+3}} \left\| A^{(1)}(v^\alpha g)\right\| _2 \nonumber \\&\lesssim \frac{e^{-\alpha _0\langle t \rangle ^s}}{N} \left\| A^{(1)}(v^\alpha g)\right\| ^2_2 \left\| A\rho (t)\right\| _2, \end{aligned}$$
(5.37)

which suffices to treat this term.

Remainders

In order to complete the treatment of \(E_{NL}^1\) it remains to estimate the remainder \({\mathcal {R}}\). Like \(R_N^{1}\), the commutator introduced by (5.27) is not helpful, so divide into two pieces:

$$\begin{aligned} {\mathcal {R}}^1&= \sum _{N \in {\mathbb {D}}} \sum _{N/8 \le N^\prime \le 8N} \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta A^{(1)} D_\eta ^\alpha \overline{\hat{g}_k(\eta )} \hat{\rho }_\ell (t)_{N^\prime }\widehat{W}(\ell )\ell \cdot (\eta - kt) \\&\quad \times \left[ A^{(1)}_{k}(t,\eta ) - A^{(1)}_{k-\ell }(t,\eta -t\ell )\right] \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{N} {\, \mathrm d}\eta \\&= {\mathcal {R}}^{1;1} + {\mathcal {R}}^{1;2}. \end{aligned}$$

Analogous to (5.14), we claim that on the integrand there holds for some \(c^\prime = c^\prime (s) \in (0,1)\),

$$\begin{aligned} A^{(1)}_k(t,\eta ) \lesssim _{\lambda _0,\alpha _0} e^{c^\prime \lambda (t)\langle k-\ell ,\eta -t\ell \rangle ^s} e^{c^\prime \lambda (t)\langle \ell ,\ell t \rangle ^s}, \end{aligned}$$
(5.38)

which again follows by the argument used to deduce (3.16). Therefore, (5.38) implies (using also (1.3) and \(\left| \eta -kt\right| \le \langle t \rangle \langle k-\ell ,\eta - t\ell \rangle \)),

$$\begin{aligned} \left| {\mathcal {R}}^{1;1}\right|&\lesssim \langle t \rangle \sum _{N \in {\mathbb {D}}} \sum _{N^\prime \approx N} \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(1)} D_\eta ^\alpha \hat{g}_k(\eta )\right| e^{c^\prime \lambda (t)\langle \ell ,\ell t \rangle ^s}\left| \hat{\rho }_\ell (t)_{N^\prime }\right| \\&\quad \times \langle k-\ell ,\eta -t\ell \rangle e^{c^\prime \lambda (t)\langle k-\ell ,\eta -t\ell \rangle ^s}\left| \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{N}\right| {\, \mathrm d}\eta . \end{aligned}$$

Applying (3.2), \(\sigma >\frac{d}{2}+2\), (3.5) and (3.3), we have

$$\begin{aligned} \left| {\mathcal {R}}^{1;1}\right|&\lesssim e^{\frac{1}{2}(c^\prime - 1)\alpha _0\langle t \rangle ^s}\sum _{N \in {\mathbb {D}}} \sum _{N^\prime \approx N}\left\| A^{(1)}(v^\alpha g)\right\| _2 \frac{1}{N^\prime } \left\| A \rho (t)_{N^\prime }\right\| _2 \left\| (v^\alpha g)_{N}\right\| _{\mathcal {G}^{c^\prime \lambda (t);\frac{d}{2}+2;s}} \nonumber \\&\lesssim \left\| A\rho (t)\right\| _2 e^{\frac{1}{2}(c^\prime - 1)\alpha _0\langle t \rangle ^s}\sum _{N \in {\mathbb {D}}}\left\| A^{(1)}(v^\alpha g)\right\| _2 \frac{1}{N} \left\| A^{(1)}(v^\alpha g)_{N}\right\| _2 \nonumber \\&\lesssim \left\| A\rho (t)\right\| _2\langle t \rangle ^{-1}e^{\frac{1}{4}(c^\prime - 1)\alpha _0\langle t \rangle ^s} \left\| A^{(1)}(v^\alpha g)\right\| ^2_2, \end{aligned}$$
(5.39)

which will suffice to treat this term.

Treating \({\mathcal {R}}^{1;2}\) is very similar to \({\mathcal {R}}^{1;1}\). Indeed, on the support of the integrand \(\langle k-\ell ,\eta -t\ell \rangle \lesssim \langle \ell ,t\ell \rangle \) by the same logic used to deduce (5.38) and hence (1.3) and (3.8) imply

$$\begin{aligned} \left| {\mathcal {R}}^{1;2}\right|&\lesssim \langle t \rangle \sum _{N \in {\mathbb {D}}} \sum _{N^\prime \approx N} \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(1)} D_\eta ^\alpha \hat{g}_k(\eta )\right| \left| \hat{\rho }_\ell (t)_{N^\prime }\right| \\&\quad \times \langle k-\ell ,\eta -t\ell \rangle A^{(1)}_{k-\ell }(t,\eta -\ell t)\left| \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{N}\right| {\, \mathrm d}\eta \\&\lesssim \langle t \rangle \sum _{N \in {\mathbb {D}}} \sum _{N^\prime \approx N} \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(1)} D_\eta ^\alpha \hat{g}_k(\eta )\right| e^{\frac{1}{2}\lambda (t)\langle \ell ,\ell t \rangle ^s}\left| \hat{\rho }_\ell (t)_{N^\prime }\right| \\&\quad \times A^{(1)}_{k-\ell }(t,\eta -\ell t)\left| \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{N}\right| {\, \mathrm d}\eta , \end{aligned}$$

which implies by (3.1), (3.8), (3.3) and \(\sigma > d/2 + 2\),

$$\begin{aligned} \left| {\mathcal {R}}^{1;2}\right|&\lesssim \langle t \rangle e^{-\frac{1}{2}\alpha _0\langle t \rangle ^s}\sum _{N \in {\mathbb {D}}} \sum _{N^\prime \approx N}\left\| A^{(1)}(v^\alpha g)\right\| _2 \frac{1}{N^\prime } \left\| A\hat{\rho }(t)_{N^\prime }\right\| _2 \left\| A^{(1)}(v^\alpha g)_{N}\right\| _2 \nonumber \\&\lesssim \langle t \rangle e^{-\frac{1}{2}\alpha _0\langle t \rangle ^s}\left\| A\rho (t)\right\| _2 \sum _{N \in {\mathbb {D}}}\left\| A^{(1)}(v^\alpha g)\right\| _2 \frac{1}{N} \left\| A^{(1)}(v^\alpha g)_{N}\right\| _2 \nonumber \\&\lesssim \langle t \rangle ^{-1}\left\| A\rho (t)\right\| _2 e^{-\frac{1}{4}\alpha _0\langle t \rangle ^s} \left\| A^{(1)}(v^\alpha g)\right\| ^2_2, \end{aligned}$$
(5.40)

which suffices to treat \({\mathcal {R}}^{1;2}\).

Treatment of Lower Moments

Next we turn to the treatment of \(E_{NL}^2\). First apply (1.3) (using \(\gamma \ge 1\)),

$$\begin{aligned} \left| E_{NL}^2\right|&\lesssim \sum _{\left| j\right| = 1; j \le \alpha } \sum _{k \in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(1)}D_\eta ^\alpha \hat{g}_k(\eta )\right| \\&\quad A^{(1)}_k(t,\eta ) \left| \langle \ell \rangle ^{-1}\hat{\rho }_\ell (t)D_\eta ^{\alpha -j} \hat{g}_{k-\ell }(t,\eta - t\ell )\right| {\, \mathrm d}\eta . \end{aligned}$$

Then we apply (3.13)

$$\begin{aligned} \left| E_{NL}^2\right|&\lesssim \sum _{\left| j\right| = 1; j \le \alpha } \left\| A^{(1)} v^\alpha g\right\| _2 \left\| A^{(1)} v^{\alpha -j} g\right\| _2 \left\| \rho (t)\right\| _{\mathcal {F}^{\tilde{c}\lambda (t),0;s}} \\&\quad + \sum _{\left| j\right| = 1; j \le \alpha }\left\| A^{(1)} v^\alpha g\right\| _2 \left\| v^{\alpha -j} g\right\| _{\mathcal {G}^{\tilde{c}\lambda ,0;s}}\left\| \langle k \rangle ^{-1} A_k^{(1)} \hat{\rho }_k(t)\right\| _{L_k^2}. \end{aligned}$$

From here, we take advantage of the regularity gap and \(\langle k \rangle ^{-1}A^{(1)}_k(k,k t) \lesssim \langle t \rangle A_k(t,k t)\) to deduce

$$\begin{aligned} \left| E_{NL}^2\right|&\lesssim \sum _{\left| j\right| = 1; j \le \alpha } e^{(\tilde{c} - 1)\alpha _0\langle t \rangle ^s}\left\| A^{(1)} v^\alpha g\right\| _2 \left\| A^{(1)} v^{\alpha -j} g\right\| _2 \left\| A\rho (t)\right\| _2 \nonumber \\&\quad + \sum _{\left| j\right| = 1; j \le \alpha }\langle t \rangle \left\| A^{(1)} v^\alpha g\right\| _2 \left\| A^{(-\beta )} v^{\alpha -j} g\right\| _2 \left\| A\rho (t)\right\| _{2}, \end{aligned}$$
(5.41)

which suffices to treat this contribution.

Conclusion of High Norm Estimate

Denote \(\delta = -\frac{1}{4}\min \left( c-1,\tilde{c} - 1,c^\prime - 1\right) \alpha _0\). Collecting the contributions of (5.23), (5.26) (5.34), (5.36), (5.37), (5.39), (5.40) and (5.41) then summing in N with (3.4) (note we used \(1 \lesssim \langle k,\eta \rangle ^{s/2}\) to group (5.37), (5.39) and (5.40) with (5.34)), we have the following for some \(\tilde{K} = \tilde{K}(s,M,\sigma ,\lambda _0,{\lambda ^\prime },C_0,d)\),

$$\begin{aligned} \frac{1}{2}\frac{\mathrm{d}}{\mathrm{d}t} \left\| A^{(1)} v^\alpha g\right\| _2^2&\le \left( \tilde{K} \langle t \rangle ^{-1} e^{-\delta \langle t \rangle ^{s}} \left\| A\rho (t)\right\| _2 + \dot{\lambda }(t)\right) \left\| \langle \nabla _{z,v} \rangle ^{s/2} A^{(1)}(v^\alpha g)\right\| _2^2 \\&\quad + \tilde{K}\langle t \rangle \left\| A^{(1)} v^\alpha g\right\| _2 \left\| A\rho (t)\right\| _2 \\&\quad + \tilde{K}\frac{\left\| A^{(-\beta )} v^\alpha g\right\| _2}{\langle t \rangle ^2} \left\| A^{(1)} v^\alpha g\right\| ^2_2 + \tilde{K}\langle t \rangle ^6\left\| A^{(-\beta )} v^\alpha g\right\| _2\left\| A\rho \right\| ^2_2 \\&\quad + \tilde{K}\sum _{\left| j\right| = 1; j \le \alpha } e^{-\delta \langle t \rangle ^s}\left\| A^{(1)} v^\alpha g\right\| _2 \left\| A^{(1)} v^{\alpha -j} g\right\| _2 \left\| A\rho (t)\right\| _2 \\&\quad + \tilde{K}\sum _{\left| j\right| = 1; j \le \alpha }\langle t \rangle \left\| A^{(1)} v^\alpha g\right\| _2 \left\| A^{(-\beta )} v^{\alpha -j} g\right\| \left\| A\rho (t)\right\| _2. \end{aligned}$$

Introducing a small parameter b to be fixed depending only on \(\tilde{K}\) and \(\lambda (t)\),

$$\begin{aligned} \frac{1}{2}\frac{\mathrm{d}}{\mathrm{d}t} \left\| A^{(1)} v^\alpha g\right\| _2^2&\le \left( \tilde{K} e^{-\delta \langle t \rangle ^{s}} \langle t \rangle ^{-1}\left\| A\rho (t)\right\| _2 + \frac{b\tilde{K}}{\langle t \rangle ^2} +\dot{\lambda }(t)\right) \left\| \langle \nabla \rangle ^{s/2} A^{(1)}(v^\alpha g)\right\| _2^2 \nonumber \\&\quad + \frac{\tilde{K}}{b}\langle t \rangle ^{4}\left\| A\rho (t)\right\| ^2_2 \nonumber \\&\quad + \tilde{K}\frac{\left\| A^{(-\beta )} v^\alpha g\right\| _2}{\langle t \rangle ^2} \left\| A^{(1)} v^\alpha g\right\| ^2_2 + \tilde{K}\langle t \rangle ^6\left\| A^{(-\beta )} v^\alpha g\right\| _2\left\| A\rho \right\| ^2_2 \nonumber \\&\quad + \tilde{K}\sum _{\left| j\right| = 1; j \le \alpha } e^{-\delta \langle t \rangle ^s}\left\| A^{(1)} v^\alpha g\right\| _2 \left\| A^{(1)} v^{\alpha -j} g\right\| _2 \left\| A\rho (t)\right\| _2 \nonumber \\&\quad + \tilde{K}\sum _{\left| j\right| = 1; j \le \alpha }\langle t \rangle \left\| A^{(1)} v^\alpha g\right\| _2 \left\| A^{(-\beta )} v^{\alpha -j} g\right\| \left\| A\rho (t)\right\| _2. \end{aligned}$$
(5.42)

By (3.8) and (2.9) we may fix b and \(\epsilon \) small such that

$$\begin{aligned} \tilde{K} e^{-\delta \langle t \rangle ^{s}}\sqrt{K_4}\epsilon + \frac{b\tilde{K}}{\langle t \rangle ^2} \le \frac{1}{2}\left| \dot{\lambda }(t)\right| . \end{aligned}$$

Note this requires fixing \(\epsilon \) small relative to \(K_4\) but not b. Then by (5.16) we deduce that the first term in (5.42) is negative. Therefore, summing in \(\alpha \), integrating and applying the bootstrap hypotheses (2.11) and (5.16) implies (adjusting \(\tilde{K}\) to \(\tilde{K}^\prime \))

$$\begin{aligned} \sum _{\left| \alpha \right| \le M}\left\| A^{(1)} v^\alpha g\right\| _2^2&\le \epsilon ^2 + \tilde{K}^\prime \langle t \rangle ^5 K_3 \epsilon ^2 + \tilde{K}^\prime K_1 \sqrt{K_2} \langle t \rangle ^6 \epsilon ^3 + \tilde{K}^\prime \sqrt{K_2} \langle t \rangle ^7 K_3 \epsilon ^3 \\&\quad + \tilde{K}^\prime K_1 \sqrt{K_4}\epsilon ^3 + \tilde{K}^\prime \sqrt{K_1K_2 K_4} \langle t \rangle ^6 \epsilon ^3. \end{aligned}$$

Hence we may take \(K_1 = \tilde{K}^\prime K_3 + 1\) and we have (2.12a) by choosing

$$\begin{aligned} \epsilon < K_1\left( 4\tilde{K}^\prime \right) ^{-1}\left( K_1\sqrt{K_2} + \sqrt{K_2}K_3 + K_1\sqrt{K_4} + \sqrt{K_1K_2K_4} \right) ^{-1}. \end{aligned}$$

Proof of Low Norm Estimate (Equation (2.12b))

This proof proceeds analogously to (2.12a) replacing \(A^{(1)}\) with \(A^{(-\beta )}\). First compute the derivative as in (5.23),

$$\begin{aligned} \frac{1}{2}\frac{\mathrm{d}}{\mathrm{d}t}\left\| A^{(-\beta )}D_\eta ^\alpha \hat{g}\right\| _2^2&= \sum _{k\in \mathbb {Z}^d}\int _\eta \dot{\lambda }(t)\langle k,\eta \rangle ^{s} \left| A^{(-\beta )} D_\eta ^\alpha \hat{g}_k(\eta )\right| ^2 {\, \mathrm d}\eta \nonumber \\&\quad + \sum _{k \in \mathbb {Z}^d} \int _\eta A^{(-\beta )} D_\eta ^\alpha \overline{\hat{g}_k(\eta )} A^{(-\beta )} D_\eta ^\alpha \partial _t \hat{g}_k(\eta ) {\, \mathrm d}\eta \nonumber \\&= CK_{\mathcal {L}} + E_{\mathcal {L}}, \end{aligned}$$
(5.43)

where

$$\begin{aligned} E_{\mathcal {L}}&= -\sum _{k \in \mathbb {Z}^d} \int _\eta A^{(-\beta )} D_\eta ^\alpha \overline{\hat{g}_k(\eta )} A^{(-\beta )}_k(t,\eta )\hat{\rho }_k(t) \widehat{W}(k)\nonumber \\&\quad \times D_\eta ^\alpha \left[ k \cdot (\eta - tk) \hat{f}^0(\eta - kt)\right] {\, \mathrm d}\eta \nonumber \\&\quad - \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d}\int _\eta A^{(-\beta )} D_\eta ^\alpha \overline{\hat{g}_k(\eta )} A^{(-\beta )}_k(t,\eta ) \hat{\rho }_\ell (t)\widehat{W}(\ell )\nonumber \\&\quad \times D_\eta ^\alpha \left[ \ell \cdot (\eta - tk) \hat{g}_{k-\ell }(t,\eta - t\ell )\right] {\, \mathrm d}\eta \nonumber \\&= -E_{\mathcal {L};L} - E_{\mathcal {L};NL}. \end{aligned}$$
(5.44)

As in the treatment of \(E_L\) in §5.3.1, we may use the product lemma (3.13) and (1.3) to deduce

$$\begin{aligned} \left| E_{\mathcal {L};L}\right|&\lesssim \left\| A^{(-\beta )} D_\eta ^\alpha \hat{g}\right\| _2\left\| A^{(-\beta )} D_\eta ^\alpha (\eta \hat{f}^0(\eta ))\right\| _{L^2_\eta } \left\| \rho (t)\right\| _{\mathcal {F}^{\tilde{c} \lambda (t),0;s}} \\&\quad + \left\| A^{(-\beta )} D_\eta ^\alpha \hat{g}\right\| _2\left\| A^{(-\beta )} D_\eta ^\alpha (\eta \hat{f}^0(\eta ))\right\| _{L^2_\eta }\left\| A^{(-\beta )} \rho (t)\right\| _2. \end{aligned}$$

By the analogue of (5.25), \(\tilde{c} < 1\) and the regularity gap between \(A^{(-\beta )}\) and A and (3.8)

$$\begin{aligned} \left| E_{\mathcal {L};L}\right|&\lesssim e^{(\tilde{c}-1)\alpha _0\langle t \rangle ^s} \left\| A^{(-\beta )} D_\eta ^\alpha \hat{g}\right\| _2\left\| A\rho (t)\right\| _2 + \langle t \rangle ^{-\beta } \left\| A^{(-\beta )} D_\eta ^\alpha \hat{g}\right\| _2 \left\| A\rho (t)\right\| _2 \nonumber \\&\lesssim \langle t \rangle ^{-\beta } \left\| A^{(-\beta )} v^\alpha g\right\| _2 \left\| A\rho (t)\right\| _2, \end{aligned}$$
(5.45)

which suffices to treat this term.

We now turn to the treatment of \(E_{\mathcal {L};NL}\), which as in §5.3.2 is expanded by

$$\begin{aligned} E_{\mathcal {L};NL}&= \sum _{k \in \mathbb {Z}^d} \int _\eta A^{(-\beta )}D_\eta ^\alpha \overline{\hat{g}_k(\eta )}\\&\quad \times \left( A^{(-\beta )}_k(t,\eta ) \left[ \sum _{\ell \in \mathbb {Z}_*^d} \hat{\rho }_\ell (t)\widehat{W}(\ell )\ell \cdot ( \eta - tk ) D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell )\right] \right) {\, \mathrm d}\eta \\&\quad +\sum _{k\in \mathbb {Z}^d} \int _\eta A^{(-\beta )}D_\eta ^\alpha \overline{\hat{g}_k(\eta )}\\&\quad \times \left( A^{(-\beta )}_k(t,\eta ) \sum _{\left| j\right| = 1; j \le \alpha }\left[ \sum _{\ell \in \mathbb {Z}_*^d} \hat{\rho }_\ell (t)\widehat{W}(\ell )\ell _j D_\eta ^{\alpha -j} \hat{g}_{k-\ell }(t,\eta - t\ell )\right] \right) {\, \mathrm d}\eta \\&= E_{\mathcal {L};NL}^1 + E_{\mathcal {L};NL}^2. \end{aligned}$$

First consider \(E_{\mathcal {L};NL}^1\), to which we apply (5.27) (with \(A^{(-\beta )}\) instead of \(A^{(1)}\)) and then decompose via paraproduct as in (5.28):

$$\begin{aligned} E_{\mathcal {L};NL}^1&= \sum _{N \ge 8} T^1_{\mathcal {L};N} + \sum _{N \ge 8} R^1_{\mathcal {L};N} + {\mathcal {R}}_{\mathcal {L}}^1, \end{aligned}$$
(5.46)

where the transport term is given by

$$\begin{aligned} T^1_{\mathcal {L};N}&= \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta A^{(-\beta )} D_\eta ^\alpha \overline{\hat{g}_k(\eta )} \hat{\rho }_\ell (t)_{<N/8}\widehat{W}(\ell )\ell \cdot (\eta - kt) \nonumber \\&\quad \times \left[ A^{(-\beta )}_{k}(t,\eta ) - A^{(-\beta )}_{k-\ell }(t,\eta -t\ell )\right] \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{N} {\, \mathrm d}\eta , \end{aligned}$$
(5.47)

and the reaction term by

$$\begin{aligned} R^1_{\mathcal {L};N}&= \sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta A^{(-\beta )} D_\eta ^\alpha \overline{\hat{g}_k(\eta )} \hat{\rho }_\ell (t)_{N}\widehat{W}(\ell )\ell \cdot \left( \eta - kt\right) \nonumber \\&\quad \times \left[ A^{(-\beta )}_{k}(t,\eta ) - A^{(-\beta )}_{k-\ell }(t,\eta -t\ell )\right] \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell ) \right) _{<N/8}{\, \mathrm d}\eta , \end{aligned}$$
(5.48)

and as before the remainder is whatever is left over. The treatment of the transport term \(T_{\mathcal {L};N}^1\) and the remainder \({\mathcal {R}}_{\mathcal {L}}^1\) is unchanged from the corresponding treatments of \(T_{N}^1\) and \({\mathcal {R}}^1\) in §5.3.3 and §5.3.5 respectively. Hence, we omit it and simply conclude as in (5.34), (5.39) and (5.40):

$$\begin{aligned} \left| T_{\mathcal {L};N}^1\right|&\lesssim \langle t \rangle ^{-1}e^{\frac{1}{2}(c-1)\alpha _0 \langle t \rangle ^{s}} \left\| A\rho (t)\right\| _2 \left\| \langle \nabla _{z,v} \rangle ^{s/2}A^{(-\beta )} (v^\alpha g)_{\sim N}\right\| _2^2, \end{aligned}$$
(5.49)
$$\begin{aligned} \left| {\mathcal {R}}_{\mathcal {L}}^1\right|&\lesssim \left\| A\rho (t)\right\| _2\langle t \rangle ^{-1}e^{\frac{1}{4}(c^\prime - 1)\alpha _0\langle t \rangle ^s} \left\| A^{(-\beta )} (v^\alpha g)\right\| ^2_2. \end{aligned}$$
(5.50)

The reaction term is slightly altered to gain from the regularity gap and get a uniform bound (as in the linear contribution \(E_{\mathcal {L};L}\)). As in the treatment of reaction in \(E_{NL}^1\) in §5.3.4, we separate into \(R_{\mathcal {L};N}^{1} = R_{\mathcal {L};N}^{1;1} + R_{\mathcal {L};N}^{1;2}\) where the leading order reaction term is given by

$$\begin{aligned} R_{\mathcal {L};N}^{1;1}&= -\sum _{k\in \mathbb {Z}^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta A^{(-\beta )} D_\eta ^\alpha \overline{\hat{g}_k(\eta )} A^{(-\beta )}_k(t,\eta )\hat{\rho }_\ell (t)_{N}\widehat{W}(\ell ) \ell \cdot \left[ \eta - tk\right] \\&\quad \times \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell )\right) _{<N/8} {\, \mathrm d}\eta . \end{aligned}$$

By the frequency localizations (3.15), (3.11) implies for some \(c = c(s) \in (0,1)\) (using also (1.3) and \(\left| \eta - kt\right| \le \langle t \rangle \langle k-\ell ,\eta -t\ell \rangle \)),

$$\begin{aligned} \left| R_{\mathcal {L};N}^{1;1}\right|&\lesssim \langle t \rangle \sum _{k \in \mathbb {Z}_*^d} \sum _{\ell \in \mathbb {Z}_*^d} \int _\eta \left| A^{(-\beta )} \left( D_\eta ^\alpha \hat{g}_k(\eta )\right) _{\sim N}\right| A^{(-\beta )}_\ell (t,t\ell ) \left| \hat{\rho }_\ell (t)_{N}\right| \\&\quad \times e^{c\lambda (t)\langle k-\ell ,\eta -\ell t \rangle ^s} \langle k-\ell ,\eta - \ell t \rangle \left| \left( D_\eta ^\alpha \hat{g}_{k-\ell }(t,\eta - t\ell )\right) _{<N/8}\right| {\, \mathrm d}\eta . \end{aligned}$$

Proceeding as in the proof of (5.36), applying (3.2) (along with \(\sigma >d/2 + 2\)) and using the regularity gap between \(A^{(-\beta )}\) and A implies

$$\begin{aligned} \left| R_{\mathcal {L};N}^{1;1}\right|&\lesssim \langle t \rangle \left\| A^{(-\beta )} \left( v^\alpha g\right) _{\sim N}\right\| _2 \left\| A^{(-\beta )}\rho _N\right\| _2 \left\| A^{(-\beta )} v^\alpha g\right\| _2 \nonumber \\&\lesssim \langle t \rangle ^{1-\beta } \left\| A^{(-\beta )} \left( v^\alpha g\right) _{\sim N}\right\| _2 \left\| A\rho _N\right\| _2\left\| A^{(-\beta )} v^\alpha g\right\| _2 \nonumber \\&\lesssim \langle t \rangle ^{1-\beta } \left\| A^{(-\beta )} v^\alpha g\right\| _2 \left\| A^{(-\beta )}\left( v^\alpha g\right) _{\sim N}\right\| ^2_2 + \langle t \rangle ^{1-\beta }\left\| A^{(-\beta )} v^\alpha g\right\| _2\left\| A\rho _N\right\| ^2_2, \end{aligned}$$
(5.51)

which will be sufficient for the proof of (2.12b). The term \(R_{\mathcal {L};N}^{1;2}\) can be treated exactly as \(R_{N}^{1;2}\) and hence we omit and simply conclude

$$\begin{aligned} \left| R_{\mathcal {L};N}^{1;2}\right|&\lesssim \frac{e^{-\alpha _0\langle t \rangle ^s}}{N} \left\| A^{(-\beta )}(v^\alpha g)\right\| ^2_2 \left\| A\rho (t)\right\| _2. \end{aligned}$$
(5.52)

The term \(E_{\mathcal {L};NL}^2\) is treated as in §5.3.6. By (1.3), (3.13) and the regularity gap between \(A^{(-\beta )}\) and A (also (3.8) in the last line),

$$\begin{aligned} \left| E_{\mathcal {L};NL}^2\right|&\lesssim \sum _{\left| j\right| = 1; j \le \alpha } \left\| A^{(-\beta )} v^\alpha g\right\| _2 \left\| A^{(-\beta )} v^{\alpha -j} g\right\| _2 \left\| \rho (t)\right\| _{\mathcal {F}^{\tilde{c} \lambda (t),0;s}} \nonumber \\&\quad + \sum _{\left| j\right| = 1; j \le \alpha }\left\| A^{(-\beta )} v^\alpha g\right\| _2 \left\| v^{\alpha -j} g\right\| _{\mathcal {G}^{\tilde{c}\lambda ,\sigma ;s}} \left\| A^{(-\beta )} \rho (t)\right\| _2 \nonumber \\&\lesssim e^{(\tilde{c} - 1)\alpha _0\langle t \rangle ^s}\sum _{\left| j\right| = 1; j \le \alpha } \left\| A^{(-\beta )} v^\alpha g\right\| _2 \left\| A^{(-\beta )} v^{\alpha -j} g\right\| _2 \left\| A\rho (t)\right\| _2 \nonumber \\&\quad + \langle t \rangle ^{-\beta } \sum _{\left| j\right| = 1; j \le \alpha }\left\| A^{(-\beta )} v^\alpha g\right\| _2 \left\| A^{(-\beta )} v^{\alpha -j} g\right\| _2\left\| A\rho (t)\right\| _2 \nonumber \\&\lesssim \langle t \rangle ^{-\beta } \sum _{\left| j\right| = 1; j \le \alpha }\left\| A^{(-\beta )} v^\alpha g\right\| _2 \left\| A^{(-\beta )} v^{\alpha -j} g\right\| _2\left\| A\rho (t)\right\| _2. \end{aligned}$$
(5.53)

Denote \(\delta = -\frac{1}{4}\min \left( c-1,\tilde{c} - 1,c^\prime - 1\right) \alpha _0\). Collecting (5.45) (5.49), (5.50), (5.51), (5.52) and (5.53) and summing in N, splitting the linear terms with a small parameter b and combining (5.50) and (5.52) with (5.49) as in (5.42) (using also (3.8)), we have the following for some \(\tilde{K} = \tilde{K}(s,\sigma ,\alpha _0,C_0,d)\) (not the same as the \(\tilde{K}\) in (5.42) but this is irrelevant),

$$\begin{aligned} \frac{1}{2}\frac{\mathrm{d}}{\mathrm{d}t} \left\| A^{(-\beta )} v^\alpha g\right\| _2^2&\le \left( \tilde{K} \langle t \rangle ^{-1}e^{-\delta \langle t \rangle ^{s}} \left\| A\rho (t)\right\| _2 + \tilde{K}\langle t \rangle ^{-\beta }b + \dot{\lambda }(t)\right) \nonumber \\&\quad \times \left\| \langle \nabla _{z,v} \rangle ^{s/2} A^{(-\beta )} (v^\alpha g)\right\| _2^2 + \frac{\tilde{K}}{b}\langle t \rangle ^{-\beta }\left\| A\rho (t)\right\| _2^2 \nonumber \\&\quad + \tilde{K} \langle t \rangle ^{1-\beta } \left\| A^{(-\beta )} (v^\alpha g)\right\| _2\left( \left\| A^{(-\beta )}(v^\alpha g)\right\| _2^2 + \left\| A\rho (t)\right\| _2^2 \right) \nonumber \\&\quad + \tilde{K} \langle t \rangle ^{-\beta } \sum _{\left| j\right| = 1; j \le \alpha }\left\| A^{(-\beta )} v^\alpha g\right\| _2 \left\| A^{(-\beta )} v^{\alpha -j} g\right\| \left\| A\rho (t)\right\| _2. \end{aligned}$$
(5.54)

By (3.8) and (2.9) we may fix b and \(\epsilon \) small such that

$$\begin{aligned} \tilde{K} \sqrt{K_4} \epsilon e^{-\delta \langle t \rangle ^{s}} + \tilde{K}\langle t \rangle ^{-\beta }b \le \frac{1}{2}\left| \dot{\lambda }(t)\right| , \end{aligned}$$

which by (5.16), implies that the first term in (5.54) is non-positive. Therefore, summing in \(\alpha \), integrating with \(\beta > 2\) and applying the bootstrap hypotheses (2.11) and (5.16) implies (adjusting \(\tilde{K}\) to \(\tilde{K}^\prime \)),

$$\begin{aligned} \sum _{\left| \alpha \right| \le M} \left\| A^{(-\beta )} v^\alpha g\right\| _2^2 \le \epsilon ^2 + \tilde{K}^\prime K_3\epsilon ^2 + \tilde{K}^\prime K_2^{3/2}\epsilon ^3 + \tilde{K}^\prime \sqrt{K_2}K_3 \epsilon ^3 + \tilde{K}^\prime K_2\sqrt{K_4} \epsilon ^3. \end{aligned}$$

Hence, we take \(K_2 = 1 + \tilde{K}^\prime K_3\) and \(\epsilon < K_2 \left( 3\tilde{K}^\prime \right) ^{-1}\left( K_2^{3/2} + \sqrt{K_2}K_3 + K_2\sqrt{K_4}\right) ^{-1}\) to deduce (2.12b).

Analysis of the Plasma Echoes

The most important step to pushing linear Landau damping to the nonlinear level is analyzing and controlling the dominant weakly nonlinear effect: the plasma echo. Mathematically, this comes down to verifying condition (5.2) on the time-response kernels, crucial to the proof of (2.12c) in §5.1. Our choices of \(\lambda (t)\) for \(t \gg 1\) (in particular the choice of a) and \(s > 1/(2+\gamma )\) are both determined in this section. The analysis in this section is similar to the moment estimates carried out on the time-response kernels in §7 of [67] except with the regularity loss encoded by our choice of \(\lambda (t)\) taking the place of amplitude growth. The distinction is arguably minor, but this increased precision allows for a slightly cleaner treatment and highlights more clearly the origin of the regularity requirement.

Lemma 6.1

(Time response estimate I) Under the bootstrap hypotheses (2.11), there holds

$$\begin{aligned} \sup _{t \in [0,T^\star ]} \sup _{k\in \mathbb {Z}_*^d}\int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}\tau \lesssim _{a,s,d,\lambda _0,{\lambda ^\prime }} \sqrt{K_2} \epsilon . \end{aligned}$$

Proof

Consider first the effect of \(g_0\), the homogeneous part of g, which corresponds to \(\bar{K}_{k,k}(t,\tau )\):

$$\begin{aligned} {\mathcal {I}}_{inst}(t)&:= \int _0^t e^{(\lambda (t) - \lambda (\tau ))\langle k,kt \rangle ^s} e^{c\lambda (\tau )\langle k(t-\tau ) \rangle ^s} \frac{\left| k(t-\tau )\right| }{\left| k\right| ^{\gamma }}\left| \widehat{g}_{0}(\tau ,k(t-\tau ))\right| {\, \mathrm d}\tau \\&\le \int _0^te^{c\lambda (\tau )\langle k(t-\tau ) \rangle ^s} \frac{\left| k(t-\tau )\right| }{\left| k\right| ^{\gamma }}\left| \widehat{g}_{0}(\tau ,k(t-\tau ))\right| {\, \mathrm d}\tau . \end{aligned}$$

Here \(\textit{inst}\) stands for ‘instantaneous’ as this effect has no time delay (unlike \(k \ne \ell \) below); this terminology was borrowed from [67]. Also note that this is only controlling the effect of ‘low’ frequencies in \(g_0\). From the \(H^{d/2+} \hookrightarrow C^0\) embedding, \(\sigma > \beta + 1\) and (2.13),

$$\begin{aligned} {\mathcal {I}}_{inst}(t)&\le \int _0^t e^{(c-1)\lambda (\tau )\langle k(t-\tau ) \rangle ^s} \left( \sup _{\eta \in {\mathbb {R}}^d} e^{\lambda (\tau )\langle \eta \rangle ^s} \left| \eta \right| \left| \widehat{g}_{0}(\tau ,\eta )\right| \right) {\, \mathrm d}\tau \\&\lesssim _M \int _0^t e^{(c-1)\lambda (\tau )\langle k(t-\tau ) \rangle ^s} \left\| A^{(-\beta )}g_0(\tau )\right\| _{H^{M}_\eta } {\, \mathrm d}\tau \\&\lesssim _{\alpha _0} \sqrt{K_2}\epsilon . \end{aligned}$$

Next turn to the contributions from the case \(k \ne \ell \), which is the origin of the plasma echoes. Using \(\left| k(t-\tau )\right| \le \langle \tau \rangle \left| k-\ell ,kt - \ell \tau \right| \) and the definition of \(\bar{K}\) in (5.1),

$$\begin{aligned} \mathbf {1}_{k \ne \ell } \bar{K}_{k,\ell }(t,\tau )&\lesssim e^{(\lambda (t) - \lambda (\tau ))\langle k,kt \rangle ^s}e^{c\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s}\frac{\langle \tau \rangle }{\left| \ell \right| ^{\gamma }}\left| \widehat{\nabla g}_{k-\ell }(\tau ,kt-\ell \tau )\right| \mathbf {1}_{k \ne \ell \ne 0}. \end{aligned}$$

In what follows denote

$$\begin{aligned} -\nu (t,\tau ) = \lambda (t) - \lambda (\tau ). \end{aligned}$$

Then using that \(\lambda (t) \ge \alpha _0\) and \(c < 1\), if we write \(\delta = (1 - c)\alpha _0\) we are left to estimate,

$$\begin{aligned} {\mathcal {I}}(t)&:= \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} e^{-\nu (t,\tau )\langle k,kt \rangle ^s}e^{c\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s}\frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma }\left| \widehat{\nabla g}_{k-\ell }(\tau ,kt-\ell \tau )\right| \mathbf {1}_{k \ne \ell } {\, \mathrm d}\tau \nonumber \\&\lesssim \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} e^{-\nu (t,\tau )\langle k,kt \rangle ^s} \frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma } e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s}\nonumber \\&\quad \left. \left| e^{\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s}\right. \widehat{\nabla g}_{k-\ell }(\tau ,kt-\ell \tau )\right| \mathbf {1}_{k\ne \ell } {\, \mathrm d}\tau . \end{aligned}$$
(6.1)

By \(\sigma \ge \beta +1\), the \(H^{d/2+} \hookrightarrow C^0\) embedding and (2.13),

$$\begin{aligned} \left| e^{\lambda (\tau )\langle k-\ell ,kt-\ell \tau \rangle ^s} \widehat{\nabla g}_{k-\ell }(\tau ,kt-\ell \tau )\right|&\le \sup _{\eta \in {\mathbb {R}}^d} e^{\lambda (\tau )\langle k-\ell ,\eta \rangle ^s} \langle k-\ell ,\eta \rangle \left| \widehat{g}_{k-\ell }(\tau ,\eta )\right| \\&\le \left( \sum _{k \in \mathbb {Z}^d} \sup _{\eta \in {\mathbb {R}}^d} e^{2\lambda (\tau )\langle k,\eta \rangle ^s} \langle k,\eta \rangle ^2 \left| \widehat{g}_{k}(\tau ,\eta )\right| ^2\right) ^{1/2} \\&\lesssim \left\| A^{(-\beta )}g(\tau )\right\| _{L_k^2 H_\eta ^M} \\&\lesssim _M \sqrt{K_2}\epsilon . \end{aligned}$$

Applying this to (6.1) implies

$$\begin{aligned} {\mathcal {I}}(t) \lesssim \sqrt{K_2}\epsilon \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} e^{-\nu (t,\tau )\langle k,kt \rangle ^s} \frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma } e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s} \mathbf {1}_{\ell \ne k} {\, \mathrm d}\tau . \end{aligned}$$
(6.2)

Following an argument similar to that in [67] we may reduce to the \(d = 1\) case. By (3.9c),

$$\begin{aligned} {\mathcal {I}}(t)&\lesssim \sqrt{K_2}\epsilon \int _0^t\sum _{\ell \in \mathbb {Z}_*^d} \sum _{j: \ell _j \ne k_j} e^{-\nu (t,\tau )\langle k_j,k_jt \rangle ^s} \frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma } e^{-C_s\delta \langle k_j-\ell _j,k_jt-\ell _j\tau \rangle ^s} \\&\qquad \prod _{i \ne j}^d e^{-C^{d-1}_s\delta \langle k_i - \ell _i \rangle ^s} \mathbf {1}_{\ell \ne k} {\, \mathrm d}\tau \\&\lesssim \frac{\sqrt{K_2}\epsilon }{\delta ^{\frac{d-1}{s}}} \sum _{1 \le j \le d}\int _0^t\sum _{\ell _j \in \mathbb {Z}} e^{-\nu (t,\tau )\langle k_j,k_jt \rangle ^s} \frac{\langle \tau \rangle }{\langle \ell _j \rangle ^\gamma } e^{-C_s\delta \langle k_j-\ell _j,k_jt-\ell _j\tau \rangle ^s} \mathbf {1}_{\ell _j \ne k_j} {\, \mathrm d}\tau . \end{aligned}$$

Notice that we may not assert that both \(k_j\) and \(\ell _j\) are non-zero. However, if either \(k_j\) or \(\ell _j\) is zero we have by (3.9c), (3.8) and \(\tau \le t\),

$$\begin{aligned} \langle \tau \rangle e^{-C_s\delta \langle k_j-\ell _j,k_jt-\ell _j\tau \rangle ^s}&\le \langle \tau \rangle e^{-C_s^2\delta \langle k_j-\ell _j \rangle ^s - C_s^2 \delta \langle k_jt-\ell _j\tau \rangle ^s}\\&\lesssim \delta ^{-1/s}e^{-C_s^2\delta \langle k_j-\ell _j \rangle ^s - \frac{1}{2}C_s^2 \delta \langle \tau \rangle ^s}. \end{aligned}$$

Hence, we see that such cases cannot contribute anything to the sum in k of \({\mathcal {I}}(t)\) which is not bounded uniformly in time. Therefore, up to adjusting the definition of \(\delta \) by a constant, we may focus on the cases such that both \(k,\ell \in \mathbb {Z}_*\) and \(k \ne \ell \). Let us now focus on one such choice:

$$\begin{aligned} {\mathcal {I}}_{k,\ell }(t) := \int _0^t e^{-\nu (t,\tau )\langle k,kt \rangle ^s} \frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma } e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s} {\, \mathrm d}\tau . \end{aligned}$$

This term isolates a single possible echo at \(\tau = t k/\ell \): notice how the integrand is sharply localized near this time which accounts for the effect \(\rho _\ell (\tau \ell )\) has on the behavior of \(\rho _k(kt)\). Summing them deals with the cumulative effect of all the echoes. See [67] for more discussion. By symmetry we need only consider the case \(k \ge 1\).

Let us first eliminate the irrelevant early times; indeed by (3.9c),

$$\begin{aligned} \int _0^{\min (1,t)} e^{-\nu (t,\tau )\langle k,kt \rangle ^s} \frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma } e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s} {\, \mathrm d}\tau&\lesssim \int _0^{\min (1,t)} \frac{1}{\left| \ell \right| ^\gamma } e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s} {\, \mathrm d}\tau \nonumber \\&\lesssim \frac{e^{-C_s\delta \langle k-\ell \rangle ^s}}{\delta ^{1/s}\left| \ell \right| ^{1+\gamma }}. \end{aligned}$$
(6.3)

Now let us turn to the more interesting contributions of \(t \ge \tau \ge 1\). Given t,k and \(\ell \), define the resonant interval as

$$\begin{aligned} I_R = \left\{ \tau \in [1,t] : \left| kt - \ell \tau \right| < \frac{t}{2} \right\} \end{aligned}$$

and divide \({\mathcal {I}}_{k,\ell }\) into three contributions (one from (6.3)):

$$\begin{aligned} {\mathcal {I}}_{k,\ell }(t)&\lesssim \frac{1}{\delta ^{1/s}\left| \ell \right| ^{1+\gamma }} e^{-C_s\langle k-\ell \rangle ^s} + \left( \int _{[1,t] \cap I_R} + \int _{[1,t] \setminus I_R}\right) \nonumber \\&\quad \quad \times \frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma }e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s} e^{-\nu (t,\tau )\langle k,kt \rangle ^s} {\, \mathrm d}\tau \\&= \frac{1}{\delta ^{1/s}\left| \ell \right| ^{1+\gamma }} e^{-C_s\delta \langle k-\ell \rangle ^s} + \mathcal {I}_{R} + \mathcal {I}_{NR}. \end{aligned}$$

Here ‘NR’ stands for ‘non-resonant’. Note that if \(\ell \le k-1\) then in fact \([0,t] \cap I_R = \emptyset \).

Consider first the easier case of \({\mathcal {I}}_{NR}\). Since \(\left| kt - \ell t\right| \ge t/2\) on the support of the integrand, by (3.9c) and (3.8) we have

$$\begin{aligned} {\mathcal {I}}_{NR}&\le \frac{\langle t \rangle }{\left| \ell \right| ^\gamma }\int _{[1,t] \setminus I_R} e^{-C_s\delta \langle k-\ell \rangle ^s-C_s\delta \langle kt - \ell \tau \rangle ^s} e^{- \nu (\tau ,t) \langle k,kt \rangle ^s} {\, \mathrm d}\tau \nonumber \\&\le \frac{\langle t \rangle }{\left| \ell \right| ^\gamma } e^{-C_s\delta \langle k-\ell \rangle ^s - \frac{1}{2}C_s\delta \langle \frac{t}{2} \rangle ^s}\int _0 ^te^{- \frac{1}{2}C_s\delta \langle kt - \ell \tau \rangle ^s} {\, \mathrm d}\tau \nonumber \\&\lesssim \frac{\langle t \rangle }{\delta ^{1/s}\left| \ell \right| ^{1+\gamma }} e^{-C_s\delta \langle k-\ell \rangle ^s - \frac{1}{2}C_s\delta \langle \frac{t}{2} \rangle ^s} \nonumber \\&\lesssim \frac{1}{\delta ^{2/s}\left| \ell \right| ^{1+\gamma }} e^{-C_s\delta \langle k-\ell \rangle ^s}, \end{aligned}$$
(6.4)

which suffices to treat all of the non-resonant contributions.

Now focus on the resonant contribution \({\mathcal {I}}_{R}\), which as pointed out above, is only present if \(\ell \ge k+1\) due to the echo at \(\tau = tk/\ell \in (0,t)\). Since we are interested in \(t \ge \tau \ge 1\), by the definition of \(\lambda (t)\) in (2.8), there exists some constant \(\delta ^\prime \) (possibly adjusted by the reduction to one dimension) which is proportional to \(\lambda _0 - {\lambda ^\prime }\) such that on \([1,t]\cap I_R\),

$$\begin{aligned} \nu (t,\tau ) = \delta ^\prime \left( \tau ^{-a} - t^{-a} \right) = \delta ^\prime \left( \frac{t^a - \tau ^a}{\tau ^at^a} \right) . \end{aligned}$$

For t and \(\tau \) well separated, this provides a gap of regularity that helps us to control \(\mathcal {I}_R\). Hence, we see that the most dangerous echoes occur for \(\ell \approx k\) as these echoes are stacking up near t and the regularity gap provided by \(\nu \) becomes very small. From the formal analysis of [67] we expect to find the requirement \(s > 1/(2+\gamma )\) due precisely to this effect. Indeed we will see that is the case, in fact, here is the only place in the proof of Theorem 1 where this requirement is used (also at the analogous step in the proof of Lemma 6.2 below). By the mean-value theorem and the restriction that \(\tau \in I_R\) (also \(\tau \le \frac{3kt}{2\ell }\) and \(\ell - k \ge 1\)), we have

$$\begin{aligned} \nu (t,\tau ) \ge a\delta ^\prime \frac{t - \tau }{\tau ^a t}&= \frac{a\delta ^\prime }{\tau ^a t}\left[ t - \frac{kt}{\ell }\right] -\frac{a\delta ^\prime }{\tau ^a t}\left[ \tau - \frac{kt}{\ell }\right] \nonumber \\&\ge \frac{a\delta ^\prime }{\tau ^a}\left[ 1 - \frac{k}{\ell }\right] - \frac{a\delta ^\prime }{2 \tau ^a \ell } \nonumber \\&\ge \frac{a\delta ^\prime }{2\tau ^a \ell } \nonumber \\&\ge \frac{a\delta ^\prime }{2^{1-a} 3^a (kt)^a \ell ^{1-a}}. \end{aligned}$$
(6.5)

Let \(\tilde{\delta }^{\prime } =\frac{a\delta ^\prime }{2^{1-a}3^a}\). The lower bound (6.5) precisely measures the usefulness of \(\nu \). Indeed, by (6.5), (3.9c), (3.8) and \((2+\gamma )(s-a) = 1-a\) we have

$$\begin{aligned} {\mathcal {I}}_R&\lesssim \int _{I_R} \frac{kt}{\ell ^{1+\gamma }}e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s} e^{-\frac{\tilde{\delta }^\prime }{\ell ^{1-a}}\left| kt\right| ^{s-a}} {\, \mathrm d}\tau \nonumber \\&\lesssim \frac{kt}{\delta ^{1/s} \ell ^{2+\gamma }}e^{-\frac{\tilde{\delta }^\prime }{\ell ^{1-a}}\left| kt\right| ^{s-a}} e^{-C_s \delta \langle k-\ell \rangle ^s} \nonumber \\&\lesssim \frac{kt}{\delta ^{1/s} \ell ^{2+\gamma }}\left( \frac{\ell ^{\frac{1-a}{s-a}}}{(\tilde{\delta }^{\prime })^{\frac{1}{s-a}}kt} \right) e^{-C_s \delta \langle k-\ell \rangle ^s} \nonumber \\&\lesssim _{s,a} e^{-C_s \delta \langle k-\ell \rangle ^s} \frac{1}{\delta ^{1/s}(a\delta ^\prime )^{\frac{1}{s-a}}}. \end{aligned}$$
(6.6)

The use of \((2+\gamma )(s-a) \ge 1-a\) above is exactly the mathematical origin of the requirement \(s > (2+\gamma )^{-1}\). Notice also that (6.6) can be summed in either k or l, but not in both.

Assembling (6.3), (6.4) and (6.6) implies the lemma after summing in \(\ell \) and taking the supremum in t and k. \(\square \)

The next estimate is in some sense the ‘dual’ of Lemma 6.1 and is proved in the same way.

Lemma 6.2

(Time response estimate II) Under the bootstrap hypotheses (2.11) there holds

$$\begin{aligned} \sup _{\tau \in [0,T^\star ]} \sup _{\ell \in \mathbb {Z}_*^d} \sum _{k \in \mathbb {Z}_*^d} \int _{\tau }^{T^\star }\bar{K}_{k,\ell }(t,\tau ) {\, \mathrm d}t \lesssim _{a,s,d,\lambda _0,{\lambda ^\prime }} \sqrt{K_2} \epsilon . \end{aligned}$$

Proof

First consider \(\bar{K}_{k,k}(t,\tau )\), which corresponds to the homogeneous part of g:

$$\begin{aligned} {\mathcal {I}}_{inst}(\tau )&:= \int _\tau ^{T^\star } e^{(\lambda (t) - \lambda (\tau ))\langle k,kt \rangle ^s} e^{c\lambda (\tau )\langle k(t-\tau ) \rangle ^s} \frac{\left| k(t-\tau )\right| }{\left| k\right| ^{\gamma }}\left| \widehat{g}_{0}(\tau ,k(t-\tau ))\right| {\, \mathrm d}t. \end{aligned}$$

By the same argument as used in Lemma 6.1, it is straightforward to show

$$\begin{aligned} {\mathcal {I}}_{inst}(\tau )&\lesssim \sqrt{K_2}\epsilon . \end{aligned}$$

Next consider the case \(k \ne \ell \). By following the analysis of Lemma 6.1 the problem reduces to analyzing the analogue of (6.2):

$$\begin{aligned} {\mathcal {I}}(\tau ) = \sqrt{K_2}\epsilon \int _\tau ^{T^\star } \sum _{k \in \mathbb {Z}_*^d} e^{-\nu (t,\tau )\langle k,kt \rangle ^s}e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s}\frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma } \mathbf {1}_{\ell \ne k} {\, \mathrm d}t \end{aligned}$$

where \(\nu (t,\tau ) = \lambda (\tau ) - \lambda (t)\) and \(\delta \) are defined as in Lemma 6.1. As before we may reduce to the one dimensional case at the cost of adjusting the constant and the definition of \(\delta \). Hence consider the one dimensional integrals with \(k,\ell \in \mathbb {Z}_*\), \(k \ne \ell \) and \(k \ge 1\) (by symmetry):

$$\begin{aligned} {\mathcal {I}}_{k,\ell }(\tau ) = \int _\tau ^{T^\star } e^{-\nu (t,\tau )\langle k,kt \rangle ^s}e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s}\frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma } {\, \mathrm d}t. \end{aligned}$$
(6.7)

As in the proof of Lemma 6.1, we may eliminate early times; we omit the details and conclude

$$\begin{aligned} \int _\tau ^{\max (\tau ,\min (1,T^\star ))} e^{-\nu (t,\tau )\langle k,kt \rangle ^s}e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s}\frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma } {\, \mathrm d}t \lesssim \frac{1}{\delta ^{1/s}\left| \ell \right| ^{\gamma }\left| k\right| } e^{-C_s\delta \langle k-\ell \rangle ^s}. \end{aligned}$$

For the remainder of the proof, we will henceforth just assume \(T^\star > \tau \ge 1\). Following the proof of Lemma 6.1, define the resonant interval as

$$\begin{aligned} I_R = \left\{ t \in [\tau ,T^\star ] : \left| kt - \ell \tau \right| < \frac{\tau }{2} \right\} \end{aligned}$$

and divide the integral into two main contributions:

$$\begin{aligned} {\mathcal {I}}_{k,\ell }(\tau )&= \left( \int _{[\tau ,T^\star ) \cap I_R} + \int _{ [\tau ,T^\star ) \setminus I_R}\right) \frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma }e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s} e^{-\nu (t,\tau )\langle k,kt \rangle ^s} {\, \mathrm d}t \\&= \mathcal {I}_{R} + \mathcal {I}_{NR}. \end{aligned}$$

The non-resonant contribution follows essentially the same proof as (6.4) in Lemma 6.1; we omit the details and conclude

$$\begin{aligned} \mathcal {I}_{NR} \lesssim \frac{1}{\delta ^{2/s}\left| \ell \right| ^{\gamma } k} e^{-C_s\delta \langle k-\ell \rangle ^s}. \end{aligned}$$
(6.8)

Turn now to the resonant integral, in which case \(\ell \ge k+1\), and there is an echo at \( \ell \tau /k = t \in (\tau ,\infty )\). Since we are interested in \(t \ge \tau \ge 1\), by the definition of \(\lambda (t)\) in (2.8), there exists some constant \(\delta ^\prime \) (possibly adjusted by the reduction to one dimension) which is proportional to \(\lambda _0 - {\lambda ^\prime }\) such that by the mean-value theorem and the restriction that \(t \in I_R\) (also since \(\frac{kt}{2\ell } \le \tau \)),

$$\begin{aligned} \nu (t,\tau ) \ge a \delta ^\prime \frac{t - \tau }{\tau ^a t}&\ge \frac{a\delta ^\prime }{\tau ^\alpha t}\left[ \frac{\ell \tau }{k} - \tau \right] -\frac{a\delta ^\prime }{\tau ^\alpha t}\left[ t - \frac{\ell \tau }{k}\right] \\&\ge \frac{a\delta ^\prime \tau ^{1-a}}{2tk} \\&\ge \frac{a\delta ^\prime }{2^{2-a}\ell ^{1-a}(kt)^a}. \end{aligned}$$

If we now let \(\tilde{\delta }^{\prime } = \frac{a\delta ^\prime }{2^{2-a}}\) and apply (3.9c), (3.8) and \((2+\gamma )(s-a) = 1-a\) then we have

$$\begin{aligned} {\mathcal {I}}_R&\lesssim \int _{I_R} \frac{kt}{\ell ^{1+\gamma }}e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s} e^{-\frac{\tilde{\delta }^{\prime }}{\ell ^{1-a}}\left| kt\right| ^{s-a}} {\, \mathrm d}t \nonumber \\&\lesssim \int _{I_R} \frac{\ell ^{\frac{1-a}{s-a}}}{\ell ^{1+\gamma } (\tilde{\delta }^{\prime })^{\frac{1}{s-a}}}e^{-C_s\delta \langle k-\ell \rangle ^s - C_s\delta \langle kt-\ell \tau \rangle ^s} {\, \mathrm d}t \nonumber \\&\lesssim \frac{\ell ^{\frac{1-a}{s-a}}}{\ell ^{2+\gamma } \delta ^{1/s} (\tilde{\delta }^{\prime })^{\frac{1}{s-a}}} \left( \frac{\ell e^{-C_s\delta \langle k-\ell \rangle ^s}}{k}\right) \nonumber \\&\lesssim \frac{1}{\delta ^{2/s} (\tilde{\delta }^{\prime })^{\frac{1}{s-a}}} e^{-\frac{1}{2}C_s\delta \langle k-\ell \rangle ^s}, \end{aligned}$$
(6.9)

which is summable in k uniformly in l (the extra power of k in the denominator of the penultimate line came from the time integration).

Assembling the contributions of (6.8) and (6.9), summing in k and taking the supremum in \(\ell \) and \(\tau \le \infty \) completes the proof of Lemma 6.2. \(\square \)

The following simple lemma is used in §5.2 above to deduce the pointwise-in-time control on \(\rho \).

Lemma 6.3

Under the bootstrap hypotheses (2.11) we have

$$\begin{aligned} \sup _{0 \le \tau \le t} \sup _{\ell \in \mathbb {Z}_*^d} \sum _{k \in \mathbb {Z}_*^d} \bar{K}_{k,\ell }(t,\tau ) \lesssim \sqrt{K_2} \epsilon \langle t \rangle . \end{aligned}$$

Proof

As in the proof of Lemmas 6.1 and 6.2, we may control g by (2.11b) and reduce to dimension one, leaving us to analyze the analogue of (6.7) except without the time integral:

$$\begin{aligned} {\mathcal {I}}_{k,\ell }(t,\tau ) = e^{-\nu (t,\tau )\langle k,kt \rangle ^s}e^{-\delta \langle k-\ell ,kt-\ell \tau \rangle ^s}\frac{\langle \tau \rangle }{\left| \ell \right| ^\gamma }. \end{aligned}$$

By using (3.9c) we have,

$$\begin{aligned} {\mathcal {I}}_{k,\ell }(t,\tau ) \lesssim e^{-C_s\delta \langle k-\ell \rangle ^s}\langle \tau \rangle , \end{aligned}$$

which after summing in k and taking the supremum in \(\ell \) and \(\tau \le t\) gives the lemma. \(\square \)

Final Steps of Proof

By Proposition 2.2, (2.12) holds uniformly in time. By (1.3) and the algebra property (3.14),

$$\begin{aligned}&\int _0^T \left\| F(t,z+vt) \cdot (\nabla _v - t\nabla _z)(f^0+g)(t) \right\| _{\mathcal {G}^{\alpha _0}} {\, \mathrm d}t\\&\quad \lesssim \int _0^T \left\| \rho (t)\right\| _{\mathcal {F}^{\alpha _0}}\left\| (\nabla _v - t\nabla _z)(f^0+g)(t)\right\| _{\mathcal {G}^{\alpha _0}} {\, \mathrm d}t. \end{aligned}$$

Therefore, (2.12), (3.8), \(\lambda (t) \ge \alpha _0\), (1.9) and \(\sigma > \beta +1\) imply

$$\begin{aligned}&\int _0^T \left\| F(t,z+vt) \cdot (\nabla _v - t\nabla _z)(f^0+g)(t) \right\| _{\mathcal {G}^{\alpha _0}} {\, \mathrm d}t\\&\quad \lesssim \int _0^T \langle t \rangle ^{-\sigma +1} \left\| A\rho (t)\right\| _{2} \left\| A^{(-\beta )}(f^0 + g)(t)\right\| _{2} {\, \mathrm d}t \\&\quad \lesssim \left( \int _0^T \left\| A\rho (t)\right\| ^2_{2} {\, \mathrm d}t \right) ^{1/2} \left( \int _0^T \langle t \rangle ^{-2\sigma + 2} \left\| A^{(-\beta )}(f^0 + g)(t)\right\| ^2_{2} {\, \mathrm d}t\right) ^{1/2} \\&\quad \lesssim \epsilon . \end{aligned}$$

Therefore, we may define \(g_\infty \) satisfying \(\left\| g_\infty \right\| _{\mathcal {G}^{\alpha _0}} \lesssim \epsilon \) by the absolutely convergent integral

$$\begin{aligned} g_\infty = h_{{\mathrm{in}}} - \int _0^\infty F(\tau ,z+v\tau )\cdot (\nabla _v - \tau \nabla _z)g(\tau ) {\, \mathrm d}\tau . \end{aligned}$$

Moreover, again by (3.14), (2.12) and (3.8),

$$\begin{aligned} \left\| g(t) - g_\infty \right\| _{\mathcal {G}^{\lambda ^{\prime }}}&\lesssim \int _t^\infty e^{({\lambda ^\prime } - \alpha _0)\langle \tau \rangle ^s} \langle \tau \rangle ^{-\sigma + 1} \left\| A\rho (\tau )\right\| _{2} \left\| A^{(-\beta )}(f^0+g)(\tau )\right\| _{2} {\, \mathrm d}\tau \\&\lesssim \epsilon e^{\frac{1}{2}({\lambda ^\prime } - \lambda _0)\langle t \rangle ^s}, \end{aligned}$$

which implies (1.12a). Then Lemma 5.2 implies (1.12b) (since \(\sigma > 1/2\)), completing the proof of Theorem 1.

We briefly sketch the refinement mentioned in Remark 6. Specifically we verify that the final state predicted by the linear theory is accurate to within \(O(\epsilon ^2)\). Indeed, let \(g^L\) be the solution to

$$\begin{aligned} \left\{ \begin{array}{l} \partial _t g^L + F^L(t,z+vt)\cdot \nabla _vf^0 = 0, \\ \widehat{F^L(t,z+vt)}(t,k,\eta ) = -ik\widehat{W}(k)\widehat{g^L}_k(t,kt) \delta _{\eta \,{=}\, kt}, \\ g^L(t = 0,z,v) = h_{{\mathrm{in}}}(z,v). \end{array} \right. \end{aligned}$$
(7.1)

By the analysis of §4 we have that \(h^L(t,x,v) = g^L(t,x-tv,v)\) satisfies the conclusions of Theorem 1 for \(h_\infty ^L = h_\infty ^L(z,v)\) given by

$$\begin{aligned} h_\infty ^L(z,v) = h_{{\mathrm{in}}}(z,v) - \int _0^\infty F^L(t,z+vt) \cdot \nabla _v f^0(v) {\, \mathrm d}t. \end{aligned}$$

Consider next the PDE

$$\begin{aligned} \partial _t \left( g-g^L\right) + \left( F - F^L\right) (t,z+vt) \cdot \nabla _vf^0 = -F(t,z+vt) \cdot (\nabla _v -t\nabla _z)g. \end{aligned}$$

By treating the right-hand side as a decaying external force, the analysis of §4 with \({\lambda ^\prime }\) replaced by \(\lambda ^{\prime \prime } < {\lambda ^\prime }\), then implies

$$\begin{aligned} \left\| g(t)-g^L(t)\right\| _{\lambda ^{\prime \prime }} \lesssim _{{\lambda ^\prime } - \lambda ^{\prime \prime }} \epsilon ^2, \end{aligned}$$

which shows that the nonlinearity changes the linear behavior at the expected \(O(\epsilon ^2)\) order. Justifying higher order expansions should also be possible, but justifying the convergence of a Newton iteration is significantly more challenging as the constants would need to be quantified carefully.