1 Introduction

In this work, we investigate the thin film equation with linear mobility in arbitrary space dimensions, that is, the partial differential equation

$$\begin{aligned} \partial _\tau u+\nabla \cdot \left( u\nabla \Delta u\right) =0 \end{aligned}$$
(1)

in the whole space \(\mathbb {R}^N\). This equation models the flow of an \(N+1\) dimensional viscous fluid with high surface tension over a flat substrate, and thus, the real physical three-dimensional setting corresponds to the case \(N=2\). The evolving scalar variable \(u=u(\tau ,y)\) in (1) represents the height of the liquid film, and is assumed to be nonnegative [58, 59]. In the \(1+1\) dimensional case, equation (1) can also be seen as the lubrication approximation in a two-dimensional Hele–Shaw cell [34].

The thin film equation is degenerate parabolic in the sense that the diffusion flux decreases to zero where u vanishes. It follows that the speed of propagation is finite and thus droplet configurations stay compactly supported for all times. On a mathematical level, we are thus concerned with a free boundary problem. We will be focusing on a setting in which droplet solutions are slowly spreading over the full space, a regime that is commonly referred to as complete wetting. This is obtained mathematically by prescribing the contact angle at the droplet boundary \(\partial \{u>0\}\) to be zero, that is, \(\nabla u=0\).

A reference spreading droplet configuration is given by Smyth and Hill’s self-similar solution [6, 25, 66]

$$\begin{aligned} u_*(\tau ,y) = \frac{1}{\tau ^{\frac{N}{N+4}}}\alpha _N\left( \sigma _M-\frac{\vert y\vert ^2}{\tau ^{\frac{2}{N+4}}}\right) ^2_+, \end{aligned}$$
(2)

where \(\alpha _N=\frac{1}{8(N+4)(N+2)}\) and \(\sigma _M\) is a positive constant only depending on the mass constraint

$$\begin{aligned} \int \limits _{\mathbb {R}^N} u_*\text {d}y=M. \end{aligned}$$

Moreover, we write \((s)_+\) for the positive part \( \max \!\left\{ 0,s\right\} \!\) of a quantity s. These source-type solutions (2) play a distinguished role in the theory since they are, similar to related parabolic problems, believed to describe the large time asymptotic behavior of any solution of mass M to the thin film equation, that is,

$$\begin{aligned} u(\tau ,y)\approx u_*(\tau ,y)\quad \text{ for } \text{ any } \tau \gg 1. \end{aligned}$$
(3)

This convergence has been proved for strong solutions in the one-dimensional setting (\(N=1\)) via entropy methods by Carrillo and Toscani [13] and for minimizing movement solutions in arbitrary dimensions via gradient flow techniques by Matthes, McCann and Savaré [53]. Both contributions provide sharp rates of convergence and exploit the intimate relation between the thin film equation (1) and the porous medium equation

$$\begin{aligned} \partial _{\tau } u - \Delta u^{m}=0 \end{aligned}$$
(4)

in the case \(m=3/2\). In fact, up to a suitable rescaling, the Smyth–Hill solutions (2) coincide with the self-similar Barenblatt solutions [2, 61, 71] of the porous medium equation (4), and the surface energy, which is dissipated by the thin film equation (1), coincides with the rate of dissipation of the Tsallis entropy under the porous medium flow (4). See [53, 54] for a clean formulation of this entropy-information relation from a gradient flow perspective.

The link between the two equations can be further exploited in order to get deeper insights into the large time behavior of solutions to (1): When linearizing both equations about the self-similar solutions, it turns out that the linear porous medium operator \(\mathcal {L}\) translates into the linear thin film operator in a simple algebraic way, namely \(\mathcal {L}^2 +N\mathcal {L}\) [54]. It immediately follows that the eigenfunctions of both operators agree, while the transformation of the eigenvalues from the porous medium setting to the thin film setting obeys the same algebraic formula. The operator \(\mathcal {L}\) was diagonalized in [63, 70], and thus, the full spectral information is also available for the thin film equation [54], see Theorem 2.2 below. The spectrum of the one-dimensional operator was computed earlier in [7].

The knowledge of the complete spectrum does not only give information on the sharp rate of convergence (for which information about the spectral gap would be sufficient), but also on the geometry of all modes through the knowledge of all eigenfunctions. One may thus analyze in detail the role played by affine symmetries such as dilations, rotations or shears, and we will in this paper obtain improved rates of convergence for the thin film equation (1) by quoting out such symmetries. Further details on the large time asymptotics can be formulated after a suitable change of variables.

Higher order large time asymptotics for the porous medium equation (4) with \(m>1\) were obtained in one dimension by Angenent [1], building up on the spectral information in [70] and, more recently, in any dimension by the first author [64], building up on [63]. While Angenent derived fine series expansions around the limiting solution, the later multidimensional contribution takes a geometric point of view by constructing finite-dimensional invariant manifolds that the solutions approximate to any given order. In the present work, we will derive a parallel theory for the thin film equation. Invariant manifold studies can be found in numerous applications in the field of nonlinear partial differential equations, for instance, [12, 17,18,19, 23, 26,27,28, 38, 68, 69]. What is particularly challenging in [64] and the present paper is the moving free boundary at which solutions cease to be smooth.

What is needed for linking the spectrum of the linear operators to the nonlinear dynamics (1) or (4) is a regularity framework in which solutions depend differentiably on the initial configuration. This is necessary since the precise rate of convergence in the limit (3) is dictated by the particular choice of the initial datum. Identifying such a framework is far from being trivial. A crucial first step is a nonlinear change of variables that transforms the free boundary problem into an evolution equation on a fixed domain, which can be chosen as the unit ball. The linear leading order part of the equation can then be seen as a degenerate parabolic equation, whose degeneracy can be cured by interpreting the the dynamics as a fourth-order heat flow on a weighted Riemannian manifold. For the porous medium equation, this setting was proposed by Koch in his habilitation thesis [47], further refined in the work of Kienzler [42] and then adapted in [64]. An analogous theory for the thin film equation was derived by John in [40] and later adapted by the first author in [65]. After some necessary refinements, the latter will be the starting point for the present study.

We also like to mention the related studies by Denzler, Koch and McCann [21, 22] and Choi, McCann and the first author [15, 16], who derived some improved large time asymptotics for the fast diffusion equation, that is, (4) with \(m<1\), in the full space and a bounded domain, respectively. The full space setting is particularly challenging due to the occurrence of continuous spectrum, which arises from the fact that the associated Barenblatt profile possesses a finite number of moments, while in a bounded domain, in which solutions extinct in finite time, negative (unstable) eigenvalues challenge the leading order asymptotics [9].

On a technical level, the passage from the second order problem [64] to the present fourth order problem is far from being trivial. Indeed, for the construction of invariant manifolds a truncation has to be introduced that reflects the maximal regularity properties of the linear part of the equation. In order to remove the truncation eventually, improved regularity and new smoothing estimates are crucial. We elaborate on this issue later in Section 6.

A major improvement with respect to the analogous second order paper [64] is a group theoretical point of view that we introduce here and with the help of which we are able to classify and mod out symmetries in order to obtain convergence rates of higher order.

Before giving in the next section a specific description of our setting and of our main results, we want to finish this introductory section with a brief discussion about the state of the art in the mathematical theory for thin film equation. Existence of nonnegative weak solutions was established with the help of compactness arguments and estimates on the free surface energy by Bernis and Friedman [5]. This approach is not adequate to prove a general uniqueness result even though the regularity of these solutions could be improved, see [4, 8, 20]. In a neighborhood of stationary solutions (of infinite mass), well-posedness and regularity of one-dimensional solutions could be established in a weighted Sobolev setting [32] and in Hölder spaces [31]. Moreover, the aforementioned work [40] deals with the multidimensional case and lowers the regularity requirements to Lipschitz norms and Carleson-type measures. The latter approach was adapted to neighborhoods of the Smyth–Hill self-similar solution in [65]. The one-dimensional setting was also considered in [35] using weighted Hilbert spaces. We finally remark that for nonlinear mobilities, solutions are in general not smooth, see [3, 29, 30, 36, 43,44,45] for results in the complete wetting and partial wetting regime (positive contact angle). Moreover, even though convergence to the self-similar or stationary solutions [6, 25] is expected, no results in this direction are available so far. (Equilibration towards stationary profiles in the \(1+1\) dimensional linear-mobility partial wetting regime can be found in [24, 52].)

Organization of the Paper In the next section, we state and discuss our results on the large time asymptotics in self-similar variables. In Section 3, we rewrite the thin film equation as a perturbation equation around the self-similar Smyth–Hill solution and present our main theorems of this paper, including the Invariant Manifold Theorem. We will describe in Section 4 how these results for the perturbation equation translate into the large-time asymptotics for the thin film equation. Section 5 collects information on the well-posedness of the perturbation equation and improves on known regularity estimates. The subsequent Section 6 deals with a truncated version of the perturbation equation. Well-posedness and regularity estimates are provided. Moreover, we introduce and discuss the time-one mapping that will be our main object of consideration in our construction of invariant manifolds in Section 7. The final Section 8 exploits the invariant manifold theory to prove the large-time asymptotic expansions for the perturbation equation. We conclude with two appendices, one with a derivation of the perturbation equation, one with inequalities for weighted Sobolev spaces.

2 Higher Order Asymptotics for the Thin Film Equation

In order to study the convergence towards self-similar solutions, it is customary to perform a self-similar change of variables. In view of the particular form of the Smyth–Hill solution (2), we choose

$$\begin{aligned} x = \frac{1}{\sqrt{\sigma _M}}\frac{1}{\tau ^{\frac{1}{N+4}}}y, \quad t= \gamma ^{-1} \log \left( \tau ^{\frac{1}{N+4}}\right) \quad \text { and } \quad v = \frac{(N+4)\gamma }{\sigma _M^2}\tau ^{\frac{N}{N+4}}u, \end{aligned}$$
(5)

where \(\gamma = 2(N+2)\), which transforms equation (1) into the confined thin film equation

$$\begin{aligned} \partial _t v + \nabla \cdot \left( v\nabla \Delta v\right) - \gamma \nabla \cdot \left( xv\right) =0, \end{aligned}$$
(6)

and turns the self-similar solution (2) into a stationary one,

$$\begin{aligned} v_*(x) = \frac{1}{4}\left( 1-\vert x \vert ^2\right) _+^2. \end{aligned}$$
(7)

We remark that under this change of variables, the initial time will be transferred from \(t =0\) to \(\tau =1\). As we are interested into the solutions’ large time behavior only, we will hereafter treat 0 as the initial time for the transformed equation. Moreover, the rescaling incorporates the total mass M through \(\sigma _M\) in such a way that the stationary \(v_*\) is the limiting solution only if v and \(v_*\) have the same total mass. In what follows, we will assume that this is always the case be requiring that

$$\begin{aligned} \int \limits _{\mathbb {R}^N} v_0\, \text {d}x = \int \limits _{\mathbb {R}^N} v_*\, \text {d}x, \end{aligned}$$
(8)

if \(v_0\) is the initial configuration for the evolution in (6).

The theory in [65] guarantees that the confined thin film equation (6) has a unique regular solution provided that \(v_0\) and \(v_*\) are sufficiently close in the sense that

$$\begin{aligned} \Vert \sqrt{v_0} - V_* \Vert _{W^{1,\infty }({{\,\textrm{supp}\,}}v_0)} \ll 1, \end{aligned}$$
(9)

where \(V_*( x) = \frac{1}{2}(1-| x|^2)\) is the (unsigned) extension of \(\sqrt{ v_*}\) to \(\mathbb {R}^N\). This condition actually yields strong estimates between \(v_0\) and and the exact stationary solution \(v_*\) as will be explained in the following remark:

Remark 2.1

Choosing the globally decaying \(V_*\) over \(\sqrt{v_*}\) in (9) has the advantage that we can infer from it simultaneously an information on the support of \(v_0\), a global estimate on the difference of \(v_0\) and \(v_*\), and a bound on the slope of \(v_0\).

Indeed, regarding the first, restricting to the boundary of the support, where \(v_0\) vanishes, and noticing that \(V_*(x)\sim {{\,\textrm{dist}\,}}(x,\partial B_1(0))\), we directly deduce

$$\begin{aligned} \sup \limits _{x\in \partial {{\,\textrm{supp}\,}}v_0}{{\,\textrm{dist}\,}}(x, \partial B_1(0)) \ll 1. \end{aligned}$$

Next, we observe that \(V_*=\sqrt{v_*}\) inside the ball \(B_1(0)\). Outside of \(B_1(0)\) it holds that \(0\le \sqrt{v_0} - \sqrt{v_*} = \sqrt{v_0} \le \sqrt{v_0}-V_*\), and thus, we find

$$\begin{aligned} \Vert \sqrt{v_0} - \sqrt{v_*}\Vert _{L^{\infty }(A)} \ll 1 \end{aligned}$$
(10)

on the set \(A = {{\,\textrm{supp}\,}}v_0\) as a consequence of (9). Moreover, since \(V_*- \sqrt{v_0}= V_* = \sqrt{v_*}\) on \(B_1(0)\cap \partial {{\,\textrm{supp}\,}}v_0\) and since \(v_*\) is decaying towards the boundary, the estimate (10) holds true also on \(A = B_1(0) {\setminus } {{\,\textrm{supp}\,}}v_0 \). It remains to notice that \(v_0 = v_*=0\) on the remaining set \(A=B_1(0)^c\cap ({{\,\textrm{supp}\,}}v_0)^c\), and thus (10) is proved to be true with \(A=\mathbb {R}^N\). We immediately deduce that

$$\begin{aligned} \left\| v_0-v_*\right\| _{L^\infty \left( \mathbb {R}^N\right) }\ll 1, \end{aligned}$$

because \(|v_0-v_*| = |\sqrt{v_0}-\sqrt{v_*}|(\sqrt{v_0}+\sqrt{v_*}) \lesssim |\sqrt{v_0}-\sqrt{v_*}|\) where the last identity is true because \(v_0\) and \(v_*\) are bounded.

Finally, we can also extract a condition on the slope of \(v_0\), namely,

$$\begin{aligned} \left\| \nabla v_0 + 2x \sqrt{v_0}\right\| _{L^\infty \left( \mathbb {R}^N\right) } \ll 1. \end{aligned}$$
(11)

To establish (11), we first note that the left-hand side vanishes provided that x does not lie in the support of \(v_0\). Inside \( {{\,\textrm{supp}\,}}v_0\), we have \(|\nabla v_0+ 2x\sqrt{v_0}| = 2\sqrt{v_0}|\nabla \sqrt{v_0}- \nabla V_*|\lesssim |\nabla \sqrt{v_0}- \nabla V_*|\), because \(v_0\) is bounded. Condition (9) then yields the claim. Notice that the left-hand side in (11) vanishes precisely for \(v_0=v_*\) (under the mass constraint (8)).

The main results of the referred work [65] are repeated in more details later in Section 5. This section also contains the main results of the present work. At this stage, we present some consequences of that general theory for the confined thin film equation (6), which provide exemplary improved convergence rates towards equilibrium by quoting out symmetries.

The rates of relaxation being intimately related to the spectrum of the linear operator \(\mathcal {L}^2+N\mathcal {L}\) associated to the confined equation (6), see Section 3 below, for a better understanding of our results presented in the sequel, we recall the findings of the spectral analysis from the literature.

Theorem 2.2

([7, 54]) The operator \(\mathcal {L}^2+N\mathcal {L}\) has a purely discrete spectrum consisting of the eigenvalues

$$\begin{aligned} \mu _{l,k}= \lambda _{l,k}^2+N\lambda _{l,k}, \end{aligned}$$

where the \(\lambda _{l,k}\) are the eigenvalues of \(\mathcal {L}\). They are given by

$$\begin{aligned} \lambda _{l,k}= 2\left( l+2k\right) +2k\left( k+l+\frac{N}{2}-1\right) , \end{aligned}$$

for \((l,k)\in \mathbb {N}_0\times \mathbb {N}_0\) if \(N\ge 2\) and \((l,k) \in \{0,1\}\times \mathbb {N}_0\) if \(N=1\). The corresponding eigenfunctions are polynomials of degree \(l+2k\), namely

$$\begin{aligned} \psi _{l,n,k}(x) = {}_2F_1\left( -k,1+l+\frac{N}{2}+k;l+\frac{N}{2};|x|^2\right) Y_{l,n}\left( \frac{x}{|x|}\right) |x|^{l}, \end{aligned}$$

where \(n\in \{1,\dots ,N_l\}\) with \(N_0=1\) or \(N_1=N\) and \(N_l=\frac{\left( N+l-3\right) !\left( N+2l-2\right) }{l!\left( N-2\right) !}\) if \(l\ge 2\). Besides, \({}_2F_1(a,b;c;d)\) is a hypergeometric function and \(Y_{l,n}\) is a spherical harmonic (of degree l) if \(N\ge 2\), corresponding to the eigenvalue \(l(l+N-2)\) of \(-\Delta _{\mathbb {S}^{N-1}}\) with multiplicity \(N_l\). If \(N=1\) it is \(Y_{l,n}\left( \pm 1\right) = \left( \pm 1\right) ^l.\)

The computation of the linear operator in [54] was rather formal and was derived from the gradient flow interpretation of (6) with respect to the Wasserstein metric tensor [33, 53, 60]. It occurs naturally after suitable rescaling in the perturbation equation (19)

In the statement of the theorem, the linear operator is analyzed with respect to the Hilbert space introduced in (21) below, and the eigenfunctions \(\psi _{l,n,k}\) give rise to an orthogonal basis of that Hilbert space.

We recall that hypergeometric functions can be written as power series of the form

$$\begin{aligned} {}_2F_1(a,b;c;z)=\sum \limits _{j=0}^{\infty } \frac{(a)_j(b)_j}{(c)_jj!}z^j, \end{aligned}$$

where \(a,b,c,z \in \mathbb {R}\) and c is not an non-positive integer, see, for example [62]. The definition uses extended factorials, also known as Pochhammer symbols,

$$\begin{aligned} (s)_j=s(s+1)\cdots (s+j-1), \quad \text { for } j\ge 1 \text { and } (s)_0=1. \end{aligned}$$

The hypergeometric functions with \(z = |x|^2\) reduce to a polynomial of degree 2k if we plug in \(-k\) for a. In this case, they can be expressed as Jacobi polynomials.

Fig. 1
figure 1

The spectrum of the linear operator with multiples in the range [0, 400] in the \(2+1\) dimensional setting (\(N=2\))

In the one-dimensional setting, all eigenvalues have multiplicity one. In higher dimensions, all eigenvalues with \(l\ge 2\) have a dimension dependent multiplicity that stems from the multiplicity of the eigenvalue \(l(l+N-2)\) associated with the spherical harmonics, that is, the eigenfunctions of the Laplace–Beltrami operator \(\Delta _{\mathbb {S}^{N-1}}\). In addition, there are certain intersections between the eigenvalues \(\mu _{\cdot , k}\) and \(\mu _{\cdot , k+n}\). For instance, in two dimensions, it holds that \(\mu _{l,k} = \mu _{l(k+1) + k(k+2),0}\) for any kl, see Figure 1.

2.1 Leading Order Asymptotics

Apparently, \(\mu _{0,0}=0\) is the smallest eigenvalue. It corresponds to a situation in which the convergence in (3) fails, which is precisely the case if the equal mass condition (8) is not satisfied. Conversely, by requiring that (8) holds, this eigenvalue is automatically eliminated. The exact leading order asymptotics are then governed by the second smallest eigenvalue \(\mu _{1,0}=4+2N\), which is our first result for solutions to the confined thin film equation. We will derive it from a more general statement in Theorem 3.1 in Section 3 and present it thus as a corollary here.

Corollary 2.3

(Exact leading order asymptotics) Let v be the solution to (6) with initial data \(v_0\) satisfying the mass constraint (8) and being sufficiently close to \(v_*\) in the sense of (9). Then it holds that

$$\begin{aligned} \left\| \sqrt{v(t)}-V_*\right\| _{W^{1,\infty }\left( {{\,\textrm{supp}\,}}v(t)\right) }&\lesssim e^{-(4+2N) t} \quad \text { for all } t\ge 0. \end{aligned}$$

The result entails the convergence of v(t) towards \(v_*\) as outlined in Remark 2.1.

The same rate of convergence was established earlier in terms of the relative Tsallis entropy and the \(L^1\) norm by Carrillo and Toscani [13] in the one-dimensional setting and by Matthes, McCann and Savaré in any dimension (if one takes into account the difference in the time scaling that we introduced in (6) through the \(\gamma ^{-1}\) factor). It corresponds to an \(O(\tau ^{-(N+1)/(N+4)})\) convergence in the limit (3) for the original thin-film equation (1).

The convergence rate in this theorem is sharp and is saturated by spatial translations of the stationary solution \(v_*\). Indeed, for every vector \(b\in \mathbb {R}^N\), the function \(v(t,x)=v_*(x-e^{-\gamma t}b)\) solves the confined thin film equation exactly and approaches \(v_*(x)\) with exponential rate \(\gamma =4+2N\), as can be readily checked via Taylor expansion. However, because the original equation (1) is invariant under spatial translations, the convergence in (3) with rate \(O(\tau ^{-(N+1)/(N+4)})\) remains true for any shifted version of the Smyth–Hill solution, that is, \(u(\tau ,y)\approx u_*(\tau ,y-b)\), and the significance of this rate is thus an artifact of this symmetry. Indeed, the above arguing shows that the convergence in Corollary 2.3 is sharp only if we are not willing to pick the “correctly” centered Smyth–Hill solution. We may equivalently adjust the initial datum by a suitable translation in \(\mathbb {R}^N\). As we will see, the “correct” choice for b is the center of mass, which is preserved under the original evolution (1) and pushed towards the origin by the confined equation,

$$\begin{aligned} \int xv(t,x)\text {d}x = e^{- \gamma t}b_0,\quad b_0=\int x v_0(x)\text {d}x \end{aligned}$$
(12)

for all \(t\ge 0\), because our rescaling (5) has eliminated the translation invariance. Supposing that \(v_0\) is centered at the origin, \(b_0=0\), the eigenvalue \(\mu _{1,0}\) drops out of the spectrum and we obtain a better rate of convergence, namely by the next smallest eigenvalue, which is \(\mu _{0,1}=30\) if \(N=1\), and \(\mu _{2,0} = 16+4N\) if \(N\ge 2\).

Corollary 2.4

Let v be as in Corollary 2.3 and assume in addition that \(v_0\) is centered at the origin, that is, \(b_0=0\) in (12). Then, it holds that

$$\begin{aligned} \left\| \sqrt{v(t)}-V_*\right\| _{W^{1,\infty }\left( {{\,\textrm{supp}\,}}v(t)\right) }&\lesssim e^{- 30t} \quad \text { for all } t\ge 0 \end{aligned}$$

if \(N=1\), and

$$\begin{aligned} \left\| \sqrt{v(t)}-V_*\right\| _{W^{1,\infty }\left( {{\,\textrm{supp}\,}}v(t)\right) }&\lesssim e^{-(16 +4N)t} \quad \text { for all }t\ge 0 \end{aligned}$$

if \(N\ge 2\).

This rate of convergence is again sharp for solutions that start, if \(N\ge 2\), from affine transformations of the stationary solution, and if \(N=1\), from dilated stationary solutions. Because we will discuss dilated stationary solutions later also in the multi-dimensional case, we will restrict ourselves here to the setting \(N\ge 2\). Solutions starting from affine transformations of \(v_*\) are then to leading order (modulo rescaling to fit the mass constraint) described by \(v(t,x)\approx v_*(x - e^{-\mu _{2,0}t} Ax)\) for a symmetric and trace-free matrix A. The validity of this asymptotics is best understood in terms of the perturbation equation, that we will introduce in the subsequent section.

The occurrence of such affine transformations can be explained on the level of the eigenfunctions computed in Theorem 2.2: The finite displacements \(v_s\) generated by an eigenfunction \(\psi \) are described by

$$\begin{aligned} v_s( x+s\nabla \psi (x)) \det (I+s\nabla ^2 \psi (x)) = v_*(x), \end{aligned}$$
(13)

provided that \(|s|\ll 1\). For \(k=0\), the eigenfunctions are homogeneous harmonic polynomials of degree l, namely \(\psi _{l,n,0}(x) = Y_{l,n}(x/|x|) |x|^l\). If \(l=2\), the generating polynomials are quadratic, and thus of the form \(\psi (x) = x\cdot Ax\) for a symmetric and trace-free matrix A. In this case, (13) defines affine transformations.

For further improvements on the rate of convergence, we have to quote out affine transformations.

2.2 Higher Order Corrections and the Role of Symmetries

In order to improve on the convergence rates even further, we exploit symmetry invariances of the thin film equation in conjunction with symmetry properties of spherical harmonics, which determine the angular modulations of our eigenfunctions, see Theorem 2.2. More precisely, we will obtain higher order convergence rates by assuming that the initial datum \(v_0\) is invariant under certain orthogonal transformations. Because such transformations leave the thin film equation invariant and thanks to the uniqueness of solutions near self-similarity [65], the invariance under those orthogonal transformations is inherited by the solution for all times. We will show that the orthogonality condition leads to a selection among the eigenfunctions forcing a large class of eigenmodes to remain inactive during the evolution. The slowest active mode will then govern the large-time asymptotics.

Fig. 2
figure 2

The finite displacements of \(v_*\) generated by the eigenfunctions \(\psi _{l,n,0}\) in the physical case \(N=2\)

To motivate our approach for modding out certain modes, it is enlightening to study briefly the situation in two space dimensions, \(N=2\). In Figure 2, we have plotted some finite displacements, cf. (13), generated by eigenfunctions \(\psi _{l,n,0}\) with \(l\in \{1,\dots , N_l\}\). Apparently, displacements generated by \(\psi _{l,n,0}\) (and then also by any polynomial of the form \(p(|x|)Y_{l,n}(x/|x|)\) including \(\psi _{l,n,k}\)) share precisely the symmetry properties of a regular l-polygon. Under the assumption that the solution has the symmetry properties such a regular l-polygon, all eigenmodes generated by \(\psi _{m,n,k}\) with \(m<l\) are necessarily inactive. In Remark 4.2 below, we will discuss the short elementary argument that rigorously supports this observation.

In higher space dimensions, the situation gets more involved and the structure of the spherical harmonics is more complex. In order to mod out eigenmodes, taking a more abstract approach is strongly advised. We choose a group theoretical approach, noticing that the symmetry group of a regular l-polygon is a finite subgroup of the group of orthogonal transformations O(N). Our goal is to determine geometric conditions on an arbitrary function, more precisely, invariances under the action of a given finite subgroup of O(N), which guarantee that the \(L^2\)-projections of that function onto all spherical harmonics of a given degree l vanish. To achieve this goal, we will eventually apply tools originating from the field of representation theory of groups, see, for example, [10, 57] for elementary considerations.

The space of square integrable functions on the unit sphere \(L^2\left( \mathbb {S}^{N-1}\right) \) can be decomposed into a direct Hilbert sum over the eigenspaces of \(\Delta _{\mathbb {S}^{N-1}}\),

$$\begin{aligned} L^2\left( \mathbb {S}^{n-1}\right) = \bigoplus \limits _{l\in \mathbb {N}_0}H_l, \end{aligned}$$

where the eigenspace \(H_l\) is spanned by the spherical harmonics of degree l and its dimension is given by \(N_l\), see Theorem 2.2. We remark that every eigenspace \(H_l\) is invariant under the action of orthogonal transformations. More precisely, given an orthogonal matrix \(g\in O(N)\), for every \(f\in H_l \) we have that \(f\circ g^{-1} \in H_l\).

If E is a finite subgroup of O(N), we denote by \(H_l^E\) the subspace of \(H_l\) consisting of all functions that are invariant under the action of all elements of E, that is, \(f\circ g^{-1}=f\) for any \(f\in H_l^E\) and \(g\in E\). The eigenmodes corresponding to an eigenvalue \(\mu _{l,k}\) are all modded out by the action of elements in E if that subspace is trivial, \(\dim (H_l^E)=0\). We present and discuss our final convergence result under such an abstract condition and will discuss thereafter some specific choices of E, for which we will need some deeper insights from the representation theory of finite groups.

Corollary 2.5

Let \(N\ge 2\) and v be given as in Corollary 2.4 satisfying

$$\begin{aligned} \left\| \sqrt{v(t)}-V_*\right\| _{L^\infty ({{\,\textrm{supp}\,}}v(t))}\lesssim e^{-\mu _{l,k}t} \quad \text { for all }t\ge 0 \end{aligned}$$
(14)

for some \(l\in \mathbb {N}\) and \(k\in \mathbb {N}_0\), such that the multiplicity of \(\mu _{l,k}\) is given by \(N_l\). Assume in addition that \(v_0\) is invariant under the action of a finite subgroup E of O(N) such that

$$\begin{aligned} \dim \left( H_l^E\right) =0. \end{aligned}$$
(15)

Then it holds that

$$\begin{aligned} \left\| \sqrt{v(t)}-V_*\right\| _{W^{1,\infty }({{\,\textrm{supp}\,}}v(t))}&\lesssim e^{-\mu _+t} \quad \text { for all }t\ge 0, \end{aligned}$$

where \(\mu _+\) is the next largest eigenvalue following \(\mu _{l,k}\).

We shall briefly comment on the assumptions on v(t) in the latter corollary.

Remark 2.6

It may be surprising that it suffices to demand the decay of \(v(t)-\sqrt{V_*}\) in \(L^\infty \) instead of \(W^{1,\infty }\), what would be the expected setting due to the previous results. Due to the regularizing properties of the equation and the Lipschitz bound (9) for the initial time, we will eventually see, that both assumptions are in fact equivalent in the given situation. We will discuss this phenomenon shortly in the proof of Corollary 2.5.

Not every eigenfunction corresponds to an orthogonal transformation and thus, a symmetry condition like (15) is in general not sufficient to jump from one eigenvalue to another. Indeed, all eigenfunctions \(\psi _{0,1,k}\) are radially symmetric polynomials, and the slowest of the corresponding modes is generated by delayed Smyth–Hill solutions \(u_*(\tau + \tau _0,y)\) of (1), which turn into the dilations \(\lambda (t)^{-N}v_*(\lambda (t)^{-1}x)\) with \(\lambda (t)\approx 1 + \frac{1}{N+4}\tau _0 e^{-\mu _{0,1}t} \) solving the confined equation (6), and converging towards the stationary \(v_*\) with exponential rate \(\mu _{0,1}\). We do not know if these modes can be eliminated by a reasonable assumption on the initial configuration nor do we see how they can be suitably controlled during the evolution. Therefore, in order to raise the convergence rates beyond eigenvalues \(\mu _{0,k}\), the decay hypothesis (14) seems necessary to ensure that the respective radial modes are inactive. We have to demand that the multiplicity of the eigenvalue \(\mu _{l,k}\) in (14) is precisely \(N_l\), in order to exclude possible resonances with any spherical harmonics of different order (such that \(\mu _{l,k} = \mu _{{\tilde{l}},{\tilde{k}}}\)).

To conclude the discussion about higher order asymptotics on the level of the confined thin film equation, we remark that the number of eigenvalues we are able to remove from the spectrum before reaching \(\mu _{0,1}\) (provided we find a suitable subgroup of O(N)) depends on the space dimension: If the dimension is odd, \( N=2\,m-1\), then \(\mu _{0,1}\) is the \((m+2)\)th eigenvalue and has multiplicity one. In even dimensions, \(N=2m\), it coincides with \(\mu _{m+2,0} \).

We finally recall from the introduction that further and, in fact, much stronger statements on the large time asymptotics can be derived after a customary change of variables. These will be presented and discussed in the next section.

It remains to identify finite subgroups E of O(N), which mod out spherical harmonics of a given order l in the sense of (15). We will do that by applying a surprisingly helpful tool, the Molien series, which originates from the field of representation theory of groups. It was suggested to us by our colleague Linus Kramer.

The subspace \(H_l\subseteq L^2\left( \mathbb {S}^{N-1}\right) \) of spherical harmonics of degree l can be identified with the space of symmetric, trace-free tensors of rank l that we will further denote by \(H_l\) as well. The generating function \(h_E(t)\) for the dimensions \( \dim \left( H_l^E\right) \) of the subspace of \(L^2(\mathbb {S}^{N-1})\) that is invariant under the action of E can be formally expressed as the power series

$$\begin{aligned} h_E(s)=\sum \limits _{l=0}^{\infty }\dim \left( H_l^E\right) s^l, \end{aligned}$$
(16)

which is called Molien series or Hilbert series in the literature, cf. [57, p. 11] or [67, p. 479]. A beautiful and functional way that is often used to compute this series explicitly is given by Molien’s formula

$$\begin{aligned} h_E(s)=\frac{1}{|E|}\sum \limits _{g\in E}\frac{1-s^2}{\det \left( I-sg\right) }, \end{aligned}$$

see [55, 56]. In the physical case \(N=2\), the Molien series is known for all finite subgroups of O(2), as will be discussed in the following:

  • Cyclic groups. The first class of subgroups, \(\mathfrak {S}_n\) for \(n\in \mathbb {N}\), is generated by rotations by an angle of \(2\pi /n\). The corresponding Molien series is given by

    $$\begin{aligned} h_{\mathfrak {S}_n}(s)=\frac{1+s^n}{1-s^n}=(1+s^n)\sum \limits _{l=0}^\infty s^{ln} = 1 +2s^{n}+2s^{2n}+2s^{3n}+\dots , \end{aligned}$$

    see [56, p. 143]. In view of the Hilbert series representation (16) of \(h_{\mathfrak {S}_n}(s)\), this formula proves that the corresponding invariant subspaces must be trivial (15) precisely if l is not divisible by n. In other words, the projection of a function that is invariant under rotations of an angle of \(2\pi /n\) onto the subspaces spanned by spherical harmonics of degree l has to vanish if l is not divisible by n. Moreover, if non-trivial, \(H_l^{\mathfrak {S}_n}\) has dimension 2, and thus, recalling that for \(N=2\) each of the tensor spaces \(H_l\) with \(l\ge 1\) is two-dimensional \(N_l=2\), it is \(H_l^{\mathfrak {S}_n} = H_l\).

  • Dihedral groups. The second class of finite subgroups, \(\mathfrak {D}_n\) for \(n\in \mathbb {N}\), is generated by two elements. Again a rotation of the angle \(2\pi /n\) and additionally a reflection. In this case the Molien series reads as

    $$\begin{aligned} H_{\mathfrak {D}_n}(s)=\frac{1}{1-s^n}=\sum _{l=0}^\infty s^{ln}=1+s^{n}+s^{2n}+s^{3n}+\dots , \end{aligned}$$

    see [39, p. 59]. If a function is invariant under the action of \(\mathfrak {D}_n\) instead of \(\mathfrak {S}_n\), the projection onto \(H_l\) vanishes for the same l as before. This time, however, the nontrivial subspace are one-dimensional.

We remark that the zeroth order term \(s^0=1\) in the Molien series does not affect the convergence rates since the mass of the initial datum \(v_0\) is already fixed.

In higher dimensions, classifying the finite subgroups of O(N) becomes more complicated. For \(N=3\), we discuss the subgroups of O(3) that only consist of rotations in more detail. The following results, together with more far-reaching ones, can be found in [56, p. 143].

  • Cyclic groups. The class \(\mathfrak {S}_n\) for \(n\in \mathbb {N}\) is generated by rotations by an angle of \(2\pi /n\) around a fixed axis. The corresponding Molien series is given by

    $$\begin{aligned} h_{\mathfrak {S}_n}(s)= \frac{1}{1-s}\frac{1+s^n}{1-s^n}=\left( 1+s+s^2+s^3+\dots \right) \left( 1 +2s^{n}+2s^{2n}+2s^{3n}+\dots \right) . \end{aligned}$$

    This formula shows that no invariant subspace is ensured to be trivial in this case.

  • Dihedral groups. In three dimensions, the dihedral group \(\mathfrak {D}_n\) is generated by two rotations: A rotation by an angle of \(2\pi /n\) around a fixed axis and a rotation by an angle of \(\pi \) around an axis perpendicular to the first one. The corresponding Molien series is given by

    $$\begin{aligned} h_{\mathfrak {D}_n}(s) = \frac{1}{1-s^2}\frac{1+s^{n+1}}{1-s^n}= \left( 1+s^{n+1}\right) \left( 1+s^2+s^4+\dots \right) \left( 1+s^n+s^{2n}+\dots \right) . \end{aligned}$$

    In this case, the invariant subspace \(H_l^E\) becomes trivial if and only if \(l \ne (n+1)k+nm_1+2m_2\) for all \(k\in \left\{ 0,1\right\} \) and \(m_1,m_2\in \mathbb {N}_0\).

  • Platonic solids. The last group is given by the three rotation groups of the platonic solids. The tetrahedral group \(\mathfrak {T}\) (the rotation group of the tetrahedron) has the Molien series

    $$\begin{aligned} h_{\mathfrak {T}}(s) = \frac{1}{1-s^4}\frac{1+s^{6}}{1-s^3}=\left( 1+s^6\right) \left( 1+s^4+s^8+\dots \right) \left( 1+s^3+s^6+\dots \right) . \end{aligned}$$

    In this case, the invariant subspace \(H_l^E\) becomes trivial if and only if \(l \ne 4m_1+3m_2\) for all \(m_1,m_2\in \mathbb {N}_0\).

    The octahedral group \(\mathfrak {O}\) (the rotation group of the cube or the octahedron) has the Molien series

    $$\begin{aligned} h_{\mathfrak {O}}(s) = \frac{1}{1-s^4}\frac{1+s^{9}}{1-s^6}=\left( 1+s^{9}\right) \left( 1+s^4+s^8+\dots \right) \left( 1+s^6+s^{12}+\dots \right) . \end{aligned}$$

    In this case, the invariant subspace \(H_l^E\) becomes trivial if and only if \(l \ne 9k+4m_1+6m_2\) for all \(k\in \left\{ 0,1\right\} \) and \(m_1,m_2\in \mathbb {N}_0\).

    The isocahedral group \(\mathfrak {I}\) (the rotation group of the cube or the dodecahedron or the isocahedron) has the Molien series

    $$\begin{aligned} h_{\mathfrak {I}}(s) = \frac{1}{1-s^{10}}\frac{1+s^{15}}{1-s^6}=\left( 1+t^{15}\right) \left( 1+s^{10}+s^{20}+\dots \right) \left( 1+s^6+s^{12}+\dots \right) . \end{aligned}$$

    In this case, the invariant subspace \(H_l^E\) becomes trivial if and only if \(l \ne 15k+10m_1+6m_2\) for all \(k\in \left\{ 0,1\right\} \) and \(m_1,m_2\in \mathbb {N}_0\).

Regarding the four dimensional case O(4), extensive results can be found in [55]. In addition, various results regarding the Molien series in general dimensions are available; see, for example, [39].

Remark 2.7

We remark that some of the references given above do not work in exactly the same setting that we consider here. In fact, it is not necessary to decompose the space \(L^2\left( \mathbb {S}^{N-1}\right) \) into eigenspaces of \(\Delta _{\mathbb {S}^{N-1}}\). Instead, one could also decompose it into spaces of homogeneous polynomials of fixed degree,

$$\begin{aligned} L^2\left( \mathbb {S}^{N-1}\right) = \bigoplus \limits _{l\in \mathbb {N}_0}P_l, \end{aligned}$$

where \(P_l\) is the space of homogeneous polynomials of degree l. Given a finite subgroup E of O(N), we similarly denote by \(P_l^E\) the subspace of \(P_l\) consisting of all functions that are invariant under the action of all elements of E. Let

$$\begin{aligned} p_E(s) = \sum _{l=0}^\infty \dim \left( P_l^E\right) s^l \end{aligned}$$

be the corresponding generating function. In this situation Molien’s formula has to be adapted, namely

$$\begin{aligned} p_E(s)= \frac{1}{|E|}\sum \limits _{g\in E}\frac{1}{\det \left( I-sg\right) }, \end{aligned}$$

see for example [57, p. 13]. We obtain \(h_E(s)=(1-s^2)p_E(s)\), what enables us to transfer results to the given setting.

3 New Variables and Main Results

As announced earlier, one of the main analytical challenges in deriving fine large time asymptotics for the (confined) thin film equation is the free moving boundary. Following Koch [47], we perform a von Mises-type change of dependent and independent variables, which brings the equation into a setting in which solutions depend differentiably on the initial datum [65]. The transformation applies when the solution is Lipschitz close to the stationary solution in the sense of (9), cf. [64, 65]. The underlying geometric procedure is the following, which is also illustrated in Figure 3.

Fig. 3
figure 3

The change of variables from (xv(x)) to (zw(z))

The stationary \((4 v_*)^{1/4}\) describes a hemisphere over the N-dimensional unit ball \(B=B_1(0)\). We orthogonally project each point \((x,(4 v(x))^{1/4})\) of the graph of \((4 v)^{1/4}\) onto the closest point \((z,(4 v_*(z))^{1/4})\) on the hemisphere and denote by w(z) the (minimal) distance. Analytically this amounts to the choice

$$\begin{aligned} z=\frac{x}{\sqrt{2( v(t,x))^{1/2}+|x|^2}} \end{aligned}$$
(17)

for the new independent variable, and we see that \(x=z\) precisely if v is the stationary solution (7). The formula for the dependent variables reads as

$$\begin{aligned} 1+w(t,z)= \sqrt{2( v(t,x))^{1/2}+|x|^2}, \end{aligned}$$
(18)

and thus w vanishes if v is \(v_*\). We will accordingly refer to w as the perturbation.

The transformation is applicable also in situations in which v and \(v_*\) have not the same mass. This observation is reflected by the fact that \(\mu _{0,0}=0\) occurs in the spectrum of the linear operator, see Theorem 2.2. We will not eliminate this eigenvalue on the level of the perturbation, but only for the original variables through the mass constraint (8). For the general theory that we perform in terms of the perturbation, any constant solution \(w\equiv {{\,\mathrm{ const\mathrm }\,}}\) is admissible and corresponds to a Smyth–Hill solution (2) of arbitrary mass M.

The derivation of an evolution equation for the new variable w is lengthy and tedious. It has been described in detail already in [65], using the sloppy \(\star \) notation, see (20) below. For our purposes it is necessary to rederive the transformed equation in a way that carries more structure than the formulation chosen in [65]. We postpone these computations to the appendix and state here our findings only. The perturbation equation for the w variables is

$$\begin{aligned} \partial _tw+\mathcal {L}^2w+N\mathcal {L}w = \frac{1}{\rho } \nabla \cdot \left( \rho ^2F[w]\right) +\rho F[w ] \quad \text { on } \left( 0,\infty \right) \times B_1(0), \end{aligned}$$
(19)

where \(\rho (z) = \frac{1}{2}(1-|z|^2)\) is a weight function degenerating at the boundary, \(\mathcal {L}w=-\rho ^{-1}\nabla \cdot \left( \rho ^2\nabla w\right) = -\rho \Delta w +2z\cdot \nabla w\) is the building block of the thin film linear operator and

$$\begin{aligned} F[w] = p\star R[w]\star \left( \rho \nabla ^3w\star \nabla w+ \rho (\nabla ^2w)^{2\star } + \nabla ^2w\star \nabla w+ (\nabla w)^{2\star }\right) \end{aligned}$$
(20)

is the nonlinearity. The star product \(a\star b\) denotes an arbitrary linear combination of entries of the tensors a and b, and thus, in particular, the above F[w] defines a class of nonlinearities and both representatives in (20) may be different from each other. We write \(a^{k\star } = a \star \cdots \star a\), where the \(\star \)-product has k factors. Moreover, p is a polynomial tensor in z, which might have zero entries. The rational factors R[w] are tensors of the form

$$\begin{aligned} R[w] = \frac{(\nabla w)^{k\star }}{(1+w+z\cdot \nabla w)^l}, \end{aligned}$$

for some \(k\in \mathbb {N}_0\) and \(l\in \mathbb {N}\). Finally, the distributive property respects only the tensor class, for example \(p\star (a+b) = p\star a +{\tilde{p}}\star b\) with two possibly different polynomial tensors p and \({\tilde{p}}\). This shortened \(\star \) notation is suitable in the present work because the exact form of the nonlinearity is not important for our analysis. We finally recall from our introduction that the linear operator \(\mathcal {L}\) also occurs in the context of the porous medium equation (4) with \(m=\frac{3}{2}\), and was analyzed, for instance, in [63, 64]. It is readily checked that \(\mathcal {L}\) is symmetric (and, in fact, self-adjoint [63]) with respect to the inner product

$$\begin{aligned} \langle w,{\tilde{w}}\rangle = \int _{B_1(0)} w{\tilde{w}}\, \rho \text {d}z, \end{aligned}$$
(21)

which induces a Hilbert space with norm \(\Vert \cdot \Vert \) in the obvious way.

The perturbation equation (19) is well-posed for small Lipschitz initial data \(w_0\),

$$\begin{aligned} \Vert w_0\Vert _{W^{1,\infty }} \ll 1, \end{aligned}$$
(22)

as was proved in [65]. We will recall the precise statement in Theorem 5.1 below. The above smallness condition is equivalent to (9) under the change of variables.

It follows from the statement of Theorem 2.2 that the order of the eigenvalues \(\mu _{l,k}\) depends on the space dimension N. For us, it only plays a role when we want to determine conditions on the initial datum \(v_0\) that lead to improvements in the convergence rates for the confined thin film equation, see Corollaries 2.32.4, and 2.5 presented above. On the level of the perturbation equation, it is more convenient to rename the eigenvalues \(\left\{ \mu _k\right\} _{k\in \mathbb {N}_0}\) and order them in a strictly increasing way, that is \(\mu _k<\mu _{k+1}\). Correspondingly, we denote by \(\psi _{k,n}\) all eigenfunctions corresponding to \(\mu _k\) for \(n\in \left\{ 1,\dots ,\tilde{N}_k\right\} \). We note that the multiplicity of \(\mu _k\) may change due to intersections between the eigenvalues, see Figure 1. We mostly stick to this notation for the remaining work.

All announced asymptotic results for solutions v to the confined thin film equation will be derived from the following theorem that fully describes the higher order asymptotics of the perturbation equation. It is one of the two main results of the present work its proof can be found in Section 8.

Theorem 3.1

For any fixed \(K\in \mathbb {N}_0\), there exists an \(\varepsilon _0>0\) with the following properties: Let w be a solution to (19) with initial datum \(w_0\) satisfying \(\Vert w_0\Vert _{W^{1,\infty }}\le \varepsilon _0\). Then, under the assumption

$$\begin{aligned} \lim \limits _{t\rightarrow \infty }e^{\mu _kt}\langle \psi _{k,n},w(t)\rangle = 0 \quad \text { for all } k\in \left\{ 0,\dots ,K\right\} \text { and } n\in \left\{ 1,\dots ,\tilde{N}_k\right\} , \end{aligned}$$
(23)

it holds that

$$\begin{aligned} \left\| w(t)\right\| _{W^{1,\infty } }\lesssim e^{-\mu _{K+1}t} \text { for all } t\ge 0. \end{aligned}$$

To clarify the meaning of this Theorem, we first consider the case \(K=0\). The smallest eigenvalue \(\mu _K= \mu _0= 0\), corresponds to the constant eigenfunction 1, and thus, condition (23) turns into the requirement

$$\begin{aligned} \lim \limits _{t\rightarrow \infty } \int _{B_1(0)} w(t,z)\rho (z)\text {d}z=0. \end{aligned}$$
(24)

As we will see in the proof of Corollary 2.3, the latter is equivalent to the mass constraint (8) for the v variable. By imposing a condition of the solution’s mass, we rule out \(\mu _0=0\) as a relevant eigenvalue for the evolution, or, in other words, the corresponding mode is inactive. It follows that the leading order asymptotics are dominated by the next eigenvalue in order, \(\mu _1\), in the sense that it determines the rate of convergence and governs the evolution towards the stationary \(v_*\).

The theorem states that this procedure can be iterated. Because the mappings \(\langle \psi _{k,n},\cdot \rangle \) act as projections onto the respective eigenspaces, condition (23) ensures that the first K modes (with their multiplicities) are inactive during the evolution, that is, the modes do not affect the long-time behavior anymore. We can thus improve the rate of convergence and the theorem shows that the leading order asymptotics is then governed by the smallest active mode. In the proofs of Corollaries 2.4 and 2.5 we identify symmetry conditions for solutions to the thin film equation which ensure the decay (23) for the perturbation equation.

The proof for the higher-order asymptotics of the perturbation variable w in Theorem 3.1 is based on the construction of invariant manifolds, which are localized around the stationary solution \(w\equiv 0\). This is our second main result, which is of independent interest. To state it properly, we have to introduce some further notation.

First, we denote by \(S^t(g)\) the flow generated by the perturbation equation, that is \(S^t(g)=w(t,\cdot )\) where w(tz) solves the perturbation equation with initial datum g. We consider the Hilbert space H that is induced by the inner product

$$\begin{aligned} \langle v,w\rangle _H = \langle v,w\rangle + \langle \mathcal {L}v,w\rangle =\langle v,w\rangle + \langle v,\mathcal {L}w\rangle = \langle v,w\rangle +\langle \sqrt{\rho }\nabla v,\sqrt{\rho } \nabla w \rangle \end{aligned}$$

and the norm

$$\begin{aligned} \left\| w\right\| ^2_H = \left\| w\right\| ^2+ \Vert \mathcal {L}^{1/2}w\Vert ^2= \left\| w\right\| ^2+\Vert \sqrt{\rho }\nabla w\Vert ^2, \end{aligned}$$

where \(\Vert \cdot \Vert \) was defined via (21). It is equivalent to a scale invariant Hilbert space norm,

$$\begin{aligned} \Vert w\Vert _H^2 \sim \left\| w\right\| ^2_{L^2}+\left\| \rho \nabla w\right\| ^2_{L^2}, \end{aligned}$$
(25)

as can be seen with the help of Hardy’s inequality, cf. Lemma B.2 in the appendix. Furthermore, \(E_c\) is the eigenspace spanned by the eigenfunctions \(\psi _{k,n}\) for \(k\le K\) and \(n\in \left\{ 1,\dots ,\tilde{N}_k\right\} \) with \(K\in \mathbb {N}_0\) fixed and \(E_s\) denotes its orthogonal complement in H, such that \(H=E_c \oplus E_s\). In the following theorem, \(E_c\) and \(E_s\) are the center and stable eigenspaces, respectively. We finally have to refine the analysis from [65] by considering

$$\begin{aligned} \Vert w\Vert _{W} = \Vert w\Vert _{L^{\infty }} + \Vert \nabla w\Vert _{L^{\infty }} + \Vert \rho \nabla ^2w \Vert _{L^{\infty }} + \Vert \rho ^2\nabla ^3 w\Vert _{L^{\infty }}, \end{aligned}$$
(26)

instead of the Lipschitz norm only. The necessity of considering (scale-invariant) higher-order norms is a crucial observation in our definition and analysis of the truncated equation (45). We will comment on this further in Section 6.

Theorem 3.2

For any fixed \(K\in \mathbb {N}_0\) and \(\mu \in \left( \mu _K,\mu _{K+1}\right) \), there exist two constants \(\varepsilon>\varepsilon _0>0\) (with \(\varepsilon _0\) possibly smaller than in Theorem 3.1), and a Lipschitz continuous mapping \(\theta _\varepsilon :E_c\rightarrow E_s\) that is differentiable at zero with \(\theta _\varepsilon (0)=0\) and \(D\theta _\varepsilon (0)=0\) such that \(W_{loc}^c \) given by

$$\begin{aligned} W_{loc}^c = \left\{ g\in H: g=g_c+\theta _\varepsilon \left( g_c\right) , g_c\in E_c, \Vert g\Vert _{H}\le \varepsilon \right\} \end{aligned}$$

has the following properties:

  1. 1.

    For every \(g\in W_{loc}^c\) with \(\Vert g\Vert _{H}\le \varepsilon _0\) it holds that \(S^t(g)\in W_{loc}^c\) for all \(t\ge 0\).

  2. 2.

    For every \(g\in H\) with \(\Vert g\Vert _{W}\le \varepsilon _0\) there exists a unique \(\tilde{g} \in W_{loc}^c\) such that

    $$\begin{aligned} \left\| S^t\left( g\right) -S^t\left( \tilde{g}\right) \right\| _{W} \lesssim e^{-\mu t} \end{aligned}$$

    for every \(t\ge 1\).

The first property simply states that the local center manifold \(W^c_{loc}\) is locally invariant under the nonlinear evolution (19). From the properties of \(\theta _{\varepsilon }\) we infer that this manifold touches the center eigenspace \(E_c\) tangentially at the origin. The second property provides a finite-dimensional approximation at a given rate by solutions in \(W_{loc}^c\) for any given solution with sufficiently small initial datum. It is this feature that we exploit in order to derive fine large time asymptotics for the thin film equation.

The invariant manifold theorem is interesting on its own as it provides a nonlinear finite-dimensional object which solutions approximate at a given rate in the large time limit. In other words, once a rate of convergence is determined, any sufficiently small solution belonging to an infinite-dimensional function space can be approximated with the prescribed rate by a solution on a finite-dimensional manifold. As outlined in the introduction, similar results have been derived earlier. What is particularly challenging here is the delicate degenerate parabolicity of the fourth-order equation (19) modeling a free boundary problem whose mathematical understanding is still poor.

The construction of the invariant manifolds will be done in Section 7, and will be carried out for a truncated version of the perturbation equation first. In fact, our analysis provides even more information, that we omit here because they are not relevant for the large time asymptotics. For instance, we will show that the finite-dimensional approximation emerges from foliation of the Hilbert space H over a global invariant manifold.

4 From Invariant Manifolds to Higher Order Asymptotics

The goal in this section is the derivation of the main results for the thin film equation stated in Corollaries 2.32.4 and 2.5 from Theorem 3.1 on the mode-by-mode asymptotics for the perturbation equation.

We start by noting that the transformations (17) and (18) yield that

$$\begin{aligned} v(x) = \rho (\Phi (x))^2\left( 1+w(\Phi (x)\right) ^4, \end{aligned}$$
(27)

where \(\Phi (x)=z\) is the diffeomorphism introduced in (17).

In our proof of the leading order asymptotics, we apply Theorems 3.1 and 3.2 with \(K=0\).

Proof of Corollary 2.3

In a first step, we have to ensure that the mass constraint (8) implies the vanishing mean condition (24), which is the \(K=0\) version of (23). We start by rewriting (8) with the help of the change of variables formula (27) and the expression for the Jacobian determinant (68) in the appendix,

$$\begin{aligned} \begin{aligned} \int _{\mathbb {R}^N} v_*(x)\, \text {d}x&= \int _{\mathbb {R}^N} v(t,x)\, \text {d}x\\&= \int \limits _{B_1(0)} \rho (z)^2 (1+w(t,z))^{N+3}\left( 1+w(t,z)+z\cdot \nabla w(t,z)\right) \, \text {d}z. \end{aligned} \end{aligned}$$
(28)

The term on the right-hand side can be simplified via an integration by parts,

$$\begin{aligned}&\int \limits _{B_1(0)} \rho ^2 (1+w)^{N+3}\left( 1+w+z\cdot \nabla w\right) \text {d}z\\&\quad = \int \limits _{B_1(0)} \rho ^2 (1+w)^{N+4}\text {d}z + \frac{1}{N+4}\int \limits _{B_1(0)} \rho ^2 z\cdot \nabla \left( 1+w\right) ^{N+4}\text {d}z \\&\quad =\frac{1}{N+4}\int \limits _{B_1(0)}\rho \left( 1+w\right) ^{N+4}\left( \left( N+4\right) \rho -N\rho +2|z|^2\right) \text {d}z\\&\quad = \frac{2}{N+4}\int \limits _{B_1(0)} \left( 1+w\right) ^{N+4}\rho \, \text {d}z, \end{aligned}$$

where we have used that \(4\rho (z) +2|z|^2=2\) in the last identity. In particular, as \(v_*\) is mapped onto \(w_*=0\) under the change of variables, the latter identity entails that

$$\begin{aligned} \int _{\mathbb {R}^N} v_*\, \text {d}x = \frac{2}{N+4}\int \limits _{B_1(0)} \rho \, \text {d}z\quad \left( = \frac{2|B_1(0)|}{(N+2)(N+4)}\right) , \end{aligned}$$

which can also be verified via an elementary computation. Hence, we may cancel this term on both sides of (28) to obtain

$$\begin{aligned} \frac{2}{N+4}\int \limits _{B_1(0)}\left( \left( 1+w(t,z)\right) ^{N+4}-1\right) \rho (z) \text {d}z=0 \quad \text { for all } t\ge 0. \end{aligned}$$

Now we notice that any solution to the perturbation equation w(t) converges to leading order to a constant \(a\in \mathbb {R}\). Indeed, if \(K=0\), the local center manifold constructed in Theorem 3.2 is simply a ball \(B_{\varepsilon }(0)\) in \(\mathbb {R}\). (We comment on this simple fact briefly in the proof of Theorem 8.1.) Hence passing to the large time limit, the previous identity translates into \( (1+a)^{N+4}-1=0\), where \(|a|\le \varepsilon \) and thus \(a=0\). This proves (24).

Applying now Theorem 3.1 gives the decay estimate

$$\begin{aligned} \left\| w(t)\right\| _{W^{1,\infty }}\lesssim e^{-\mu _{1,0}t}. \end{aligned}$$

Using the transformation formulas (17) and (18), we see that

$$\begin{aligned} w(t,\Phi (x)) + \frac{1}{2}w(t,\Phi (x))^2 = \sqrt{v(t,x)} - V_*(x) \end{aligned}$$

for any \(x\in {{\,\textrm{supp}\,}}v(t)\), and that the quadratic term on the left-hand side is of higher order because w(t) is small. Therefore, the decay estimate for w(t) implies the first part of the statement

$$\begin{aligned} \Vert \sqrt{v(t)}-V_*\Vert _{L^{\infty }({{\,\textrm{supp}\,}}v(t))} \lesssim e^{-\mu _{1,0}t}. \end{aligned}$$

Lastly, we turn to the decay of the first derivatives. With help of (4) we derive \( \partial _i \left( \sqrt{v(x)}-V_*\right) = \left( 1+w\right) \nabla w \cdot \partial _i\Phi \). Recalling the transformation formulas (17) and (18), we compute

$$\begin{aligned}\partial _i\Phi (x) = \frac{e_i}{1+w} + \Phi (x)\frac{\partial _i\left( \sqrt{v(t)}-V_*\right) }{(1+w)^2}\end{aligned}$$

and thus obtain

$$\begin{aligned} \nabla \left( \sqrt{v(t,x)}-V_*(x)\right) = \frac{1+w(t,\Phi (x))}{1+w(t,\Phi (x))+\Phi (x)\cdot w(t,\Phi (x))}\nabla w(t,\Phi (x)) \end{aligned}$$

for all \(x\in {{\,\textrm{supp}\,}}v(t)\). Having this identity at hand, the decay estimate for w(t) in \(W^{1,\infty }\) directly yields the second part of the statement,

$$\begin{aligned} \left\| \nabla \left( \sqrt{v(t)}-V_*\right) \right\| _{L^\infty ({{\,\textrm{supp}\,}}v(t))}\lesssim e^{-\mu _{1,0}t}. \end{aligned}$$

\(\square \)

The proofs of Corollaries  2.4 and 2.5 will build up on the fact that the \(L^2\)-projections of solutions v(t) of the confined thin film equation onto eigenspaces generated by certain eigenfunctions \(\psi _{l,n,k}\) vanish for all times if they vanish initially. The next lemma illustrates how exactly this condition can be translated to the perturbation equation.

Lemma 4.1

Let \(\psi _{l,n,k}\) be an eigenfunction of the linear operator \(\mathcal {L}^2+N\mathcal {L}\) as given in Theorem 2.2. Then it holds that

$$\begin{aligned} \int _{\mathbb {R}^N} \left( v(x)-v_*(x)\right) \psi _{l,n,k}(x)\, \text {d}x = 2\int _{B_1(0)} w(z)\psi _{l,n,k}(z)\rho (z)\, \text {d}z + \mathcal {O}\left( \Vert w\Vert ^2_{L^{\infty }}\right) , \end{aligned}$$

provided that w small in the sense of (22).

This lemma, in particular, entails that

$$\begin{aligned} \left| \langle w, \psi _{l,n,k}\rangle \right| \lesssim \left\| w\right\| _{L^\infty }^2, \end{aligned}$$

provided that \(\int _{\mathbb {R}^N} v \psi _{l,n,k}\text {d}x =\int _{\mathbb {R}^N} v_* \psi _{l,n,k}\text {d}x\). We will exploit this observation in the sequel.

Proof

Theorem 2.2 shows that every eigenfunction \(\psi _{l,n,k}(x)\) is given as a product of a polynomial in \(|x|^2\) and a homogeneous harmonic polynomial of degree l, that is,

$$\begin{aligned} \psi _{l,n,k} = \sum \limits _{j=1}^{k} c(l,k,j)|x|^{2j}\psi _{l}(x), \end{aligned}$$

where \(\psi _{l}\) denotes an arbitrary homogeneous harmonic polynomial of degree l and c(lkj) a real-valued coefficient. Due to this structure of the eigenfunctions, the problem boils down to proving that

$$\begin{aligned} \int _{\mathbb {R}^N} \left( v(x)-v_*(x)\right) \psi _l(x)|x|^{2j}\text {d}x = 2\int _{B_1(0)} w(z)\psi _l(z)|z|^{2j}\rho (z)\text {d}z + \mathcal {O}\left( \Vert w\Vert ^2_{L^{\infty }}\right) \end{aligned}$$
(29)

for any integer \(j\le k\).

To address (29), we first notice that by our choice of the perturbation variables (17) and (18), it holds that \(\psi _l(x) = (1+w(z))^l\psi _l(z)\) and \(|x| = (1+w(z))|z|\). Therefore, we find with the help of the transformation identities (27) and (68) that

$$\begin{aligned} \int _{\mathbb {R}^N} v\psi _l |x|^{2j}\text {d}x&= \int _{B_1(0)} \rho ^2(1+w)^{3+l+2j+N} \psi _l(z) |z|^{2j}\left( 1+w+z\cdot \nabla w \right) \,\text {d}z\\&=\int _{B_1(0)} \rho ^2(1+w)^{4+l+2j+N}\psi _l(z)|z|^{2j}\text {d}z\\&\quad + \int _{B_1(0)} \rho ^2(1+w)^{3+l+2j+N} \psi _l(z)|z|^{2j}z\cdot \nabla w\, \text {d}z. \end{aligned}$$

In the last term on the right-hand side, we integrate by parts and find after a short computation that

$$\begin{aligned}&\int _{B_1(0)} \rho ^2 (1+w)^{3+l+2j+N} \psi _l(z)|z|^{2j}z\cdot \nabla w\, \text {d}z\\&\quad = -\int _{B_1(0)} \rho ^2 (1+w)^{4+l+2j+N} \psi _l(z)|z|^{2j}\, \text {d}z \\&\qquad + \frac{2}{4+l+2j+N}\int _{B_1(0)} \rho (1+w)^{4+l+2j+N} \psi _l(z)|z|^{2j}\, \text {d}z, \end{aligned}$$

where we have used the identities \(z\cdot \nabla \psi _l = l\psi _l\), which holds true because \(\psi _{l}\) is a homogeneous polynomial of degree l, and \(2\rho +|z|^2=1\). It follows that

$$\begin{aligned} \int _{\mathbb {R}^N} v\psi _l|x|^{2j}\, \text {d}x = \frac{2}{4+l+2j+N}\int _{B_1(0)} \rho (1+w)^{4+l+2j+N} \psi _l(z)|z|^{2j}\, \text {d}z. \end{aligned}$$

Next, we take into account the identity \((1+w)^m=1+mw+\mathcal {O}\left( \Vert w\Vert ^2_{L^\infty }\right) \), which holds for \(m\in \mathbb {N}\) and \(\Vert w\Vert _{L^{\infty }}\) small by Taylor expansion, and derive

$$\begin{aligned}&\int _{\mathbb {R}^N} v\psi _l|x|^{2j}\text {d}x=2 \int _{B_1(0)} \rho w \psi _l|z|^{2j}\text {d}z \\&+\frac{2}{4+l+2j+N}\int _{B_1(0)} \rho \psi _l|z|^{2j} \text {d}z + \mathcal {O}\left( \Vert w\Vert _{L^{\infty }}^2\right) . \end{aligned}$$

It remains to show that

$$\begin{aligned}&\frac{2}{4+l+2j+N}\int _{B_1(0)} \rho \psi _l|z|^{2j}\text {d}z \\&\quad = \int _{\mathbb {R}^N} v_*\psi _l|x|^{2j}\text {d}x= \frac{1}{4}\int _{B_1(0)} \psi _l\left( 1-|x|^2\right) ^2|x|^{2j}\rho \text {d}x. \end{aligned}$$

In the case \(l\ge 1\) both terms vanish thanks to the orthogonality of the eigenfunctions with respect to the inner product introduced in (21). Indeed, the harmonic polynomial \(\psi _l\) can be written as a linear combination of the eigenfunctions \(\psi _{l,n,0}\) with \(n\in \{1,\dots ,N_l\}\), while the radial weights \(|z|^{2j}\) and \( \left( 1-|x|^2\right) ^2|x|^{2j}\) lie in the spaces \({{\,\textrm{span}\,}}\left\{ \psi _{0,0,i}:\,i\le j\right\} \) and \({{\,\textrm{span}\,}}\left\{ \psi _{0,0,i}:\,i\le j+2\right\} \), respectively. For \(l=0\), it holds that \(\psi _0=1\) and the claim follows via an elementary computation. This establishes (29) and thus the proof is finished. \(\square \)

With help of the previous lemma, the proof of Corollary 2.4 reduces to an easy combination of the already established results.

Proof of Corollary 2.4

As the solution of the confined thin film equation remains centered at the origin provided its initial data is, cf. (12), we can make use of Lemma 4.1 with \(\psi _{l,n,k} = \psi _{1,n,0}\) to obtain

$$\begin{aligned} 0 = \int _{\mathbb {R}^N} x_iv(t,x)\text {d}x =2 \int _{B_1(0)} z_i w(t,z)\rho \text {d}z +\mathcal {O}\left( \Vert w\Vert ^2_{L^{\infty }}\right) \end{aligned}$$

for every \(i \in \left\{ 1,\dots ,N\right\} \). In the proof of Corollary 2.3 we already established convergence rates for w, namely \(\Vert w\Vert _{L^{\infty }}\lesssim e^{-\mu _{1,0}t}\). This directly yields

$$\begin{aligned} \lim \limits _{t\rightarrow \infty }e^{\mu _{1,0}t} \int _{B_1(0)}zw(t,z)\rho \text {d}z =0, \end{aligned}$$

which makes Theorem 3.1 applicable. We therefore obtain

$$\begin{aligned} \Vert w(t)\Vert _{W^{1,\infty }}\lesssim e^{-\mu t}, \end{aligned}$$

where \(\mu \) is the next eigenvalue in line, which is \(\mu = \mu _{0,1} =30\) if \(N=1\) and \(\mu = \mu _{2,0} = 16+4N\) if \(N\ge 2\).

It remains to translate the convergence result for the perturbation equation into a convergence result for the confined thin film equation. The argument proceeds in exactly the same way as the proof of Corollary 2.3. We drop the details. \(\square \)

The last proof of this section is based on similar ideas and exploits Lemma 4.1 in more generality.

Proof of Corollary 2.5

In a first step we establish the uniform decay estimate \(\left\| w(t)\right\| _{L^\infty }\lesssim e^{-\mu _{l,k}t}\), which directly implies \(\lim \limits _{t\rightarrow \infty }e^{\mu t}\langle \psi ,w(t)\rangle = 0 \) for all \(\mu < \mu _{l,k}\) and their corresponding eigenfunctions \(\psi \). Towards this uniform estimate, we notice that on the one hand it holds \(|w(t,z)| \lesssim \left| w(t,z)+\frac{1}{2}w(t,z)^2\right| \), because w(t) is small as a consequence of the leading order asymptotics in Corollary 2.3. On the other hand, we deduce from the transformation formulas (17) and (18) that \(\left| w(t,z)+\frac{1}{2}w(t,z)^2\right| = \left| \sqrt{v(t,x)}-V_*(x)\right| \). A combination of both and (14) gives the estimate on w(t).

Before we continue with the proof, we insert a short discussion about the assumptions on the decay of \(v(t)-\sqrt{V_*}\), c. f. Remark 2.6. Since all eigenmodes corresponding to eigenvalues \(\mu \) smaller than \(\mu _{l,k}\) decay fast enough, Theorem 3.1 provides a decay estimate for w(t) in \(W^{1,\infty }\), namely \(\left\| w(t)\right\| _{W^{1,\infty }}\lesssim e^{-\mu _{l,k}t}\). Proceeding in the same way as in the proof of Corollary 2.3, we obtain \(\left\| v(t)-\sqrt{V_*}\right\| _{W^{1,\infty }\left( {{\,\textrm{supp}\,}}v(t)\right) }\lesssim e^{-\mu _{l,k}t}\). This shows that extending norm in the decay assumption in Corollary 2.5 from \(L^\infty \) to \(W^{1,\infty }\) eventually provides an equivalent condition.

Let us now turn back to the actual proof. To deduce a better convergence rate for w(t) from Theorem 3.1, we also have to show that the eigenmodes corresponding to \(\mu _{l,k}\) are inactive, that is

$$\begin{aligned} \lim \limits _{t\rightarrow \infty }e^{\mu _{l,k} t}\langle \psi _{l,n,k},w(t)\rangle = 0 \quad \text { for all } n\in \left\{ 1,\dots ,N_l\right\} . \end{aligned}$$
(30)

Once this is proved, we obtain with help of Theorem 3.1 that \(\Vert w\Vert _W \lesssim e^{-\mu _+ t}\), where \(\mu _+\) is the next largest eigenvalue following \(\mu _{l,k}\). From this point on, the proof proceeds in the same way as before.

Let us now turn to the proof of (30). Recalling that \(\left\| w\right\| _{L^\infty }\lesssim e^{-\mu _{l,k}t}\) and Lemma 4.1, it suffices to prove that

$$\begin{aligned} \int _{\mathbb {R}^N} \left( v(t,x)-v_*(x)\right) \psi _{l,n,k}(x)\, \text {d}x =0 \quad \text { for all }n\in \left\{ 1,\dots ,N_l\right\} \end{aligned}$$
(31)

for all \(t\ge 0\). The argument for this identity is based on the invariance of v(t) under orthogonal transformations contained in E. Since the confined thin film equation is invariant under orthogonal transformations, uniqueness of solutions to this equation guarantees that the solution v(t) inherits this property from its initial datum \(v_0\) for every time t.

By the right choice of E, this geometric invariance ensures that the projection of v(t) onto every homogeneous, harmonic polynomial of degree l vanishes. The same trivially holds true for \(v_*\). In order to exploit this fact, we have a closer look at the structure of the eigenfunctions \(\psi _{l,n,k}\) appearing in (31). Due to the condition that \(\mu _{l,k}\) has multiplicity \(N_l\), we know from Theorem 2.2 that every \(\psi _{l,n,k}\) has the form

$$\begin{aligned} \psi _{l,n,k} = \sum \limits _{j=1}^{k} c(l,k,j)|x|^{2j}\psi _{l}(x), \end{aligned}$$

where \(\psi _l\) denotes an homogeneous harmonic polynomial of degree l.

Note that the product \(v(t) \sum c(l,k,j)|x|^{2j}\) satisfies the same geometrical properties as v(t) and thus its projection onto every homogeneous harmonic polynomial vanishes as well, that is

$$\begin{aligned} 0 = \int _{\mathbb {R}^N} v(t,x)\sum \limits _{j=1}^{k} c(l,k,j)|x|^{2j}\psi _{l}(x)\,\text {d}x = \int _{\mathbb {R}^N}v(t,x)\psi _{l,n,k} \,\text {d}x. \end{aligned}$$

Again, the same holds true for \(v_*\) and thus the proof of (31) is completed. \(\square \)

Remark 4.2

In the two-dimensional case \(N=2\), Corollary 2.5 can also be easily proved in a more direct way thanks to the fact that both, the spherical harmonics and the orthogonal transformations have a handy, explicit form in two dimensions. The spherical harmonics of degree l are given by (in polar coordinates) \(cos(l\varphi )\) and \(sin(l\varphi )\). Recalling the form of a rotation or reflection matrix, a straightforward computation yields the same results as Corollary 2.5.

However, this strategy becomes impracticable in higher dimensions, particularly because there is no longer such a convenient representation for general orthogonal projections.

5 Theory for the Perturbation Equation

In this section, we will recall main aspects of the theory for the perturbation equation (19) derived earlier in [65], and we will provide higher order regularity estimates. Such estimates will be an important tool in our invariant manifold theory, which we will develop in the subsequent sections.

We start by recalling that the operator \(\mathcal {L}\) is symmetric in \(L^2(\rho )\) and satisfies the maximal regularity estimate

$$\begin{aligned} \Vert \nabla w\Vert +\Vert \rho \nabla ^2w\Vert \lesssim \Vert \mathcal {L}w\Vert . \end{aligned}$$
(32)

Indeed, such an estimate holds true for the more general class of degenerate elliptic operators

$$\begin{aligned} \mathcal {L}_\sigma {:}{=}-\rho ^{-\sigma }\nabla \cdot \left( \rho ^{\sigma +1}\nabla w\right) , \end{aligned}$$
(33)

that naturally occur in the context of the porous medium equation, see [42, 47, 63, 64]. In this case, the underlying Hilbert space is \(L^2\left( \rho ^\sigma \right) \). We state the corresponding maximal regularity estimate for the fourth order linear problem associated to the perturbation equation (19), that is,

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _tw+\mathcal {L}^2w+ N \mathcal {L}w &{}= f \quad \text { in } (0,\infty )\times B_1(0)\\ w(0,\cdot )&{}=w_0 \quad \text { in } B_1(0). \end{array}\right. } \end{aligned}$$
(34)

This problem is well-posed for \(L^2(\rho )\) initial data and \(L^2((0,T);L^2(\rho ))\) inhomogeneities; see Lemma 7 in [65]. In the case with zero initial data, \(w_0=0\), there is the maximal regularity estimate

$$\begin{aligned} \begin{aligned}&\Vert \partial _tw\Vert _{L^p\left( \left( 0,T\right) ;L^p(\rho ^\sigma )\right) }\\&+\Vert \nabla ^2w\Vert _{L^p\left( (0,T);L^p(\rho ^\sigma )\right) }+\Vert \rho \nabla ^3w\Vert _{L^p\left( (0,T);L^p(\rho ^\sigma )\right) }+\Vert \rho ^2\nabla ^4w\Vert _{L^p\left( (0,T);L^p(\rho ^\sigma )\right) } \\&\quad \lesssim \Vert f\Vert _{L^p\left( (0,T);L^p(\rho ^\sigma )\right) }, \end{aligned} \end{aligned}$$
(35)

which holds true for any \(p\in \left( 1,\infty \right) \), \(\sigma >0\) and \(T>0\), see Lemma 8 and Proposition 19 (and its proof) in [65].

In order to motivate the results that are collected and derived in the following, we have a closer look at the nonlinearity occurring in (20). The natural framework to prove well-posedness of the nonlinear problem (19) is the class \(C^{0,1}(B_1(0))\), in which the singular terms \(R_l[w]\) can be suitably controlled, at least, if w is small in that class. Moreover, in such a situation, the nonlinearity is of the same regularity order as the linear elliptic operator \(\mathcal {L}^2 \), and the inhomogeneity can thus be treated as a quadratic perturbation term. We will carry this out in a simple Hilbert space setting later in Section 6 (after a necessary truncation). A complete theory for the nonlinear equation (19) forces us to construct higher order norms that match the scaling of the (homogeneous) Lipschitz norm. This naturally leads to considering Carleson or Whitney measures; more precisely that

$$\begin{aligned} \Vert w\Vert _{X(p)}&=\sum \limits _{(l,k,|\beta |)\in \mathcal {E}} \sup \limits _{\begin{array}{c} z\in \overline{B_1(0)}\\ 0<r\le 1 \end{array}} \frac{r^{4k+|\beta |-1}}{\theta (r,z)^{2l-|\beta |+1}}\left| Q_r^d(z)\right| ^{-\frac{1}{p}}\Vert \rho ^l\partial _t^k\partial _z^\beta w\Vert _{L^p\left( {Q_r^d(z)}\right) }\\&\quad +\sum \limits _{(l,k,|\beta |)\in \mathcal {E}} \sup \limits _{T\ge 1} \Vert \rho ^l\partial _t^k\partial _z^\beta w\Vert _{L^p\left( Q(T)\right) },\\ \Vert f\Vert _{Y(p)}&= \sup \limits _{\begin{array}{c} z\in \overline{B_1(0)}\\ 0<r\le 1 \end{array}} \frac{r^3}{\theta (r,z)}\left| Q_r^d(z)\right| ^{-\frac{1}{p}}\Vert f\Vert _{L^p\left( Q_r^d(z)\right) } +\sup \limits _{T\ge 1} \Vert f\Vert _{L^p\left( Q(T)\right) }, \end{aligned}$$

where \( \mathcal {E}=\left\{ (0,1,0),(0,0,2),(1,0,3),(2,0,4) \right\} \) and \(\theta (r,z) = \max \{r,\sqrt{\rho (z)}\}\). Moreover, \(Q_r^d(z)\) is the Whitney cube \((r^4/2,r^4)\times B_r^d(z)\) and \(Q(T) = (T,T+1)\times B_1(0)\). We remark that the balls \(B_r^d(z)=\left\{ z'\in \overline{B_1(0)}: d(z,z')<r\right\} \) are not defined with respect to the Euclidean metric on \(B_1(0)\) but the semi-distance

$$\begin{aligned} d(z,z'){:}{=}\frac{|z-z'|}{\sqrt{\rho (z)}+\sqrt{\rho (z')}+\sqrt{|z-z'|}}. \end{aligned}$$
(36)

The occurrence of this semi-distance can be motivated by interpreting the parabolic problem (34) as a (fourth order) heat flow on a weighted Riemannian manifold \((\mathcal {M},{\textbf {g}},\omega {\textbf {vol}})\), cf. [37]. Indeed, considering \({\textbf {g}}=\rho ^{-1}(\text {d}x)^2\) as the Riemannian metric on the disc B and choosing a suitable weight \(\omega \) on the volume form, the elliptic operator \(\mathcal {L}\) turns out to be the Laplace–Beltrami operator on \((\mathcal {M},{\textbf {g}},\omega {\textbf {vol}})\). On this manifold, the induced geodesic distance is equivalent to \(d(z,z')\) in (36).

Considering this intrinsic metric is helpful as the theories for heat flows are often also available on weighted manifolds [37]. For the subsequent computations, we recall some properties of the intrinsic distance from [64]: The intrinsic balls are equivalent to Euclidean balls, more precisely there exists a positive constant C such that

$$\begin{aligned} B_{C^{-1}r\theta (r,z)}(z)\subseteq B_r^d(z)\subseteq B_{Cr\theta (r,z)}(z) \end{aligned}$$
(37)

for every z in \(\overline{B_1(0)}\) and any r. Furthermore, it holds for any r that

$$\begin{aligned} \sqrt{\rho (z')}\lesssim r \quad \Rightarrow \quad \sqrt{\rho (z)}\lesssim r \quad \text { for all } z\in B_r^d(z') \end{aligned}$$

and

$$\begin{aligned} \sqrt{\rho (z')}\gg r \quad \Rightarrow \quad \rho (z)\sim \rho (z') \quad \text { for all } z\in B_r^d(z') \end{aligned}$$

which, in particular, implies that

$$\begin{aligned} \theta (r,\cdot )\sim \theta (r,z')\quad \text{ in } B_r^d(z'). \end{aligned}$$
(38)

Variants of these norms were considered earlier in the treatment of the Navier–Stokes equations, a class of geometric flows, the porous medium equation and the thin film equation [40, 42, 48, 50, 64], see also the review in [49]. The choice of the large time contributions is rather arbitrary, see also Remark 5.2.

Still on the level of the linear equation (34), it is proved in [65] that for any \(p>N+4\), the solution to (34) satisfies the estimate

$$\begin{aligned} \Vert w\Vert _{W^{1,\infty }} +\Vert w\Vert _{X(p)}\lesssim \Vert f\Vert _{Y(p)} + \Vert w_0\Vert _{W^{1\infty }}, \end{aligned}$$
(39)

provided that the right-hand side is finite. The well-posedness theory for the perturbation equation (19) and our higher-order regularity estimate below do heavily rely on that bound.

For further reference, we recall the main results for (19) from the literature.

Theorem 5.1

([65]) Let \(p>N+4\) be given. There exists \(\varepsilon _0 >0\) such that for every \(w_0\in W^{1,\infty }\) with \( \Vert w_0\Vert _{W^{1,\infty }} \le \varepsilon _0\) there exists a solution w to the nonlinear equation (19) with initial datum \(w_0\) and w is unique among all solutions with \(\Vert w\Vert _{L^\infty (W^{1,\infty })} + \Vert w\Vert _{X(p)} \lesssim \varepsilon _0\). Moreover, this solution w satisfies the estimate

$$\begin{aligned} \Vert w\Vert _{L^\infty (W^{1,\infty })} + \Vert w\Vert _{X(p)}\lesssim \Vert w_0\Vert _{W^{1,\infty }} \end{aligned}$$

and is smooth, and analytic in time and angular direction.

Strictly speaking, the result described here slightly differ from [65].

Remark 5.2

For accuracy, we remark that in [65], the linear bound (39) and the nonlinear theory in Theorem 5.1 were derived for slightly different X(p) and Y(p) norms. Indeed, in this earlier work the large time contributions \( \Vert \rho ^l\partial _t^k\partial _z^\beta w\Vert _{L^p\left( Q(T)\right) }\) and \( \Vert f\Vert _{L^p\left( Q(T)\right) }\) came both with a factor T. With regard to the theory developed in the present paper, dropping this factor is more convenient.

In the present paper, we have to extend the theory from \(C^{0,1}\) data to a higher regularity setting. Indeed, it turns out that the truncation that we introduce on the level of the nonlinearity in Section 6 needs to cut-off derivatives up to third order. In order to subsequently relate the truncated equation to the original one (19), these derivatives need to be controlled by the initial data. We will chose the uniform higher-order norms whose homogeneous parts have the same scaling as the homogeneous Lipschitz norm at the boundary, \(\Vert \cdot \Vert _W\), which we introduced in (26).

Our main contribution in the present section is the following higher order regularity result:

Theorem 5.3

There exists \(\varepsilon _0>0\), possibly smaller than in Theorem 5.1, such that for every \(w_0\in W^{1,\infty }\) with \( \Vert w_0\Vert _{W}\le \varepsilon _0\), the unique solution w from from Theorem 5.1 satisfies

$$\begin{aligned} \Vert w\Vert _W \lesssim \Vert w_0\Vert _W. \end{aligned}$$

Proof

Step 1. Second order derivatives. We will prove the slightly stronger bound

$$\begin{aligned} \Vert \rho \nabla ^2 w\Vert _{L^{\infty }} + \Vert \rho \nabla w\Vert _{X(p)} \lesssim \Vert w_0\Vert _{W^{1,\infty }}+\Vert \rho \nabla ^2w_0\Vert _{L^\infty }. \end{aligned}$$
(40)

For this purpose, for every \(i=1,\dots ,N\), we consider the dynamics of \(\rho \partial _i w\) under the nonlinear equation (19), that is,

$$\begin{aligned} \partial _t(\rho \partial _iw)+\mathcal {L}^2(\rho \partial _iw)+N\mathcal {L}(\rho \partial _iw)=\rho \partial _if[w] + NE[w]+\mathcal {L}E[w] + E[\mathcal {L}w], \end{aligned}$$

where \(E[v] = -\rho z_i \Delta v - 2\rho \partial _iv + (N\rho -2|z|)\partial _iv+ 2\rho z \cdot \nabla \partial _iv\) is the commutator of the operators \(\rho \partial _i \) and \( \mathcal {L}\), and this equation is equipped with the initial datum \(\rho \partial _iw_0.\) From the a priori bound in (39), we know that

$$\begin{aligned}{} & {} \Vert \rho \partial _iw\Vert _{W^{1,\infty }} +\Vert \rho \partial _iw\Vert _{X(p)}\lesssim \Vert \rho \partial _if[w]\Vert _{Y(p)} \\{} & {} \quad + \Vert NE[w]+\mathcal {L}E[w] + E[\mathcal {L}w]\Vert _{Y(p)}+ \Vert \rho \partial _iw_0\Vert _{W^{1,\infty }}. \end{aligned}$$

In view of the bound from Theorem 5.1, in order to prove (40) it suffices thus to prove that

$$\begin{aligned} \begin{aligned}&\Vert \rho \partial _if[w]\Vert _{Y(p)} + \Vert NE[w]+\mathcal {L}E[w] + E[\mathcal {L}w]\Vert _{Y(p)}\\&\quad \lesssim \Vert w\Vert _{W^{1,\infty }} + \Vert w\Vert _{X(p)} + \varepsilon _0 \left( \Vert \rho \nabla w\Vert _{W^{1,\infty }} + \Vert \rho \nabla w\Vert _{X(p)} \right) , \end{aligned} \end{aligned}$$
(41)

and to choose \(\varepsilon _0\) sufficiently small.

From [65] we are aware of another form of the nonlinearity f[w] of the perturbation equation (19), namely \(f[w]=f^1[w]+f^2[w]+f^3[w],\) where

$$\begin{aligned} f^1[w]&= p \star R[w]\star \left( \left( \nabla w\right) ^{2\star }+\nabla w \star \nabla ^2w \right) , \\ f^2[w]&= p\star R[w]\star \rho \left( \left( \nabla ^2w\right) ^{2\star }+ \nabla ^3w\star \nabla w\right) , \\ f^3[w]&=p \star R[w]\star \rho ^2 \left( \left( \nabla ^2w\right) ^{3\star } +\nabla ^2w\star \nabla ^3w+ \nabla w\star \nabla ^4w \right) , \end{aligned}$$

and

$$\begin{aligned} R[w]= \frac{\left( \nabla w\right) ^{k\star }}{\left( 1+w +z\cdot \nabla w\right) ^{l}} \end{aligned}$$

for some \(k \in \mathbb {N}_0\), \(l\in \mathbb {N}\), whose values may be different in any occurrence of R[w]. (Of course, the reader may derive this presentation also directly from (19) and (20).) The computation of derivatives of these expressions is tedious but straightforward. As an auxiliary result we notice that \(\nabla R [w] = p\star R + p\star R\star \nabla ^2 w\). Here are the final formulas:

$$\begin{aligned} \partial _i f^1[w]&=p\star R[w]\star \left( (\nabla w)^{2\star } +\nabla w\star \nabla ^2 w + (\nabla ^2 w)^{2\star } +\nabla w\star \nabla ^3 w\right) ,\\ \partial _i f^2[w]&=p\star R[w]\star \left( (\nabla ^2 w)^{2\star }\right. \\&\quad \left. +\nabla w\star \nabla ^3 w + \rho (\nabla ^2 w)^{3\star } +\rho \nabla ^2 w\star \nabla ^3 w + \rho \nabla w\star \nabla ^4w\right) ,\\ \partial _i f^3[w]&= p\star R[w]\star \left( \rho (\nabla ^2 w)^{3\star } + \rho \nabla ^2 w\star \nabla ^3 w+\rho \nabla w\star \rho ^4w + \rho ^2 (\nabla ^2 w)^{4\star } \right. \\&\quad +\left. \rho ^2 \nabla w\star \nabla ^5 w+ \rho ^2 (\nabla ^2 w)^{2\star }\star \nabla ^3w +\rho ^2 (\nabla ^3 w)^{2\star } +\rho ^2 \nabla ^2 w\star \nabla ^4 w\right) . \end{aligned}$$

Combining them, and multiplying by \(\rho \), we thus find that

$$\begin{aligned} \rho \partial _i f[w] = p\star R[w]\star \left( I + J \right) , \end{aligned}$$

where

$$\begin{aligned} I&= (\nabla w)^{2\star } +\rho \nabla w\star \nabla ^2 w +\rho (\nabla ^2 w)^{2\star } +\rho \nabla w\star \nabla ^3 w +\rho ^2 \nabla ^2 w \star \nabla ^3 w\\&\quad + \rho ^2 \nabla w\star \nabla ^4 w +\rho ^2 \nabla ^2 w\star \nabla ^3 w + \rho ^3 \nabla ^2 w\star \nabla ^4 w + \rho ^3 \nabla w\star \nabla ^5 w,\\ J&= \rho ^3 (\nabla ^3 w)^{2\star } +\rho ^2 (\nabla ^2 w)^{3\star } +\rho ^3 (\nabla ^2 w)^{2\star } \star \nabla ^3 w + \rho ^3 (\nabla ^2 w)^{4\star } . \end{aligned}$$

Because \(|p\star R[w]|\lesssim 1\) thanks to the control of w and \(\nabla w\) during the evolution, in our estimate of \(\rho \partial _i f[w]\) it is enough to control I and J. Here, the first term is much easier to handle. Indeed, using the fact that \(\Vert \nabla w\Vert _{Y(p)} \lesssim \Vert \nabla w\Vert _{L^{\infty }}\) and \(\Vert \nabla ^2 w\Vert _{Y(p)} + \Vert \rho \nabla ^3w\Vert _{Y(p)}+\Vert \rho ^2\nabla ^4w\Vert _{Y(p)} \lesssim \Vert w\Vert _{X(p)} \), which comes directly out of the definition of the Y(p) norm, and invoking the a priori estimate in Theorem 5.1, we readily find that

$$\begin{aligned} \Vert I\Vert _{Y(p)}&\lesssim \left( \Vert \nabla w\Vert _{L^{\infty }} + \Vert w\Vert _{X(p)}\right) \left( \Vert \nabla w\Vert _{L^{\infty }} + \Vert \rho \nabla ^2 w\Vert _{L^{\infty }} + \Vert \rho ^3 \nabla ^5 w\Vert _{Y(p)}\right) \\&\lesssim \Vert w_0\Vert _{W^{1,\infty }} + \varepsilon _0 \left( \Vert \rho \nabla ^2 w\Vert _{L^{\infty }} + \Vert \rho ^3 \nabla ^5 w\Vert _{Y(p)}\right) . \end{aligned}$$

The estimates of the terms appearing in J are more involved, as we have to make use of suitable interpolations; some were already discussed in [65], but we present the ideas here for the convenience of the reader. Let \(\eta \) be a smooth cut-off function satisfying \(\eta = 1\) in \(B_r^d(z_0)\) and \(\eta =0\) outside \(B_{2r}^d(z_0)\) for \(r\le 1\). Inside of the ball \(B_r^d(z_0)\), we then have that

$$\begin{aligned} \rho ^3 |\nabla ^3 w|^2 \lesssim \rho |\nabla \zeta |^2 + \rho |\nabla ^2 w|^2, \end{aligned}$$

if \(\xi =\eta \rho \nabla ^2w\). It follows that

$$\begin{aligned} \Vert \rho ^3 |\nabla ^3 w|^2\Vert _{L^p\left( B_r^d(z_0)\right) }&\lesssim \Vert \rho |\nabla \zeta |^2\Vert _{L^p} + \Vert \rho |\nabla ^2 w|^2\Vert _{L^p\left( B_r^d(z_0)\right) }. \end{aligned}$$

To estimate the first term on the right hand side, we make use of the interpolation inequality (71) with \(m=2\) and \(i=1\) in Lemma B.3 of the appendix and find

$$\begin{aligned} \Vert \rho |\nabla \zeta |^2\Vert _{L^p}&= \Vert \nabla \zeta \Vert _{L^{2p}\left( \rho ^p\right) }^2 \lesssim \Vert \zeta \Vert _{L^{\infty }} \Vert \nabla ^2\zeta \Vert _{L^p\left( \rho ^p\right) }. \end{aligned}$$

We then deduce from the definition of \(\zeta \), by using Leibniz’ rule and the fact that \(|\nabla ^k \eta |\lesssim r^{-k}\theta (r,z_0)^{-k}\), which follows from the behavior of the intrinsic balls in (37), that

$$\begin{aligned} \Vert \rho |\nabla \zeta |^2\Vert _{L^p}&\lesssim \Vert \rho \nabla ^2 w\Vert _{L^{\infty }} \left( \Vert \rho ^2 \nabla ^4 w\Vert _{L^p\left( B_{2r}^d(z_0)\right) } + \Vert \rho \nabla ^3 w\Vert _{L^p(B_{2r}^d(z_0))} \right. \\&\quad +\frac{1}{r\theta (r,z_0)}\Vert \rho \nabla ^2 w\Vert _{L^p\left( B_{2r}^d(z_0)\right) }+\frac{1}{r\theta (r,z_0)}\Vert \rho ^2 \nabla ^3 w\Vert _{L^p\left( B_{2r}^d(z_0)\right) }\\&\quad \left. +\frac{1}{r^2\theta (r,z_0)^2}\Vert \rho ^2 \nabla ^2 w\Vert _{L^p\left( B_{2r}^d(z_0)\right) }\right) . \end{aligned}$$

The \(\rho \)’s can we always pulled out of the norms by estimating against \(\theta (r,z_0)^2\), because \(\theta (r,z_0)\sim \theta (r,z)=\max \left\{ r,\sqrt{\rho (z)}\right\} \) by (38). In view of the definitions of the Y(p) and X(p) norms, we then deduce that

$$\begin{aligned} \Vert \rho ^3 |\nabla ^3 w|^2 \Vert _{Y(p)} \lesssim \Vert w\Vert _{X(p)} \Vert \rho \nabla ^2 w\Vert _{L^{\infty }} + \Vert \rho |\nabla ^2 w|^2\Vert _{Y(p)}, \end{aligned}$$

and the second term can be estimated as in our bound for I, so that we find that

$$\begin{aligned} \Vert \rho ^3 |\nabla ^3 w|^2 \Vert _{Y(p)} \lesssim \varepsilon _0 \Vert \rho \nabla ^2 w\Vert _{L^{\infty }}, \end{aligned}$$
(42)

thanks to the estimates from Theorem 5.1

The second term in J can be estimated very similarly. This time we choose \(\zeta = \eta \nabla w\) and eventually arrive at

$$\begin{aligned} \Vert \rho ^2 |\nabla ^2 w|^3\Vert _{Y(p)} \lesssim \Vert \nabla w\Vert _{L^{\infty }}^2 \left( \Vert \nabla w\Vert _{L^{\infty }} + \Vert w\Vert _{X(p)}\right) \lesssim \Vert w_0\Vert _{W^{1,\infty }}, \end{aligned}$$

thanks to the a priori estimates in Theorem 5.1. (Notice that details for this estimate can be found in [65].) The latter bound also entails an estimate for the fourth term in J. Indeed, we have

$$\begin{aligned}{} & {} \Vert \rho ^3 |\nabla ^2 w|^4\Vert _{Y(p)} \le \Vert \rho \nabla ^2 w\Vert _{L^{\infty }} \Vert \rho ^2 |\nabla ^2 w|^3\Vert _{Y(p)}\nonumber \\{} & {} \lesssim \Vert g\Vert _{W^{1,\infty }}\Vert \rho \nabla ^2 w\Vert _{L^{\infty }} \le \varepsilon _0 \Vert \rho \nabla ^2 w\Vert _{L^{\infty }}. \end{aligned}$$
(43)

Finally, in order to bound the third term in J, we interpolate between (42) and (43). Altogether, we find the estimate

$$\begin{aligned} \Vert J\Vert _{Y(p)} \lesssim \Vert w_0\Vert _{W^{1,\infty }} + \varepsilon _0 \Vert \rho \nabla ^2 w\Vert _{L^{\infty }}. \end{aligned}$$

Our estimates on I and J yield the desired control on \(\rho \partial _i f[w]\). To prove the full statement in (41), it remains only to choose \(\varepsilon _0\) small enough and to notice that

$$\begin{aligned} |NE[w]+\mathcal {L}E[w] + E[\mathcal {L}w]| \lesssim |\rho ^2\nabla ^4w|+|\rho \nabla ^3w|+|\nabla ^2w|+|\nabla w|, \end{aligned}$$

which provides

$$\begin{aligned} \Vert NE[w]+\mathcal {L}E[w] + E[\mathcal {L}w]\Vert _{Y(p)} \lesssim \Vert w\Vert _{W^{1,\infty }} + \Vert w\Vert _{X(p)} \lesssim \Vert w_0\Vert _{W^{1,\infty }} \end{aligned}$$

in a similar manner as before. This finishes the proof.

Step 2. Third order derivatives. The prove of the estimates proceeds analogously to the first step, only this time, much more terms have to be considered. For every \(i,j=1,\dots ,N\) we consider the dynamics of \(\rho \partial _j(\rho \partial _i w)\), that is,

$$\begin{aligned}&\partial _t\left( \rho \partial _j(\rho \partial _iw)\right) + \mathcal {L}^2\left( \rho \partial _j(\rho \partial _iw)\right) +N\mathcal {L}\left( \rho \partial _j(\rho \partial _iw)\right) \\&\quad =\rho \partial _j\left( \rho \partial _if[w]\right) +\rho \partial _j \left( NE[w]+\mathcal {L}E[w] + E[\mathcal {L}w]\right) \\&\qquad + NE[\rho \partial _iw]+\mathcal {L}E[\rho \partial _iw] + E[\mathcal {L}(\rho \partial _iw)], \end{aligned}$$

which is equipped with the initial datum \(\rho \partial _j(\rho \partial _iw_0)\). Again, thanks to the a priori bound (39), we know that

$$\begin{aligned}&\Vert \rho \partial _j\left( \rho \partial _iw\right) \Vert _{W^{1,\infty }}+ \Vert \rho \partial _j\left( \rho \partial _iw\right) \Vert _{X(p)}\\&\quad \lesssim \Vert \rho \partial _j\left( \rho \partial _if[w]\right) \Vert _{Y(p)}+\Vert \rho \partial _j \left( NE[w]+\mathcal {L}E[w] + E[\mathcal {L}w]\right) \Vert _{Y(p)}\\&\qquad +\Vert NE[\rho \partial _iw]+\mathcal {L}E[\rho \partial _iw] + E[\mathcal {L}(\rho \partial _iw)]\Vert _{Y(p)}+\Vert \rho \partial _j\left( \rho \partial _iw_0\right) \Vert _{W^{1,\infty }}, \end{aligned}$$

which can be rewritten as

$$\begin{aligned}&\Vert \rho ^2 \partial _{ij}^2 w \Vert _{W^{1,\infty }}+ \Vert \rho ^2 \partial _{ij}^2 w \Vert _{X(p)}\\&\quad \lesssim \Vert \rho ^2 \partial _{ij}^2 f[w] \Vert _{Y(p)}+\Vert \rho \partial _j \left( NE[w]+\mathcal {L}E[w] + E[\mathcal {L}w]\right) \Vert _{Y(p)}\\&\qquad +\Vert NE[\rho \partial _iw]+\mathcal {L}E[\rho \partial _iw] + E[\mathcal {L}(\rho \partial _iw)]\Vert _{Y(p)}+\Vert w_0\Vert _{W }, \end{aligned}$$

by the virtue of the second order derivative (40). The linear terms are, again, relatively easy to bound, as we have

$$\begin{aligned}&|NE[\rho \partial _i w] +\mathcal {L}E[\rho \partial _i w] + E[\mathcal {L}(\rho \partial _i w)]| + |\rho \partial _j \left( NE[w] + \mathcal {L}E[w] +E[\mathcal {L}w]\right) |\\&\quad \lesssim \rho ^3 |\nabla ^5 w| +\rho ^2 |\nabla ^4 w| + \rho |\nabla ^3 w| + |\nabla ^2 w| + |\nabla w|, \end{aligned}$$

and thus, the Y(p) norm of the linear terms is controlled by the X(p) and \(L^{\infty }\) norms of w and \(\rho \nabla w\), which are in turn bounded by \(\Vert w_0\Vert _W\) by the virtue of Theorem 5.1 and the second order estimates in (40).

Let us thus focus on the nonlinear terms. There take the form

$$\begin{aligned} \rho ^2 \partial _{ij}^2 f[w]&= p\star R[w]\star K , \end{aligned}$$

where

$$\begin{aligned} K&= \rho (\nabla w)^{2\star } + \rho \nabla w\star \nabla ^2 w + \rho (\nabla ^2 w)^{2\star } + \rho \nabla w \star \nabla ^3 w + \rho ^2 (\nabla ^2 w)^{3\star } +\rho ^2 \nabla ^2 w\star \nabla ^3 w \\&\quad + \rho ^2 \nabla w\star \nabla ^4 w+ \rho ^3 (\nabla ^2 w)^{2\star } \star \nabla ^3 w+\rho ^3 (\nabla ^2 w)^{4\star } + \rho ^3 \nabla ^2 w\star \nabla ^4 w +\rho ^3 \nabla w\star \nabla ^5 w \\&\quad + \rho ^4 \nabla w \star \nabla ^6 w+ \rho ^4 \nabla ^2 w \star \nabla ^5 w + \rho ^4 (\nabla ^2 w)^{2\star } \star \nabla ^4 w+ \rho ^4 (\nabla ^2 w)^{3\star }\star \nabla ^3 w\\&\quad +\rho ^4 (\nabla ^2 w)^{3\star }\star \nabla ^3 w+ \rho ^4 (\nabla ^2 w)^{5\star } + \rho ^3 (\nabla ^3 w)^{2\star } + \rho ^4 \nabla ^2 w \star (\nabla ^3 w)^{2\star } + \rho ^4 \nabla ^3 w\star \nabla ^4 w , \end{aligned}$$

as the reader may check in a lengthy but straightforward exercise. The bound of K is surprisingly simple as, thanks to the second order estimates (40), no interpolations have to be performed. We simply have

$$\begin{aligned} \Vert K\Vert _{Y(p)}&\lesssim \left( \Vert \nabla w\Vert _{L^{\infty }} + \sum _{k=1}^4\Vert \rho \nabla ^2 w\Vert _{L^{\infty }}^k\right) \left( \Vert \nabla w\Vert _{L^{\infty }} + \Vert w\Vert _{X(p)} + \Vert \rho \nabla w\Vert _{X(p)}\right) \\&\quad + \Vert \nabla w\Vert _{L^{\infty }} \Vert \rho ^4 \nabla ^6 w\Vert _{Y(p)} \\&\quad + \Vert w\Vert _{X(p)} \Vert \rho ^2 \nabla ^3 w\Vert _{L^{\infty }} + \Vert \rho \nabla ^2 w\Vert _{L^{\infty }} \Vert w\Vert _{X(p)} \Vert \rho ^2 \nabla ^3 w\Vert _{L^{\infty }}\\&\lesssim \Vert w_0\Vert _{W^{1,\infty }} + \Vert \rho \nabla ^2 w_0 \Vert _{L^{\infty }} + \varepsilon _0\left( \Vert \rho ^2 \nabla ^2 w\Vert _{L^{\infty }} + \Vert \rho ^2 \nabla ^2 w\Vert _{X(p)}\right) , \end{aligned}$$

where we invoked the second order estimates (40) and the a priori estimates from Theorem 5.1 in the second inequality. We derive the statement of the theorem by choosing \(\varepsilon _0\) sufficiently small. \(\square \)

6 The Truncated Problem

The particular form of the nonlinearity limitates the well-posedness theory for the Cauchy problem for (19) to a small neighborhood of the trivial solution \(w\equiv 0\). It follows that the resulting semi-flow is necessarily local. In order to construct a global semi-flow, whose existence simplifies the construction of invariant manifolds significantly, it is customary to consider a truncated version of the perturbation equation. We thus introduce a cut-off function that eliminates the nonlinear terms (locally) near points where the solution w, or one of its (suitably weighted) derivatives, is too large. This way, the equation becomes linear at these points. The cut-off remains inactive as long as the solution is globally small with respect to \(\Vert \cdot \Vert _W\), which is the case for solutions of the perturbation equation for sufficiently small initial datum due to Theorem 5.3.

To make this truncation more precise we recall that the perturbation equation reads as

$$\begin{aligned} \partial _tw+\mathcal {L}^2w+N\mathcal {L}w=\rho ^{-1}\nabla \cdot \left( \rho ^2 F[w]\right) +\rho F[w], \end{aligned}$$
(44)

where the nonlinear terms are schematically given by

$$\begin{aligned} F[w] = p\star R_l[w]\star \left( \rho \nabla ^3w\star \nabla w+ \rho (\nabla ^2w)^{2\star } + \nabla ^2w\star \nabla w+ (\nabla w)^{2\star }\right) , \end{aligned}$$

cf. (19) and (20). Let \(\hat{\eta }:[0,\infty ) \rightarrow [0,1]\) be a smooth cut-off function that is supported on [0, 2) with \(\hat{\eta }(x) =1\) if \(0 \le x\le 1\). For \(\varepsilon \in (0,1)\), we define

$$\begin{aligned} \eta _\varepsilon&= \eta _\varepsilon \left[ w,\nabla w,\rho \nabla ^2w,\rho ^2\nabla ^3w\right] \\&{:}{=}\hat{\eta }\left( \frac{w^2}{\varepsilon ^2}\right) \hat{\eta }\left( \frac{|\nabla w|^2}{\varepsilon ^2}\right) \hat{\eta }\left( \frac{\left| \rho \nabla ^2w\right| ^2}{\varepsilon ^2}\right) \hat{\eta }\left( \frac{\left| \rho ^2\nabla ^3w\right| ^2}{\varepsilon ^2}\right) . \end{aligned}$$

The truncated problem we consider now is the following:

$$\begin{aligned} \partial _tw+\mathcal {L}^2w+N\mathcal {L}w=\rho ^{-1}\nabla \cdot \left( \rho ^2 F_{\varepsilon }[w]\right) +\rho F_{\varepsilon }[w],\quad F_{\varepsilon } = \eta _{\varepsilon }F. \end{aligned}$$
(45)

It is clear that this equation coincides with (19) as long as all terms \(\left| w\right| \), \(\left| \nabla w\right| \), \(\left| \rho \nabla ^2w\right| \) and \(\left| \rho ^2 \nabla ^3w\right| \) are globally bounded from above by \(\varepsilon \). As we already know for solutions w(t) of the full perturbation equation (19) that \(\left\| w(t)\right\| _{W}\) is controlled by \( \left\| w_0\right\| _{W },\) provided that the initial datum \(w_0\) is sufficiently small, the solutions of both equations coincide if \(\left\| w_0\right\| _{W}\ll \varepsilon \). Thus, in this situation the truncation does not change the dynamics, even though it has the advantage that we end up with a globally well-posed equation, see Theorem 6.3. We remark that the choice of a pointwise truncation is necessary in order to ensure the differentiability of the nonlinearity in w. It has, however, the drawback that the regularity estimates from [65] seem not to carry over to the truncated problem. The technical difficulties arise from the fact that derivatives are falling onto the cut-off functions and the resulting terms fail to be controlled in a way analogously to the nonlinear terms in the original problem.

Moreover, it is crucial that derivatives up to third order are suitable truncated. This looks at first glance surprising because the original theory [65] for the perturbation equation (44) requires only the control of Lipschitz norms. However, it turns out that the well-posedness theory for a truncated equation becomes unexpectedly subtle if the truncation is performed only up to first order.

We will prove well-posedness of (45) in the Hilbert space H, which, as we will see, appears very naturally in the treatment of the truncated equation. Even though it is in general not necessary to work in a Hilbert space setting to construct invariant manifolds, see, for example, [14], this choice will be extremely convenient. Moreover, we can take advantage of the spectral analysis developed in [54] in a nearly identical setting.

In order to prove well-posedness of the truncated problem in H, we need to extend the maximal regularity result (32) for the operator \(\mathcal {L}\) to the Hilbert space H.

Lemma 6.1

The operator \(\mathcal {L}\) satisfies the maximal regularity estimate

$$\begin{aligned} \left\| \nabla w\right\| _H+\Vert \rho \nabla ^2 w\Vert _H \lesssim \left\| \mathcal {L}w\right\| _H. \end{aligned}$$

For the proof we refer to the theory for the operator \(\mathcal {L}_\sigma \) in (33) and its derivatives developed in [65], more precisely Lemmas 1,2 and 4 and their proofs. The proof of Lemma 6.1 can be done analogously. It mainly relies on the observation that the operator \(\mathcal {L}_\sigma \) commutates with tangential derivatives and its radial derivative \(\partial _r\mathcal {L}_\sigma w\) can be rewritten in terms of \(\mathcal {L}_{\sigma +1}\partial _r w\) and lower order terms. This makes the maximal regularity estimate for \(\mathcal {L}_\sigma \), equation (32), applicable.

The proof of well-posedness of the truncated problem exploits a fixed point argument. For this it is necessary to control the Lipschitz constants of the nonlinear terms \(F_{\varepsilon }\) in a suitable way.

Lemma 6.2

It holds that

$$\begin{aligned}&\left\| \sqrt{\rho } F_{\varepsilon } \left[ w_1\right] -\sqrt{\rho }F_{\varepsilon } \left[ w_2\right] \right\| \\&\quad \lesssim \varepsilon \left( \left\| \rho \nabla ^2 w_1-\rho \nabla ^2 w_2 \right\| _H+\left\| \nabla w_1-\nabla w_2\right\| _H + \left\| w_1-w_2\right\| _H \right) . \end{aligned}$$

Proof

This is a straightforward computation embarking from the pointwise estimate

$$\begin{aligned}&\left| \rho F_{\varepsilon } [w_1]-\rho F_{\varepsilon } [w_2]\right| \\&\quad \lesssim \varepsilon \left( \left| \rho ^2 \nabla ^3w_1-\rho ^2\nabla ^3w_2\right| +\left| \rho \nabla ^2w_1-\rho \nabla ^2w_2\right| +\left| \nabla w_1-\nabla w_2\right| +\left| w_1-w_2\right| \right) , \end{aligned}$$

which in turn can be readily checked. Indeed, the latter implies that

$$\begin{aligned}&\left\| \sqrt{\rho } F_{\varepsilon } [w_1] - \sqrt{\rho } F_{\varepsilon } [w_2]\right\| \\&\quad =\left\| \rho F_{\varepsilon } [w_1] - \rho F_{\varepsilon } [w_2]\right\| _{L^2}\\&\quad \lesssim \varepsilon \left( \left\| \rho ^2 \nabla ^3w_1-\rho ^2\nabla ^3w_2\right\| _{L^2}+\left\| \rho \nabla ^2w_1-\rho \nabla ^2w_2\right\| _{L^2}+\left\| \nabla w_1-\nabla w_2\right\| _{L^2}+\left\| w_1-w_2\right\| _{L^2}\right) \\&\quad \lesssim \varepsilon \left( \left\| \nabla ^2w_1-\nabla ^2w_2\right\| _{H}+\left\| \nabla w_1-\nabla w_2\right\| _{H}+ \left\| w_1-w_2\right\| _{H}\right) , \end{aligned}$$

where we have used (25) in the last inequality. \(\square \)

With this preparation, we are in the position to derive well-posedness.

Theorem 6.3

(Global well-posedness in H) There exists \(\varepsilon ^*>0\) such that for every \(\varepsilon \le \varepsilon ^*\) and every initial datum \(w_0\in H\) the truncated problem (45) has a unique global solution w. Moreover, the solution w satisfies

$$\begin{aligned} \left\| w\right\| _{L^\infty \left( (0,\infty );H\right) }+\left\| \nabla w\right\| _{L^2\left( (0,\infty );H\right) }+\Vert \rho \nabla ^2w\Vert _{L^2\left( (0,\infty );H\right) }\lesssim \left\| w_0\right\| _{H}. \end{aligned}$$

Proof

We commence by considering the linear initial value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t\tilde{w}+\mathcal {L}^2\tilde{w}+N\mathcal {L}{\tilde{w}} &{}= \rho ^{-1} \nabla \cdot \left( \rho ^2 \tilde{F}\right) +\rho ^2 \tilde{F}\\ \tilde{w}(0,\cdot )&{}=w_0 \end{array}\right. } \end{aligned}$$
(46)

for fixed \(\tilde{F} \in L^2((0,\infty );L^2(\rho ^2 ))\). The problem (46) has a unique weak solution \(\tilde{w}\) on the time interval (0, T); see Lemma 7 in [65]. This satisfies the estimate

$$\begin{aligned} \begin{aligned}&\left\| \tilde{w}\right\| _{L^\infty \left( (0,T);H\right) }+\Vert \nabla \tilde{w}\Vert _{L^2\left( (0,T);H\right) }+\left\| \rho \nabla ^2\tilde{w} \right\| _{L^2\left( (0,T);H\right) }\\&\quad \le C_T\left( \left\| \rho \tilde{F}\right\| _{L^2\left( (0,T);L^2\right) } +\left\| w_0\right\| _{H} \right) . \end{aligned}\end{aligned}$$
(47)

To derive (47), we test the equation with w in the inner product \(\langle \cdot ,\cdot \rangle _H\), and obtain, after multiple integration by parts,

$$\begin{aligned} \frac{1}{2}\frac{d}{\text {d}t}\left\| {\tilde{w}}\right\| ^2_H + \left\| \mathcal {L}{\tilde{w}}\right\| ^2_H+N\left\| \mathcal {L}^{1/2}{\tilde{w}}\right\| ^2_H= - \langle \rho \tilde{F},\nabla {\tilde{w}}\rangle _H+\langle \rho \tilde{F},{\tilde{w}}\rangle _H. \end{aligned}$$
(48)

Using the Cauchy-Schwarz inequality in the energy space \(L^2(\rho )\), we furthermore notice that

$$\begin{aligned} \left| \langle \rho \tilde{F},\nabla {\tilde{w}}\rangle _H\right|&\le \left| \langle \rho {\tilde{F}},\nabla {\tilde{w}}\rangle \right| +\left| \langle \rho {\tilde{F}},\nabla \mathcal {L}{\tilde{w}}\rangle \right| \\ {}&\le \Vert \sqrt{\rho } {\tilde{F}}\Vert \left( \Vert \sqrt{\rho } \nabla {\tilde{w}}\Vert + \Vert \sqrt{\rho } \nabla \mathcal {L}{\tilde{w}}\Vert \right) \ \le \Vert \sqrt{\rho } {\tilde{F}} \Vert \left( \Vert {\tilde{w}}\Vert _H + \Vert \mathcal {L}{\tilde{w}}\Vert _{H}\right) \end{aligned}$$

and

$$\begin{aligned} \left| \langle \rho \tilde{F},{\tilde{w}} \rangle _H \right|&\le \left| \langle \rho \tilde{F}, {\tilde{w}}\rangle \right| + \left| \langle \rho \tilde{F},\mathcal {L}{\tilde{w}}\rangle \right| \\&\le \left\| \rho \tilde{F}\right\| \left( \left\| {\tilde{w}}\right\| +\left\| \mathcal {L}{\tilde{w}}\right\| \right) \le \left\| \rho \tilde{F}\right\| \left( \left\| {\tilde{w}}\right\| _H+\left\| \mathcal {L}^{1/2}{\tilde{w}}\right\| _H\right) . \end{aligned}$$

We now invoke Young’s inequality and the fact that \(\rho \le 1\) and we drop the non-negative lower-order term on the left-hand side to derive the differential inequality

$$\begin{aligned} \frac{d}{\text {d}t}\left\| {\tilde{w}}\right\| ^2_H + \left\| \mathcal {L}{\tilde{w}}\right\| ^2_H \lesssim \left\| \sqrt{\rho }\tilde{F}\right\| ^2 + \left\| {\tilde{w}}\right\| ^2_H. \end{aligned}$$

We deduce (47) with help of the maximal regularity result of Lemma 6.1 and a Grönwall type argument.

To show well-posedness for the nonlinear problem, we apply a fixpoint argument. The estimate in Lemma 6.2 shows that the nonlinearity \(F_{\varepsilon }[w]\) belongs to \( L^2((0,T);L^2(\rho ^2))\) whenever \(w\in L^\infty ((0,T);H)\) is given such that \(\nabla w, \rho \nabla ^2w \in L^2((0,T);H)\). By the linear theory, there exists thus a solution \(\tilde{w}=\tilde{w}(w,w_0)\) to the Cauchy problem (46) with \(\tilde{F} = F_{\varepsilon }[w]\), and the estimate (47) and Lemma 6.2 (applied to \(w_1=\tilde{w}\) and \(w_2=0\)) yield that

$$\begin{aligned}&\left\| \tilde{w}\right\| _{L^\infty \left( (0,T);H\right) }+\left\| \nabla \tilde{w}\right\| _{L^2\left( (0,T);H\right) }+\left\| \rho \nabla ^2\tilde{w}\right\| _{L^2\left( (0,T);H\right) } \\&\le C_T\varepsilon \left( \left\| \rho \nabla ^2w\right\| _{L^2\left( (0,T);H\right) }+\left\| \nabla w\right\| _{L^2\left( (0,T);H\right) } +\left\| w\right\| _{L^\infty \left( \left( 0,T\right) ;H\right) }\right) +C_T\left\| g\right\| _{H}. \end{aligned}$$

Similarly, given \(w_1\) and \(w_2\) in the same class of functions, the difference of the corresponding solutions \(\tilde{w}_1\left( w_1,g\right) \) and \(\tilde{w}_2\left( w_2,g\right) \) to the associated linear problems is bounded by

$$\begin{aligned}&\left\| \rho \nabla ^2 \tilde{w}_1-\rho \nabla ^2 \tilde{w}_2\right\| _{L^2\left( H\right) }+\left\| \nabla \tilde{w}_1-\nabla \tilde{w}_2\right\| _{L^2\left( H\right) }+\left\| \tilde{w}_1-\tilde{w}_2\right\| _{L^\infty (H)} \\&\quad \le C_T\varepsilon \left( \left\| \rho \nabla ^2 w_1-\rho \nabla ^2 w_2\right\| _{L^2\left( H\right) }+\left\| \nabla w_1-\nabla w_2\right\| _{L^2\left( H\right) }+\left\| w_1-w_2\right\| _{L^\infty (H)}\right) . \end{aligned}$$

We conclude that, for \(\varepsilon \) sufficiently small, the mapping \(w\mapsto \tilde{w}(w,w_0)\) is a contraction on the space \(\left\{ w\in L^\infty \left( (0,T); H \right) \text { with } \nabla w \in L^2\left( (0,T); H\right) \text { and } \rho \nabla ^2w \in \right. \left. L^2\left( (0,T); H\right) \right\} \). An application of Banach’s fixed point theorem shows that there exists a unique solution w to the truncated problem (19) with initial datum \(w_0\in H\). We stress that the constructed solution is defined locally in time and that the size of the admissible \(\varepsilon \) is dependent on T. In what follows, we choose \(\varepsilon \) for \(T=1\) and show that the constructed solution can be extended globally in time.

Our starting point is the estimate for the linear problem (48), in which we choose \({\tilde{w}}=w\) and \({\tilde{F}}=F_{\varepsilon }[w]\). In order to avoid a time-dependency in the estimate for w, we should estimate the nonlinearities slightly differently as above. We notice that the nonlinearity obeys the pointwise estimate

$$\begin{aligned} |F_{\varepsilon }[w]| \lesssim \rho |\nabla w||\nabla ^3 w| + \rho |\nabla ^2 w|^2 + |\nabla w| |\nabla ^2 w| + |\nabla w|^2 , \end{aligned}$$
(49)

which implies that

$$\begin{aligned} \Vert \rho F_{\varepsilon }[w]\Vert _{L^1} \lesssim \Vert \nabla w\Vert _{L^2} \Vert \rho ^2 \nabla ^3 w\Vert _{L^2} + \Vert \rho \nabla ^2 w\Vert _{L^2}^2 + \Vert \nabla w\Vert _{L^2} \Vert \rho \nabla ^2 w\Vert _{L^2} + \Vert \nabla w\Vert _{L^2}^2 \end{aligned}$$

via the Cauchy–Schwarz inequality. In view of the norm characterization in (25), the latter can be rewritten as

$$\begin{aligned} \Vert \rho F_{\varepsilon }[w]\Vert _{L^1} \lesssim \Vert \nabla w\Vert _H\left( \Vert \nabla w\Vert _H + \Vert \rho \nabla ^2 w\Vert _H\right) . \end{aligned}$$

We also notice that

$$\begin{aligned} |w| + |\nabla w| + |\mathcal {L}w|+ \rho |\nabla \mathcal {L}w| \lesssim |w| + |\nabla w| + \rho |\nabla ^2 w| + \rho ^2 |\nabla ^3 w| \lesssim \varepsilon \end{aligned}$$

in the support of the nonlinearity \(F_{\varepsilon }\) by our choice of the cut-off. Thanks to the previous two bounds, the nonlinear terms on the right-hand side of (48) are estimated as follows:

$$\begin{aligned}&\left| \langle \rho {F}_{\varepsilon }[w],\nabla w\rangle _H\right| +\left| \langle \rho {F}_{\varepsilon }[w], w\rangle _H\right| \\&\quad = \left| \langle \rho {F}_{\varepsilon }[w], w\rangle \right| +\left| \langle \rho {F}_{\varepsilon }[w],\nabla w\rangle \right| +\left| \langle \rho {F}_{\varepsilon }[w],\mathcal {L}w\rangle \right| + \left| \langle \rho {F}_{\varepsilon }[w],\nabla \mathcal {L}w\rangle \right| \\&\quad \lesssim \varepsilon \Vert \rho F_{\varepsilon }[w]\Vert _{L^1}\\&\quad \lesssim \varepsilon \Vert \nabla w\Vert _H\left( \Vert \nabla w\Vert _H + \Vert \rho \nabla ^2 w\Vert _H\right) . \end{aligned}$$

Substitution into (48) thus yields

$$\begin{aligned} \frac{d}{\text {d}t} \Vert w\Vert _H^2 + \Vert \mathcal {L}w\Vert _{H}^2 \lesssim \varepsilon \Vert \nabla w\Vert _H\left( \Vert \nabla w\Vert _H + \Vert \rho \nabla ^2 w\Vert _H\right) , \end{aligned}$$

where we have again dropped the lower order term on the left-hand side. In view of the maximal regularity estimate from Lemma 6.1, the right-hand side can be absorbed into the left-hand side provided that \(\varepsilon \) is chosen sufficiently small. This gives

$$\begin{aligned} \frac{d}{\text {d}t} \Vert w\Vert _H^2 + \frac{1}{C}\Vert \mathcal {L}w\Vert _{H}^2 \le 0 \end{aligned}$$

for some \(C>1\), and the local solution can thus be extended globally for all times. The estimate in the assertion of the theorem follows. \(\square \)

It will be crucial for our analysis to have some smoothing properties established for the truncated equation (45). This will be achieved in the following two lemmas:

Lemma 6.4

There exists \(\varepsilon ^*\) possibly smaller than in Theorem 6.3, such that for any \(0< \varepsilon \le \varepsilon ^*\) the following holds: If w is the solution to the truncated equation (45) with initial datum \(w_0\in H\) then it holds that

$$\begin{aligned}&\left\| \partial _tw\right\| _{L^q\left( (1/4,2);L^q(\rho )\right) } +\left\| w\right\| _{L^q\left( (1/4,2);L^q(\rho )\right) }+\left\| \nabla w\right\| _{L^q\left( (1/4,2);L^q(\rho )\right) }\\&\quad + \left\| \nabla ^2w\right\| _{L^q\left( (1/4,2);L^q(\rho )\right) } +\left\| \rho \nabla ^3w\right\| _{L^q\left( (1/4,2);L^q(\rho )\right) }+\left\| \rho ^2\nabla ^4w\right\| _{L^q\left( (1/4,2);L^q(\rho )\right) } \lesssim \left\| w_0\right\| _{H} \end{aligned}$$

for any \(q\in (1,\infty )\).

Proof

We will perform an iterative argument for which it is convenient to localize time on an arbitrary scale. For this purpose, we fix \(T\in (0,2)\) and introduce a smooth cut-off function \(\phi _1:\mathbb {R}^+_0\rightarrow [0,1]\), satisfying \(\phi _1(t)=0\) if \(t\le T\) and \(\phi _1(t)=1\) if \(t\ge 2T\). Of course, its growth rate is inversely proportional to the cut-off scale T, but having this quantity uniformly finite throughout the proof, we will simply write \(|\phi '_1|\lesssim 1\) for convenience. Smuggling \(\phi _1\) into the truncated equation (45) gives

$$\begin{aligned} \partial _t(w \phi _1) + \mathcal {L}^2(w\phi _1) + N\mathcal {L}(w\phi _1) = \rho ^{-1}\nabla \cdot \left( \rho ^2F_{\varepsilon }[w]\right) \phi _1 +\rho F_{\varepsilon }[w]\phi _1 + w\phi _1'. \end{aligned}$$

We note that \(w\phi _1\) has zero initial datum, which makes the maximal regularity theory for \(\mathcal {L}^2+N\mathcal {L}\) applicable: From (35) and elementary computations we infer the maximal regularity estimate

$$\begin{aligned} \begin{aligned}&\left\| \partial _t (w\, \phi _1)\right\| _{L^2\left( L^2(\rho )\right) }+\left\| \nabla ^2w\,\phi _1\right\| _{L^2\left( L^2(\rho )\right) }+\left\| \rho \nabla ^3w\,\phi _1\right\| _{L^2\left( L^2(\rho )\right) } + \left\| \rho ^2\nabla ^4w\, \phi _1\right\| _{L^2\left( L^2(\rho )\right) } \\&\quad \lesssim \Vert \eta _{\varepsilon } F\, \phi _1\Vert _{L^2(L^2(\rho ))} + \Vert \rho \nabla \eta _{\varepsilon }\, F \phi _1\Vert _{L^2(L^2(\rho ))} + \Vert \rho \eta _{\varepsilon } \nabla F\, \phi _1\Vert _{L^2(L^2(\rho ))} + \left\| w\phi _1'\right\| _{L^2\left( L^2(\rho )\right) }, \end{aligned} \end{aligned}$$
(50)

where we have set \(F=F_{\varepsilon }[w]\) for brevity. For brevity, we have dropped the time interval (0, 2) in the norms. The final term on the right-hand side is easily controlled via the a priori estimates from Theorem 6.3 and the defining properties of the temporal cut-off \(\phi _1\); it holds that

$$\begin{aligned} \left\| w\phi _1'\right\| _{L^2\left( L^2(\rho )\right) } \lesssim \left\| w\right\| _{L^\infty \left( L^2(\rho )\right) } \le \Vert w\Vert _{L^{\infty }(H)} \lesssim \left\| w_0\right\| _{H}. \end{aligned}$$

For the first and the second term, we use the pointwise bound on the nonlinearity on the support of \(\eta _{\varepsilon }\),

$$\begin{aligned} |F[w]| \lesssim \varepsilon \left( |\nabla w| + |\nabla ^2 w| + \rho |\nabla ^3 w|\right) \lesssim \rho ^{-1}\varepsilon ^2, \end{aligned}$$
(51)

cf. (49). More precisely, plugging the first of the two estimates into the first term on the right-hand side of (50), we find that

$$\begin{aligned} \Vert \eta _{\varepsilon } F\, \phi _1\Vert _{L^2(L^2(\rho ))}&\lesssim \varepsilon \left( \Vert \nabla w \phi _1\Vert _{L^2(L^2(\rho ))}+ \Vert \nabla ^2 w \phi _1\Vert _{L^2(L^2(\rho ))} + \Vert \rho \nabla ^3 w \phi _1\Vert _{L^2(L^2(\rho ))}\right) . \end{aligned}$$

We interpolate the first term with the help of Lemma B.3 in the appendix, so that

$$\begin{aligned} \Vert \eta _{\varepsilon } F\, \phi _1\Vert _{L^2(L^2(\rho ))}&\lesssim \varepsilon \left( \Vert w \phi _1\Vert _{L^2(L^2(\rho ))}+ \Vert \nabla ^2 w \phi _1\Vert _{L^2(L^2(\rho ))} + \Vert \rho \nabla ^3 w \phi _1\Vert _{L^2(L^2(\rho ))}\right) . \end{aligned}$$

The two last terms on the right-hand side can be absorbed into the left-hand side of (50) if \(\varepsilon \) is chosen sufficiently small, while the first term is controlled by the initial datum through the energy estimate of Theorem 6.3.

To estimate the second term on the right-hand side of (50), we notice that

$$\begin{aligned} |\nabla \eta _{\varepsilon }| \lesssim 1 + \frac{1}{\varepsilon }\left( |\nabla ^2 w| + \rho |\nabla ^3 w| +\rho ^2 |\nabla ^4 w|\right) , \end{aligned}$$

and thus, using that \(\rho \le 1\) and the second estimate in (51), we find that

$$\begin{aligned}&\Vert \rho \nabla \eta _{\varepsilon }\, F \phi _1\Vert _{L^2(L^2(\rho ))}\\&\quad \lesssim \Vert \chi _{{{\,\textrm{supp}\,}}\eta _{\varepsilon } } \, F \phi _1\Vert _{L^2(L^2(\rho ))}\\&\qquad + \varepsilon \left( \Vert \nabla w \phi _1\Vert _{L^2(L^2(\rho ))} +\Vert \nabla ^2 w \phi _1\Vert _{L^2(L^2(\rho ))} +\Vert \rho \nabla ^3 w \phi _1\Vert _{L^2(L^2(\rho ))} \right) . \end{aligned}$$

The first term can be estimated as before and the second one can be absorbed into the left-hand side of (50) if \(\varepsilon \) is sufficiently small.

It remains to study the third term on the right-hand side of (50). Here, we find, after a small computation, that

$$\begin{aligned} \rho |\nabla F| \lesssim |F| +\varepsilon \left( |\nabla w|+ |\nabla ^2 w| + \rho |\nabla ^3w | +\rho ^2|\nabla ^4w|\right) . \end{aligned}$$

Hence, in view of the bound in (51), the only new term we have to deal with is the fourth-order term. This one, however, can be controlled as the second- and third-order term before by absorption into the left-hand side of (50).

Combining all the estimates that we discussed, adding the lower order term from the energy inequality in Theorem 6.3 to the left-hand side, making use of the interpolation inequality in Lemma B.3 in the appendix to include the first order spatial gradient and finally dropping all higher order terms, we arrive at

$$\begin{aligned} \left\| w\, \phi _1 \right\| _{L^2\left( L^2(\rho )\right) }+ \left\| \partial _t (w\, \phi _1)\right\| _{L^2\left( L^2(\rho )\right) }+\left\| \nabla ( w\,\phi _1)\right\| _{L^2\left( L^2(\rho )\right) } \lesssim \Vert w_0\Vert _H. \end{aligned}$$
(52)

We are now in the position to invoke the Sobolev inequality Lemma B.1 in the appendix, namely

$$\begin{aligned} \left\| w\right\| _{L^q\left( L^q(\rho )\right) }\lesssim \left\| \partial _tw\right\| _{L^p\left( L^{p}(\rho )\right) } + \left\| w\right\| _{L^p\left( L^{p}(\rho )\right) } + \left\| \nabla w\right\| _{L^p\left( L^{p}(\rho )\right) }, \end{aligned}$$

where the integrability exponents \(1\le p\le q<\infty \) are such that

$$\begin{aligned} 1-\frac{N+2}{p} = -\frac{N+2}{q}. \end{aligned}$$

In our situation, that is \(p=p_1=2\), we deduce from (52) the inequality

$$\begin{aligned} \left\| w\phi _1\right\| _{L^{q_1}\left( L^{q_1}(\rho )\right) } \lesssim \left\| w_0\right\| _{H}, \end{aligned}$$
(53)

where we now have that \( q= q_1 = \frac{2(N+2)}{N}\).

In order to further increase the order of integrability, we have to use the maximal regularity estimate in \(L^q\), see (35). We introduce a new smooth cut-off function \(\phi _2:\mathbb {R}^+_0\rightarrow [0,1]\), such that \(\phi _2(t)=0\) if \(t\le 2T\) and \(\phi _2(t)=1\) if \(t\ge 3T\). Using the maximal regularity estimate for \(w\phi _2\) and \(q_1\), we get

$$\begin{aligned}&\left\| \partial _t(w\phi _2)\right\| _{L^{q_1}\left( L^{q_1}(\rho )\right) } + \left\| \nabla ^2(w\phi _2)\right\| _{L^{q_1}\left( L^{q_1}(\rho )\right) }+\left\| \rho \nabla ^3(w\phi _2)\right\| _{L^{q_1}\left( L^{q_1}(\rho )\right) }+\left\| \rho ^2\nabla ^4(w\phi _2)\right\| _{L^{q_1}\left( L^{q_1}(\rho )\right) } \\&\quad \lesssim \Vert \eta _{\varepsilon } F\, \phi _2\Vert _{L^{q_1}(L^{q_1}(\rho ))} + \Vert \rho \nabla \eta _{\varepsilon }\, F \phi _2\Vert _{L^{q_1}(L^{q_1}(\rho ))} + \Vert \rho \eta _{\varepsilon } \nabla F\, \phi _2\Vert _{L^{q_1}(L^{q_1}(\rho ))}+ \left\| w\phi _2'\right\| _{L^{q_1}\left( L^{q_1}(\rho )\right) }. \end{aligned}$$

The treatment of the right-hand side is almost identical to the \(p=2\) case, only that now equation (53) is invoked where before the energy equation was used. We eventually arrive at

$$\begin{aligned} \left\| w\, \phi _2 \right\| _{L^{q_1}\left( L^{q_1}(\rho )\right) }+ \left\| \partial _t (w\, \phi _2)\right\| _{L^{q_1}\left( L^{q_1}(\rho )\right) }+\left\| \nabla ( w\,\phi _2)\right\| _{L^{q_1}\left( L^{q_1}(\rho )\right) } \lesssim \Vert w_0\Vert _H, \end{aligned}$$

and we may use the Sobolev inequality once more with \(p_2 \le \min \{q_1,N+2\}\). By iterating this procedure, the order of integrability can be further increased. After finitely many steps, depending only on the space dimension, and by choosing T carefully, the statement follows. \(\square \)

Theorem 6.3 shows that the truncated equation generates a global semiflow in the Hilbert space setting. We define \(S^t_\varepsilon : H \rightarrow H\) as the corresponding flow map,

$$\begin{aligned} S_{\varepsilon }^t(w_0) = w(t,\cdot ) \end{aligned}$$

where w is the unique solution to the truncated nonlinear problem (45) with initial datum \(w_0\). Our invariant manifold construction is based on that flow. More accurately, we choose to consider a discrete time setting by working with the time-one map rather than with the continuous flow. Compared to constructing the manifolds for the semiflow directly, this has the advantage, that the differentiability of the time-one map is a weaker property than its counterpart for flows, the variation of constants formula. We write \(S_\varepsilon {:}{=}S_\varepsilon ^1\).

The main regularity results for the perturbation variable w are stated uniformly in time and space, while our invariant manifold theory will rely on Hilbert spaces. The connection of both necessitates to establish suitable smoothing estimates. We will do so in the next lemma, which we improve after one time step. As we are interested in the long-time behavior, such a delayed smoothing statement does not cause any problems.

Lemma 6.5

Let \(\varepsilon ^*\) be as in Lemma 6.4 and \(\varepsilon \le \varepsilon ^*\). For any \(w_0\in H\) the following holds: If \(w(t)= S_\varepsilon ^t(w_0)\) is the solution to the truncated equation, then

$$\begin{aligned} \left\| w(t)\right\| _{L^\infty } +\left\| \nabla w(t)\right\| _{L^\infty } + \left\| \rho \nabla ^2w\right\| _{L^\infty }+\left\| \rho ^2\nabla ^3w\right\| _{L^\infty } \lesssim \left\| w_0\right\| _{H} \end{aligned}$$

for all \(t\ge 1/2\). In particular, this yields \(\Vert S_\varepsilon (w_0)\Vert _{W}\lesssim \left\| w_0\right\| _{H}\). Moreover, there exists \(\varepsilon ^0\le \min \left\{ \varepsilon ,\varepsilon _0\right\} \) such that \(S^t_\varepsilon \left( S_\varepsilon (w_0)\right) =S^t\left( S_\varepsilon (w_0)\right) \) for \(t>0\), provided that \(\left\| w_0\right\| _H\le \varepsilon ^0\).

Proof

Due to the Morrey-type embedding inequality B.4 in the appendix, we have that \(\left\| w\right\| _{L^\infty }\lesssim \left\| w\right\| _{L^q(\rho )}+\left\| \nabla w\right\| _{L^q(\rho )}\), provided that q is sufficiently large. We can extend this estimate to higher order derivatives and find

$$\begin{aligned} \Vert w\Vert _{W} \lesssim \left\| w\right\| _{L^q(\rho )}+\left\| \nabla w\right\| _{L^q(\rho )} + \Vert \nabla ^2 w\Vert _{L^{q}(\rho )} + \Vert \rho \nabla ^3 w\Vert _{L^{q}(\rho )}+ \Vert \rho ^2\nabla ^4 w\Vert _{L^{q}(\rho )}. \end{aligned}$$
(54)

Thus, in order to establish the asserted estimate, we have to improve the estimate in Lemma 6.4 to a pointwise-in-time statement. For this, we invoke a simple construction.

For an arbitrarily given function \(f\in L^q(1/4,1/2)\), we consider the set

$$\begin{aligned} J_f=\left\{ t\in (1/4,1/2): \left| f(t)\right| >8\left\| f\right\| _{L^q(1/4,1/2)}\right\} . \end{aligned}$$

By Chebyshev’s inequality, it holds that

$$\begin{aligned} \left\| f\right\| _{L^q(1/4,1/2)}\ge \left\| f\right\| _{L^q(J_f)} \ge 8\left\| f\right\| _{L^q(1/4,1/2)} |J_f|^{1/q}, \end{aligned}$$

where \(\left| \cdot \right| \) denotes the Lebesgue measure, and thus, \(|J_f|\le \left( 1/8\right) ^q\). Moreover, since \(q\ge 1\), we have also an estimate on the complementary set in (1/4, 1/2), namely \(|J_f^c|\ge 1/4-\left( 1/8\right) ^q \ge 1/8\). Applying this estimate to the function \(f(t)=\left\| w(t)\right\| _{L^q(\rho )}+\left\| \nabla w(t)\right\| _{L^q(\rho )}+\left\| \nabla ^2 w(t)\right\| _{L^q(\rho )}+\left\| \rho \nabla ^3w(t)\right\| _{L^q(\rho )}+\left\| \rho ^2\nabla ^4w(t)\right\| _{L^q(\rho )}\) and using the above estimate (54), we find that

$$\begin{aligned}&\left\| w\right\| _{L^\infty (J_f^c;W)}\lesssim \Vert f \Vert _{L^{\infty }(J_f^c)} \lesssim \Vert f\Vert _{L^{q}(1/4,1/2)}\\&\lesssim \left\| w\right\| _{L^q\left( (1/4,1/2);L^q(\rho )\right) }+\left\| \nabla w\right\| _{L^q\left( (1/4,1/2);L^q(\rho )\right) }+ \left\| \nabla ^2w\right\| _{L^q\left( (1/4,1/2);L^q(\rho )\right) }\\&\quad +\left\| \rho \nabla ^3w\right\| _{L^q\left( (1/4,1/2);L^q(\rho )\right) }+\left\| \rho ^2\nabla ^4w\right\| _{L^q\left( (1/4,1/2);L^q(\rho )\right) } . \end{aligned}$$

By the virtue of Lemma 6.4, the right-hand side is bounded by \(\Vert w_0\Vert _H\). This shows that there exists a time \(\hat{t}\in (1/4,1/2)\) such that

$$\begin{aligned} \Vert w(\hat{t})\Vert _{W} = \Vert S^{\hat{t}}_\varepsilon (w_0)\Vert _{W} \le C \Vert w_0\Vert _{H}. \end{aligned}$$

Now suppose that \(\left\| w_0\right\| _H\le \varepsilon ^0\) From Theorems 5.1 and  5.3 we know, that the nonlinear flow \(S^t(w_0)\) can be controlled in W by its initial data g in the W-norm, that is, \(\Vert S^t(w_0)\Vert _{W} \le \tilde{C}\Vert w_0\Vert _{W}\) for every \(t\ge 0\), provided that \(\Vert w_0\Vert _{W}\) is sufficiently small. If we now choose \(\varepsilon ^0\) in a way such that \(\tilde{C}C\varepsilon ^0\le \varepsilon \), we obtain that \(S^t_\varepsilon \left( S^{\hat{t}}_\varepsilon (w_0)\right) = S^t\left( S_\varepsilon ^{\hat{t}}(w_0)\right) \) for every \(t\ge 0 \), and thus

$$\begin{aligned} \Vert S^{\hat{t}+t}_\varepsilon (w_0)\Vert _{W} = \Vert S^t\left( S_\varepsilon ^{\hat{t}}(w_0)\right) \Vert _{W}\le \tilde{C}\Vert S^{\hat{t}}_\varepsilon (w_0)\Vert _{W}\le \tilde{C}C\Vert w_0\Vert _{H} \end{aligned}$$

for every \(t\ge 0\). Since \(\hat{t}\in (1/4,1/2)\), this gives the result. \(\square \)

By construction of the solution in Theorem 6.3, we know that \(S_\varepsilon \) is Lipschitz-continuous. We decompose the global flow \(S_\varepsilon \) into a linear and nonlinear part

$$\begin{aligned} S_\varepsilon = L + R_\varepsilon , \quad \text { where } L{:}{=}e^{-\left( \mathcal {L}^2+N\mathcal {L}\right) }. \end{aligned}$$

As a difference of Lipschitz continuous functions, \(R_\varepsilon \) is Lipschitz continuous as well. Actually, its Lipschitz constant can be estimated in terms of \(\varepsilon \) and becomes thus a contraction if \(\varepsilon \) is sufficiently small.

Lemma 6.6

Let \(\varepsilon ^*>0\) as in Lemma 6.4 and \(0<\varepsilon \le \varepsilon ^*\). Then, for any g, \(\tilde{g} \in H\) it holds that

$$\begin{aligned} \left\| R_\varepsilon (g)-R_\varepsilon (\tilde{g})\right\| _{H}\lesssim \varepsilon \left\| g-\tilde{g}\right\| _{H}. \end{aligned}$$

Proof

Let g, \(\tilde{g}\in H\) be given. Then \(w(t,x)=S_\varepsilon ^t(g)\) and \(\tilde{w}(t,x)=S_\varepsilon (\tilde{g})\) solve the truncated problem (45) with initial data g or \(\tilde{g}\), respectively. We set \(v(t)=w(t)-L^tg\), where \(L^tg\) is the solution to the linear problem with initial datum g, so that, in particular \(v(1,x)=R_\varepsilon (g)\). Analogously we define \(\tilde{v}\). Then \(v-\tilde{v}\) solves the equation

$$\begin{aligned}&\partial _t(v-\tilde{v}) + \mathcal {L}^2(v-\tilde{v})+N\mathcal {L}(v-\tilde{v}) \\&\quad = \frac{1}{\rho }\nabla \cdot \left( \rho ^2\left( F_{\varepsilon }[w]- F_{\varepsilon }[\tilde{w}] \right) \right) + \rho \left( F_{\varepsilon }[w]-F_{\varepsilon }[\tilde{w}]\right) , \end{aligned}$$

with zero initial datum. With the help of estimate (47) from the proof of Theorem 6.3 we deduce that

$$\begin{aligned}&\left\| v(1)-\tilde{v}(1)\right\| _{H} + \left\| \nabla v-\nabla \tilde{v}\right\| _{L^2\left( (0,1);H\right) } + \left\| \rho \nabla ^2v-\rho \nabla ^2\tilde{v}\right\| _{L^2\left( (0,1);H\right) }\\&\quad \lesssim \left\| \rho F_{\varepsilon }[w] -\rho F_{\varepsilon }[\tilde{w}]\right\| _{L^2\left( (0,1);L^2\right) } \\&\quad \lesssim \varepsilon \left( \left\| w-\tilde{w}\right\| _{L^\infty \left( (0,1);H\right) } + \left\| \nabla w-\nabla \tilde{w}\right\| _{L^2\left( (0,1);H\right) } + \left\| \rho \nabla ^2w-\rho \nabla ^2\tilde{w}\right\| _{L^2\left( (0,1);H\right) } \right) , \end{aligned}$$

where we used Lemma 6.2 in the last step. Since \(S_\varepsilon \) is Lipschitz continuous, the right-hand side is controlled by \(\varepsilon \Vert g-\tilde{g}\Vert _{H}\). This finishes the proof. \(\square \)

Additionally we would like to know that \(R_\varepsilon \) is quadratic near the origin. The superlinear behavior entails the differentiability of \(R_\varepsilon \) in the origin, with derivative zero. Neither this information nor the regularity will be necessary for our construction of the invariant manifolds. However, as we will see, it provides the additional geometric insight that the center manifold \(W_\varepsilon ^c\) touches the stable Eigenspace \(E_c\) tangentially, see Theorem 7.1. The proof of the quadratic estimate is rather technical and exploits smoothing properties of the nonlinear flow. We are able to show the quadratic behavior after a regularizing time step, in a similar way as in Lemma 6.5, what still is sufficient for our purpose.

Lemma 6.7

Let \(\varepsilon ^*\) be as in Lemma 6.4. For all \(0\le \varepsilon \le \varepsilon _*\) and every \(g\in H\) it holds that

$$\begin{aligned} \left\| R_\varepsilon \left( S_\varepsilon (g)\right) \right\| _{H} \lesssim \left\| g\right\| ^2_{H}. \end{aligned}$$

Proof

Let \(w(t,x)=S^t_\varepsilon (g)\) and set \(W(t,x) = w(t+1,x)\), which yields \(W(0,\cdot )=S_\varepsilon (g)\). Let v solve the initial value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _tv+\mathcal {L}^2v+N\mathcal {L}v &{}= \frac{1}{\rho } \nabla \cdot \left( \rho ^2F_{\varepsilon }[W]\right) +\rho F_{\varepsilon }[W]\quad \text { in } (0,\infty )\times B_1(0)\\ v(0,\cdot )&{}=0 \quad \text { in } B_1(0), \end{array}\right. } \end{aligned}$$

so that \( v(1,\cdot ) = R_\varepsilon \left( W(0,\cdot )\right) = R_\varepsilon \left( S_\varepsilon (g)\right) \). Thanks to the proof of Theorem 6.3, more precisely estimate (47), we know that

$$\begin{aligned} \left\| v(1)\right\| ^2_{H}&\lesssim \int \limits _0^1 \left\| \sqrt{\rho }F_{\varepsilon }[W]\right\| ^2\, \text {d}t = \int \limits _1^2 \left\| \sqrt{\rho }F_{\varepsilon }[w]\right\| ^2\, \text {d}t, \end{aligned}$$

and by the virtue of the pointwise estimate (49) and Young’s inequality, we deduce

$$\begin{aligned} \left\| v(1)\right\| _{H}&\lesssim \left\| \rho \nabla ^3w\right\| _{L^4\left( (1,2);L^4(\rho ^2)\right) }^2 +\left\| \nabla ^2w\right\| _{L^4\left( (1,2);L^4(\rho ^2)\right) }^2+\left\| \nabla w\right\| _{L^4\left( (1,2);L^4(\rho ^2)\right) }^2. \end{aligned}$$

It thus remains to invoke the smoothing property from Lemma 6.4 with \(q=4\) and the bound \(\rho \le 1\) in order to prove the lemma. \(\square \)

Lemma 6.8

Let \(\varepsilon ^*\) be as in Lemma 6.4 and \(\varepsilon \le \varepsilon ^*\). Let \(\varepsilon ^0 \le \min \left\{ \varepsilon , \varepsilon ^*\right\} \) be as in Lemma 6.5. Then, for any \(g, \tilde{g}\in H^1_{1,2}\) with \(\left\| g\right\| _{H},\left\| \tilde{g}\right\| _{H}\le \varepsilon ^0\) it holds that

$$\begin{aligned} \left\| S^{\hat{t}}_\varepsilon (g)-S_\varepsilon ^{\hat{t}}(\tilde{g}) \right\| _{W} \lesssim \left\| g-\tilde{g}\right\| _{H} \end{aligned}$$

for some \({\hat{t}}\in (\frac{4}{5},1)\).

Proof

Similar to the previous proof, we will make use of a maximal regularity estimate for the linear equation. However, this proof will be less technical, because the previous lemma, combined with a result of [65], will allow us to consider the flow without the cut-off function \(\eta _\varepsilon \).

Let w(t) denote \(S^t_\varepsilon (g)\) and \({\tilde{w}}(t)=S^t_\varepsilon (\tilde{g})\) respectively. Then, by Lemma 6.5 we know that \(\left\| w(t)\right\| _{W} \lesssim \left\| g\right\| _{H}\) for every \(t\ge 1/2\). At this point we invoke Theorem 2 of [65] to also achieve even better control (in terms of \(\rho \)) on the higher derivatives: It guarantees that the unique solution w of the full nonlinear perturbation equation (19) with (of course small) initial data g satisfies \(\left| \nabla ^2w(x,t)\right| +\left| \rho \nabla ^3w(x,t)\right| +\left| \rho ^2\nabla ^4w(t,x)\right| \lesssim t^{-\kappa } \left\| g\right\| _{W^{1,\infty }}\) for some positive \(\kappa >0\). If we apply this result with w(1/2) as the initial data, we obtain the estimate

$$\begin{aligned}&\left\| w(t)\right\| _{L^\infty }+\left\| \nabla w(t)\right\| _{L^\infty }+\left\| \nabla ^2 w(t)\right\| _{L^\infty }+\left\| \rho \nabla ^3w(t)\right\| _{L^\infty }+\left\| \rho ^2\nabla ^4w(t)\right\| _{L^\infty }\nonumber \\&\quad \lesssim \left\| g\right\| _{H}\le \varepsilon ^0 \end{aligned}$$
(55)

uniformly in time for every \(t\ge 3/4\). The same holds true for \({\tilde{w}}(t)\) and \(\tilde{g}\). That is, for \(t\ge 3/4\) both w(t) and \({\tilde{w}}(t)\) solve the full nonlinear equation.

We now introduce \(v=w-{\tilde{w}}\), which solves the initial value problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t v+\mathcal {L}^2v+N\mathcal {L}v &{}= \rho ^{-1} \nabla \cdot \left( \rho ^2 \left( F_1[w]-F_1[{\tilde{w}}]\right) \right) +\rho \left( F_2[w]-F_2[{\tilde{w}}]\right) ,\\ v(0,\cdot )&{}=0. \end{array}\right. } \end{aligned}$$

Arguing very similarly as in the proof of Lemma 6.4, but using (55) instead of the truncation, we arrive at

$$\begin{aligned}&\left\| \partial _t v\right\| _{L^q\left( (4/5,1);L^q(\rho )\right) } +\left\| v\right\| _{L^q\left( (4/5,1);L^q(\rho )\right) }+\left\| \nabla v\right\| _{L^q\left( (4/5,1);L^q(\rho )\right) }\\&\quad + \left\| \nabla ^2v\right\| _{L^q\left( (4/5,1);L^q(\rho )\right) } +\left\| \rho \nabla ^3 v \right\| _{L^q\left( (4/5,1);L^q(\rho )\right) }+\left\| \rho ^2\nabla ^4v\right\| _{L^q\left( (4/5,1);L^q(\rho )\right) } \lesssim \left\| g-\tilde{g}\right\| _{H} \end{aligned}$$

for any \(q\in (1,\infty )\). Lastly, we proceed as in the proof of Lemma 6.5 to prove the existence of a \(\hat{t}\in (4/5,1)\), such that

$$\begin{aligned} \Vert w({\hat{t}}) - {\tilde{w}}({\hat{t}})\Vert _{W} = \Vert v({\hat{t}})\Vert _{W} \lesssim \left\| g-\tilde{g}\right\| _{H}. \end{aligned}$$

\(\square \)

7 Dynamical System Arguments

In this part we will construct invariant manifolds and prove Theorem 3.2. We want to draw a heuristic picture of the concept, see als Figure 4 for a geometric illustration. The center manifold, see Theorem 7.1, can be represented as the graph of a Lipschitz continuous function over the finite-dimensional center eigenspace, and it touches the center eigenspace tangentially at the origin. Here, the center eigenspace is the subspace of H spanned by the eigenfunctions of the first \(K+1\) eigenvalues of \(\mathcal {L}^2+N\mathcal {L}\), where K is an arbitrarily fixed nonnegative integer. Solutions to the truncated flow that lie on the center manifold remain on it for all subsequent times. The stable manifolds, see Theorem 7.3, intersect with the center manifold in exactly one point, and they form thus a foliation of the underlying Hilbert space H over the center manifold. This foliation is invariant under the flow. The stable manifolds can be described as (displaced) graphs over the stable eigenspace, that is, the orthogonal complement of the center eigenspace. Given an arbitrary solution to the truncated perturbation equation, our construction provides a solution that approximates the given one with an exponential rate of at least \(\mu _{K}\).

Fig. 4
figure 4

The flow \(S^t_\varepsilon (g)\) starting from an arbitrary \(g\in H\) approaches the flow starting from the unique intersection point \(\tilde{g}\) of \(W_\varepsilon ^c\) and \(M_g^\varepsilon \), \(S^t\varepsilon (\tilde{g})\), that stays on the center manifold and whose longtime behavior dominates the asymptotics of \(S^t_\varepsilon (g)\)

Throughout this section, we fix \(\varepsilon ^*\) as in Lemma 6.4 and choose some \( \varepsilon ^0 \le \min \left\{ \varepsilon , \varepsilon _0\right\} \) as in Lemma 6.5. With these choices, all results from the previous two sections are admissible.

The linear operator \(\mathcal {L}^2+N\mathcal {L}\) and the associated semi-flow operator \(L = e^{-\mathcal {L}^2-N\mathcal {L}}\) share the same eigenfunctions and an eigenvalue \(\mu \) of \(\mathcal {L}^2+N\mathcal {L}\) turns into the eigenvalue \(e^{-\mu }\) of L. We recall that all spectrum information is contained in Theorem 2.2. The fact that the spectrum is discrete will facilitate our analysis substantially.

In our construction of the invariant manifolds, we follow an approach by Koch, see [46], and mainly stick to his notation. From now on we keep \(K\in \mathbb {N}_0\) fixed, and we denote by \(E_c\) the finite-dimensional subspace of H spanned by the eigenfunctions corresponding to the eigenvalues \(\{\mu _0,\dots ,\mu _K\}\), that we call the center eigenspace. The projection of H onto the space \(E_c\) is given by \(P_c\). The stable eigenspace \(E_s\) is defined as the orthogonal complement of the center eigenspace, that is \(E_s {:}{=}E_c^\perp \), such that \(H=E_c \oplus E_s\), and \(P_s=1-P_c\). We denote the restriction of L to \(E_s\) by \(L_s\); it can be estimated via \(\left\| L_s\right\| _H\le e^{-\mu _{K+1}}\). Indeed, for \(w \in H\), it holds that

$$\begin{aligned} \left\| L_sw\right\| ^2_{H} = \sum \limits _{k>K}\sum _{l} \langle L w, \psi _{k,l}\rangle ^2_H = \sum \limits _{k>K}\sum _{l} e^{-2\mu _k} \langle w, \psi _{k,l} \rangle ^2_H \le e^{-2\mu _{K+1}} \left\| w\right\| ^2_{H}, \end{aligned}$$

if the \(\psi _{k,l}\)’s are the eigenfunctions corresponding to \(\mu _k\). For \(L_c\), the restriction of L onto \(E_c\), we similarly obtain \(\left\| L_c^{-1}\right\| \le e^{\mu _K}\). Indeed, we have

$$\begin{aligned} \left\| L_c^{-1}w\right\| ^2_{H} = \sum \limits _{k\le K} \sum _{l}\langle L^{-1} w,\psi _k \rangle ^2_H = \sum \limits _{k \le K}\sum _l e^{2\mu _k}\langle w, \psi _k\rangle ^2_H \le e^{2\mu _k}\left\| w\right\| ^2_{H}. \end{aligned}$$

We define

$$\begin{aligned} \Lambda _c = e^{-\mu _K}, \quad \Lambda _s = e^{-\mu _{K+1}} \quad \text { and } \Lambda _{max}=1 \end{aligned}$$

and conclude

$$\begin{aligned} \begin{aligned} \left\| L_c^{-1}\right\|&\le \Lambda _c^{-1}\quad \text { or }\quad \Lambda _c\left\| w\right\| _{H} \le \left\| Lw\right\| _{H} \text { for all } w\in E_c, \\ \left\| L_s\right\|&\le \Lambda _s \quad \ \ \text { or }\ \ \quad \left\| Lw\right\| _{H} \le \Lambda _s\left\| w\right\| _{H} \text { for all } w \in E_s,\\ \text { and }\quad \left\| L\right\|&\le \Lambda _{max} \ \ \text { or }\quad \ \ \left\| Lw\right\| _{H} \le \left\| w\right\| _{H} \text { for all }w\in H. \end{aligned} \end{aligned}$$
(56)

We arbitrarily choose \(\Lambda _s<\Lambda _- = e^{-\mu _-} <\Lambda _c\) with \(\mu _-< \mu _{K+1}<2\mu _-\) and \(\Lambda _{max} < \Lambda _+\) and introduce the following norms, that will be used for the construction of the manifolds:

  • For \(w \in H\) we define \({\left| \!\left| \!\left| w\right| \!\right| \!\right| } {:}{=}\max \left\{ \left\| P_cw\right\| _{H}, \left\| P_sw\right\| _{H} \right\} \).

  • For \(\left\{ w_k\right\} _{k\in \mathbb {Z}} \subseteq H\) we set \(\left\| \left\{ w_k\right\} _{k\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}{:}{=}\sup \limits _{k\in \mathbb {N}_0} \max \left\{ \Lambda _+^{-k}{\left| \!\left| \!\left| w_k\right| \!\right| \!\right| }, \Lambda _-^k{\left| \!\left| \!\left| w_{-k}\right| \!\right| \!\right| }\right\} .\)

  • For \(\left\{ w_k\right\} _{k\in \mathbb {N}_0} \subseteq H\) we set \(\left\| \left\{ w_k\right\} _{k\in \mathbb {N}_0}\right\| _{\Lambda _-,+}{:}{=}\sup \limits _{k\in \mathbb {N}_0} \Lambda _-^{-k}{\left| \!\left| \!\left| w_k\right| \!\right| \!\right| }.\)

The corresponding Banach spaces of sequences are denoted by \(\ell _{\Lambda _-,\Lambda _+}\) and \(\ell _{\Lambda _-,+}\), respectively.

Our first result it the construction of the center manifold.

Proposition 7.1

(Center manifold) Fix \(\Lambda _- = e^{-\mu _-}\) in \(\left( \Lambda _s,\Lambda _c \right) \). Let \(\varepsilon _{gap}>0\) such that

$$\begin{aligned} \Lambda _s+\varepsilon _{gap}<\Lambda _-<\Lambda _c-\varepsilon _{gap}\quad \text { and } \quad \Lambda _{max} +\varepsilon _{gap} < \Lambda _+. \end{aligned}$$
(57)

Choose \(\varepsilon \le \varepsilon ^*\) sufficiently small, such that

$$\begin{aligned} {{\,\textrm{Lip}\,}}\left( R_\varepsilon \right) \le \varepsilon _{gap}. \end{aligned}$$
(58)

(If necessary, choose \(\varepsilon ^0 \le \min \left\{ \varepsilon , \varepsilon _0\right\} \) even smaller according to Lemma 6.5.) Then there exists a function \(\theta _\varepsilon : E_c \rightarrow E_s\) with \(\theta _\varepsilon (0)=0\), that is differentiable at zero with \(D\theta _\varepsilon (0)=0\), and the submanifold

$$\begin{aligned} W_\varepsilon ^c {:}{=}\left\{ w_c + \theta _\varepsilon \left( w_c\right) : w_c \in E_c \right\} \end{aligned}$$

satisfies the following conditions:

  1. 1.

    The function \(\theta _\varepsilon \) is a contraction with \({{\,\textrm{Lip}\,}}\left( \theta _\varepsilon \right) \lesssim \varepsilon _{gap}\) and \(\left\| \theta _\varepsilon \left( g_c\right) \right\| _{H} \lesssim \left\| g_c\right\| _{H}^{\alpha }\) for all \(g_c \in E_c\) for some \(1<\alpha <\frac{\mu _{K+1}}{\mu _-}\). Moreover, it holds that \(\left\| \theta _{\varepsilon }(g_c)\right\| _{W}\lesssim {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }\).

  2. 2.

    If the semiflow \(\left\{ S_\varepsilon ^t\right\} _{t\ge 0}\) gets restricted to \(W_\varepsilon ^c\), it can be extended to an eternal Lipschitz flow on \(W_\varepsilon ^c\). More precisely, it holds that \(S_\varepsilon ^t\left( W_\varepsilon ^c\right) = W_\varepsilon ^c\) for all \(t\ge 0\) and for any \(g \in W_\varepsilon ^c\) there exists a semiflow \(\left\{ w(t)\right\} _{t\le 0}\) in \(W_\varepsilon ^c\) with \(w(0)=g\).

  3. 3.

    The manifold \(W_\varepsilon ^c\) is characterized as follows: The point g belongs to \(W_\varepsilon ^c\) if and only if there exists a flow \(\left\{ w(t)\right\} _{t\in \mathbb {R}}\) with \(w(0)=g\) and

    $$\begin{aligned} \left\| w(t)\right\| _{H} \le {\left\{ \begin{array}{ll} \Lambda _+^t{\left| \!\left| \!\left| g\right| \!\right| \!\right| }\quad \text { for all } t\ge 0 \\ \Lambda _-^t{\left| \!\left| \!\left| g\right| \!\right| \!\right| }\quad \text { for all }t \le 0. \end{array}\right. } \end{aligned}$$

The Lipschitz constants here and in the following are to be understood for a mappings from H to H, if both are equipped with the \({\left| \!\left| \!\left| \cdot \right| \!\right| \!\right| }\) norm.

Proof

Our proof relies on the construction in [46] in many parts. However, with regard to the subtle regularity issues we have to modify the argument and need to establish additional properties. For this reason, we give here a self-contained presentation.

First, we note that thanks to Lemma 6.6 by choosing \(\varepsilon \) sufficiently small, the Lipschitz condition (58) on \(R_{\varepsilon }\) is realizable. We define \(J:E_c\times \ell _{\Lambda _-,\Lambda _+} \rightarrow \ell _{\Lambda _-,\Lambda _+} \) by

$$\begin{aligned} J_k\left( g_c, \left\{ w_l\right\} _{l\in \mathbb {Z}}\right) = {\left\{ \begin{array}{ll} S_\varepsilon \left( w_{k-1}\right) &{}\text { if } k\ge 1\\ P_sS_\varepsilon \left( w_{-1}\right) +g_c&{} \text { if }k=0\\ P_sS_\varepsilon \left( w_{k-1}\right) +L_c^{-1}P_c \left( w_{k+1}-R_\varepsilon \left( w_k\right) \right) &{}\text { if } k\le -1. \end{array}\right. } \end{aligned}$$

This mapping is well defined, as we will show that

$$\begin{aligned} \left\| J\left( g_c, \left\{ w_l\right\} _{l\in \mathbb {Z}}\right) \right\| _{\Lambda _-,\Lambda _+} \le \max \left\{ {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }, \kappa \left\| \left\{ w_l\right\} _{l\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}\right\} , \end{aligned}$$
(59)

with \(\kappa {:}{=}\max \left\{ \frac{\Lambda _-+\varepsilon _{gap}}{\Lambda _c}, \frac{\Lambda _{max}+\varepsilon _{gap}}{\Lambda _+}, \frac{\Lambda _s+\varepsilon _{gap}}{\Lambda _-} \right\} .\) This quantity \(\kappa \) is strictly smaller than one due to (57). To prove (59) for positive times steps, \(k\ge 1\), we compute with help of the triangle inequality and properties (56) and (58) of L and \(R_\varepsilon \)

$$\begin{aligned}&\Lambda _+^{-k}{\left| \!\left| \!\left| P_sS_\varepsilon \left( w_{k-1}\right) \right| \!\right| \!\right| } \le \left( \Lambda _+^{-k} {\left| \!\left| \!\left| L_sP_sw_{k-1}\right| \!\right| \!\right| }+{\left| \!\left| \!\left| P_sR_\varepsilon (w_{k-1})\right| \!\right| \!\right| }\right) \\&\le \Lambda _+^{-k}\left( \Lambda _s{\left| \!\left| \!\left| w_{k-1}\right| \!\right| \!\right| }+\varepsilon _{gap} {\left| \!\left| \!\left| w_{k-1}\right| \!\right| \!\right| } \right) \le \frac{\Lambda _s+\varepsilon _{gap}}{\Lambda _+}\left\| \left\{ w_l\right\} _{l\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}. \end{aligned}$$

We have a similar bound on the projection onto the center manifold:

$$\begin{aligned} \Lambda _+^{-k}{\left| \!\left| \!\left| P_cS_\varepsilon (w_{k-1})\right| \!\right| \!\right| }\le \frac{\Lambda _{max}+\varepsilon _{gap}}{\Lambda _+}\left\| \left\{ w_l\right\} _{l\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}. \end{aligned}$$

The bound for negative time steps, \(k\le -1\), is verified in the same manner, namely

$$\begin{aligned}&\Lambda _-^k{\left| \!\left| \!\left| P_sS_\varepsilon \left( w_{k-1}\right) +L_c^{-1}P_c \left( w_{k+1}-R_\varepsilon \left( w_k\right) \right) \right| \!\right| \!\right| }\\&\quad \le \max \left\{ \frac{\Lambda _s+\varepsilon _{gap}}{\Lambda _-},\frac{\Lambda -_+\varepsilon _{gap}}{\Lambda _c}\right\} \left\| \left\{ {w_l}\right\} _{l\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}. \end{aligned}$$

Finally, for \(k=0\), the same strategy yields

$$\begin{aligned} {\left| \!\left| \!\left| P_sS_\varepsilon (w_{-1})+g_c\right| \!\right| \!\right| } \le \max \left\{ \frac{\Lambda _s+\varepsilon _{gap}}{\Lambda _-}\left\| \left\{ w_l\right\} _{l\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}, {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }\right\} , \end{aligned}$$

which completes the proof of (59).

Making use of the inequalities (56) and (58) again, we derive similarly that \(J(g_c,\cdot )\), for fixed \(g_c \in E_c\), is a contraction on \(\ell _{\Lambda _-,\Lambda _+}\), that is

$$\begin{aligned} \left\| J_k\left( g_c, \left\{ w_l\right\} _{l\in \mathbb {Z}}\right) - J_k\left( g_c, \left\{ \tilde{w}_l\right\} _{l\in \mathbb {Z}}\right) \right\| _{\Lambda _-,\Lambda _+}\le \kappa \left\| \left\{ w_l\right\} _{l\in \mathbb {Z}} - \left\{ \tilde{w}_l\right\} _{l\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}, \end{aligned}$$

for every \(\left\{ w_l\right\} _{l\in \mathbb {Z}}\), \(\left\{ \tilde{w}_l\right\} _{l\in \mathbb {Z}}\) in \(\ell _{\Lambda _-,\Lambda _+}\). Hence, by Banach’s fixed point theorem, for every element \(g_c \in E_c\) there exists a unique sequence \(\left\{ w_k\right\} _{k\in \mathbb {Z}} \in \ell _{\Lambda _-,\Lambda _+}\) with \(J\left( g_c,\left\{ w_k\right\} _{k\in \mathbb {Z}}\right) = \left\{ w_k\right\} _{k\in \mathbb {Z}}\). By construction this fixed point sequence is a solution to the discrete semiflow with \(P_cw_0 = g_c\). By the virtue of (59), we also know that \(\left\| \left\{ w_k\right\} _{k\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}\le {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }\).

Now, we define the solution mapping \(\hat{\theta }_\varepsilon : E_c \rightarrow \ell _{\Lambda _-,\Lambda _+}\) by \(\hat{\theta }_\varepsilon \left( g_c\right) = \left\{ w_k\right\} _{k\in \mathbb {Z}}\) and consider \(\theta _\varepsilon :E_c\rightarrow E_s\) given by \(\theta _\varepsilon \left( g_c\right) = P_sw_0\). In other words, the initial datum of the solution sequence decomposes into \(w_0=g_c+\theta _{\varepsilon }\left( g_c\right) \). Since \(J(0,0)=0\), we obtain, by the uniqueness of the fixed point, that \(\hat{\theta }_\varepsilon (0)=0\) and thus \(\theta _\varepsilon (0)=0\).

The contraction property, in particular, entails that the solution mapping \(\hat{\theta }_\varepsilon \) is Lipschitz continuous with bound \({{\,\textrm{Lip}\,}}\left( \hat{\theta }_\varepsilon \right) \le \frac{1}{1-\kappa }\). Thus, also its “coordinate” \(\theta _\varepsilon \) is Lipschitz continuous with the same bound. We will need to a stronger bound, in fact, a contraction estimate. For any \(g_c\) and \(\tilde{g}_c \in E_c\) we have

$$\begin{aligned} {\left| \!\left| \!\left| \theta _\varepsilon \left( g_c\right) -\theta _\varepsilon \left( \tilde{g}_c\right) \right| \!\right| \!\right| } ={\left| \!\left| \!\left| P_s\left( S_\varepsilon \left( w_{-1}\right) -S_\varepsilon \left( \tilde{w}_{-1}\right) \right) \right| \!\right| \!\right| }, \end{aligned}$$

where \(\left\{ w_k\right\} _{k\in \mathbb {Z}} = \hat{\theta }_\varepsilon \left( g_c\right) \) and \(\left\{ \tilde{w}_k\right\} _{k\in \mathbb {Z}} = \hat{\theta }_\varepsilon \left( \tilde{g}_c\right) \). Using the triangle inequality and the properties of L and \(R_\varepsilon \), we get for any \(k\ge 0\) that

$$\begin{aligned}&\Lambda _-^k{\left| \!\left| \!\left| P_s\left( w_{-k}-\tilde{w}_{-k}\right) \right| \!\right| \!\right| }\\&\quad \le \frac{\Lambda _s}{\Lambda _-}\Lambda _-^{k+1}{\left| \!\left| \!\left| P_s\left( w_{-(k+1)}-\tilde{w}_{-(k+1)}\right) \right| \!\right| \!\right| } + \frac{\varepsilon _{gap}}{\Lambda _-}\Lambda _-^{k+1}{\left| \!\left| \!\left| w_{-(k+1)}-\tilde{w}_{-(k+1)}\right| \!\right| \!\right| }. \end{aligned}$$

Applying this inequality iteratively, we obtain

$$\begin{aligned}&{\left| \!\left| \!\left| \theta _\varepsilon \left( g_c\right) -\theta _\varepsilon \left( \tilde{g}_c\right) \right| \!\right| \!\right| } = {\left| \!\left| \!\left| P_s(w_0-{\tilde{w}}_0)\right| \!\right| \!\right| }\\&\quad \le \left( \frac{\Lambda _s}{\Lambda _-}\right) ^m\left\| \left\{ w_k\right\} _{k\in \mathbb {Z}}- \left\{ \tilde{w}_k\right\} _{k\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}\\&\qquad + \frac{\varepsilon _{gap}}{\Lambda _-}\sum \limits _{l=0}^{m-1}\left( \frac{\Lambda _s}{\Lambda _-}\right) ^l\left\| \left\{ w_k\right\} _{k\in \mathbb {Z}}- \left\{ \tilde{w}_k\right\} _{k\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+} \end{aligned}$$

for every \(m\in \mathbb {N}\). Sending m to infinity and using the Lipschitz bound for \({\hat{\theta }}_{\varepsilon }\) yields

$$\begin{aligned} {\left| \!\left| \!\left| \theta _\varepsilon \left( g_c\right) -\theta _\varepsilon \left( \tilde{g}_c\right) \right| \!\right| \!\right| } \le \frac{\varepsilon _{gap}}{\Lambda _--\Lambda _s} \left\| {\hat{\theta }}_{\varepsilon }(g_c)-{\hat{\theta }}_{\varepsilon }({\tilde{g}}_c)\right\| _{\Lambda _-,\Lambda _+}\le \frac{\varepsilon _{gap}}{\Lambda _--\Lambda _s}\frac{1}{\kappa -1} {\left| \!\left| \!\left| g_c-\tilde{g}_c\right| \!\right| \!\right| }. \end{aligned}$$

This proves that \(\theta _{\varepsilon }\) is Lipschitz with constant \({{\,\textrm{Lip}\,}}(\theta _{\varepsilon })\lesssim \varepsilon _{gap}\).

We continue by deriving the superlinear behavior of \(\theta _\varepsilon \) near zero, which eventually implies the differentiability properties stated in the proposition. We compute, using the quadratic bound on \(R_{\varepsilon }\) in Lemma 6.7

$$\begin{aligned} {\left| \!\left| \!\left| \theta _\varepsilon \left( g_c\right) \right| \!\right| \!\right| } = {\left| \!\left| \!\left| P_sw_0\right| \!\right| \!\right| } \le {\left| \!\left| \!\left| P_s R_\varepsilon \left( S_\varepsilon \left( w_{-2}\right) \right) \right| \!\right| \!\right| } + {\left| \!\left| \!\left| P_sLw_{-1}\right| \!\right| \!\right| } \le C {\left| \!\left| \!\left| w_{-2}\right| \!\right| \!\right| }^2 + \Lambda _s{\left| \!\left| \!\left| P_sw_{-1}\right| \!\right| \!\right| }. \end{aligned}$$

Similarly, we get \({\left| \!\left| \!\left| P_sw_{-k}\right| \!\right| \!\right| } \le C {\left| \!\left| \!\left| w_{-(k+2)}\right| \!\right| \!\right| }^2 + \Lambda _s {\left| \!\left| \!\left| P_sw_{-(k+1)}\right| \!\right| \!\right| }\) for any \(k\in \mathbb {N}_0\) and thus, for any \(m \in \mathbb {N}\),

$$\begin{aligned} {\left| \!\left| \!\left| \theta _\varepsilon \left( g_c\right) \right| \!\right| \!\right| } \le \Lambda _s^m {\left| \!\left| \!\left| w_{-m}\right| \!\right| \!\right| } + C\sum \limits _{l=1}^{m}\Lambda _s^{l-1}{\left| \!\left| \!\left| w_{-(l+1)}\right| \!\right| \!\right| }^2. \end{aligned}$$

Recalling the definition of \(\left\| \cdot \right\| _{\Lambda _-,\Lambda _+}\) and the fact that the solution sequence is bounded via (59), \(\left\| \left\{ w_k\right\} _{k\in \mathbb {Z}}\right\| _{\Lambda _-,\Lambda _+}\le {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }\), we obtain

$$\begin{aligned}&\Lambda _s^m {\left| \!\left| \!\left| w_{-m}\right| \!\right| \!\right| } + \sum \limits _{l=1}^{m}\Lambda _s^{l-1}{\left| \!\left| \!\left| w_{-(l+1)}\right| \!\right| \!\right| }^2 \\&\quad \le \left( \frac{\Lambda _s}{\Lambda _-} \right) ^m {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| } + \frac{C}{\Lambda _s\Lambda _-^2}\sum \limits _{l=1}^m \frac{\Lambda _s^l}{\Lambda _-^{2l}} {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }^2 \\&\quad = \left( \frac{\Lambda _s}{\Lambda _-} \right) ^m {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| } + \frac{C}{\Lambda _s\Lambda _-^2}\sum \limits _{l=1}^m \left( \frac{\Lambda _-}{\Lambda _s}\right) ^{lk}\left( \frac{\Lambda _s^{k+1}}{\Lambda _-^{k+2}}\right) ^l{\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }^2 \end{aligned}$$

for any \(k\in \mathbb {N}\). We recall that \(\Lambda _->\Lambda _s\). Hence, if there exists a \(k \in \mathbb {N}\), such that \(\Lambda _-^{k+2}> \Lambda _s^{k+1}\), it holds that

$$\begin{aligned} {\left| \!\left| \!\left| \theta _\varepsilon \left( g_c\right) \right| \!\right| \!\right| } \le \left( \frac{\Lambda _s}{\Lambda _-} \right) ^m {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| } + \frac{C}{\Lambda _s\Lambda _-^2}\left( \frac{\Lambda _-}{\Lambda _s}\right) ^{km}\frac{\Lambda _s^{k+1}}{\Lambda _-^{k+2} - \Lambda _s^{k+1}}{\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }^2, \end{aligned}$$

and after optimizing in m, this becomes

$$\begin{aligned} {\left| \!\left| \!\left| \theta _\varepsilon \left( g_c\right) \right| \!\right| \!\right| } \lesssim {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }^{1+\frac{1}{k+1}}, \end{aligned}$$

provided that the right-hand side is sufficiently small. (For larger \(g_c\), this bound follows trivially from the linear estimate.) It remains to verify the existence of a suitable k. This, however, follows easily from our choice of \(\Lambda _-\), more precisely, from the assumption \(\mu _-< \mu _{K+1}<2\mu _-\). Indeed, the latter enables us to pick \(k > \frac{2\mu _--\mu _{K+1}}{\mu _{K+1}-\mu _-}\), which implies \(\Lambda _-^{k+2}> \Lambda _s^{k+1}\) as desired. This proves the first statement with \(\alpha = 1+ \frac{1}{k+1}< \frac{\mu _{K+1}}{\mu _-}\).

We turn to the last inequality of the first statement. By the definition of \(\theta _{\varepsilon }\), the construction of the fixed point and the smoothing estimate from Lemma 6.5, we have that

$$\begin{aligned} \left\| \theta _{\varepsilon }\left( g_c\right) \right\| _{W} \le \left\| S_\varepsilon \left( w_{-1}\right) \right\| _{W} \lesssim \left\| w_{-1}\right\| _{H}. \end{aligned}$$

It remains to notice that \(\left\| w_{-1}\right\| _{H}\lesssim {\left| \!\left| \!\left| w_{-1}\right| \!\right| \!\right| }\le \Lambda _-^{-1}{\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }\lesssim {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }\) by the equivalence of the norms and the bound (59) applied to the solution sequence.

The second part of the proof covers the properties of the center manifold \(W_\varepsilon ^c\) which is defined as the graph of \(\theta _\varepsilon \). We commence with the invariance of \(W_\varepsilon ^c\). For this we consider an arbitrary point on that manifold \(g=g_c+\theta _\varepsilon \left( g_c\right) \) and consider the evolution \(\left\{ w_k\right\} _{k\in \mathbb {Z}} = S^k_\varepsilon (g) = \hat{\theta }_{\varepsilon }(g)\) starting at that point. We have to show that for every time step \(k\in \mathbb {Z}\), the solution \(w_k\) lies in \(W_\varepsilon ^c\), or, equivalently, that \(P_sw_k=\theta _{\varepsilon }\left( P_cw_k\right) \). By iteration, it suffices to show this only for \(k=1\) and \(k=-1\). We set \(\tilde{g}_c=P_cw_1\). Then \(S^k_\varepsilon \left( \tilde{w}_0\right) = \hat{\theta }_\varepsilon \left( \tilde{g}_c\right) \) is the unique flow in \(\ell _{\Lambda _-,\Lambda _+}\) that satisfies \(P_c\tilde{w}_0=\tilde{g}_c\). Since \(P_cw_1 = \tilde{g}_c\), we have by uniqueness that \(w_{k+1}=\tilde{w}_k\) for every \(k\in \mathbb {Z}\). This yields \(P_sw_1 = P_s\tilde{w}_0= \theta _{\varepsilon }\left( \tilde{g}_c\right) =\theta _{\varepsilon }\left( P_cw_1\right) \). The same procedure backwards in time yields the statement for \(k=-1\).

It remains to prove the characterization of the center manifold. First, for a point \(w_0\) on that manifold, that is, \(w_0=g_c+\theta _{\varepsilon }\left( g_c\right) \) for some \(g_c\in E_c\), we already know that \(\left\| \left\{ S^k_\varepsilon \left( w_0\right) \right\} \right\| _{\Lambda _-,\Lambda _+} = \left\| \hat{\theta }_\varepsilon \left( g_c\right) \right\| _{\Lambda _-,\Lambda _+} \le {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }\le {\left| \!\left| \!\left| g\right| \!\right| \!\right| }\) by the virtue of (59). Otherwise, if a flow \(\left\{ w_k\right\} _{k\in \mathbb {Z}} = \left\{ S^k_\varepsilon \left( w_0\right) \right\} _{k\in \mathbb {Z}}\) satisfies this bound, it must be a fixed point of \(J\left( P_cw_0,\cdot \right) \). Since this fixed point is unique, we have \(\hat{\theta }_\varepsilon \left( P_cw_0\right) =\left\{ w_k\right\} _{k\in \mathbb {Z}}\) and thus \(\theta _{\varepsilon }\left( P_cw_0\right) = P_sw_0\). This yields \(w_0\in W_\varepsilon ^c\). \(\square \)

The regularity of \(\theta _{\varepsilon }\) allows us to deduce the equivalence of the Hilbert space norm \({\left| \!\left| \!\left| \cdot \right| \!\right| \!\right| }\) and the higher-order norm \(\Vert \cdot \Vert _W\) on the finite-dimensional manifold \(W_{\varepsilon }^c\).

Corollary 7.2

The norms \({\left| \!\left| \!\left| g\right| \!\right| \!\right| }\) and \(\Vert g\Vert _{W}\) are equivalent for any \(g\in W_\varepsilon ^c\).

Proof

Trivially, the embedding \(W\hookrightarrow H\) is continuous on a bounded domain, that is, \({\left| \!\left| \!\left| g\right| \!\right| \!\right| }\lesssim \Vert g\Vert _{W}\) for every \(g \in W\). To show the reverse inequality, we take an element \(g = g_c +\theta _{\varepsilon }(g_c)\) in \(W_\varepsilon ^c\). Now, we notice that on the one hand, thanks to the regularity of \(\theta _{\varepsilon }\) established in Proposition 7.1, we have \(\left\| \theta _{\varepsilon }\left( g_c\right) \right\| _{W} \lesssim {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }\). On the other hand, because \(E_c\) is a finite-dimensional space, all norms on \(E_c\) are equivalent, so that \(\Vert g_c\Vert _W\lesssim {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| }\). We combine both insights and find

$$\begin{aligned} \Vert g\Vert _W \le \Vert g_c\Vert _W + \Vert \theta _{\varepsilon }(g_c)\Vert _W \lesssim {\left| \!\left| \!\left| g_c\right| \!\right| \!\right| } \le {\left| \!\left| \!\left| g\right| \!\right| \!\right| }, \end{aligned}$$

as desired. \(\square \)

We will now construct the stable manifolds.

Proposition 7.3

(Stable manifold) Let \(\varepsilon _{gap}>0\) and \(\varepsilon \) be as in Proposition 7.1 such that (57) and (58) hold. Then for every \(g\in H\), there exists a map \(\nu ^\varepsilon _g:E_s\rightarrow E_c\) such that the submanifold

$$\begin{aligned} M_g^\varepsilon {:}{=}g + \left\{ \nu _g^\varepsilon \left( g_s\right) +g_s : g_s\in E_s\right\} \end{aligned}$$

satisfies the following conditions:

  1. 1.

    For every \(g\in H\), the map \(\nu _g^\varepsilon :E_s\rightarrow E_c\) is Lipschitz continuous with \({{\,\textrm{Lip}\,}}\left( \nu _g^\varepsilon \right) \lesssim \varepsilon _{gap}\).

  2. 2.

    For every \(t\ge 0\) it holds that \(S^t_\varepsilon \left( M_g^\varepsilon \right) \subseteq M_{S^t_\varepsilon (g)}^\varepsilon \) and \(M_g^\varepsilon \) can be characterized as follows

    $$\begin{aligned} M_g^\varepsilon = \left\{ \tilde{g}\in H : \sup \limits _{k\in \mathbb {N}_0} \Lambda _-^{-k}{\left| \!\left| \!\left| S^k_\varepsilon \left( g\right) - S^k_\varepsilon \left( \tilde{g}\right) \right| \!\right| \!\right| } \le {\left| \!\left| \!\left| P_s\left( g-\tilde{g}\right) \right| \!\right| \!\right| } \right\} \end{aligned}$$
  3. 3.

    If \(\varepsilon _{gap}\) is sufficiently small (and \(\varepsilon ^0\) chosen accordingly), the following holds true: For every \(g\in H\) the intersection \(M_g^\varepsilon \cap W_\varepsilon ^c\) consists of a single point \(\tilde{g}\). This particularly yields that \(\left\{ M_g^\varepsilon \right\} _{g\in H}\) is a foliation of H over \(W_\varepsilon ^c\). Moreover, it holds that

    $$\begin{aligned} \left\| \tilde{g}\right\| _{W} \lesssim {\left| \!\left| \!\left| g\right| \!\right| \!\right| }. \end{aligned}$$

Proof

The existence follows again by a fixed point argument, which is similar to the one of Proposition 7.1. We will thus only sketch it.

We fix a function \(g\in H\) and a positive constant r and define

$$\begin{aligned} \ell _{\Lambda _-,+}^{g,r}{:}{=}\left\{ \left\{ w_l\right\} _{l\in \mathbb {N}_0}\in \left( L^2(\rho )\right) ^{\mathbb {N}_0}: \left\| \left\{ w_l\right\} _{l\in \mathbb {N}_0}-\left\{ S_\varepsilon ^l(g)\right\} _{l\in \mathbb {N}_0}\right\| _{\Lambda _{-},+}\le r\right\} . \end{aligned}$$

Note that \(\ell _{\Lambda _-,+}^{g,r}\) equipped with the metric

$$\begin{aligned} d_g\left( \left\{ w_l\right\} _{l\in \mathbb {N}_0},\left\{ \tilde{w}_l\right\} _{l\in \mathbb {N}_0}\right) = \left\| \left\{ w_l\right\} _{l\in \mathbb {N}_0}-\left\{ \tilde{w}_l\right\} _{l\in \mathbb {N}_0}\right\| _{\Lambda _{-},+} \end{aligned}$$

is a closed subset of \(\left( L^2(\rho )\right) ^{\mathbb {N}_0}\). We consider the map \(I^g:E_s \times \ell _{\Lambda _-,+}^{g,r} \rightarrow \ell _{\Lambda _-,+}^{g,r}\) defined by

$$\begin{aligned} I^g_k\left( g_s,\left\{ w_l\right\} _{l\in \mathbb {N}_0} \right) = {\left\{ \begin{array}{ll} g_s+P_sg+L_c^{-1}P_c\left( w_1-R_\varepsilon \left( w_0\right) \right) &{} \text { if } k=0\\ P_s\left( S_\varepsilon \left( w_{k-1}\right) +L_c^{-1}P_c\left( w_{k+1}-R_\varepsilon \left( w_k\right) \right) \right) &{} \text { if } k\ge 1, \end{array}\right. } \end{aligned}$$

which has the useful property

$$\begin{aligned} I^g_k\left( g_s, \left\{ S_\varepsilon ^l(g)\right\} _{l\in \mathbb {N}_0}\right) - S_\varepsilon ^k(g) = g_s\delta _{0k}. \end{aligned}$$
(60)

Moreover, by similar arguments as for the operator J in the proof of Proposition 7.1, relying on (56) and (58) we compute for a fixed element \(g_s\in E_s\) that

$$\begin{aligned} \left\| I^g\left( g_s, \left\{ w_l \right\} _{l\in \mathbb {N}_0}\right) -I^g\left( g_s, \left\{ \tilde{w}_l\right\} _{l\in \mathbb {N}_0} \right) \right\| _{\Lambda _-,+} \le \kappa \left\| \left\{ w_l \right\} _{l\in \mathbb {N}_0}-\left\{ \tilde{w}_l\right\} _{l\in \mathbb {N}_0}\right\| _{\Lambda _-,+} \end{aligned}$$

and

$$\begin{aligned}&\left\| I^g\left( g_s,\left\{ w_l\right\} _{l\in \mathbb {N}_0}\right) -\left\{ S^l_\varepsilon (g)\right\} \right\| _{\Lambda _-,+}\\&\quad \le \max \left\{ {\left| \!\left| \!\left| g_s\right| \!\right| \!\right| },\kappa \left\| \left\{ w_l\right\} _{l\in \mathbb {N}_0}-\left\{ S^l_\varepsilon (g)\right\} \right\| _{\Lambda _-,+}\right\} , \end{aligned}$$

where \(\kappa = \max \left\{ \frac{\Lambda _- + \varepsilon _{gap}}{\Lambda _c}, \frac{\Lambda _s+\varepsilon _{gap}}{\Lambda _-}\right\} <1\). Notice that in the latter estimate, we made use of the formula (60). Both estimates imply \(I^g(g_s,\cdot )\) is a contraction and a self-mapping on the set \( \ell _{\Lambda _-,+}^{g,r}\), if we choose \(r={\left| \!\left| \!\left| g_s\right| \!\right| \!\right| }\).

Hence, by Banach’s fixed point theorem there exists a unique sequence \(\left\{ w_k\right\} _{k\in \mathbb {N}_0}\) satisfying

$$\begin{aligned} I^g\left( g_s, \left\{ w_k \right\} _{k\in \mathbb {N}_0}\right) = \left\{ w_k\right\} _{k\in \mathbb {N}_0} \quad \text { and } \quad \left\| \left\{ w_k\right\} _{k\in \mathbb {N}_0} - \left\{ S_\varepsilon ^k(g)\right\} _{k\in \mathbb {N}_0}\right\| _{\Lambda _-,+}\le r. \end{aligned}$$

By construction, this sequence \(\left\{ w_k\right\} _{k\in \mathbb {N}_0}\) is a semiflow to the truncated equation with \(P_sw_0= g_s+P_sg\). We may now introduce a solution mapping \(\hat{\nu }^\varepsilon _g:E_s \rightarrow \ell _{\Lambda _-,+}^g\) by \(\hat{\nu }_g^\varepsilon \left( g_s\right) {:}{=}\left\{ w_k\right\} _{k\in \mathbb {N}_0}\), and we define \(\nu _g^\varepsilon \left( g_s\right) =P_c\left( w_0-g\right) \). Due to the construction via a fixpoint argument, we deduce that \(\hat{\nu }_g^\varepsilon \) is Lipschitz continuous with \({{\,\textrm{Lip}\,}}\left( \hat{\nu }^\varepsilon _g\right) \le \frac{1}{1-\kappa }\).

We will improve the Lipschitz constant in a similar way as in the previous proof. For this, let \(\hat{\nu }_g^\varepsilon \left( g_s\right) = \left\{ w_k\right\} _{k\in \mathbb {N}_0}\) and \(\hat{\nu }_g^\varepsilon \left( \tilde{g_s}\right) = \left\{ \tilde{w}_k\right\} _{k\in \mathbb {N}_0}\) be two fixed point solution sequences. It holds that \( \nu _g^\varepsilon \left( g_s\right) -\nu _g^\varepsilon \left( \tilde{g_s}\right) = P_c\left( w_0- \tilde{w}_0\right) \), and we compute

$$\begin{aligned} {\left| \!\left| \!\left| P_c\left( w_k-\tilde{w}_k\right) \right| \!\right| \!\right| } \le \frac{1}{\Lambda _c}{\left| \!\left| \!\left| P_c\left( w_{k+1}-\tilde{w}_{k+1}\right) \right| \!\right| \!\right| } + \frac{\varepsilon _{gap}}{\Lambda _c} {\left| \!\left| \!\left| w_k-\tilde{w}_k\right| \!\right| \!\right| } \end{aligned}$$

with the help of the definition of the map I. Therefore, for every \(m\in \mathbb {N}\) it holds

$$\begin{aligned} {\left| \!\left| \!\left| \nu _g^\varepsilon \left( g_s\right) -\nu _g^\varepsilon \left( \tilde{g_s}\right) \right| \!\right| \!\right| }&\le \left( \frac{\Lambda _-}{\Lambda _c}\right) ^m \left\| \left\{ w_k\right\} _{k\in \mathbb {N}_0}-\left\{ \tilde{w}_k\right\} _{k\in \mathbb {N}_0}\right\| _{\Lambda _-,+} \\&\quad + \frac{\varepsilon _{gap}}{\Lambda _c}\sum \limits _{l=0}^{m-1} \left( \frac{\Lambda _-}{\Lambda _c}\right) ^l \left\| \left\{ w_k\right\} _{k\in \mathbb {N}_0}-\left\{ \tilde{w}_k\right\} _{k\in \mathbb {N}_0}\right\| _{\Lambda _-,+}. \end{aligned}$$

Since \(\frac{\Lambda _-}{\Lambda _c}<1\), for \(k\rightarrow \infty \), this yields

$$\begin{aligned} {\left| \!\left| \!\left| \nu _g^\varepsilon \left( g_s\right) -\nu _g^\varepsilon \left( \tilde{g_s}\right) \right| \!\right| \!\right| } \le \frac{\varepsilon _{gap}}{\left( \Lambda _c-\Lambda _-\right) \left( 1-\kappa \right) }{\left| \!\left| \!\left| g_s-\tilde{g_s}\right| \!\right| \!\right| }. \end{aligned}$$

The stable manifold \(M_g^\varepsilon \) is defined as the graph of \(\nu _g^\varepsilon \) shifted by g. We first prove its characterization as stated in the second part of the proposition. Let \(\tilde{g}\) be in \(M_g^\varepsilon \), that is, \(\tilde{g}= g+\nu _g^\varepsilon \left( g_s\right) +g_s\) for some \(g_s \in E_s\). We define \(\left\{ w_k\right\} _{k\in \mathbb {N}_0}=\hat{\nu }_g^\varepsilon \left( \tilde{g}\right) \) as the unique semi flow with \(\Lambda _-^{-k}{\left| \!\left| \!\left| w_k-S^k_\varepsilon (g)\right| \!\right| \!\right| }\le {\left| \!\left| \!\left| g_s\right| \!\right| \!\right| }={\left| \!\left| \!\left| P_s\left( g-\tilde{g}\right) \right| \!\right| \!\right| }\) and \(P_sw_0=g_s+P_sg\). By definition of \(\nu _g^\varepsilon \), we have

$$\begin{aligned} \tilde{g}=g+\nu _g^\varepsilon \left( g_s\right) +g_s = g+P_c\left( w_0-g\right) +P_sw_0-P_sg =w_0, \end{aligned}$$

and thus, \(w_k= S^k_\varepsilon \left( \tilde{g}\right) \) satisfies the desired bound. Let us now assume that \(S^k_\varepsilon \left( \tilde{g}\right) \) satisfies this bound. We define \(g_s=P_s\left( \tilde{g}-g\right) \). Then \(S^k_\varepsilon \left( \tilde{g}\right) \) is the unique fixpoint of \(I^g\left( g_s,\cdot \right) \) with \(\left\| S^k_\varepsilon \left( \tilde{g}\right) -S^k_\varepsilon (g)\right\| _{\Lambda _-,+}\le {\left| \!\left| \!\left| g_s\right| \!\right| \!\right| }\). By definition, this yields \(S^k_\varepsilon \left( \tilde{g}\right) = \hat{\nu }_g^\varepsilon \left( g_s\right) \) and \(\nu _g^\varepsilon \left( g_s\right) =P_c\left( \tilde{g}-g\right) \) and thus

$$\begin{aligned} g+\nu _g^\varepsilon \left( g_s\right) +g_s= g + P_c\left( \tilde{g}-g\right) + P_s\left( \tilde{g}-g\right) =\tilde{g}. \end{aligned}$$

Next, we have to verify that \(M_g^\varepsilon \) is positive invariant. For this, we take an arbitrary point \(w_0\) in \(M_g^\varepsilon \) and define \(\tilde{w}_0=S_\varepsilon (w_0)\). We straightforwardly compute that \(S^k_\varepsilon \left( \tilde{w}_0\right) \) is a fixpoint of \(I^{S_\varepsilon (w_0)}\left( 0, \cdot \right) \), which implies the desired property.

To prove that there exists a single intersection point with the center manifold \(W_{\varepsilon }^c\), we consider the mapping \(\chi (g_s) = \theta _{\varepsilon }(\nu ^{\varepsilon }_g(g_s-P_sg)+P_cg)\) on \(E_s\). Since \(\theta _{\varepsilon }\) and \(\nu _g^{\varepsilon }\) are both Lipschitz continuous with constant of order \(\varepsilon _{gap}\), the mapping \(\chi \) itself is Lipschitz with a constant of the order \(\varepsilon _{gap}^2\), and thus, it is a contraction if \(\varepsilon _{gap}\) is sufficiently small. We denote by \({\tilde{g}}_s\) the unique fixed point and set \({\tilde{g}}_c = \nu _g^{\varepsilon }({\tilde{g}}_s - P_sg)+P_cg\). By definition, \({\tilde{g}} = {\tilde{g}}_c+{\tilde{g}}_s\) lies in the intersection of \(W^c_{\varepsilon } \) and \(M_g^{\varepsilon }\). As every point in this intersection is itself a fixed point, the uniqueness follows.

To estimate the intersection point \({\tilde{g}}\) against g, we argue similarly. Indeed, by construction, the Lipschitz property for \(\nu _g^{\varepsilon }\), and the fact that both \(\theta _{\varepsilon }(0)=0\) and \(\nu _g^{\varepsilon }(0)=0\), it holds that

$$\begin{aligned} {\left| \!\left| \!\left| {\tilde{g}}\right| \!\right| \!\right| }&= {\left| \!\left| \!\left| \nu _g^{\varepsilon } ({\tilde{g}}_s -P_s g) +P_c g + i_s \theta _{\varepsilon }(\nu _g^{\varepsilon } ({\tilde{g}}_s -r_s g) +P_c g)\right| \!\right| \!\right| }\\&\lesssim (1+\varepsilon _{gap}) {\left| \!\left| \!\left| \nu _g^{\varepsilon } ({\tilde{g}}_s -r_s g) +P_c g\right| \!\right| \!\right| }\\&\lesssim \varepsilon _{gap} {\left| \!\left| \!\left| {\tilde{g}}_s -P_sg\right| \!\right| \!\right| } + {\left| \!\left| \!\left| g\right| \!\right| \!\right| }\\&\lesssim \varepsilon _{gap} {\left| \!\left| \!\left| {\tilde{g}}\right| \!\right| \!\right| } + {\left| \!\left| \!\left| g\right| \!\right| \!\right| }, \end{aligned}$$

where we have used that \(\varepsilon _{gap}\le 1\) in the third inequality. We arrive at

$$\begin{aligned} {\left| \!\left| \!\left| {\tilde{g}}\right| \!\right| \!\right| } \lesssim {\left| \!\left| \!\left| g\right| \!\right| \!\right| }, \end{aligned}$$

provided that \(\varepsilon _{gap}\) is sufficiently small.

Because \(\tilde{g}\) lies on the manifold \(W_\varepsilon ^c\), we can make use of Corollary 7.2 to obtain \(\left\| \tilde{g}\right\| _{W} \lesssim {\left| \!\left| \!\left| g\right| \!\right| \!\right| }\). \(\square \)

Finally we are able to show the existence of a localized invariant manifold as claimed in Theorem 3.2 by combining the two preceding constructions with earlier proved regularity properties of the flow map S.

Proof of Theorem 3.2

We choose \(0<\varepsilon _{gap}<\min \left\{ e^{-\mu _{K+1}}-e^{-\mu }, e^{-\mu }-e^{-\mu _K}\right\} \), such that the third statement in Proposition 7.3 applies, and we define \(\Lambda _-=e^{-\mu }\). We furthermore pick \(\varepsilon \le \varepsilon ^*\) and \(\varepsilon ^0 \le \min \left\{ \varepsilon ,\varepsilon _0\right\} \) as in the hypotheses of Propositions 7.1 and 7.3. The construction of \(W_{loc}^c\) then follows directly from Proposition 7.1.

To prove the first property in the theorem, we consider \(g\in W_{loc}^c\) with \(\Vert g\Vert _H\le \varepsilon _0\) and we notice that by the semi-flow property from Theorem 6.3, it holds \(\left\| S^t_\varepsilon (g)\right\| _{L^\infty \left( \left( 0,\infty \right) ;H\right) } \le \tilde{C} \varepsilon _0\) for some \({\tilde{C}}\ge 1\). Moreover, \(S^t_\varepsilon (g)\in W_{\varepsilon }^c\) by construction, and thus, by the equivalence of norms in Corollary 7.2, it holds that \(\Vert S^t_\varepsilon (g)\Vert _{W}\le C \Vert S^t_\varepsilon (g)\Vert _{H} \le C{\tilde{C}} \varepsilon _0\) for some \(C\ge 1\). Thus, for \(\varepsilon _0\le \frac{1}{C\tilde{C}}\varepsilon \), we find \(S^t(g) = S^t_{\varepsilon }(g)\) by the definition of the truncation and, in particular, \(\Vert S^t(g)\Vert _H\le \varepsilon \), for any \(t\ge 0\).

We turn to the proof of the second property. We know that there exists a unique point \(\tilde{g}\) in \(W_\varepsilon ^c\cap M_g^\varepsilon \) that satisfies \(\left\| \tilde{g}\right\| _{H}\lesssim \left\| \tilde{g}\right\| _{W} \lesssim \left\| g\right\| _{H}\lesssim \Vert g\Vert _{W}\le \varepsilon _0\), see Proposition 7.3. In particular, choosing \(\varepsilon _0 \le \varepsilon \) even smaller, if necessary, it holds that \(S^k_{\varepsilon }(g) = S^k(g)\) and \(S^k_{\varepsilon }({\tilde{g}}) = S^k({\tilde{g}})\). Moreover, the estimate shows that \(\tilde{g}\) actually lies in \(W_c^{loc}\). Now, the characterization of the stable manifold yields

$$\begin{aligned} \left\| S^k(g)-S^k\left( \tilde{g}\right) \right\| _{H} \lesssim \Lambda _-^k. \end{aligned}$$

Since we are allowed to drop the \(\varepsilon \) at \(S^t_\varepsilon (g)\) and \(S^t_\varepsilon \left( \tilde{g}\right) \), and since the solution to the (truncated) equation depends continuously on the initial datum with respect to the Hilbert space topology, \(\left\| S^t_\varepsilon (g)-S^t_\varepsilon \left( \tilde{g}\right) \right\| _H \lesssim \left\| g-\tilde{g}\right\| _H\) holds for all \(t\in [0,1]\) (see the fixed point construction of solutions in Theorem 6.3), we obtain

$$\begin{aligned} \left\| S^t(g)-S^t\left( \tilde{g}\right) \right\| _H\lesssim e^{-\mu t} \end{aligned}$$

for any \(t\ge 0\). Next, we make use of Lemma 6.8 and obtain

$$\begin{aligned} \left\| S^{t}\left( g\right) -S^{t}\left( \tilde{g}\right) \right\| _{W} \lesssim e^{-\mu t}, \end{aligned}$$

for any \(t\ge \hat{t}\) and some \({\hat{t}}\in (4/5,1)\). The statement follows. \(\square \)

8 Mode-by-Mode Asymptotics for the Perturbation Equation

In this final section, we exploit our invariant manifold theorem, Theorem 3.2 to prove the mode-by-mode asymptotics in Theorem 3.1. We start with a brief comment on the projection of a function \(w \in H\) onto the subspaces spanned by the eigenfunctions of \(\mathcal {L}^2+N\mathcal {L}\). Let \(\psi \) be such an eigenfunction for the eigenvalue \(\lambda ^2 + N\lambda \), or, equivalently, \(\mathcal {L}\psi =\lambda \psi \). We consider the H-projection of w, and find via an integration by parts

$$\begin{aligned} \langle \psi ,w\rangle _{H}&= \int \psi w \rho \text {d}z + \int \nabla \psi \cdot \nabla w \rho ^2\text {d}z \\&= \int \psi w \rho \text {d}z + \int w\mathcal {L}\psi \rho \text {d}z = \left( 1+\lambda \right) \int \psi w \rho \text {d}z = (1+\lambda )\langle \psi ,w\rangle . \end{aligned}$$

This shows that the H-projection coincides, up to a constant, with the \(L^2(\rho )\)-projection, due to the right choice of the weights. Thus, it is enough to consider the projection with respect to \(\langle \cdot ,\cdot \rangle \) in the following.

We notice that the projection of w onto the space spanned by the constant eigenfunction corresponding to the eigenvalue \(\mu _0=0\) is given by

$$\begin{aligned} P_0w= c_{0,N}\int _{B_1(0)} w\rho \text {d}z \end{aligned}$$

and the projection w onto the eigenspaces spanned by the eigenfunctions corresponding to the next eigenvalue \(\mu _1\) is given by

$$\begin{aligned} P_1w = c_{1,N}\int _{B_1(0)} zw\rho \text {d}z, \end{aligned}$$

where \(c_{0,N}\) and \(c_{1,N}\) are two positive constants.

Eventually we will prove Theorem 3.1 by induction and thus commence by proving the case \(K=0\) in the following theorem. We remark that thanks to smoothing effects, see Equation (54) in [65], it holds that

$$\begin{aligned} \Vert w(t)\Vert _{W}\le \Vert w_0\Vert _{W^{1,\infty }}, \end{aligned}$$

for some \(t\gtrsim 1\), and thus, instead of considering Lipschitz initial data, we may impose slightly stronger assumptions.

Theorem 8.1

There exists \(\varepsilon _0>0\) such that the following holds. Let w be a solution to (19) with initial datum \(w_0\). We further assume that \(\Vert w_0\Vert _{W} \le \varepsilon _0\) and

$$\begin{aligned} \lim \limits _{t\rightarrow \infty } \int w(z)\rho (z)\text {d}z=0. \end{aligned}$$
(61)

Then we have

$$\begin{aligned} \left\| w(t)\right\| _{W} \lesssim e^{-\mu _{1}t} \quad \text { for all } t\ge 0. \end{aligned}$$

Proof

We will make use of the invariant manifolds we just constructed in the case \(K=0\). In this case, \(E_c\) is one-dimensional and spanned by the constant eigenfunction \(\psi _{1,0}\) corresponding to the eigenvalue \(\mu _0=0\). Thus, we obtain \(E_c \cong \mathbb {R}\). We fix \(\mu \in (0,\mu _{1})\) and accordingly \(\varepsilon \) and \(\varepsilon _0\) as in Theorem 3.2 and claim the equality

$$\begin{aligned} W_{\varepsilon }^c= E_c. \end{aligned}$$
(62)

To see this, we first pick a function \(g \in E_c\), that is, \(g(x)= \alpha \in \mathbb {R}\). The constant function \(w(t,x)\equiv \alpha \) solves equation (45) with initial datum g and satisfies the bounds

$$\begin{aligned} \Vert w(t)\Vert \sim |\alpha | \lesssim {\left\{ \begin{array}{ll} \Lambda _+^t |\alpha |, \quad \text { for } t\ge 0,\\ \Lambda _-^t |\alpha |, \quad \text { for } t\le 0. \end{array}\right. } \end{aligned}$$

By the characterization of the center manifold, we deduce \(g \in W_{\varepsilon }^c\). Now let \(g=g_c+\theta _{\varepsilon }(g_c)\) be a function in \( W_{\varepsilon }^c\). From above we know \(E_c\subset W_{\varepsilon }^c\), and thus \(g_c \in W_{\varepsilon }^c\). This forces \(\theta _{\varepsilon }(g_c)=0\), which proves the claim (62).

Let us know consider an initial datum \(w_0\) with \(\Vert w_0\Vert _{W} \le \varepsilon _2\) and let \(w(t)=S^t(w_0)\) be the corresponding solution to the perturbation equation. The Invariant Manifold Theorem 3.2 combined with the characterization (62) yields the existence of constant a with

$$\begin{aligned} \Vert w(t)-a\Vert _H\lesssim \left\| w(t)-a\right\| _{W}\lesssim e^{-\mu t},\quad \text { for } t\ge 1. \end{aligned}$$
(63)

In particular, if a(t) denotes the average of w(t) or, in other words, the projection onto \(E_c\), , it holds that \(|a(t)-a| \lesssim e^{-\mu t}\). Invoking the hypothesis (61), this estimate entails that \(a=0\).

We want to improve on the decay rate of a(t). We note that a(t) solves the equation

The nonlinear term \(\rho F[w]\) consists of a linear combination of respective two factors of \(\nabla w\), \(\rho \nabla ^2w\) or \(\rho ^2 \nabla ^3w\), cf. (49). Thus, we obtain the estimate \(|\rho F[w]| \lesssim \Vert w\Vert ^2_{\dot{W}}\), where we consider only the homogeneous part of the norm. From (63) we already know that \(\Vert w(t)\Vert _{\dot{W}}\lesssim e^{-\mu t}\) for \(t\ge 1.\) We conclude

$$\begin{aligned} \left| \frac{d}{\text {d}t}a(t)\right| \lesssim e^{-2\mu t} \end{aligned}$$

for \(t\ge 1\). We integrate over the time interval \((t,\infty )\) and recall the assumption (61) to obtain

$$\begin{aligned} |a(t)| \lesssim e^{-2\mu t} \quad \text { for } t\ge 1. \end{aligned}$$

As we may choose \(\mu \) larger than \(\frac{1}{2}\mu _{1}\), it remains to gain suitable control over the projection of w(t) onto \(E_s\cong H/\mathbb {R}\), namely \(P_sw(t) = w(t)-a(t)\). We note that \(P_sw\) solves the equation

$$\begin{aligned} \partial _tP_sw +\left( \mathcal {L}^2+N\mathcal {L} \right) P_sw = P_s\left( \frac{1}{\rho }\nabla \cdot \left( \rho ^2F[w]\right) +\rho F[w] \right) . \end{aligned}$$

Since the eigenfunctions \(\left\{ \psi _i\right\} _{i\in \mathbb {N}_0}\) form an orthogonal basis of H, it holds that \(\langle P_sw,\left( \mathcal {L}^2+N\mathcal {L}\right) P_sw \rangle _H \ge \mu _{1} \Vert P_sw\Vert ^2_{H}\) and thus, arguing similarly as in the proof of Theorem 6.3, we find that

$$\begin{aligned}&\frac{1}{2}\frac{d}{\text {d}t}\left\| P_sw\right\| ^2_{H} +\mu _{1}\left\| P_sw\right\| ^2_{H}\\&\quad \le -\langle \nabla P_sw,\rho F[w]\rangle - \langle \nabla \mathcal {L}P_sw,\rho F[w]\rangle + \langle P_sw,\rho F[w]\rangle +\langle \mathcal {L}P_s w,\rho F[w]\rangle \\&\quad \le \Vert P_sw \Vert _{W} \left( \Vert \rho F[w]\Vert _{L^\infty } + \Vert \rho F[w]\Vert _{L^\infty }\right) \\&\quad \le \left( \Vert w\Vert _W + |a(t)| \right) \left( \Vert \rho F[w]\Vert _{L^\infty } + \Vert \rho F[w]\Vert _{L^\infty }\right) . \end{aligned}$$

Thanks to the uniform estimates on the nonlinearities that we quoted above and the bound in (63), we observe that the right-hand side decays with at least \( e^{-3\mu t}\). Therefore, the latter estimate translates into

$$\begin{aligned} \frac{d}{\text {d}t}\left( e^{2\mu _{1} t} \left\| P_sw\right\| ^2_{H}\right) \lesssim e^{\left( 2\mu _{1}- 3\mu \right) t}, \end{aligned}$$

for any \(t\ge 1\). The right hand side is integrable, provided that we choose \(\mu \) sufficiently close to \(\mu _{1}\), so that \(3\mu >2\mu _1\). Integration in time yields

$$\begin{aligned} \left\| P_sw(t)\right\| ^2_{H} \lesssim e^{-2\mu _1 t} \quad \text { for } t\ge 1. \end{aligned}$$

In combination with our estimate on the average, (44), this bound gives

$$\begin{aligned} \left\| w(t)\right\| _{H} \le \left\| P_sw(t)\right\| _{H}+\left| a(t)\right| \lesssim e^{-\mu _1 t}\quad \text { for } t\ge 1. \end{aligned}$$

We take into account Lemma 6.5 to finally obtain the statement of the theorem, noting that the result is trivial for \(t\lesssim 1\). \(\square \)

Remark 8.2

Using the final result of Theorem 8.1 we are able to improve the convergence rate of a(t) to \(|a(t)|\lesssim e^{-2\mu _1t}\) for all \(t\ge 0\).

Having already proved the part of Theorem 3.1 concerning the smallest eigenvalue, we are now able to deduce the full statement with an analogue approach.

Proof of Theorem 3.1

We prove this theorem by induction. The base case \(K=0\) is proved in the latter theorem.

Now, may assume that (23) holds true and additionally

$$\begin{aligned} \left\| w(t)\right\| _{W}\lesssim e^{-\mu _{K}t} \text { for all } t\ge 0. \end{aligned}$$
(64)

This directly implies \(|\rho F[w]|\lesssim e^{-2\mu _Kt}\). We will again exploit the invariant manifolds in a similar way as in the base case. The center eigenspace takes the form \(E_c = {{\,\textrm{span}\,}}\big \{\psi _{k,n}\, |\, k\in \left\{ 0,\dots ,K\right\} \text { and } n\in \left\{ 1,\dots ,N_k\right\} \big \}.\) We fix \(\mu \in \left( \mu _{1},\mu _{2}\right) \) and accordingly \(\varepsilon \) and \(\varepsilon _0\) as in Theorem 3.2. We deduce the existence of \({\tilde{w}}_0 \in W_{loc}^c\) such that \(\tilde{w}(t)=S^t({\tilde{w}}_0 )\in W_{loc}^c\) satisfies

$$\begin{aligned} \left\| w(t)-\tilde{w}(t)\right\| _{W}\lesssim e^{-\mu t}\quad \text { for all } t\ge 1, \end{aligned}$$
(65)

where \(\tilde{w}(t) = P_c\tilde{w}(t)+ \theta _\varepsilon \left( P_c\tilde{w}(t)\right) \) with \(P_c\tilde{w}(t) = \sum \limits _{n,k} \langle \tilde{w}(t),\psi _{k,n}\rangle \psi _{k,n}\).

Now, we fix an arbitrary \(k\in \left\{ 0,\dots ,K\right\} \) and consider the projection of w onto one of the eigenfunctions \(\psi _{k,n}\). We obtain the ordinary differential equation

$$\begin{aligned}&\frac{d}{\text {d}t}\langle \psi _{k,n},w(t)\rangle +\mu _k\langle \psi _{k,n},w(t)\rangle =-\langle \nabla \psi _{k,n},\rho F[w(t)]\rangle +\langle \psi _{k,n},F[w(t)]\rangle \\&\quad \text { for all }t\ge 0, \end{aligned}$$

which implies \(\left| \frac{d}{\text {d}t}e^{\mu _kt}\langle \psi _{k,n},w(t)\rangle \right| \lesssim e^{-\left( 2\mu _K-\mu _k \right) t}\) due to the bound on \(|\rho F[w]|\). We notice that \(\lim \limits _{t\rightarrow \infty } e^{\mu _k}\langle \psi _{k,n},w(t)\rangle \) exists and vanishes by the virtue of assumption (23). We conclude that

$$\begin{aligned} |\langle \psi _{k,n},w\rangle |\lesssim e^{-2\mu _Kt}\quad \text { for all }t\ge 0. \end{aligned}$$

This yields \(\Vert P_c w(t)\Vert _W\lesssim e^{-2\mu _Kt} \) and enables us to estimate the center part of \(\tilde{w}(t)\) with help of (65) and the triangle inequality, namely

$$\begin{aligned} \left\| P_c\tilde{w}(t)\right\| _W \le \left\| P_c \left( w(t)-\tilde{w}(t)\right) \right\| _W + \left\| P_cw(t)\right\| _W\lesssim e^{-\min \left\{ 2\mu _K, \mu \right\} t} \end{aligned}$$

for all \(t\ge 1\). Thanks to the regularity property of \(\theta _\varepsilon \) derived in the first part of Proposition 7.1 we deduce

$$\begin{aligned} \left\| \theta _\varepsilon \left( P_c\tilde{w}(t)\right) \right\| _W\lesssim e^{-\min \left\{ 2\mu _K, \mu \right\} t} \quad \text { for all }t\ge 1. \end{aligned}$$

Combining the previous estimates, we have

$$\begin{aligned} \left\| w(t)\right\| _{W}&\le \left\| w(t)-\tilde{w}(t)\right\| _{W} + \left\| P_c\tilde{w}(t)\right\| _W + \left\| \theta _\varepsilon \left( P_c\tilde{w}(t)\right) \right\| _W\nonumber \\&\lesssim e^{-\min \left\{ 2\mu _K, \mu \right\} t} \quad \text { for all }t\ge 1. \end{aligned}$$
(66)

We note that (66) gives a better rate than (64). Due to the structure of the eigenvalues it may happen, depending on K and the space dimension N, that \(2\mu _K <\mu \). In this case, inequality (66) downgrades to \(\left\| w(t)\right\| _W\lesssim e^{-2\mu _K t}\). Similarly, in this case the estimate for center part of w(t), that is \(P_cw(t)\), is also not good enough, as we want to prove \(|\langle \psi _{k,n},w(t)\rangle | \lesssim e^{-\mu _{K+1}t}\). We overcome this problem by repeating the first step of this proof, now from the starting point (66) instead of (23), which directly yields \(|\rho F[w]|\lesssim e^{-4\mu _Kt}\). If \(2\mu _K \le \mu _{K+1}\), we deduce via iteration that

$$\begin{aligned} \left\| P_cw(t)\right\| _W \lesssim e^{-2^m\mu _K} \quad \text { for all } t\ge 1 \end{aligned}$$

and

$$\begin{aligned} \Vert w(t)\Vert _W\lesssim e^{-\mu t} \quad \text { for all }t\ge 1, \end{aligned}$$
(67)

where m is smallest natural number that satisfies \(\mu _K \le 2^{m-1}\mu _K<\mu <\mu _{K+1} \le 2^m\mu _K\). We remark that we are allowed to choose \(\mu \) sufficiently close to \(\mu _{K+1}\). In the case \(2\mu _K \ge \mu _{K+1}\), we may directly continue from estimate (66), which corresponds to \(m=1\).

To achieve the rate \(\mu _{K+1}\), we investigate the projection of w(t) onto \(E_s\). Similar to the previous proof, testing the equation solved by \(P_sw\) with \(\rho P_sw\) yields

$$\begin{aligned}&\frac{1}{2}\frac{d}{\text {d}t}\left\| P_sw(t)\right\| ^2_{H}+\mu _{K+1}\left\| P_sw(t)\right\| ^2_{H}\\&\quad \le - \langle \nabla P_sw,\rho F[w]\rangle - \langle \nabla \mathcal {L}P_sw,\rho F[w]\rangle + \langle P_sw,\rho F[w]\rangle +\langle \mathcal {L}P_s w,\rho F[w]\rangle \\&\quad \le \Vert P_sw \Vert _{W} \left( \Vert \rho F[w]\Vert _{L^\infty } + \Vert \rho F[w]\Vert _{L^\infty }\right) \lesssim e^{-3\mu t} \quad \text { for all }t\ge 1, \end{aligned}$$

where we used (67) and the quadratic behavior of \(\rho F[w]\). Just like in the previous proofs, choosing \(\mu \) large enough such that \(3\mu >2\mu _{K+1}\) we obtain

$$\begin{aligned} \left\| P_sw(t)\right\| ^2_{H}\lesssim e^{-2\mu _{K+1}} \quad \text { for all }t\ge 1 \end{aligned}$$

and in total

$$\begin{aligned} \left\| w(t)\right\| _{H}\le \left\| P_cw(t)\right\| _{H}+\left\| P_sw(t)\right\| _{H}\lesssim e^{-2\mu _Kt}+e^{-\mu _{K+1}t}\lesssim e^{-\mu _{K+1}t}\quad \text { for all }t\ge 1. \end{aligned}$$

To carry this result over to the W-norm it remains to make use of the smoothing estimate in Lemma 6.5, noting again that the result is trivial for \(t\lesssim 1\). \(\square \)