Asymptotics Near Extinction for Nonlinear Fast Diffusion on a Bounded Domain

On a smooth bounded Euclidean domain, Sobolev-subcritical fast diffusion with vanishing boundary trace is known to lead to finite-time extinction, with a vanishing profile selected by the initial datum. In rescaled variables, we quantify the rate of convergence to this profile uniformly in relative error, showing the rate is either exponentially fast (with a rate constant predicted by the spectral gap), or algebraically slow (which is only possible in the presence of non-integrable zero modes). In the first case, the nonlinear dynamics are well-approximated by exponentially decaying eigenmodes up to at least twice the gap; this refines and confirms a 1980 conjecture of Berryman and Holland. We also improve on a result of Bonforte and Figalli by providing a new and simpler approach which is able to accommodate the presence of zero modes, such as those that occur when the vanishing profile fails to be isolated (and possibly belongs to a continuum of such profiles).


Introduction
Setting m q := 1 − 2 n+q [26], consider the fast diffusion equation with the exponent 0 < m ∈ (m 1− n 2 , 1), integrable non-negative initial data, and Dirichlet boundary conditions, on a smooth bounded domain ⊂ R n :  This equation models heat flow in a material whose thermal conductivity mw m−1 depends inversely on its local temperature w. With m = 1/2 this has also been used to model the diffusion of plasma ions across a magnetic field in simulations [43] and experiments [51]. For such an initial value problem, it is known that the vanishing Dirichlet boundary conditions drive the solution w to become extinct in finite time, e.g., [15,45,46]. To understand the vanishing profile, rescale around the extinction time T > 0, to obtain an equation for v(x, t) with p = 1/m: (1. 2) The rescaled solution v on is known to converge (both subsequentially [15] and sequentially [32]) as t → ∞ to some unique profile V (x) selected by the initial datum; moreover, V is a positive solution to the stationary elliptic problem Solutions to (1.3) represent critical points of the Lyapunov functional [14] (1.4) for the dynamics (1.2). Sobolev subcriticality p −1 > m 1− n 2 = n−2 n+2 implies L p+1coercivity of the energy on the L p+1 -unit sphere, from which the existence of positive steady states (1.3) had earlier been derived by Berger [10]. Brezis and Nirenberg showed such solutions need not be unique however [16]; other uniqueness and nonuniqueness results concerning positive solutions in specific domain geometries may be found in the works of Gidas-Ni-Nirenberg [33] on the ball, Dancer [23,24] on connected approximations to disjoint unions of balls, Damascelli-Grossi-Pacella [25] on domains with symmetry, Zou [52] on rough balls, Akagi-Kajikiya [4] who used instability to show that uniqueness fails on thin annuli, so solutions and their rotations form continuous families, and Akagi [5] who showed the stability of energy minimizers. On the other hand, Feireisl-Simondon [32] showed that the evolution (1.2) selects and converges to one of the positive solutions V of (1.3) (which may depend on the initial datum) as t → ∞. They did not give a result on the convergence rate. Later Bonforte et al. [12] showed convergence in relative error h : (1.5) and provided an exponential rate of convergence in entropy sense, under a nondegeneracy condition which they were able to verify for m close to 1. It has been a problem of considerable interest (a) to quantify the rate of convergence unconditionally, and (b) to predict the higher-order asymptotics of the relative error h. In contrast to the analogous questions set on the full space = R n , resolved in [26] and its references [8,21,22,29,39,44], this challenge is compounded by the fact that the linearized problem has unstable modes (including those corresponding to τ -translations in the original variables, which blow up at different times T ), and can also have zero modes, including modes called integrable that arise e.g. for reasons of symmetry, as for the thin annuli mentioned above. Recently, Bonforte and Figalli overcame some of these challenges to solve (a) for C 2,α -generic smooth domains including the ball [11]. In a suitable Hilbert space, they show the linearized evolution of h is generated by a self-adjoint operator possessing a complete basis of eigenfunctions. To summarize their findings: the unstable (negative) modes cannot be active due to Feireisl and Simondon's convergence, and neutral (zero) modes are absent on generic domains [50], in which case they show that the relative error decays uniformly with an exponential rate λ no smaller than the first positive eigenvalue. On arbitrary smooth domains however, such a result eludes their techniques, which rely on the kernel of the linearized operator being trivial, and hence the limiting profile being isolated. A simpler derivation of the rate of exponential convergence was subsequently obtained under the same restriction by Akagi [6]; although he expresses convergence in terms of an energy rather than the entropy or relative error, these quantities can be compared using the boundary regularity theory of [37]. Finally, Jin and Xiong showed unconditionally that the rate of convergence is at least algebraic in t [36], but with a power that is not explicit. In the present manuscript, we bridge this wide gap (between exponential upper and algebraic lower bounds on the rate) by developing a new approach which yields that h L ∞ either decays exponentially with rate λ or else decays algebraically at a rate 1/t or slower. In the second case, not only must zero modes be present, but they must be non-integrable in a sense made precise below. Moreover, when the decay rate is exponential, we address (b) by showing that the longtime asymptotics of the nonlinear problem are described by the linearized dynamics up to the error e −2λt produced by quadratic corrections. This refines and confirms a conjecture made for the case n = 1 by Berryman and Holland [15].
Besides being more powerful, our approach is also simpler than Bonforte and Figalli's. Instead of augmenting Del Pino and Dolbeault's nonlinear entropy method [29] with approximate orthogonality conditions, we rely on an ODE lemma of Merle and Zaag [42], which implies that the dynamics are eventually dominated either by stable or neutral modes. We must also control the nonlinearity by adapting parabolic regularity estimates to the geometry of the steady state around the domain boundary. Delicately matching these estimates to the ODE argument allows us to estimate the rate of convergence via the dichotomy described above. From there we use Hilbert projection techniques to get an asymptotic expansion up to an e −2λt error. The latter resembles Denzler, Koch and McCann's treatment of higher-order asymptotics in the narrower range m 0 = 1 − 2 n < m < 1 of evolutions on the unbounded domain = R n -though the linearized analysis around the selfsimilar spreading solution described in [26,27] and their references is more subtle than the present problem, and our rate estimate does not require us to establish differentiable dependence of the flow on initial conditions. On the other hand, the whole space problem is not plagued by the multiplicity and continua of limiting profiles that we presently face. In the current setting, finer aspects of the dynamics (beyond rate 2λ) may conceivably be described by constructing invariant manifolds, but we defer the exploration of this possibility to future research. In the porous medium regime (which refers to the complementary range of nonlinearities m > 1) on the whole space = R n , such a construction was completed by one of us [47,48] following earlier work of Angenent [7] for n = 1 and Koch [40] for n > 1.
The remainder of this manuscript is structured as follows: the Section 2, we rewrite the problem in terms of the relative error, for which most of our analysis is conducted, summarize the spectral theory and introduce some notation. Along with the necessary terminology, Section 3 states our two main dichotomy results. After recalling variants of the Merle-Zaag lemma [42] due to K. Choi with Haslhofer and Hershkovits on the one hand [18] and with Sun on the other [20], our first dichotomy-separating fast and slow convergence-is proved in Section 4, apart from the parabolic regularity estimates which yield a quadratic bound in the relative error for the nonlinearity. These are postponed to Section 5. The remaining dichotomy is established in Section 6.

Linearized Dynamics and Relative Error
In terms of the relative error h := v−V V , the dynamics (1.2) take the form where L V is the linear operator relative to V , and N (h) is the nonlinearity, given by Indeed, solving (2.1) for ∂ t h, dividing by the prefactor and shifting the nonlinearities onto the right-hand side yields N (h) = M V (h). This allows us to exchange temporal for spatial derivatives of h in the nonlinearity. Since in most parts of the paper the reference stationary solution is fixed, we will often write L = L V for notational simplicity.
The relative error and the linear operator are best understood when analyzed in suitable weighted Lebesgue and Sobolev spaces. Given σ > 0 and a positive solution V to (1.3), the weighted inner product Hilbert space. We will occasionally be concerned with more general Lebesgue spaces induced by the norm Multiplication by V acts as an isometry between L 2 p+1 and L 2 p−1 . Under this isometry, the linear operator L V + pI is unitarily equivalent to an operator p−1 , whose spectral theory, subject to vanishing Dirichlet boundary conditions, was elucidated by Bonforte and Figalli [11]: the corresponding operator L= L V is a self-adjoint semibounded operator on L 2 p+1 ;the spectrum of L is discrete and the eigenfunctions form a basis of L 2 p+1 . They are critical points for the restriction of the weighted Dirichlet energy to that subset of the L p+1 unit-sphere for which the boundary trace of φV vanishes. Using two nonnegative integers I and K , let us list the eigenvalues with repetition as • The integer I represents the dimension of the unstable modes of L (and coincides with the Morse index of E(v) from (1.4) at V ). • K represents the dimension of kernel of L and any corresponding eigenfunctions are called Jacobi fields. • Let us call the eigenfunctions which correspond to λ −I to λ −1 the unstable modes, those corresponding to λ 0 to λ K −1 the neutral (or central) modes, and the remaining eigenfunctions (starting with eigenvalue λ K )the stable modes.
The corresponding eigenspaces will be denoted by E u , E c and E s , respectively. They are understood as subspaces of L 2 p+1 , so that L 2 p+1 = E u ⊕ E c ⊕ E s .
• λ −I = 1 − p = 1 − 1 m and it is actually simple (a.k.a. multiplicity 1, so λ −I < λ 1−I ) with corresponding eigenfunction 1 (called the ground state). In the original variables it corresponds to time translation of the solution; the signs I > 0 > λ −I account for the fact that τ -translations of a given solution disappear at different times T hence diverge sharply from each other under the rescaling appropriate for one of them.
Notation. We finally comment briefly on some notation that we will frequently use throughout this work: We write a b if there is a constant C such that a Cb. The constant C may depend on the limiting profile V , the domain, and other parameters such as p = 1/m, but this dependence is continuous under small perturbations of V in the relatively-uniform topology generated by (3.6) below, hence C may be regarded as being locally independent of V . We write t 1 to indicate t must be sufficiently large.

Fast Versus Slow Convergence Dichotomies
Recall Bonforte et al. [12] showed if V (x) is the limit solution of v(x, t) (see [32]) then the relative error h(t) = v(t) V − 1 decays uniformly: In this section we describe two dichotomy theorems which establish that a spectral gap gives the sharp rate of exponential convergence, unless the linearized dynamics has a non-integrable kernel in the refined sense of Definition 3.3. When Definition 3.3 fails to be satisfied, we show the convergence occurs either exponentially at the rate of the spectral gap or no faster that O(1/t).
Theorem 3.1. (First dichotomy for asymptotic behavior) For 0 < m ∈ (m 1− n 2 , 1) = ( n−2 n+2 , 1), let ⊂ R n be a smooth bounded domain and v(x, t) 0 on (x, t) ∈ × [0, ∞) be a bounded solution to the evolution problem (1.2). Let V (x) be the classical solution to (1.3) satisfying (1.5). Then there exist positive constants ε = ε(V, p) and C = C(V, p) such that whenever h(t) L ∞ ε for all t 0, exactly one of the following alternatives holds: (2) the relative error h decays exponentially or faster, where λ = λ K > 0 is the first positive eigenvalue of linearized operator L in (2.1). Moreover, whenever (3.3) holds for some λ λ K and any C = C(V, p, λ), if J ∈ N 0 is given such that λ J < 2λ and λ J < λ J +1 =: μ, and ϕ 0 , ϕ 1 , . . . , ϕ J are corresponding eigenfunctions chosen in such a way that they can be extended to a complete L 2 p+1 orthonormal basis, then there are constants C i ∈ R such that t 1 yields lead to solutions which remain bounded after any short time [28]. (3) Due to Bonforte, Grillo and Vazquez result (3.1), the smallness assumption on our initial data is no restriction either. (4) Notice the factor t appearing in the bound (3.4) when μ = 2λ accounts for the possibility of the kind of eigenvalue resonances described in [7].
As remarked above, triviality of the kernel, K = 0, implies exponential decay. However, we can also show that in certain situations, exponential decay also occurs when K > 0. For example, some kernel of L can be obtained as a tangent variation among stationary states (as for the non-rotationally symmetric limit states V found by Akagi and Kajikiya on thin annuli [4]): if there is a one-parameter If all zero modes are accounted for by such continuous symmetries in the sense of Definition 3.3 below, exponential decay occurs in spite of the fact that K > 0. To formulate this more precisely, consider the whole family S ⊂ H 1 ( ) of weak solutions V to the Dirichlet problem (1.3), i.e. Sobolev functions for which In Lemma 5.1, we shall eventually infer S ⊂ C 3,α ( ) for some α ∈ (0, 1) and that V /W ∈ L ∞ ( ) for all V, W ∈ S. The latter implies that the sets form the base {B r (V )} r >0,V ∈S for a topology on S, called the relatively-uniform topology. We call V ∈ S an ordinary limit if S forms a manifold of dimension K = dim ker L V near V , which the error relative to V embeds differentiably into L p+1 . More explicitly, call h a stationary relative error if it is a time-independent solution to (2.1), or equivalently satisfies the nonlinear (singularly) elliptic equation Definition 3.3. (Ordinary limit) We say that V is an ordinary limit if there exists a constant δ ∈ (0, 1), an L 2 p+1 -open neighborhood U of 0 in ker L V and a C 1 diffeomorphism with the properties that In this definition, the identity map is the one on ker L V .
The point of this definition is the following theorem: Here are some preliminary (rather formal) observations. (2) If ψ ∈ ker L V consider h(s) = V (sψ) for s small. Then This remark shows the kernel of L V satisfies the next definition if V ∈ S is an ordinary limit.
In the context of minimal surfaces and geometric evolution equations, analogous concepts of kernel integrability date back at least to Allard-Almgren [3] and Simon [49] respectively. Simon used analyticity to show a converse to Remark 3.5. Inspired by this, it is natural to expect integrability of the kernel of L V to imply that V is an ordinary limit-though we have not verified this in the present context. Note that a limit V can fail to be ordinary in one of three ways: either (a) S can fail to be a manifold nearby; or (b) S can be a manifold locally which the relative error fails to embed differentiably into L 2 (V p+1 ); or (c) locally S can be a differentiably embedded manifold whose dimension is strictly less than that of ker L V . It is natural to expect that limits which fail to be ordinary are rare: To the extent that the behaviour of S mimics the stratification and singularities of an analytic variety, we imagine that (a) occurs only on a set having local codimension one in S. Based on the regularity results we establish below, we are skeptical that (b) ever occurs. And inspired by the genericity results of, e.g., Saut and Teman [50], we conjecture that (c) is non-generic in the sense that there is a dense G δ in S on which the kernel of the linearized operator has the same dimension as S; i.e. the extra zero modes associated with non-ordinary limits are coincidences whose eigenvalues become non-zero upon perturbation. (A weaker conjecture is that a perturbation of as well as V preserving the dimension of S is required to restore ordinariness.) It is these expectations that motivate our choice of the term 'ordinary'.
We finally translate the leading order asymptotics that we found for the relative error back to the original fast diffusion equation (1.1) and its convergence towards the separation-of-variables solution (3.8) (1) the difference w − W decays logarithmically or slower, (2) or the difference w − W decays algebraically or faster, Remark 3.8. Analogous estimates hold true for the relative error w(τ ) W (τ ) − 1. Proof. (Proof, Case 1) We start with the logarithmic decay estimate for which we assume that the first case in Theorem 3.1 applies. Then where we have used in the second inequality that Hence, by the definitions of w and W and the relation between t and τ , the latter estimate is equivalent to We now use the elementary estimate Since V /v = 1/(h + 1) is uniformly bounded by hypothesis, the above analysis gives that as desired. Case 2. We finally convert the exponential decay estimate on the relative error into an estimate for the fast diffusion equation (1.1). From the second case in Theorem 3.1, the definitions of W and w and the relation between t and τ we infer that We shall now invoke the elementary estimate |a − 1| |a m − 1| which holds true for a close to 1 to conclude a control on the relative error, which yields the desired statement in view of the scaling of W and the boundedness of V , This concludes the proof of the theorem.

Proof of First Dichotomy (Apart from Smoothing Estimates)
We start with an L 2 semigroup estimate which is at the heart of our analysis. 2) with initial datum h 0 ∈ L 2 p+1 , and assume that h L ∞ ε for some ε > 0. Then there exists C > 0 such that for ε > 0 small enough and all t > 0, the following holds: Proof. It will be convenient to consider the purely spatial form (2.3) of the nonlinearity N (u) = M V (u) when working with (2.1)-(2.2). Then setting dμ p (x) := V (x) p dx, testing (2.1) with hV p+1 and integrating by parts, we arrive at the identity Because |h| ε < 1, the right-hand side can be controlled by quadratic expressions, more precisely, If ε is sufficiently small, the gradient terms can be absorved into the right hand side, and the resulting estimate can be solved with the help of a standard Gronwall argument. This proves the lemma.
The proof of our first dichotomy, Theorem 3.1, is based on two main ingredients. On the one hand, our argument will rely on a fundamental dynamical systems result that is due to Merle and Zaag [42], and which in turn improves on an earlier related result by Filippas and Kohn [31]. On the other hand, we have to exploit some smoothing properties of the parabolic equation.
The Merle-Zaag lemma is concerned with a system of weakly coupled first order ordinary differential equations featuring stable, neutral and unstable solutions. It states that under the assumption that the unstable modes fail to grow, the long time asymptotics are dominated by precisely one of the other two modes. This lemma provides an effective way to extract the quantized behaviour of the solution prescribed by the discrete spectrum of its limit. It plays a pivotal role in recent progress on classifications of ancient solutions (solutions defined for (−∞, T ]) to parabolic equations [1,2,9,19] and entire solutions to elliptic equations [17] arising from geometry. In classifications of ancient flows, the lemma is applied backward in time (i.e. t = −s). One advantage in backward problems is that there are only finitely many stable eigenfunctions (imagine, for instance, the laplacian has only finitely many positive eigenfunctions); this makes the classifications possible. Meanwhile in the forward problem, there are infinitely many stable eigenfunctions. The lemma is used to investigate the asymptotic behavior of solutions. To obtain the stated dependencies of the constants in our first dichotomy requires a refinement of the Merle-Zaag Lemma due to K. Choi et al.
For later use we also recall a quantitative adaptation of the Merle Zaag lemma to a compact time interval proved by Choi and Sun in [20]. In Sect. 6 their result will be motivated and used to prove our second dichotomy.
The next proposition provides the crucial control that we need to estimate the nonlinear terms in the relative error dynamics (2.1) quadratically.
We postpone the proof and a discussion of this proposition to the next subsection and show first how to deduce our first dichotomy from Lemmas 4.1-4.2 and Proposition 4.4.
Proof of Theorem 3.1. In this proof, for the sake of convention, we omit the sub- . Positive constants ε = ε(V, p) and C = C(V, p) are not fixed yet, ε may become smaller, and C may become larger until they are fixed.
Let us denote by P s , P c and P u the orthogonal projections of L 2 p+1 onto the stable, center, and unstable eigenspaces E s , E c and E u , respectively. Moreover, we write h s = P s h, h c = P c h, and h u = P u h for the projected solutions. A straightforward computation reveals that where we recall λ −1 and λ K are the negative and positive eigenvalues of L= L V closest to zero. As an example, for the case of the evolution on the stable subspace, we observe that and testing with h s V p+1 gives The third estimate in (4.1) now results from the lower bound on the stable eigenvalues and the Cauchy-Schwarz inequality. The first and the second estimates are derived analogously. In order to estimate the nonlinearities in (4.1), we note that for |h| ε, (provided that ε is sufficiently small) there is by Taylor expansion followed by the smoothing estimates of Proposition 4.4 for t 1. By virtue of Bonforte et al. [12] uniform bound on the relative error (3.1), for any smallε > 0 there exists a time t 0 (ε) such that Plugging this estimate into (4.1), we observe that if h = 0, then h s , h c and h u satisfy the hypothesis Lemma 4.2 with s = λt where λ := 1 2 min(|λ −1 |, |λ K |). Therefore, unless h is stationary (and thus trivial), either Note in particular, the second alternative (4. We discuss the implications of each case individually, starting from the latter case (4.6). Case 2. First, by choosing ε = ε(V, p) sufficiently small, we may assume h u + h c 1 2 h s for all t 1. Here, note that t 1 is needed as Proposition 4.4 is applied to estimate N (h). Moreover, by the same argument we used to derive (4.4), Combining the above, the estimate of the nonlinearity (4.4) becomes for t 1. The third estimate in (4.1) thus turns into the differential inequality By another application of (4.6) and the semigroup estimate from Lemma 4.1, this estimate can be translated into the full solution To improve this inequality by removing the ε, fix ε small so that 2C 2 ε < 1 2 λ K , and refine the estimate of the nonlinearity with the help of the quadratic bound (4.3), the smoothing estimates from Proposition 4.4 and the previous estimate (4.7), for t 1. Substitution into the third estimate of (4.1) finally establishes a differential inequality, which yields Since the nonlinearity is quadratic, modes corresponding to eigenvalues in the interval [0, 2λ) are accessible via a simple ODE argument. For any i, the projection y i = h, ϕ i satisfies the differential inequality which can be rewritten as as a consequence of (4.8). Integration in time thus yields for any T t 1 that where we have used the fact that λ i < 2λ. This bounds implies that T → e λ i T y i (T ) is a Cauchy sequence, so that, sending T → ∞, we find for some C i ∈ R.
We now estimate with the help of the decomposition h = h s + h c + h u , the triangle inequality and the fact that the eigenfunctions ϕ 0 , . . . , ϕ J are orthonormal that We have just seen that the first term on the right-hand side is e −2λt . Recalling the spectral decomposition of L and (4.2), the next contribution, Similarly to the argumentation above, we rewrite (4.11) as and deduce that via integration. Finally, regarding the remaining terms in (4.10), the first two estimates in (4.1) imply that and thus, an integration from t to ∞ gives, thanks to the fact that Therefore, substituting (4.9), (4.12), and (4.13) into (4.10), we deduce that ; thus λ i λ as asserted. Case 1. In case the neutral modes are dominating (4.5), an estimate analogously to the one in Case 2 above gives rise to the bound provided that t t c for some t c large enough. By plugging this into the middle equation of (4.1), we find that and thus, via (4.5), for any t t c . With this information at hand, we may reconsider our previous bound on the nonlinearity. This time, making use of the pointwise estimate in (4.3) and the smoothing properties from Proposition 4.4, we have that for any t t c , for some (possibly larger) t c . The second estimate in (4.1) now turns into the growth condition which yields the lower bound if t t c for some t c . This concludes the proof of Theorem 3.1.

Smoothing Estimates
In this section, we use parabolic regularity techniques to prove Proposition 4.4. We remark that optimal (boundary) regularity estimates were derived recently by Jin and Xiong for the rescaled solution v of (1.2) rather than the relative error h [35]. However, from Theorem 5.1 in their paper we easily infer that for any k ∈ N and τ > 0. This insight will simplify the derivation of our regularity estimates substantially.
Smoothing features are typical for parabolic equations and they remain true in the time and tangential directios for the singular parabolic equation under consideration; see (5.1) and Corollary 5.12. In transversal direction, regularity is limited (if p is not an integer) [35]. Indeed, simple scaling arguments for the elliptic problem close to the boundary, and the same behavior can be expected for the parabolic problem (1.2).
For deriving the smoothing estimates in Proposition 4.4, we notice that the leading order contribution in the nonlinearity (2.2) is of the order |h||∂ t h| ε|∂ t h|, and plays thus the role of a perturbation term in regularity estimates. Moreover, the equation is invariant under differentiation in time and in tangential coordinates near the domain boundary (at least with regard to the leading order contributions), and thus, regularity in these variables is propagated and even further increased by parabolicity. We deal with higher-order derivatives of the nonlinearity by applying suitable interpolations, so that eventually, derivatives of the nonlinearity will play the role of perturbations similarly to the nonlinearity itself as discussed above. Of course, smoothing proceeds instantaneously but not uniformly in time. For this reason, the estimates in or behind Proposition 4.4, Equation (5.1) or Corollary 5.12 deteriorate as t → 0.
Before addressing the dynamical problem, we need to estimate the first three derivatives for weak solutions of the nonlinear elliptic problem (1.3), starting from the known result (5.2). Later we'll see that higher-order tangential derivatives of this solution can also be estimated near the domain boundary.
hold. Furthermore, there exists an r 1 such that for any x ∈ with dist(x, ∂ ) r.
Proof. For a fixed V ∈ S, the first estimate is established, for instance, in Theorem 1.1 in [28] (on the level of the evolutionary problem) or Theorem 5.9 in [13]; the form of (5.2) makes it clear that the constants depend continuously on V in the relatively-uniform topology on S. The second and the third estimate follow from maximal regularity estimates and Sobolev embeddings. Indeed, since V ∈ L ∞ ( ) thanks to (5.2), we must have that V ∈ W 2,q ( ) for any q ∈ (1, ∞) based on Calderón-Zygmund estimates for the elliptic problem (1.3), see, e.g., Chapter 11 in [41], and thus V ∈ C 1,α (¯ ) for any α ∈ (0, 1) by Sobolev embeddings. It follows that provided that α p − 1, and thus one spatial derivative of the equation shows V ∈ C 3,α (¯ ) by Schauder estimates, see, e.g., Chapter 6 in [34]. The statement in (5.4) is a consequence of the above. Indeed, according to (5.2), at the boundary V grows linearly in the direction of the inner normal. By the estimates (5.3), we must thus have (5.4) in a neighborhood of the boundary.
Our first step in the derivation of the smoothing estimates is a maximal regularity estimate for the linearization of (2.1). More precisely, we consider the inhomogeneous linear equation , a solution is always understood in the weak sense. A weak solution of (5.5) refers to a function h ∈ L ∞ ((0, T ); L 2 p+1 ) ∩ L 2 ((0, T );Ḣ 1 2 ) in the spaces provided by Lemma 4.1 such that for any ϕ ∈ C 1 c ([0, T ) ×¯ ) of compact support. It should be stressed that we do not impose spatial boundary conditions on h in the parabolic problem (5.5), which turns out to be well-posed (only) in this case. This is a consequence of the observation that by (formally) integrating by parts in the gradient term in the weak formulation (5.6), the boundary term vanishes thanks to the Dirichlet boundary conditions satisfied by V , see (1.3).
Existence of weak solutions can be derived via standard methods, for instance, via Galerkin approximations based on an L 2 p+1 orthonormal basis consisting of eigenfunctions of the linear operator L= L V . Moreover, from standard energy estimates (derived similarly to those in Lemma 4.1), we infer the uniqueness of weak solutions.
What is a crucial tool in our theory is a maximal regularity estimate for the linear equation (5.5), that we consider, for convenience, with L 2 2 p inhomogenity and zero initial data, see Proposition 5.5 below.
In the interior of , maximal regularity for the parabolic problem (5.5) follows by standard theory, see, e.g., Chapter 7.1 in [30], because the diffusivity coefficients are strictly positive in the interior as a consequence of (5.2). We shall thus focus on the boundary from here on and we fix x 0 ∈ ∂ . Let η denote a cut-off function on R n interpolating smoothly between η = 1 in B r (x 0 ) and η = 0 outside of B 2r (x 0 ).
A short computation reveals that the localized solution H = ηh satisfies the problem where F and G are given by The lemma guarantees that G belongs to , which is assumed for our weak solutions.

Lemma 5.2. (Weighted Poincaré/Hardy type inequality)
In particular, it holds that The proof of the first statement is based on an interpolation argument and the properties of the limit V . The latter then follows via Hölder's and Young's estimate.
Proof. We start considering the first estimate. Because C ∞ (¯ ) is dense in L 2 p+1 ∩ H 1 2 , cf. Lemma 2 in [47], it is enough to establish the estimate for smooth functions. We first notice that an integration by parts and the defining properties of V in (1.3) yield Making use of the elementary inequality ab a 2 + b 2 /4 thus gives that The first statement of the lemma is now a consequence of the fact that 1 V p+1 + |∇V |, which holds true thanks to Lemma 5.1.
For the second statement, we apply Hölder's inequality, and the previous bound to estimate The desired result is then a consequence of Young's inequality ab a q + b q for any Hölder conjugates q and q .
We shall now flatten the boundary. Upon a rotation of the coordinate system, we may assume that the boundary inside B r (x 0 ) can be written as a graph of a function γ , for instance, We setx = φ(x) = (x , x n − γ (x )), which defines a diffeomorphism in the support of η, and maps the boundary ∂ into the hypersurface R n−1 × {0}. In terms where Notice that the transformed equation (5.8) has to be considered on the halfspace R n + . The advantage of (5.8) over the (5.5) is that in the new variables, the weight and its tangential derivatives can be estimated by the distance to the flattened boundary.
Proof. We start by noticing that the boundary estimate (5.2) translates intô under the change of variables. Indeed, since on the one hand, by choosing y = x , we immediately deduce that On the other hand, as the minimizer y solves the optimality condition x − y = (x n − γ (y ))∇ γ (y ), we find Thus dist(x, ∂ ) is comparable tox n , which implies (5.10) via (5.2).
We have to show that this estimate remains true for tangential derivatives. Since Lemma 5.1 assertsV ∈ C 3,α , (5.9) follows directly for k ∈ {0, 1, 2} via Taylor expansion because the homogeneous boundary conditions are invariant under differentiation in tangential direction. For larger values of k, we have to transform the elliptic equation (1.3) into a problem on the half-space. In a similar way as we transformed the parabolic equation, we find that −∇ · (A∇(ηV )) =ηV p + B · ∇V + CV , for some smooth and bounded functions B and C on R n + that depend only on the regularity and the shape of the boundary ∂ . Differentiating with respect to tangential variables x i for any i < n, we find that to be the product of a C 3,α function with a ratio in which the C 2,α numerator and C 3,α denominator both vanish linearly at the halfspace boundary. On any smooth bounded subdomain of the halfspace containing φ( ∩ B 2r (x 0 ) ), Schauder theory (e.g., Chapter 6 in [34]), then impliesη∂x iV ∈ C 3,α so that (5.9) holds for k = 3. For larger k, choosing a sequenceη k−1 >η k of nested cutoffs satisfying the same hypotheses asη 3 =η, and a multi-index β ∈ N n−1 Induction on k gives f β ∈ C 1,α henceη kD βV ∈ C 3,α and (5.9) for all k; this induction relies on the decay already established for the derivatives of V which appear in the p-homogeneous nonlinearities, (and the fact that of the k−1 derivatives of V that contribute elsewhere to f β , all but two are in tangential directions).
Finally, the third statement of the lemma follows from Lemma 5.1 and Taylor expansion.
It follows immediately from the preceding lemma that the problem in (5.8) can be further rewritten as for some new ellipticÃ, and whereG is the sum ofĜ and other lower-order terms of the same class. The weak formulation in (5.5) now turns into for anyφ ∈ L 2 (L 2 2 ) ∩ L 2 (Ḣ 1 2 ) ∩Ḣ 1 (L 2 p+1 ) vanishing near the endpoint T . We now prove maximal regularity for the problem in (5.11).
Proof. In order to simplify the notation, we drop the hats and tildes from here on. Moreover, we set G = 0. We will here only give formal arguments. The estimates can be derived rigorously by approximating F smoothly and using suitable finite difference quotient approximations for the test functions, see, e.g., Chapter 6.3 in [30]. In a first step, we use (an approximation with smooth cut-offs in time of) ϕ = −χ (0,T ) ∂ 2 k H for some k ∈ {1, . . . , n − 1} as a test function in (5.12), where χ (0,T ) is the characteristic function for the time interval, and we obtain Multiple integrations by parts then yield 13) and invoking the ellipticity of the matrix A, the Cauchy-Schwarz inequality and recalling that H was assumed to have zero initial data yields Via the elementary estimate ab εa 2 + ε −1 b 2 , we can control the second order term on the right-hand side. We have thus derived the desired control over the second order tangential and mixed derivatives, namely Moreover, an application of Lemma 5.2 provides also the control over the first order tangential derivatives, because p + 1 > 2 implies: Notice that a replacing the time interval (0, T ) in (5.13) by (t, t + ε) shows also the continuity of ∂ k H L 2 p+1 in time. In order to control the transversal derivatives, it is now enough to focus on the x n variable, and thus study the one-dimensional problem because all the other terms that appear in (5.11) are now known to belong to L 2 2 p . In order to simplify the notation further, we drop the subscripted n's in the rest of the proof. Furthermore, the problem becomes more accessible if we freeze the diffusivity function A at an arbitrary point x * . That is, we study the equation where we have set A * = A(x * ). Thanks to the regularity of A, it holds that |A * − A(x)| δ for x and x * in the interval (0, δ). We should thus localize the problem further by smuggling in a cut-off function η satisfying η = 1 in (0, δ) and vanishing outside of (0, 2δ). This way, we are led to considering and we writeF for the right-hand side for brevity and setH = ηH . Now, if we can show that the statement follows if δ is sufficiently small, because by the regularity of A and the properties of the cut-off function. In view of the interpolation Lemma 5.2, the L 2 norm on H can be replaced by the L 2 2 p as in the statement of the lemma.
We have now reduced the multi-dimensional problem with variable coefficients (5.11) to a one-dimension problem with constant coefficient, We have to do one more transformation in order to arrive at a problem that is better behaved. Indeed, if we change variablesx = x p+1 ,ť = A * ( p + 1) 2 t, H (ť,x) =H (t, x) andF(ť,x) =F(t, x) the above equation turns into p+1 . This is precisely the linear version of the parabolic equation that characterizes the porous medium dynamics in a neighborhood of the Barenblatt solution as studied earlier in [38,40,48]. It is well-understood: Calderón-Zygmund and Muckenhoupt theory is available and provides estimates for any q ∈ (−1, 2(σ + 1) − 1), see the proofs of Proposition 3.23 in [38] or Proposition 4.23 in [48]. Our choice is q = p p+1 , which is equivalent to (5.14).
We summarize our findings as follows: This maximal regularity result can be easily combined with the energy estimate for the nonlinear problem.

Lemma 5.6. (Nonlinear smoothing 1) Let h be a solution to the nonlinear equation
(2.1) with initial datum h 0 ∈ L 2 p+1 and assume that h L ∞ ε for some ε > 0 small enough. Then, for any 0 < τ < 1 < T there exists C = C(τ, T, n, p, V ) such that Proof. We denote by ζ a smooth cut-off function that is 1 in the time interval [τ, T ] and zero in [0, τ/2]. We localize the evolution (2.1) with the help of this function and apply the maximal regularity estimate from Proposition 5.5 to the effect that Thanks to the particular structure of the nonlinearity (2.2), the third term on the right hand side can be estimated by Using the fact that L 2 p+1 embeds continuously into L 2 2 p by the virtue of (5.2), the above estimates combine with the pointwise bounds from Lemma 4.1 to give .
Choosing ε small enough and invoking the properties of the cut-off function yields the statement of the lemma.
Before turning to higher-order derivatives, we use integration by parts to establish a class of interpolation inequalities which will allow us to control the effects of the nonlinearity.
Proof. To keep the notation as simple as possible, we perform a rather symbolic calculation. That is, we write ∂ m for some partial derivative of mth order, ∂ m = ∂ α x with |α| = m. In our argument, the precise value of α is not of importance.
We start with an integration by parts to notice that where the last estimates follow from Hölder's inequality. Here, we employ the convention that Applying Young's inequality in the form ab εa 1 θ + C ε,θ b 1 1−θ for some arbitrarily small ε and θ ∈ (0, 1) to the right hand side yields It remains to apply an iteration procedure. For this purpose, we set, for k +m, (k, , 0) and notice that A 0 := A(k, 0, m) = h L ∞ . Upon rescaling h, we may assume from here on that A 0 = 1. (We may always assume A 0 = 0 as otherwise the lemma is vacuously true.) With this notation, the previous estimate becomes (5.18) and the statement of the lemma can be rephrased as , m), (5.19) for every k and 1 k − 1. The proof will be a double-induction on (k, ). We start by noticing that for k = 2 and = 1, our objective (5.19) is nothing but the estimate (5.18) just proven. Suppose k 3 is fixed and that (5.19) holds true for k − 1 and any k − 2. Our goal is to show (5.19) for fixed k and all 1 k − 1. We first need some auxiliary inequalities. Note for ϕ = Dψ, it holds that and since the estimate in (5.18) is independent of the choice ψ, the inductive hypothesis (5.19) allows us to estimatẽ for any k − 2. Plugging this bound into (5.18) gives We claim that this estimate implies for any 1 k − 1. Indeed, the case = 1 follows directly from (5.20) because C(k, 0) = A 0 = 1. The general case follows by induction: We suppose that (5.21) is proved for 1, 2, . . . , − 1 and we aim at establishing it for . For this purpose, we use Young's inequality in (5.20) to the effect that for some arbitrary ε. Invoking the hypothesis that (5.21) holds true for − 1, we then deduce which gives (5.21) for if ε is chosen small enough. It remains to iterate (5.21) to find which is what we aimed to prove, cf. (5.19).
We will now perform an intermediate step towards higher-order regularity estimates by lifting the norms on the left-hand side in (5.16) to the next order in time. Higher-order time derivatives will be considered subsequently simultaneously with suitable higher-order spatial derivatives. The intermediate step that we take in the following lemma is necessary in order to control lower-order error terms that appear later as a result of a transformation of the equation close to the boundary; see (5.22) below.

Lemma 5.8. (Nonlinear smoothing 2) Let h be a solution to the equation (2.1)
with initial datum h 0 ∈ L 2 p+1 and assume that h L ∞ ε for some ε > 0. Then if ε is small enough and 0 < τ < 1 < T , it holds that Proof. Regularity in time was proved already by Jin and Xiong, see (5.1) above. In order to get control over the mixed derivatives, we proceed carefully by considering finite difference quotients d s t h(t) = s −1 (h(t + s) − h(s)). We consider the same cut-off function in time as in the proof of Lemma 5.6. Then localizing the nonlinear equation and "differentiating" (5.17) with respect to time, we obtain Regarding the nonlinear terms, we notice that |N (h)| |h| 2 + |h||∂ t h| and |d s t N (h)| |h||d s t h| + |d s t h||∂ t h| + |h||∂ t d s t h|. Therefore, using |h| ε 1 and making use of the maximal regularity estimate for the linear problem, Proposition 5.5, we find that . If ε is sufficiently small, the last term on the right-hand side can be absorbed into the left-hand side. Moreover, the (remaining) expressions on the right-hand side are bounded uniformly in s by the virtue of Jin and Xiong's regularity statement (5.1). We may thus pass to the limit s → 0 and find . Now, applying the interpolation Lemma 5.7 in the form , and using again that |h| ε 1 leads us to the estimate , provided that ε is sufficiently small. Since μ 2 p μ p+1 by the virtue of Lemma 5.1, we can now apply Lemmas 4.1 and 5.6 and deduce the statement of the lemma by the properties of the cut-off function.
Similarly to the derivation of the maximal regularity estimate in Proposition 5.5, the derivation of higher-order regularity estimates requires attention only in a neighborhood of the boundary. Indeed, in the interior the equation is parabolic with smooth coefficients, and thus, higher-order estimates in the interior just follow by standard iterative arguments based on the maximal regularity estimate from Proposition 5.5. As before, we shall thus focus on the boundary from here on. We choose essentially the same notation as in the proof of Proposition 5.5 and we fix x 0 ∈ ∂ arbitrarily and let η denote a cut-off function on R n interpolating smoothly between η = 1 in B r (x 0 ) and η = 0 outside of B 2r (x 0 ). Moreover, as in the proofs of Lemmas 5.6 and 5.8, we have to introduce a cut-off function ζ defined on [0, ∞) that satisfies ζ = 0 in [0, τ/2] and ζ = 1 in [τ, T ] for some 0 < τ < 1 < T . Treating the nonlinearity N (h(x)) =: f (x) as an inhomogeneity and smuggling ζ η into the equation, we find that H = ζ ηh satisfies We now apply the same diffeomorphism φ that we used in order to transform the elliptic problem (5.7) into the half-space problem (5.11), and arrive at We will now derive control on higher-order derivatives for equation (5.23). As before, we interpret the weighted Lebesgue norms with respect to the simpler weight x n and we consider L r (L 2 q ) = L r ((0, T ); L 2 (μ q )) with measure dμ q =x q n dx and typically r = 2. Moreover, we write z = (t,x ) for the time and flattened tangential variables, whereas∇ denotes the full spatial gradient (tangential and normal) in flattened coordinates. Lemma 5.9. (Tangential smoothing by the linear inhomogeneous evolution) Let H be a solution to the transformed equation (5.23). For any k ∈ N and α ∈ N n 0 with |α | = k, it holds that
We can proceed as in the proof of Proposition 5.5 and show that (5.15) holds true on the half-space, that is, we have Note that the implicit constant in this estimate might be time-dependent and blow up for infinite times. Now, we differentiate (5.23) in time and tangential direction. For m ∈ N 0 and α ∈ N n−1 0 , it holds that provided that ε is sufficiently small and |β| 1. Therefore, summing over any multi-indices α with |α| = k and integrating in space and time, we have that Let's discuss the right-hand side term by term. The first term is exactly of the kind we are looking for. For the second one, we apply Lemma 5.7 and find, for some ν > 0. The remaining terms are bounded by the same quantity.
The treatment of f 1 is similar. This time, derivatives of the nonlinearity are bounded as follows, as can be observed by Young's inequality and an iterative argument. With the help of the Leibniz rule, we thus obtain and absorbing ∂ t into D z and an application of the Hölder inequality furthermore yields We now invoke Lemma 5.7 to the effect that for some ν > 0.
With these preparations, we are in the position to extend the L 2 estimates from Lemmas 4.1, 5.6, and 5.8 to z-derivatives of any order where z = (t,x ) denotes the tangential and time variables and∇ denotes the full spatial gradient (tangential and normal) in flattened coordinates. For our purposes, it is enough to bound the unweighted terms in these estimates. Proposition 5.11. (Tangential nonlinear smoothing in Hilbert norms) LetĤ be the solution to the transformed equation (5.23) and suppose that Ĥ L ∞ ε for some ε small enough. Let 0 < τ < 1 < T be given. Then, for any k ∈ N 0 , it holds that Proof. We start by noting that the estimates from Lemmas 4.1, 5.6 and 5.8 easily translate into the localized setting, so that In order to derive estimates on derivatives of the next order, we invoke Lemma 5.9 with k = 1, Of course, our presentation here is a bit formal: Instead of considering derivatives D z , we should more carefully apply difference quotients to the nonlinear equation.
We have done so in Lemma 5.8 to deal with time derivatives. Because of the known reguarity in time (5.1), passing to the limit in the difference quotients don't cause any problems, also in the nonlinear terms. The strategy remains the same for higher order derivatives in time, and we may generously simplify our presentation here by considering proper time derivatives in the sequel. When it comes to higher order derivatives in tangential direction, a result analogous to (5.1) is missing, but will be derived by us in Corollary 5.12 below. We should thus be a bit more careful in our argumentation. For the sake of a simpler presentation though, we shall keep the notation D z and will tacitly interpret it as difference quotients in the tangential variables. Only in the discussion of the leading order nonlinear terms, we shall recall its actual meaning. For all other terms we will be rather formal.
Let us start considering the lower-order terms. From the definition ofĜ, we deduce that where we have set ψ = ζη.
We will now apply an interpolation to modify the weights on the right-hand side. Let φ be a cut-off function that is 1 in the support of ψ and vanishes outside of a small neighborhood spt ψ of this support. Then it holds for any regular function ξ that An application of the Cauchy-Schwarz inequalities then yields This argument can be repeated by writingx q n = 1 q+1 d dx nx q+1 n and using thatμ q μ 2 for q 2, to derive for any q ∈ 2N. By interpolation, this estimates extends to any q > 0. Making use of this interpolation-type estimate with suitable choices of q in the above estimate forĜ and estimatingμ 2 p μ p+1 μ 2 1 yields We may now invoke (5.26) with suitable choices of τ , T and spatial cutoff r to deduce that It remains to estimate the nonlinear terms in (5.27). We promised to be more careful when considering higher order tangential difference quotients and we should thus briefly discuss the rigorous treatment of the leading order nonlinear terms. Because tangential derivatives leave the limit functionV invariant in terms of scaling, cf. Lemma 5.3, these are terms of the order ψ|d s iĥ ||∂ tĥ | and ψ|ĥ||∂ t d s iĥ |, where d s i is the difference quotient operator in directionx i , see also Lemma 5.8. By using the smallness ofĤ in the assumption, the second of these terms can be absorbed into the left-hand side before passing to the limit s → 0. The other term can be split into the two quadratic terms ψ(d s iĥ ) 2 and ψ(∂ tĥ ) 2 , among which we only have to consider the first one, because time regularity is already settled. Here, we notice that this term is bounded uniformly in s, because ψ(∂ iĥ ) 2 L 2 2 p can be estimated by ĥ L ∞ ψ∂ 2 iĥ L 2 2 plus lower order terms via the interpolation Lemma 5.7 and by usingμ 2 p μ 2 in the support ofη. This term is controlled via (5.26). There are no further regularity issues popping up when considering higher order tangential derivatives. We shall continue with the rather formal discussion and summarize here that Lemma 5.10 implies , where the last inequality is due to (5.26).
Plugging these estimates (for the lower-order terms involvingĜ and the nonlinearities involvingF) into (5.27) thus yields provided that ε ν is chosen small enough that the final terms above which it multiplies can be absorbed into the left hand side. This procedure can be iterated, with a suitable adaption of τ, T and the radii r of the spatial cut-off functions η in each step to prove that inductively. This implies the desired bounds.
Finally, we use generalized Sobolev embeddings to pass from L 2 to L ∞ estimates.
Corollary 5.12. (Tangential nonlinear smoothing in weighted uniform norms) Let H be the solution to the transformed equation (5.23) and suppose that Ĥ L ∞ ε for some ε small enough. Let 0 < τ < 1 < T be given. Then, for any k ∈ N 0 , it holds that Proof. The estimate basically follows from Proposition 5.11 via generalized Sobolev embeddings using compact support in the z = (t, x ) ∈ R n variables, followed by (two) integrations in x n where we have a vanishing boundary condition at one end x n = r only. Indeed, for m ∈ N with m > n/2, it holds that We now use the Hardy inequality whose proof is similar to that of (5.28), to eliminate the zero-order terms on the right-hand side, thus, Finally, we apply Proposition 5.11 to infer the desired control of the first term in the statement of the lemma. The second term is bounded analogously, by applying the same argument tox n D k∇Ĥ in place of D kĤ .
We are now well-prepared to prove Proposition 4.4.
Proofs of Proposition 4.4. As a consequence of Corollary 5.12 and the construction ofĤ , we find for any x 0 ∈ ∂ , 0 < τ < 1 < T and any r > 0 small enough that In particular, covering a small band along the domain boundary with a finite number of balls, the latter extends to the band r = {x ∈ : dist(x, ∂ ) r } for some small r > 0, As mentioned earlier, similar (but simpler, thanks to the strict parabolicity) arguments in the interior of the domain yield analogous estimates on \¯ r . Both together prove the statement of the proposition.

Proof of Second Dichotomy
In this final section, we turn to the proof of Theorem 3.4, which states optimal exponential convergence of the relative error under the assumption that V is an ordinary limit in the sense of Definition 3.3. Thanks to our first dichotomy result-Theorem 3.1-and Proposition 4.4, it is enough to establish convergence at some exponential rate, which is the main result of the present section.
The proof of exponential convergence relies on Choi and Sun's refinement-Lemma 4.3-of the Merle-Zaag dynamical systems result recalled above. It roughly says that if a solution is known to be small on a large time interval, then, up to a possible error caused by the neutral modes, the stable and unstable modes should be much smaller than an exponentially decaying term in the middle of this time interval. (Note the unstable modes tend to decay like the stable ones if time goes backward and this is why we need to go to the middle of time interval.) The main issue to take care of is thus the control of the neutral modes.
The underlying idea for controlling the neutral modes is reducing the amplitude of the neutral modes by changing the reference stationary solution in the direction of the neutral modes. This can be effectively done if the limit V is ordinary. This strategy goes back to the work of Allard and Almgren [3], who gave kernel integrability conditions guaranteeing that minimal surfaces converge to their tangent cones sequentially and exponenentially fast. See also Section 6 of Simon [49], or the recent contributions of Choi, Choi, Kim and Sun in various combinations [17,20].
In order to pursue this strategy, we have to prove that being an ordinary limit is an open property among stationary solutions S. This crucial insight requires some technical preparations. Lemma 6.2. (Lower semicontinuity of kernel dimension at an ordinary limit) Let V ∈ S be an ordinary limit and δ > 0 as in Definition 3.3. LetṼ ∈ S be sufficiently Proof. Let 0 ∈ U ⊂ ker L V the neighbourhood provided by Definition 3.3. We fix another element in the kernel, ζ ∈ ker L V , and choose s 0 small enough such that ψ + sζ ∈ U for any s ∈ (−s 0 , s 0 ). Then h s = V (ψ + sζ ) defines a family of stationary relative errors We may now rewriteh and thanks to the regularity properties of the diffeomorphism V , differentiation in the previous two identities yields . The latter verifies the inclusion (6.1).
The next lemma guarantees that spectral gaps are preserved by nearby stationary solutions.
for some λ ∈ R. Suppose that the sequence of eigenvalues is bounded, |λ | for some > 0. Then there exists a subsequence {φ k } k∈N and a function φ ∈ L 2 (V p+1 dx) such that The limiting function φ is a normalized eigenfunction, i.e, for some λ ∈ R. Moreover, the following hold true: (1) If λ = 0 for all ∈ N, then λ = 0.
The lemma entails, in particular, that if |λ | > 0 for all ∈ N, then lim inf where λ u is the largest negative and λ s is the smallest positive eigenvalue of L V .
Proof. For the compactness assertion, we aim to bound sup φ H 1 . Applying Proposition 5.5 with h/t = φ = f /(λ t + 1) yields the unweighted gradient bound Now the local independence (in the relatively-uniform topology) of constants in on V ∈ S combines with Lemma 5.2 to imply the sequence φ is bounded in the Sobolev space H 1 ( ). Via a Rellich compactness argument, we conclude that {φ } ∈N converges strongly subsequentially in L 2 (and then also in L 2 (V p+1 dx)) and weakly in H 1 (V 2 dx) towards some function φ. Moreover, thanks to the Bolzano-Weierstraß theorem, we find that the sequence of eigenvalues {λ } ∈N converges subsequentially to some λ ∈ R.
Considering a common subsequence (indexed by k ) and passing to the limit in the weak formulation of the eigenvalue equation, where we also use the uniform convergence of the relative error h k , we find that φ is a normalized eigenfunction of L V with eigenvalue λ. It remains to derive the assertion on the sign of the limiting eigenvalues. The first statement is trivial, while the proofs of two others are identical. Let's thus focus on one of them, say the middle one. It is clear that the limiting eigenvalue is nonnegative, λ 0 and we have to rule out that it is in fact zero. For this purpose, we note that the eigenfunctions φ are orthogonal to the kernel, φ ζ V p+1 dx = 0 for any ζ ∈ ker L V . (6.2) We pick ζ ∈ ker L V and define ζ ∈ ker L V according to the inclusion (6.1) derived in Lemma 6.2 by where ψ ∈ ker L V is such that h = V (ψ ). It is straightforward to verify that ζ converges to ζ strongly in L 2 (V p+1 dx) as → ∞. Indeed, because of the imposed convergence of the relative error h and since the diffeomorphism V vanishes only at the origin, we must have that ψ → 0 strongly in L 2 (V p+1 ). Furthermore, by the continuity of the derivative d V , it holds that (d V ) ψ → id. Using once again the uniform convergenve of the relative error, we conclude that ζ → ζ in L 2 (V p+1 dx). We now pass to the limit in the orthogonality condition (6.2) with our particular construction of the ζ 's, which was arbitrary in the choice of ζ , and find φζ V p+1 dx = 0 for any ζ ∈ ker L V .
Hence, φ is a (nontrivial) eigenfunction of L V that is orthogonal to the kernel. We conclude that λ > 0 as desired.
The preceeding analysis allows us to conlude quite easily that the dimension of the kernels of the linear operators remains constant if the reference stationary solution is changed in a neighborhood of an ordinary limit V . Lemma 6.4. (Invariance of the kernel dimension near ordinary limits) Let V ∈ S be an ordinary limit and δ > 0 as in Definition 3.3. LetṼ ∈ S be a stationary solution close to V in the sense that h =Ṽ /V − 1 satisfies h L ∞ δ , for somẽ δ ∈ (0, δ). Thenδ sufficiently small implies dim ker L V = dim ker LṼ .
Proof. As a consequence of Lemma 6.2, and because (d V ) ψ is an isomorphism, it is clear that We argue that both kernels have indeed the same dimension ifδ is sufficiently small. We give an indirect argument and derive a contradiction by assuming that there exists a sequence h = V /V − 1 satisfying h L ∞ 1 and dim ker L V K + 1. We pick K + 1 orthonormal (and thus linearly independent) functions φ ,1 , . . . , φ ,K +1 in ker L V .
By the virtue of Lemma 6.3, there exist subsequences (that we will not relabel) and normalized functions φ 1 , . . . , φ K +1 in ker L V such that In particular, using in addition that {h } ∈N is converging uniformly, we find that φ 1 , . . . , φ K +1 is orthonormal. Thus, the dimension of ker L V is at least K + 1, which contradicts (6.3).
Collecting these technical preparations, we are able to derive the aforementioned key feature: that ordinariness is a relatively-uniformly open property in the set S of stationary limits. Proposition 6.5. (Ordinary limits form an open subset of S) Let V ∈ S be an ordinary limit. IfṼ ∈ S is a stationary solution sufficiently close to V in the relatively-uniform topology, thenṼ is also an ordinary limit.
Proof. By combining the results from Lemmas 6.2 and 6.4, we see that where ψ ∈ ker L V is such that h = V (ψ). It follows that the mapping Ṽ defined by forψ ∈ ker LṼ , is a C 1 diffeomorphism from a neighborhood of 0 ∈ ker LṼ into the set of stationary solutions LṼh = M(h) and satisfies all the properties listed in Definition 3.3, as can be readily verfied. Notice also that we have see the construction of this diffeomorphism already in the proof of Lemma 6.2: If ζ ∈ ker L V is sufficiently small so that h ζ = V (ψ + ζ ) is well-defined, then V ζ = V (h ζ + 1) is a stationary solution and the relative errorh ζ with respect tõ solves the stationary error equation relative to the reference pointṼ .
We will now derive a building block for the proof of Theorem 6.1 which exploits both the previous Proposition 6.5 and the Choi-Sun refinement of the Merle-Zaag Lemma 4.3, to yield a suitable reduction of the neutral modes as announced at the beginning of this section. The actual proof of exponential convergence will follow subsequently by iteration. Proposition 6.6. (Improvement by changing reference stationary solutions) Let V ∈ S be an ordinary limit (1.3). There exist four constants ε 0 , δ 0 ∈ (0, 1) and τ, C ∈ (1, ∞) with the following property: Let V (1 + h(t)) solve the dynamics (1.2) and stay near V , for all s τ . Moreover, for each s 0,ĝ s+τ is close enough to g s that g s −ĝ s+τ L ∞ Cε. (6.8) In the proof of Theorem 6.1 below, the exponential convergence rate γ = log 2 τ is determined by the delay τ provided by the preceding prop.
Proof. To simplify the notation, we write for anyṼ stationary solution (1.3). We start by noting that for two stationary solutionsṼ = V (1 +g) andV = V (1 +ĝ) to (1.3) with g L ∞ , ĝ L ∞ δ 0 < 1, there holds i.e., two norms are equivalent f Ṽ f V provided δ 0 < 1/2. Throughout this proof, f g denotes the inequality f Cg for some constant C which may depend on n, m, and V but uniform in small ε 0 , δ 0 , and large τ .
We now have all the tools at hand to proceed to the proof of the exponential convergence result.
Proof of Theorem 6.1. The idea is to iterate Proposition 6.6. By a translation in time, we may assume that sup t 0 h(t) L ∞ ε 1 for some ε 1 min(δ 0 , ε 0 ). Moreover, we choose ε 1 small so that 2Cε 1 < δ 0 holds, where C < ∞ is the constant in Proposition 6.6.
Moreover, the estimate (6.21) also implies that, for any fixed s, the sequence g k,s converges geometrically in L ∞ to some g ∞,s , g k,s − g ∞,s L ∞ Cε 1 2 k , for any k ∈ N and s 0. We conclude via the triangle inequality and estimate (6.20) that h(t + kτ ) − g ∞,s L 2 p+1 C 1 2 −k for some new constant C 1 , any s 0 and any t ∈ [s, s + 2]. Picking s = t = 0, we deduce exponential convergence with rate γ = (log 2)/τ towards g ∞,0 , which must actually vanish, g ∞,0 = 0, because h(t) is decaying to zero by the virtue of the Bonforte-Grillo-Vázquez theorem [12], cf. (1.5). This finishes the proof of the second dichotomy.
Funding Open Access funding enabled and organized by Projekt DEAL.

Data Availability Statement
Data sharing is not applicable to this article as no data sets were generated or analyzed during the current study.

Conflicts of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/ licenses/by/4.0/.
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.