1 Introduction

In the present work we look at the problem of a finite, discrete nonlinear Schrödinger equation, with dissipation, which we considered first in [1]. We need to repeat several equations from that paper, but the aim is now to give a complete proof of the observations and assertions in that paper. One starts with

$$\begin{aligned} -{\varvec{\mathbf{i}}}\frac{\partial u_j}{\partial \tau } = - (\varDelta u)_j + |u_j|^2 u_j, \ j=1,2, \dots , n, \end{aligned}$$
(1.1)

where we will add dissipation later. Here \((\varDelta u)_j= u_{j-1}-2 u_j +u_{j+1}\), with free end boundary conditions for \(j=1\) or n, i.e., \((\varDelta u)_1= -u_1+u_2\) and \((\varDelta u)_n=-u_n+u_{n-1}\).

For the convenience of the reader, the following introduction repeats the setup from [1].

We will choose initial conditions for this system in which essentially all of the energy is in mode \(u_1\), and will add a weak dissipative term to the last mode as in [1, 2] by adding to Eq. (1.1) a term of the form

$$\begin{aligned} {\varvec{\mathbf{i}}}\gamma \delta _{n,j} u_j, \end{aligned}$$

i.e., we add dissipation to position n, at the opposite end from the energetic mode.

Eventually, this will lead to the energy of the system tending to zero, but we are interested in what happens on intermediate time scales, and in particular, how the energy is transported from one end of the lattice to the other.

If our initial conditions are chosen so that \(u_1(0) = \sqrt{ \varOmega } \), and all other \(u_j(0) = 0\), then we expect that at least initially, the coupling terms between the various modes will play only a small role in the evolution and the system will be largely dominated by the equation for \(u_1\):

$$\begin{aligned} -{\varvec{\mathbf{i}}}\frac{\mathrm{d}u_1}{\mathrm{d}\tau } = |u_1|^2 u_1, \end{aligned}$$

with solution \(u_1(\tau ) = \sqrt{\varOmega } e^{{\varvec{\mathbf{i}}}\varOmega \tau }\)—i.e., we have a very fast rotation with large amplitude. With this in mind, we introduce a rescaled dependent variable and rewrite the equation in a rotating coordinate frame by setting:

$$\begin{aligned} u_j(\tau ) = \sqrt{\varOmega } e^{{\varvec{\mathbf{i}}}\varOmega \tau } {\widetilde{w}}_j(\tau ) . \end{aligned}$$
(1.2)

Then \({\widetilde{w}}_j\) satisfies

$$\begin{aligned} \varOmega {\widetilde{w}}_j -{\varvec{\mathbf{i}}}\frac{\partial {\widetilde{w}}_j }{\partial \tau } = - (\varDelta {\widetilde{w}})_j + \varOmega |{\widetilde{w}}_j |^2 {\widetilde{w}}_j . \end{aligned}$$

We now add dissipation by adding a term which acts on the last variable, with \(\gamma \ge 0\),

$$\begin{aligned} \varOmega {\widetilde{w}}_j -{\varvec{\mathbf{i}}}\frac{\partial {\widetilde{w}}_j }{\partial \tau } = - (\varDelta {\widetilde{w}})_j + \varOmega |{\widetilde{w}}_j |^2 {\widetilde{w}}_j +{\varvec{\mathbf{i}}}\gamma \delta _{n,j} {\widetilde{w}}_j . \end{aligned}$$

Rearranging, and dividing by \(\varOmega \) gives

$$\begin{aligned} - {\varvec{\mathbf{i}}}\frac{1}{\varOmega } \frac{\partial {{\widetilde{w}}}_j }{\partial \tau } = - \frac{1}{\varOmega } (\varDelta {\widetilde{w}})_j - {\widetilde{w}}_j + |{\widetilde{w}}_j |^2 {\widetilde{w}}_j + {\varvec{\mathbf{i}}}\frac{\gamma }{\varOmega } \delta _{n,j} {\widetilde{w}}_j . \end{aligned}$$

Finally, we define \(\varepsilon = \varOmega ^{-1}\), and rescale time so that \(\tau = \varepsilon t\). Setting \(w(t) = {\widetilde{w}}(\tau )\), we arrive finally at

$$\begin{aligned} -{\varvec{\mathbf{i}}}\frac{\partial w_j}{\partial t} = -\varepsilon (\varDelta w)_j - w_j + |w_j|^2 w_j+ {\varvec{\mathbf{i}}}\gamma \varepsilon \delta _{n,j} w_j. \end{aligned}$$
(1.3)

Now that we have defined the main equation Eq. (1.3) we describe the picture that will be proved later:

When \(\gamma =0\) and \(|\varepsilon | \ne 0\) is small, this system possesses a family of breathers, namely solutions which in this rotating coordinate systems are stationary states in which most of the energy is localized in site 1.Footnote 1 Such solutions take the form

$$\begin{aligned} u_{\varepsilon ,j}^{(0)}\sim (-1)^{j+1}\varepsilon ^{j-1}, \quad j=1,\dots ,n . \end{aligned}$$

In fact, there are many such solutions, with different frequencies, which we will write as

$$\begin{aligned} u_{\varepsilon ,j}^{(\varphi _0)} = e^{{\varvec{\mathbf{i}}}\varphi _0 t} p(\varphi _0)_j, \end{aligned}$$
(1.4)

for \(\varphi _0\) near zero, and \(p(\varphi _0)_j \sim u_{\varepsilon ,j}^{(0)}\). We will demonstrate the existence of these families of solutions in Theorem 1.6, using the implicit function theorem, and also give more accurate asymptotic formulas for them. As Eq. (1.3) is invariant under complex rotations, we actually have a circle of fixed points, with a phase we call \(\vartheta \). As there is one such circle for every small \(\varphi _0 \) we represent, in Fig. 1, these solutions as a (green) cylinder, with the direction along the cylinder corresponding to changing \(\varphi _0\) and motions “around” the cylinder corresponding to changing \(\vartheta \).

When the dissipation is nonzero (i.e., when \(\gamma >0\)) these periodic solutions are destroyed, but they give rise to a family of time-dependent solutions which “wind” along the cylinder, the red curves in Fig. 1. We will prove that one can accurately approximate solutions of the dissipative equations by “modulating” the frequency and phase of the breather, namely we prove that the solutions of the dissipative equation can be written as:

$$\begin{aligned} u_j(x,t)= e^{{\varvec{\mathbf{i}}}(t\varphi (t)+\vartheta (t))}u_{\varepsilon ,j}^{(0)}+z_j(t) , \quad j=1,\dots ,n, \end{aligned}$$

where

$$\begin{aligned} {{\dot{\varphi }}}(t)&\sim -2\gamma \varepsilon ^{2n-1},\\ t{{\dot{\varphi }}}(t)+{{\dot{\vartheta }}}(t)&\sim 0,\\ {\left| \left| z(t) \right| \right| }&\text { remains bounded by }{\mathcal {O}}(\gamma \varepsilon ^n) . \end{aligned}$$

The higher order terms that we have omitted from these expressions are explicitly estimated in Sect. 7 .

We prove that the initial values \(\varphi _0\) and \(\vartheta _0\) can be chosen so that z(t) is normal to the cylinder of breathers at the point \((\varphi _0,\vartheta _0)\), and that its long term boundedness is due to the (somewhat surprising) fact that the linearized dynamics about the family of breathers is uniformly (albeit weakly) damping in these normal directions. This is the main new technical result of the paper and the proof of this fact takes up Sects. 36. That breathers can play an important role in the non-equilibrium evolution of systems of coupled oscillators has also been discussed (non-rigorously) in the physics literature. For two recent examples see [5, 6].

To formulate our results more precisely, we need some notation: Let \(\delta (t)\equiv \varphi (t)-\varphi _0\). Let \(s(t)=\int _0^t \mathrm{d}\tau (\tau {{\dot{\varphi }}}(\tau ) +{{\dot{\vartheta }}} (\tau ))\). Let the initial condition be \(\varphi _0\), \(\vartheta _0\) and \(z_0\), with \(z_0\) perpendicular to the tangent space to the cylinder at \(\varphi _0\), \(\vartheta _0\). Note that \(\delta (0) = s(0) = 0\).

Theorem 1.1

For sufficiently small \(\varepsilon >0\) and \(\gamma >0\) the following holds: Assume

$$\begin{aligned} \Vert z(0)\Vert \le \gamma \varepsilon ^n , \end{aligned}$$

Then, there is a constant (depending only on n) such that at time \(T=\mathrm{const}\,\varepsilon ^{-1}\).

$$\begin{aligned} \Vert z(T)\Vert \le \gamma \varepsilon ^n , \end{aligned}$$

while both \(\delta (T)\) and s(T) have modulus less than 1. For all intermediate t, one has \(\Vert z(t)\Vert \le 2 \gamma \varepsilon ^n \), so the trajectory never moves too far from the cylinder of breathers. Furthermore, one can find \(\varphi _1\), \(\vartheta _1\), \(z_1\), with \(z_1\) in the subspace perpendicular to the tangent space to the cylinder at \(\varphi _1\), \(\vartheta _1 \), with

$$\begin{aligned} e^{{\varvec{\mathbf{i}}}(T\varphi _1+\vartheta _1)}p(\varphi _1) + z_1 = e^{{\varvec{\mathbf{i}}}(T\varphi (T)+\vartheta (T))}p(\varphi (T)) +z(T), \end{aligned}$$
(1.5)

and

$$\begin{aligned} \Vert z_1 \Vert \le \gamma \varepsilon ^n . \end{aligned}$$
(1.6)

Finally,

$$\begin{aligned} \varphi _1-\varphi _0=-2\gamma \varepsilon ^{2n-1}T+ h.o.t.\ . \end{aligned}$$
(1.7)

Remark 1.2

The important consequence of Theorem 1.1 is the observation that the bounds propagate, so we can restart the evolution, using initial conditions \((\varphi _1,\vartheta _1, z_1)\) instead of \((\varphi _0,\vartheta _0,z_0)\), and therefore one can move to \(\varphi _2\), \(\vartheta _2\), \(z_2\), and so on, with controlled bounds on \(\varphi _k\), which apply at least as long as \(\varphi _0-\varphi _k\le \gamma \varepsilon ^n\). Also note that the deviation from the cylinder is as shown in Fig. 1, namely, the orbit can get away from \(\Vert z_k\Vert \le \gamma \varepsilon ^n\), during the times between the stopping times kT, \(k=1,2,\dots \).

The remainder of the paper is devoted to the proof of Theorem 1.1. After some introductory results, the first important bound is on the linear semigroup with the very weak dissipation in Sect. 6. The generator is called \({\mathcal {L}}_{\varphi \gamma }\), see Eq. (6.1) and its associated bound (in Corollary 6.2). In Sect. 7, we study in detail the projection onto the complement of the tangent space to the cylinder at \((\varphi ,\vartheta ) \). This allows, in Sect. 8, to estimate the contraction (after time T) of the z-component, orthogonal to the tangent space. We do this in two steps, first we evolve z while staying in the basis defined at \(\varphi _0,\vartheta _0 \). Then, in Sect. 9, we re-orthogonalize so that we obtain Eq. (1.5). Finally, Sect. 10 gives some more details about restarting the iterations from \(\varphi _1\), \(\vartheta _1\), \(z_1\) to \(\varphi _2,\dots \).

The precise statement will be formulated and proved as Theorem 8.2.

Fig. 1
figure 1

Illustration of the results: Since phase space is high-dimensional, we draw the red curve in the same coordinate system as the cylinder, but it really stays in a subspace of \({\mathbb {C}}^n\) which is orthogonal to the 2-dimensional space of the cylinder. When there is no dissipation (\(\gamma =0\)), then the system has a cylinder of fixed points in rotating frames (shown in green). This cylinder is parameterized by the values of \(\varepsilon \), i.e., the energy of the fast coordinate \(u_1\). When \(\gamma >0\), the fixed points disappear, and instead the system hovers near the cylinder, and spiraling around it, with a phase speed of \(2\gamma \varepsilon ^{2n-1}\). We show that the orbit of all such solutions stays within a distance \({\mathcal {O}}(\varepsilon ^n)\), as long as \(\varepsilon \) remains small (it actually increases with time)

Remark 1.3

From the results of [1] and Eq. (1.7) one can also conclude more details about the windings of Fig. 1. The \(m^{\mathrm{th}}\) turn finishes after a time \(t_m \approx \sqrt{\frac{2\pi m}{\gamma \varepsilon ^{2n-1}}}\), and the “horizontal” spacing (in \(\varepsilon \)) between the windings is \(2\sqrt{2\pi \gamma \varepsilon ^{2n-1}}(\sqrt{m+1}-\sqrt{m})\), up to terms of higher order.

We will study Eq. (1.3) for the remainder of this paper. We will also sometimes rewrite this equation in the equivalent real form by defining \(w_j = p_j + {\varvec{\mathbf{i}}}q_j\), which yields the system of equations, for \(j=1,\dots ,n\):

$$\begin{aligned} \dot{q}_j&= -\varepsilon (\varDelta p)_j -p_j+(q_j^2+p_j^2)p_j-\delta _ {j,n}\gamma \varepsilon q_n,\nonumber \\ \dot{p}_j&= \varepsilon (\varDelta q)_j +q_j-(q_j^2+p_j^2)q_j-\delta _ {j,n}\gamma \varepsilon p_n. \end{aligned}$$
(1.8)

Note that if \(\gamma =0\), this is a Hamiltonian system with:

$$\begin{aligned} H&=\frac{\varepsilon }{2} \sum _{j<n} \left( (p_j-p_{j+1})^2+(q_j-q_{j+1})^2\right) \nonumber \\&~~~- \sum _{j=1}^n \left( \frac{1}{2}(p_j^2+q_j^2)-\frac{1}{4} (p_j^2+q_j^2)^2\right) . \end{aligned}$$
(1.9)

Finding a periodic solution of the form Eq. (1.2) (i.e., a fixed point in the rotating coordinate system) reduces to finding roots of the system of equations

$$\begin{aligned} 0&= -\varepsilon (\varDelta p)_j -p_j+(q_j^2+p_j^2)p_j,\nonumber \\ 0&= \varepsilon (\varDelta q)_j +q_j-(q_j^2+p_j^2)q_j. \end{aligned}$$
(1.10)

Since we are also interested in solutions which rotate (slowly), replacing w by \(e^{{\varvec{\mathbf{i}}}\varphi _0 t}w\), we study instead of Eq. (1.8) (resp. Eq. (1.10)) the related equation

$$\begin{aligned} \dot{q}_j&= -\varepsilon (\varDelta p)_j -(1+\varphi _0)p_j+(q_j^2+p_j^2)p_j-\delta _ {j,n}\gamma \varepsilon q_n,\nonumber \\ \dot{p}_j&= \varepsilon (\varDelta q)_j +(1+\varphi _0)q_j-(q_j^2+p_j^2)q_j-\delta _ {j,n}\gamma \varepsilon p_n , \end{aligned}$$
(1.11)

where the \(\varphi _0\) dependence comes from differentiating the exponential factor \(e^{{\varvec{\mathbf{i}}}\varphi _0 t}\).

Remark 1.4

We use \(\varphi _0\) to designate a constant rotation speed, while later, \(\varphi \) will stand for a time-dependent rotation speed.

Remark 1.5

The reader who is familiar with the paper [1] can jump to Sect. 3, since much of the material in this section and the next is basically repeated from that reference.

Theorem 1.6

Suppose that the damping coefficient \(\gamma \) equals 0 in Eq. (1.11). There exist constants \(\varepsilon _* > 0\), \(\varphi _* >0\), such that for \(|\varepsilon | < \varepsilon _*\) and \( |\varphi _0 | < \varphi _*\), Eq. (1.11) has a periodic solution of the form \(w_j(t;\varphi _0) = e^{{\varvec{\mathbf{i}}}t\varphi _0 } p_j^{}(\varphi _0)\), with \(p_1^{}(\varphi _0) = 1+{{\mathcal {O}}}(\varepsilon , \varphi _0)\), and \(p_j^{}(\varphi _0) = {{\mathcal {O}}}(\varepsilon ^{j-1})\) for \(j = 2, \dots , n\).

Proof

If we insert \(w_j(t;\varphi _0) = e^{{\varvec{\mathbf{i}}}t\varphi _0 } p_j^{}(\varphi _0)\) into Eq. (1.3), and take real and imaginary parts, we find that the amplitudes \(p^{}\in {\mathbb {R}}^n \) of these periodic orbits are (for \(\gamma =0\)) solutions of

$$\begin{aligned} F_j(p;\varphi _0,\varepsilon ) = -\varepsilon (\varDelta p)_j - (1+\varphi _0) p_j + p_j^3 = 0 ,\ \ j = 1, \dots , n . \end{aligned}$$
(1.12)

Setting \(p^0_j = \delta _{j,1}\), we have

$$\begin{aligned} F_j(p^0;0,0) = 0 , \end{aligned}$$

for all j. Furthermore, the Jacobian matrix at this point is the diagonal matrix

$$\begin{aligned} \left( D_{p} F (p^0;0,0) \right) _{i,j} = (3 \delta _{i,1} -1) \delta _{i,j}, \end{aligned}$$

which is obviously invertible.

Thus, by the implicit function theorem, for \((\varphi _0,\varepsilon )\) in some neighborhood of the origin, Eq. (1.12) has a unique fixed point \(p = p^{}(\varphi _0,\varepsilon )\), and since F depends analytically on \((\varphi _0,\varepsilon )\), so does \(p^{}(\varphi _0,\varepsilon )\).

It is easy to compute the first few terms of this fixed point:

$$\begin{aligned} {p_1^{}}&=1+{\frac{1}{2}}( \varphi _0-\varepsilon )+{\mathcal {O}}_2,\nonumber \\ {p_2^{}}&=-\varepsilon +{\mathcal {O}}_2,\nonumber \\ {p_3^{}}&=\varepsilon ^{2}+{\mathcal {O}}_3,\nonumber \\&\dots \nonumber \\ {p_j^{}}&=(-1)^{j-1}\varepsilon ^{j}+{\mathcal {O}}_{j+1}, \end{aligned}$$
(1.13)

where \({\mathcal {O}}_k\) denotes terms of order k in \(\varphi _0,\varepsilon \) together. \(\quad \square \)

Remark 1.7

Since Eq. (1.3) is invariant under complex rotations \(w_ j \rightarrow e^{{\varvec{\mathbf{i}}}\vartheta _0} w_j\), we actually have a circle of fixed points (when \(\gamma =0\)). However, these are the only fixed points with \(|w_1| \approx 1\). We will continue with \(\vartheta _0=0\), and reintroduce \(\vartheta _0\ne 0\) only in Sect. 3.

2 The Eigenspace of the Eigenvalue 0

Consider the linearization of the system Eq. (1.11) around the periodic orbit (fixed point) we found in Theorem 1.6. Denote by \(Z_*\) this solution,

$$\begin{aligned} Z_* =(p_1^{},p_2^{},\dots ,p_n^{},q_1^{},q_2^{},\dots , q_n^{})^\top , \end{aligned}$$

where \(q_j=0\) and \(p_j=p_j(\varphi _0,\varepsilon )\) as found in Theorem 1.6. In order to avoid overburdening the notation, we will write out the formulas which follow for the case \(n=3\)—the expressions for general (finite) values of n are very similar. We also omit the \(\varepsilon \) dependence from \(p(\varphi _0,\varepsilon )\). The linearization of the evolution Eq. (1.11) at \(Z_*\) leads (for \(\gamma =0\)) to an equation of the form

$$\begin{aligned} \frac{\mathrm{d}x}{\mathrm{d}t} = M_{\varphi _0,\varepsilon } x = \begin{pmatrix} 0&{}A_{\varphi _0,\varepsilon }\\ B_{\varphi _0,\varepsilon }&{}0 \end{pmatrix} x , \end{aligned}$$

and with \(1_{\varphi _0}\equiv (1+\varphi _0)\):

$$\begin{aligned} A_{\varphi _0,\varepsilon }= \begin{pmatrix} 1_{\varphi _0}-\varepsilon -(p^{}_1)^2 &{}\varepsilon &{}0\\ \varepsilon &{} 1_{\varphi _0}-2\varepsilon -(p^{}_2)^2 &{}\varepsilon \\ 0&{}\varepsilon &{}1_{\varphi _0}-\varepsilon -(p^{}_3)^2 \end{pmatrix}, \end{aligned}$$
(2.1)

where the \(p^{}_j=p(\varphi _0)_j\) are the stationary solutions of Eq. (1.11). Similarly,

$$\begin{aligned} B_{\varphi _0,\varepsilon } = \begin{pmatrix} -1_{\varphi _0}+\varepsilon +3(p_1^{})^2 &{}-\varepsilon &{}0\\ -\varepsilon &{} -1_{\varphi _0}+2\varepsilon +3(p_2^{})^2 &{}-\varepsilon \\ 0&{}-\varepsilon &{}-1_{\varphi _0}+\varepsilon +3(p_3^{})^2 \end{pmatrix}. \end{aligned}$$
(2.2)

Similar expressions hold for other values of n.

Among the key facts that we will establish below is that \(M_{\varphi _0,\varepsilon }\) has a two-dimensional zero eigenspace, with an explicitly computable basis, for all values of \(\varepsilon \). Then, in subsequent sections we will show that the remainder of the spectrum lies on the imaginary axis and that all non-zero eigenvalues are simple and separated from the remainder of the spectrum of \(M_{\varphi _0,\varepsilon }\) by a distance at least \(C \varepsilon \). All of these facts turn out to be essential for our subsequent calculations and establishing them is complicated by the extreme degeneracy of the eigenvalues of \(M_{\varphi _0,0}\) about which we wish to perturb.

The following lemma will allow to simplify notation:

Lemma 2.1

One has the identity

$$\begin{aligned} \partial _\varphi p(\varphi _0)=B_{\varphi _0,\varepsilon }^{-1} p(\varphi _0). \end{aligned}$$

Proof

This follows by differentiating Eq. (1.12) and comparing to the definition of \(B_{\varphi _0,\varepsilon }\) in Eq. (2.2). \(\quad \square \)

Lemma 2.2

Define \(B_{\varphi _0,\varepsilon }=L-\varphi _0 \mathbb {1}+3(p(\varphi )_0)^2\), with \(L=\varepsilon \varDelta -\mathbb {1}\). That is, we view \(B_{\varphi _0,\varepsilon }\) as a real \(n\times n\) matrix and \((p(\varphi _0))^2\) as the diagonal matrix with components \(((p(\varphi _0)_1)^2,\dots ,(p(\varphi _0)_n)^2)\). Then the zero eigenspace of the matrix \(M_{\varphi _0,\varepsilon }\) is spanned by the 2n-component vectors

$$\begin{aligned} v^{(1)}_{\varphi _0}&=\begin{pmatrix}0\\ p(\varphi _0)\end{pmatrix},\nonumber \\ v^{(2)}_{\varphi _0}&=\begin{pmatrix}B_{\varphi _0,\varepsilon }^{-1}\,p(\varphi _0)\\ 0\end{pmatrix}, \end{aligned}$$
(2.3)

Proof

To see that \(M_{\varphi _0,\varepsilon }v^{(1)}_{\varphi _0}=0\), note that Eq. (1.1) is invariant under \(u\rightarrow e^{{\varvec{\mathbf{i}}}\vartheta } u\). Thus, viewed in \({\mathbb {C}}^n\), the quantity \(e^{{\varvec{\mathbf{i}}}\vartheta }(p(\varphi _0,\varepsilon )+{\varvec{\mathbf{i}}}0)\) is a solution for all \(\vartheta \). Taking the derivative w.r.t. \(\vartheta \), at \(\vartheta =0\) and considering the real and imaginary parts of the resulting equation shows that \(v^{(1)}_{\varphi _0}\) is a solution of \(M_{\varphi _0,\varepsilon }v^{(1)}_{\varphi _0}=0\). From the form of \(M_{\varphi _0,\varepsilon }\) and the invertibility of \(B_{\varphi _0,\varepsilon } \) we see immediately that \(v^{(2)}_{\varphi _0}\) is mapped onto the direction of \(v^{(1)}_{\varphi _0}\). \(\quad \square \)

We will also need the adjoint eigenvectors of M:

Lemma 2.3

The adjoint eigenvectors are given by

$$\begin{aligned} n^{(1)}_{\varphi _0}&= (2+{\mathcal {O}}(\varphi _0,\varepsilon )) \cdot (0,B^{-1}_{\varphi _0} p(\varphi _0))^\top , \nonumber \\ n^{(2)}_{\varphi _0}&= (2+{\mathcal {O}}(\varphi _0,\varepsilon ))\cdot (p(\varphi _0),0)^\top . \end{aligned}$$
(2.4)

They are normalized to satisfy

$$\begin{aligned} \langle n^{(1)}_{\varphi _0}|v^{(1)}_{\varphi _0}\rangle&= \langle n^{(2)}_{\varphi _0}|v^{(2)}_{\varphi _0}\rangle =1,\nonumber \\ \langle n^{(2)}_{\varphi _0}|v^{(1)}_{\varphi _0}\rangle&= \langle n^{(1)}_{\varphi _0}|v^{(2)}_{\varphi _0}\rangle =0. \end{aligned}$$
(2.5)

Remark 2.4

The approximate versions are

$$\begin{aligned} n^{(1)}_{\varphi _0}&\sim (0,\dots ,0,1,0,\dots ,0)^\top ,\\ n^{(2)}_{\varphi _0}&\sim (2,0,\dots ,0)^\top . \end{aligned}$$

Proof

Because of the block form of M and the fact that A and B are symmetric, we have

$$\begin{aligned} M^*_{\varphi _0,\varepsilon } = \left( \begin{array}{cc} 0 &{} B_{\varphi _0,\varepsilon } \\ A_{\varphi _0,\varepsilon } &{} 0 \end{array} \right) . \end{aligned}$$

But then, since we know from the computation of the eigenvectors of M that \(B_{\varphi _0,\varepsilon } p(\varphi _0)= 0\), we can check immediately that

$$\begin{aligned} {\tilde{n}}^{(2)}_{\varphi _0} = (p(\varphi _0),0)^\top \end{aligned}$$

satisfies \(M^* {\tilde{n}}^{(2)}_{\varphi _0} = 0\). Likewise,

$$\begin{aligned} {\tilde{n}}^{(1)}_{\varphi _0} =(0,B^{-1}_{\varphi _0,\varepsilon } p^{})^\top \end{aligned}$$

satisfies

$$\begin{aligned} M^*_{\varphi _0,\varepsilon } {\tilde{n}}^{(1)}_{\varphi _0} = (p(\varphi _0),0)^\top = {{\tilde{n}}}^{(2)}_{\varphi _0} . \end{aligned}$$

Thus, \({\tilde{n}}^{(1)}_{\varphi _0}\) and \({\tilde{n}}^{(2)}_{\varphi _0}\) span the zero eigenspace of the adjoint matrix. The normalization is checked from the definitions. \(\quad \square \)

3 Evolution Equations for \(\gamma >0\)

Consider Eq. (1.3), with dissipation: Here, \(C_\varGamma \) is not a scalar, but a diagonal matrix, whose diagonal will be taken as \((0,0,\dots ,\gamma \varepsilon )\in {\mathbb {C}}^n\). Thus, our evolution equation is

$$\begin{aligned} -{\varvec{\mathbf{i}}}\dot{W} =LW +|W|^2W + {\varvec{\mathbf{i}}}C_\varGamma W, \end{aligned}$$
(3.1)

with \(L=\varepsilon \varDelta -\mathbb {1}\), as before. We are interested in the time dependence of two real “slow” variables which we call \(\varphi (t)\) and \(\vartheta (t)\), and so we set

$$\begin{aligned} W(t)=e^{{\varvec{\mathbf{i}}}(t\varphi (t)+\vartheta (t))}\bigl (p\bigl (\varphi (t)\bigr )+z(t)\bigr ),\quad W,z\in {\mathbb {C}}^n. \end{aligned}$$
(3.2)

Remark 3.1

Recall that the notation \(\varphi _0\) stands for a constant phase speed, while \(\varphi =\varphi (t)\) will always mean a time-dependent quantity.

Our decomposition is inspired by modulation theory approaches to study the stability of solitary waves and patterns with respect to perturbations [7,8,9,10]. In particular, we will choose the initial decomposition of the solution so that the initial value of z(0) lies in the subspace conjugate to the zero-eigenspace of the linearization. We then prove (somewhat surprisingly) that all modes orthogonal to the zero subspace are uniformly damped which allows us to show that the values of z(t) remain bounded for very long times. Omitting the arguments (t), we find

$$\begin{aligned} \dot{W}&={\varvec{\mathbf{i}}}(\varphi +t{{\dot{\varphi }}}+{{\dot{\vartheta }}} )e^{{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}(p(\varphi )+z)\\&\quad + e^{{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}\bigl ( \partial _\varphi p(\varphi )\,{{\dot{\varphi }}}+\dot{z} \bigr ). \end{aligned}$$

Then Eq. (3.1) leads to (using again that powers and products are taken componentwise),

$$\begin{aligned}&(\varphi +t{{\dot{\varphi }}}+{{\dot{\vartheta }}} )e^{{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}(p(\varphi )+z) -{\varvec{\mathbf{i}}}e^{{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}\bigl ( \partial _\varphi p(\varphi ){{\dot{\varphi }}}+\dot{z} \bigr )\\&\quad = e^{{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}L(p(\varphi ))+e^{{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}Lz \\&\qquad +\bigl ( e^{{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}(p(\varphi )+z)\bigr )^2(e^{-{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}(p(\varphi )+{\bar{z}})) +{\varvec{\mathbf{i}}}C_\varGamma e^{{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}(p(\varphi )+z). \end{aligned}$$

The factors of \(e^{{\varvec{\mathbf{i}}}(t\varphi +\vartheta )}\) cancel and we get

$$\begin{aligned}&(\varphi +t{{\dot{\varphi }}}+{{\dot{\vartheta }}} ) (p(\varphi )+z) -{\varvec{\mathbf{i}}}\bigl ( \partial _\varphi p(\varphi )\,{{\dot{\varphi }}}+\dot{z} \bigr ) \\&\quad = L(p(\varphi ))+ Lz + (p(\varphi )+z)^2 (p(\varphi )+{\bar{z}}) +{\varvec{\mathbf{i}}}C_\varGamma (p(\varphi )+z). \end{aligned}$$

We now expand this equation to first order in z and this leads to

$$\begin{aligned}&(\varphi +t{{\dot{\varphi }}}+{{\dot{\vartheta }}} ) (p(\varphi )+z) -{\varvec{\mathbf{i}}}\bigl ( \partial _\varphi p(\varphi )\,{{\dot{\varphi }}}+\dot{z} \bigr )\nonumber \\&\quad = L(p(\varphi ))+ Lz \nonumber \\&\qquad +(p(\varphi ))^3+2(p(\varphi ))^2 z+p(\varphi ){\bar{z}}\nonumber \\&\qquad +{\varvec{\mathbf{i}}}C_\varGamma (p(\varphi )+z)+{\mathcal {O}}(|z|^2) . \end{aligned}$$
(3.3)

Set now \(z=\xi +{\varvec{\mathbf{i}}}\eta \).

In what follows, we will switch back and forth between the real and complex representations of the solutions and will refer to \(z= \xi + {\varvec{\mathbf{i}}}\eta \in {\mathbb {C}}^n\) and \(\zeta = (\xi ,\eta ) \in {\mathbb {R}}^{2n}\) interchangeably, allowing the context to distinguish between the two ways of writing the solution. When we consider \(\xi \) and \(\eta \), which are n dimensional vectors, one should note that \(\xi =(\xi _1,\dots ,\xi _n)^\top \) while \(\eta =(\eta _1,\dots ,\eta _n)^\top \). At various points in the argument, will use restrictions of our equations to these two spaces which we call \({{\mathbb {P}}}^\xi _{\varphi _0}\) and \({{\mathbb {P}}}^\eta _{\varphi _0}\).

Taking the real and imaginary components of Eq. (3.3), we obtain the following equations in \({\mathbb {R}}^n\):

$$\begin{aligned}&(t{{\dot{\varphi }}}+{{\dot{\vartheta }}} )(p(\varphi )+\xi )+{{\dot{\eta }}}= (L-\varphi )\xi +3(p(\varphi ))^2\xi -C_\varGamma \eta +{\mathcal {O}}_2,\nonumber \\&(t{{\dot{\varphi }}}+{{\dot{\vartheta }}} )\eta -(\partial _\varphi p(\varphi )\,{{\dot{\varphi }}}-{{\dot{\xi }}}= (L-\varphi )\eta +(p(\varphi ))^2\eta +C_\varGamma (p(\varphi )+\xi )+{\mathcal {O}}_2, \end{aligned}$$
(3.4)

where \({\mathcal {O}}_2\) refers to terms that are at least quadratic in \((\xi ,\eta )\).

We next study what happens in the complement of the two-dimensional zero eigenspace identified at the end of the previous section, when one adds dissipation on the coordinate n. In the standard basis, when \(n=3\), the dissipation is given, as before, by

$$\begin{aligned} C_\varGamma = \begin{pmatrix} 0&{}\quad 0&{}\quad 0\\ 0&{}\quad 0&{}\quad 0\\ 0&{}\quad 0&{}\quad \gamma \varepsilon \end{pmatrix}. \end{aligned}$$

In the full space, we have the \(2n\times 2n\) matrix

$$\begin{aligned} \varGamma = \begin{pmatrix} C_\varGamma &{}\quad 0\\ 0&{}\quad C_\varGamma \end{pmatrix}. \end{aligned}$$

We fix a \(\varphi (0)=\varphi _0\) and we consider the projection

$$\begin{aligned} {\mathbb {P}}={\mathbb {P}}_{\varphi _0} =\mathbb {1}-|v^{(1)}_{\varphi _0} \rangle \langle n^{(1)}_{\varphi _0} | -|v^{(2)}_{\varphi _0} \rangle \langle n^{(2)}_{\varphi _0} |. \end{aligned}$$

This is the projection onto the complement of the space spanned by the 0 eigenvalue.

We will require that \(\zeta =(\xi ,\eta )\) remains in the range of \({\mathbb {P}}_{\varphi _0}\). As time passes, the base point \((\varphi ,\vartheta )\) will change, and this will lead to secular growth in \(\zeta \), an issue which we discuss in detail below.

We rearrange Eq. (3.4) as

$$\begin{aligned} {{\dot{\xi }}}&= {{\mathbb {P}}}^\xi _{\varphi _0} \biggl (\bigl (-(L-\varphi ) -(p(\varphi ))^2\bigr ) \eta \nonumber \\&\quad -C_\varGamma \xi +(t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) \eta -\partial _\varphi \bigl (p(\varphi )\bigr ){{\dot{\varphi }}}-C_\varGamma p(\varphi )+{\mathcal {O}}_2\biggr ),\nonumber \\ {{\dot{\eta }}}&= {{\mathbb {P}}}^\eta _{\varphi _0}\left( \bigl ((L-\varphi ) +3(p(\varphi ))^2\bigr )\xi -C_\varGamma \eta -(t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) (p(\varphi )+\xi )+{\mathcal {O}}_2\right) . \end{aligned}$$
(3.5)

We will compute these projections in detail in Sect. 7.

4 Spectral Properties of the Linearization at \(\gamma \) = 0

In this section, we consider the action of the matrix \(M_{\varphi _0,\varepsilon }\), when projected (with \({\mathbb {P}}_{\varphi _0}\)) onto the complement of the subspace associated with the 0 eigenspace. Recall from above that:

$$\begin{aligned} {\mathbb {P}}_{\varphi _0} = \mathbb {1}- | v^{(1)}_{\varphi _0}\rangle \,\langle n^{(1)}_{\varphi _0} | - | v^{(2)}_{\varphi _0}\rangle \,\langle n^{(2)}_{\varphi _0} |. \end{aligned}$$

We will see that this projection is very close to the projection onto the complement of the 1st and \((n+1)\)st component of the vectors in \({\mathbb {R}}^{2n}\).

We use perturbation theory, starting from the matrix \( M_{\varphi _0=0,\varepsilon =0}\). We write the formulas for \(n=4\). The discussion starts with \(\varepsilon =\varphi _0=0\). Then, we have the quantities

$$\begin{aligned} A_{0,0}=A_{\varphi _0=0,\varepsilon =0} \left( \begin{array}{c|ccc} 0 &{}0 &{}0&{}0\\ \hline 0&{} 1 &{}0&{}0 \\ 0&{}0&{}1&{}0\\ 0&{}0&{}0&{}1 \end{array}\right) = \left( \begin{array}{c|c} 0&{}\\ \hline &{}\mathbb {1}\end{array}\right) , \end{aligned}$$
(4.1)

and

$$\begin{aligned} B_{0,0}=B_{\varphi =0,\varepsilon =0}= \left( \begin{array}{c|rrr} 2 &{}0 &{}0&{}0\\ \hline 0&{} -1 &{}0&{}0 \\ 0&{}0&{}-1&{}0\\ 0&{}0&{}0&{}-1\ \end{array}\right) = \left( \begin{array}{c|c} 2&{}\\ \hline &{}-\mathbb {1}\\ \end{array}\right) , \end{aligned}$$
(4.2)

which follow by substitution. We set

$$\begin{aligned} M_{0,0}\equiv \left( \begin{array}{cc} 0&{}A_{0,0}\\ B_{0,0}&{}0 \end{array}\right) , \end{aligned}$$

and study first the spectrum of \(M_{0,0}\). The spectrum (and the eigenspaces) of \(M_{\varphi ,\varepsilon }\) will then be shown to be close to that of \(M_{0,0}\).

The eigenvalues of \(M_{0,0}\) are: A double 0, and \(n-1\) pairs of eigenvalues \(\pm {\varvec{\mathbf{i}}}\). When \(n=4\), the corresponding eigenvectors are:

$$\begin{aligned} e^{(1)}&= (0,0,0,0,1,0,0,0)^\top , \\ e^{(3),(4)}&= (0,\pm {\varvec{\mathbf{i}}},0,0,0,1,0,0)^\top ,\\ e^{(5),(6)}&= (0,0,\pm {\varvec{\mathbf{i}}},0,0,0,1,0)^\top ,\\ e^{(7),(8)}&= (0,0,0,\pm {\varvec{\mathbf{i}}},0,0,0,1)^\top . \end{aligned}$$

Note that \(e^{(2)}\) is missing, but the vector \(e^{(2)}=(1,0,0,0,0,0,0,0)^\top \) is mapped onto \(2e^{(1)}\) and so \(e^{(1)}\) and \(e^{(2)}\) span the 0 eigenspace.

Since we have a symplectic problem (when \(\gamma =0\)), we need to do the calculations in an appropriate basis. This is inspired by the paper [11]. The coordinate transformation is defined by the following matrix: Let \(s=1/\sqrt{2}\), and define (for the case \(n=4\)),

$$\begin{aligned} X=\left( \begin{array}{cc|ccc|ccc} &{}1&{}&{}&{}&{}&{}&{}\\ \hline &{}&{}s&{}&{}&{}s&{}&{}\\ &{}&{}&{}s&{}&{}&{}s&{}\\ &{}&{}&{}&{}s&{}&{}&{}s\\ \hline 1&{}&{}&{}&{}&{}&{}&{}\\ \hline &{}&{}{\varvec{\mathbf{i}}}s&{}&{}&{}-{\varvec{\mathbf{i}}}s&{}&{}\\ &{}&{}&{}{\varvec{\mathbf{i}}}s&{}&{}&{}-{\varvec{\mathbf{i}}}s&{}\\ &{}&{}&{}&{}{\varvec{\mathbf{i}}}s&{}&{}&{}-{\varvec{\mathbf{i}}}s \end{array}\right) . \end{aligned}$$
(4.3)

The columns are the normalized eigenvectors of our problem, for \(\varepsilon =\varphi =0\). Empty positions are “0”s and the second vector is mapped on the first (up to a factor of 2). With our choice of s we have \(X^* X=1\), where \(X^*\) is the Hermitian conjugate of X.

Definition 4.1

If Y is a matrix, we write its transform as \({{\mathcal {X}}}(Y)=X^* Y X\).

In the new basis, we get:

$$\begin{aligned} {{\mathcal {X}}}(M_{0,0}) =-{\varvec{\mathbf{i}}}\left( \begin{array}{cc|ccc|ccc} 0&{}-2{\varvec{\mathbf{i}}}&{}&{}&{}&{}&{}&{}\\ 0&{}0&{}&{}&{}&{}&{}&{}\\ \hline &{}&{}1&{}&{}&{}&{}&{}\\ &{}&{}&{}1&{}&{}&{}&{}\\ &{}&{}&{}&{}1&{}&{}&{}\\ \hline &{}&{}&{}&{}&{}-1&{}&{}\\ &{}&{}&{}&{}&{}&{}-1&{}\\ &{}&{}&{}&{}&{}&{}&{}-1\\ \end{array}\right) = -{\varvec{\mathbf{i}}}\left( \begin{array}{cc|c|c} 0&{}-2{\varvec{\mathbf{i}}}&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}\mathbb {1}&{}\\ &{}&{}&{}-\mathbb {1}\end{array}\right) . \end{aligned}$$
(4.4)

Therefore, we have diagonalized \(M_{0,0}\) up to its nilpotent block, and we also see that the other parts of \({{\mathcal {X}}}( M_{0,0})\) are imaginary, which reflects the symplectic nature of the model.

We now turn to the case of \(M_{\varphi _0,\varepsilon }\) which we view as a perturbation of \(M_{0,0}\) in the following way:

$$\begin{aligned} M_{\varphi _0,\varepsilon }=M_{\varphi _0,0} + E_1 +E_2, \end{aligned}$$

where

$$\begin{aligned} M_{{\varphi _0},0}=\begin{pmatrix}0&{} A_{{\varphi _0},0}\\ B_{{\varphi _0},0}&{}0\end{pmatrix}, \end{aligned}$$

with

$$\begin{aligned} A_{{\varphi _0},0}&= \text {diag}( 0,1_{\varphi _0},\dots ,1_{\varphi _0}),\\ B_{{\varphi _0}.0}&= \text {diag}( -2\cdot 1_{\varphi _0},-1_{\varphi _0},\dots ,-1_{\varphi _0}). \end{aligned}$$

The matrix \(E_1\) collects the terms in \(M_{{\varphi _0},\varepsilon }\) which are linear in \(\varepsilon \), while \(E_2\) collects all higher order terms. The matrix \(E_1\) is easily derived from Eqs. (1.13) and (4.1)–(4.2)Footnote 2:

$$\begin{aligned} E_1= \varepsilon \left( \begin{array}{cccc|cccc} &{}&{}&{}&{}0&{}1 &{}&{}\\ &{}&{}&{}&{}1 &{}-2&{}1&{}\\ &{}&{}&{}&{}&{}1 &{}-2&{}1\\ &{}&{}&{}&{}&{}&{}1 &{}-1\\ \hline -2 &{}-1 &{}&{}&{}&{}&{}&{}\\ -1 &{}2 &{}-1 &{}&{}&{}&{}&{}\\ &{}-1 &{}2 &{}-1 &{}&{}&{}&{}\\ &{}&{}-1 &{}1 &{}&{}&{}&{} \end{array}\right) . \end{aligned}$$

We now apply the coordinate transformation to \(M_{{\varphi _0},0}\) and \(E_1\). We first observe that

$$\begin{aligned} {{\mathcal {X}}}(M_{{\varphi _0},0}) = (1+{\varphi _0}) {{\mathcal {X}}}(M_{0,0}). \end{aligned}$$
(4.5)

Applying the transformation to \(E_1\), one gets (using again \(s=1/\sqrt{2}\)):

$$\begin{aligned} {{\mathcal {X}}}( E_1 ) = {\varvec{\mathbf{i}}}\varepsilon \left( \begin{array}{cc|rrr|rrr} &{}2{\varvec{\mathbf{i}}}&{}{\varvec{\mathbf{i}}}s&{}&{}&{}{\varvec{\mathbf{i}}}s&{}&{}\\ &{}&{} s&{}&{}&{}- s&{}&{}\\ \hline -{\varvec{\mathbf{i}}}s&{} s&{}-\mathbf{2}&{}1&{}&{}&{}&{}\\ &{}&{} 1&{}-2 &{}1&{}&{}&{}\\ &{}&{} &{}1 &{}-1&{}&{}&{}\\ \hline -{\varvec{\mathbf{i}}}s&{}- s&{}&{}&{}&{}\mathbf{2}&{}-1&{}\\ &{}&{}&{}&{}&{} -1&{}2 &{}-1\\ &{}&{} &{}&{}&{}&{}-1 &{}1 \end{array}\right) . \end{aligned}$$
(4.6)

The reader should be aware that the seeming irregularities of the matrix \({{\mathcal {X}}}(E_1)\) are due to the differences between the expansions of \(p_1\) and the other \(p_j\) in powers of \(\varepsilon \).

We next split \({{\mathcal {X}}}(E_1)={{\mathcal {X}}}( E_{11})+{{\mathcal {X}}}( E_{12})\), whereFootnote 3

$$\begin{aligned} {{\mathcal {X}}}(E_{11})= {\varvec{\mathbf{i}}}\varepsilon \left( \begin{array}{cc|c|c} 0&{}2{\varvec{\mathbf{i}}}&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}S&{}\\ \hline &{}&{}&{}-S \end{array}\right) , \end{aligned}$$
(4.7)

and the \(\pm S\) are the tri-diagonal parts of Eq. (4.6). It is important to observe that the S is tri-diagonal, symmetric, with non-zero off-diagonal elements. Clearly, \({{\mathcal {X}}}(E_{12})\) contains only the two top rows and the first two columns of \({{\mathcal {X}}}(E_1)\). It is thus of the form

$$\begin{aligned} {{\mathcal {X}}}(E_{12})={\varvec{\mathbf{i}}}\varepsilon \left( \begin{array}{cc|c|c} 0&{} 0 &{}***&{}***\\ 0&{} 0 &{}***&{}***\\ \hline *&{}*&{}&{}\\ *&{}*&{}&{}\\ *&{}*&{}&{}\\ \hline *&{}*&{}&{}\\ *&{}*&{}&{}\\ *&{}*&{}&{} \end{array}\right) , \end{aligned}$$
(4.8)

where the “*” denote elements of at most \({\mathcal {O}}_1\), cf. Eq. (4.6).

All in all, we have decomposed

$$\begin{aligned} {{\mathcal {X}}}(M_{{\varphi _0},\varepsilon })=(1+{\varphi _0}){{\mathcal {X}}}(M_{0,0})+{{\mathcal {X}}}(E_{11}) +{{\mathcal {X}}}(E_{12})+{{\mathcal {X}}}(E_2). \end{aligned}$$
(4.9)

The term \({{\mathcal {X}}}(E_2)\) contributes to second order, and it only remains to understand the role of \({{\mathcal {X}}}(E_{12})\). Note now that \({{\mathcal {X}}}( E_{12})\), which is of the form of Eq. (4.8), couples the 0 block to S and \(-S\), but only to the first component of these matrices.Footnote 4 The following argument from classical perturbation theory shows that this can only contribute to second order in \(\varepsilon \) to the spectrum.

The first order shift of an eigenvalue close to \({\varvec{\mathbf{i}}}\) with eigenvector v is simply \(\langle v |{{\mathcal {X}}}( E_{12}) v\rangle \). But, v is of the form

$$\begin{aligned} v= (0,0,v_1,v_2,v_3,0,0,0)^\top , \end{aligned}$$

due to the form \({{\mathcal {X}}}( M_{{\varphi _0},0} + E_{11})\). Thus, \({{\mathcal {X}}}(E_{12}) v\) is of the form

$$\begin{aligned} {{\mathcal {X}}}( E_{12}) v= (*,*,0,0,0,0,0,0)^\top , \end{aligned}$$

where “\(*\)” denotes possibly non-zero elements. From the form of v, this implies that

$$\begin{aligned} \langle v | {{\mathcal {X}}}(E_{12}) v\rangle =0. \end{aligned}$$

This means that \({{\mathcal {X}}}(E_{12})\) contributes only in second order in \(\varepsilon \) to the eigenvalues.

5 Complement of the Zero Eigenspace

Recall that our goal is to write the solution of (1.3) as

$$\begin{aligned} w_j(t) = e^{{\varvec{\mathbf{i}}}( \varphi (t) t + \vartheta (t))} ( p_j(\varphi (t)) + z_j(t)), \end{aligned}$$

and then follow the evolution of \({\varphi }\), \(\vartheta \), and \(z =(\xi + {\varvec{\mathbf{i}}}\eta )\). Since \(\zeta (t)= (\xi (t), \eta (t))\) is constructed to lie in the subspace orthogonal to the tangent space of the cylinder of breathers at \((\varphi _0,\vartheta _0)\), we construct the projection \({\mathbb {P}}\) onto the tangent space at \(\varphi _0\), \(\vartheta _0\). We show that, somewhat surprisingly, with the exception of the zero eigenspace, all other eigenvalues of the linearized matrix are simple, lie on the imaginary axis, and are separated by a distance of at least \(C\varepsilon \) from the remainder of the spectrum. It is convenient to work directly with the transformation \({{\mathcal {X}}}(\cdot )\) of Eq. (4.3).

Theorem 5.1

The operator \({{\mathcal {X}}}({\mathbb {P}}M_{{\varphi _0},\varepsilon })\) has two eigenvalues which are within \({\mathcal {O}}(\varepsilon )\) of 0, and \((n-1)\) purely imaginary eigenvalues close to \({\varvec{\mathbf{i}}}\), separated by \(C\varepsilon \), with \(C>0\). The corresponding eigenvectors (in the \({{\mathcal {X}}}(\cdot )\) basis) are orthogonal. Furthermore, these eigenvalues have non-vanishing last component (i.e., the components \(2+(n-1)\) and \(2+2(n-1)\) in the \({{\mathcal {X}}}(\cdot )\) representation). Analogous statements hold for the \(n-1\) eigenvalues near \(-{\varvec{\mathbf{i}}}\).

Proof

The remainder of this section is devoted to the proof of this theorem.

A calculation (using our formulas for \(v^{(j)}\), \(n^{(j)}\)) shows that

$$\begin{aligned} {{\mathcal {X}}}({\mathbb {P}})={{\mathcal {X}}}({\mathbb {P}}_0) +{{\mathcal {X}}}({\mathbb {P}}_1)+{{\mathcal {X}}}({\mathbb {P}}_2), \end{aligned}$$
(5.1)

where

$$\begin{aligned} {{\mathcal {X}}}({\mathbb {P}}_0)=\left( \begin{array}{cc|ccc|ccc} 0&{}0&{}&{}&{}&{}&{}&{}\\ 0&{}0&{}&{}&{}&{}&{}&{}\\ \hline &{}&{}1&{}&{}&{}&{}&{}\\ &{}&{}&{}1&{}&{}&{}&{}\\ &{}&{}&{}&{}1&{}&{}&{}\\ \hline &{}&{}&{}&{}&{}1&{}&{}\\ &{}&{}&{}&{}&{}&{}1&{}\\ &{}&{}&{}&{}&{}&{}&{}1\\ \end{array}\right) ,\qquad {{\mathcal {X}}}({\mathbb {P}}_1)= \left( \begin{array}{cc|c|c} 0&{}0&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{} P_1&{}\\ \hline &{}&{}&{} P_2 \end{array}\right) , \end{aligned}$$

where the orders of the elements of \(P_j\) are

$$\begin{aligned} P_j = \begin{pmatrix} \varepsilon ^2 &{}\quad \varepsilon ^3 &{}\quad \varepsilon ^4\\ \varepsilon ^3 &{}\quad \varepsilon ^4 &{}\quad \varepsilon ^5\\ \varepsilon ^4 &{}\quad \varepsilon ^5 &{}\quad \varepsilon ^6\\ \end{pmatrix}. \end{aligned}$$

Furthermore the \(P_j\) are symmetric. Note that \( (P_j)_{i,k} ={\mathcal {O}}(\varepsilon ^{i+k})\). Finally, showing orders only,

$$\begin{aligned} {{\mathcal {X}}}({\mathbb {P}}_2)= \left( \begin{array}{cc|ccc|ccc} \varepsilon +{\varphi _0}&{}0&{}\varepsilon &{}0&{}0&{}\varepsilon &{}0&{}0\\ 0&{}\varepsilon +{\varphi _0}&{}\varepsilon &{}0&{}0&{}\varepsilon &{}0&{}0\\ \hline \varepsilon &{}\varepsilon &{}&{}&{}&{}&{}\\ 0 &{}0 &{}&{}&{}&{}&{}\\ 0 &{}0 &{}&{}&{}&{}&{}\\ \hline \varepsilon &{}\varepsilon &{}&{}&{}&{}&{}\\ 0 &{}0 &{}&{}&{}&{}&{}\\ 0 &{}0 &{}&{}&{}&{}&{}\\ \end{array}\right) +{\mathcal {O}}_2. \end{aligned}$$

The omitted terms are similar to those in \(P_j\) (again a symmetric matrix). Therefore, the eigenvectors are orthogonal. (This actually follows also from the Hamiltonian nature of the problem, but we need more information to control the \(\gamma \)-dependence.)

Remark 5.2

Clearly, \({{\mathcal {X}}}({\mathbb {P}}_0)\) is the projection on the eigenspace spanned by \(\pm {\varvec{\mathbf{i}}}\), when \(\varepsilon =0\). The part \({{\mathcal {X}}}({\mathbb {P}}_1)\) contains the couplings within the subspace of the eigenvalues \(\pm {\varvec{\mathbf{i}}}\), while \({{\mathcal {X}}}({\mathbb {P}}_2)\) describes the coupling between the zero-eigenspace of dimension 2 and its complement.

From Sect. 4 we also have the decomposition Eq. (4.9):

$$\begin{aligned} {{\mathcal {X}}}(M_{{\varphi _0},\varepsilon })=(1+{\varphi _0}){{\mathcal {X}}}(M_{0,0})+{{\mathcal {X}}}(E_{11}) +{{\mathcal {X}}}(E_{12})+{{\mathcal {X}}}(E_2). \end{aligned}$$

Combining with Eq. (5.1), we see that

$$\begin{aligned} {{\mathcal {X}}}({\mathbb {P}}M_{{\varphi _0},\varepsilon })={{\mathcal {X}}}({\mathbb {P}}){{\mathcal {X}}}(M_{{\varphi _0},\varepsilon }) \end{aligned}$$

leads to 12 terms, many of which are of second order in \(\varepsilon \) and \({\varphi _0}\). We start with the dominant ones.

Since \({{\mathcal {X}}}({\mathbb {P}}_0)\) is just the projection onto the complement of the first 2 eigendirections, we get from Eq. (4.4),

$$\begin{aligned}&{{\mathcal {X}}}({\mathbb {P}}_0 M_{\varphi _0,0})={{\mathcal {X}}}({\mathbb {P}}_0){{\mathcal {X}}}(M_{\varphi _0,0})\\&= {\varvec{\mathbf{i}}}1_\varphi \left( \begin{array}{cc|c|c} 0&{}0&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}\mathbb {1}&{}\\ \hline &{}&{}&{}\mathbb {1}\\ \end{array}\right) \left( \begin{array}{cc|c|c} 0&{}-2{\varvec{\mathbf{i}}}&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}\mathbb {1}&{}\\ \hline &{}&{}&{}-\mathbb {1}\\ \end{array}\right) =-{\varvec{\mathbf{i}}}1_\varphi \left( \begin{array}{cc|c|c} 0&{}0&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}\mathbb {1}&{}\\ \hline &{}&{}&{}-\mathbb {1}\end{array}\right) , \end{aligned}$$

where \(1_\varphi \equiv 1+\varphi \). This is clearly the leading constant term.

The next term is at the origin of the \(\varepsilon \)-splitting of Theorem 5.1. Using Eq. (4.7), we get

$$\begin{aligned}&{{\mathcal {X}}}({\mathbb {P}}_0 E_{11})={{\mathcal {X}}}({\mathbb {P}}_0) {{\mathcal {X}}}(E_{11})\\&={\varvec{\mathbf{i}}}\varepsilon \left( \begin{array}{cc|c|c} 0&{}0&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}\mathbb {1}&{}\\ \hline &{}&{}&{}\mathbb {1}\\ \end{array}\right) \left( \begin{array}{cc|c|c} 0&{}2{\varvec{\mathbf{i}}}&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}S&{}\\ \hline &{}&{}&{}-S \end{array}\right) ={\varvec{\mathbf{i}}}\varepsilon \left( \begin{array}{cc|c|c} 0&{}0&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}S&{}\\ \hline &{}&{}&{}-S \end{array}\right) . \end{aligned}$$

Thus, to leading order, we find

$$\begin{aligned} {{\mathcal {X}}}({\mathbb {P}}_0 (M_{\varphi _0,0}+E_{11}))= -{\varvec{\mathbf{i}}}1_\varphi \left( \begin{array}{cc|c|c} 0&{}0&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}\mathbb {1}&{}\\ \hline &{}&{}&{}-\mathbb {1}\\ \end{array}\right) +{\varvec{\mathbf{i}}}\varepsilon \left( \begin{array}{cc|c|c} 0&{}0&{}&{}\\ 0&{}0&{}&{}\\ \hline &{}&{}S&{}\\ \hline &{}&{}&{}-S \end{array}\right) . \end{aligned}$$
(5.2)

We now use

Proposition 5.3

Consider a tri-diagonal matrix U with \(U_{i,i+1}=U_{i,i-1}\ne 0\) for all i and arbitrary elements in the diagonal. Then

  1. 1.

    All eigenfunctions of U have their first and last components non-zero

  2. 2.

    All eigenvalues of U are simple.

Postponing the proof, we conclude, by applying the proposition to S and \(-S\) separately, that: The matrix \({{\mathcal {X}}}({\mathbb {P}}_0 E_{11})\) has a double eigenvalue 0, and \(2(n-1)\) simple, purely imaginary, eigenvalues \(\pm \lambda _1,\dots ,\pm \lambda _{n-1}\) which are different from 0. Because S is symmetric, it follows from the proposition that \({{\mathcal {X}}}({\mathbb {P}}_0(M_{\varphi _0,0}+E_{11}))\) has purely imaginary spectrum, with two eigenvalues equal to 0, and \(n-1\) simple eigenvalues near \({\varvec{\mathbf{i}}}(1+\varphi )\) and another \(n-1\) near \(-{\varvec{\mathbf{i}}}(1+\varphi )\). Furthermore, since \(E_{11}\) is proportional to \(\varepsilon \), and S has simple eigenvalues separated by \({\mathcal {O}}(1)\), we conclude

Corollary 5.4

The eigenvalues of \({{\mathcal {X}}}({\mathbb {P}}_0 (M_{\varphi _0,0}+E_{11}))\) are purely imaginary and satisfy \(|\mu _i-\mu _j|>C'_n \varepsilon \) when \(i\ne j\). The constant \(C'_n>0\) only depends on n. The eigenfunctions, in the \({{\mathcal {X}}}(\cdot )\) basis, have non-zero component at position \(2+n-1\) and 2n.

We will now show that the remaining terms of \({{\mathcal {X}}}({\mathbb {P}}M_{\varphi ,\varepsilon })\) only give rise to corrections of order \({\mathcal {O}}_2\), both in the spectrum and in the eigendirections (on the subspace spanned by \(X^*(\xi ,\eta )^\top \)). For the terms which are of order \({\mathcal {O}}_2\), there is nothing to prove, since they perturb a matrix whose spectrum is separated by \({\mathcal {O}}(\varepsilon )\).

But there are terms of order \({\mathcal {O}}(\varepsilon )\). They are of the form \({{\mathcal {X}}}({\mathbb {P}}_2 M_{0,0})\) or \({{\mathcal {X}}}({\mathbb {P}}_0 E_{12})\). Here, the special form of the matrices comes into play, and the following lemma formulates the key point:

Lemma 5.5

Let U, V be \((r+s) \times (r+s)\) matrices of the form

$$\begin{aligned} U= \left( \begin{array}{c|c} U_1&{}\\ \hline &{}U_2 \end{array}\right) ,\quad V= \left( \begin{array}{c|c} &{}V_1\\ \hline V_2&{} \end{array}\right) , \end{aligned}$$

where \(U_1\) is an \(r \times r\) square matrix, \(U_2\) an \(s \times s\) square matrix and \(V_1\) and \(V_2\) are \(s \times r\) and \(r \times s\) matrices, respectively. Let \(x=(0,x_2)^\top \) be an eigenvector of U (here \(x_2 \in {\mathbb {R}}^s\)). Then

$$\begin{aligned} \langle x | V x\rangle =0. \end{aligned}$$
(5.3)

Proof

Obvious. \(\quad \square \)

We apply this lemma to the two matrices \({{\mathcal {X}}}({\mathbb {P}}_2 M_{0,0} )\) and \({{\mathcal {X}}}({\mathbb {P}}_0 E_{12})\), which play the role of V in the lemma. From Proposition 5.3 we conclude that the eigenvectors of Eq. (5.2) are of the form

$$\begin{aligned} x_1=(0,0,*,\dots ,*,0,\dots ,0)^\top \text { or } x_2=(0,0,0,\dots ,0,*,\dots ,*)^\top . \end{aligned}$$

Therefore Eq. (5.3) applies in this case, and thus the first order contributions of \({{\mathcal {X}}}({\mathbb {P}}_2)\) resp. \({{\mathcal {X}}}(E_{12})\) vanish. Thus, as the spectra are \(\varepsilon \)-separated and simple by Corollary 5.4, we see that for small enough \(|\varepsilon |\), the spectrum maintains the splitting properties when the second order perturbations (in \(\varepsilon \)) are added. Note that, since \({{\mathcal {X}}}(M_{\varphi _0,0})=(1+\varphi _0){{\mathcal {X}}}(M_{0,0})\) by Eq. (4.5) and as \({{\mathcal {X}}}(M_{0,0})\) has the form Eq. (4.4), the effect of \(\varphi _0\) is to just shift the spectrum globally, without changing the spacing of the eigenvalues within a block. This completes the proof of Theorem 5.1. \(\quad \square \)

We end the section with the

Proof of Proposition 5.3

Suppose \(x=(x_1,\dots ,x_n)^\top \) is an eigenfunction with eigenvalue \(\lambda \). Then, from the form of U, we have

$$\begin{aligned} (U_{11}x_1+U_{12}x_2)&=\lambda x_1, \text { so that } x_2=\frac{1}{U_{12}}(\lambda -U_{11})x_1,\\ (U_{21}x_1+U_{22}x_2+U_{23}x_3)&=\lambda x_2, \text { so that }\\ x_3&=\frac{1}{U_{23}} \bigl ((\lambda -U_{22})x_2-U_{21}x_1 \bigr )\\&=\frac{1}{U_{23}}\left( (\lambda -U_{22})\frac{1}{U_{12}}(\lambda -U_{11})-U_{21} \right) x_1. \end{aligned}$$

Continuing in this way, we can write \(x_j=C_j x_1\) for some constant \(C_j\) defined in terms of the \(U_{ij}\) and \(\lambda \). But then, if \(x_1=0\) all other \(x_j\) are 0, and we have not found a non-trivial eigenfunction. The same argument rules out the case \(x_n=0\). Note that other \(x_j\) can vanish.

The proof of 2 is by the same argument: If we normalize, say, \(x_1=1\), the inductive steps above show that the eigenfunctions are uniquely determined by the eigenvalues. Hence, since there is a complete set of eigenvectors, all eigenvalues are simple. \(\quad \square \)

6 The Effect of the Dissipation on the Semigroup

In order to control the evolution of \(\zeta (t)\), we need to understand precisely the bounds on the evolution of the semigroup generated by the linear part of the equation. We find that all modes in the subspace orthogonal to the zero eigendirections are uniformly contracted with a rate proportional to \(\gamma \varepsilon \). This is somewhat surprising due to the localized nature of the dissipative term in the equation. However, if follows from the facts we have demonstrated above. Namely, we have shown in Theorem 5.1 that the eigenvectors in the \({{\mathcal {X}}}(\cdot )\) representation have nonzero last component, and are isolated, and so we conclude from standard perturbation theory that, adding dissipation \(\varGamma \) moves these eigenvalues into the left half plane, by an amount proportional to \(\gamma \varepsilon \) (up to higher order terms). We now check that the coefficients of the term proportional to \(\gamma \varepsilon \) are all non-zero (and depend only on n).

Let

$$\begin{aligned} {{\mathcal {L}}_{\varphi _0,\gamma }}={M_{\varphi _0,\varepsilon }} -{\varGamma },\quad {{\mathcal {X}}}({\mathcal {L}}_{\varphi _0,\gamma })={{\mathcal {X}}}(M_{\varphi _0,\varepsilon }) -{{\mathcal {X}}}(\varGamma ). \end{aligned}$$
(6.1)

An explicit calculation shows that

$$\begin{aligned} {{\mathcal {X}}}(\varGamma )= \left( \begin{array}{cc|ccc|ccc} 0&{}0&{}&{}&{}&{}&{}&{}\\ 0&{}0&{}&{}&{}&{}&{}&{}\\ \hline &{}&{}0&{}0&{}0&{}&{}&{}\\ &{}&{}0&{}0&{}0&{}&{}&{}\\ &{}&{}0&{}0&{}\gamma \varepsilon &{}&{}&{}\\ \hline &{}&{}&{}&{}&{}0&{}0&{}0\\ &{}&{}&{}&{}&{}0&{}0&{}0\\ &{}&{}&{}&{}&{}0&{}0&{}\gamma \varepsilon \end{array}\right) . \end{aligned}$$
(6.2)

Proposition 6.1

There is a constant \(\gamma _0>0\), depending only on n such that for \(\gamma \in [0,\gamma _0]\) one has the bound

$$\begin{aligned} \left\| e^{t{\mathcal {L}}_{\varphi _0,\gamma }}X^*\zeta \right\| \le (1+C_n\gamma ) e^{-\varkappa _n\gamma \varepsilon t}\Vert X^* \zeta \Vert , \end{aligned}$$
(6.3)

for some \(\varkappa _n>0\) and \(C_n>0\), depending only on n.

Proof

This result follows from the classical perturbation theory for eigenvalues and eigenfunctions. By Theorem 5.1 we know that for \(\gamma =0\) the purely imaginary eigenvalues are simple, and pairwise separated by \(C\varepsilon \), with \(C>0\) depending only on n. We focus on the eigenvalues close to \(+{\varvec{\mathbf{i}}}\) — those near \(-{\varvec{\mathbf{i}}}\) are handled by an entirely analogous procedure. From Prop.5.3 we see that the eigenvectors \(v_j\), \(j=1,\dots ,n-1\) corresponding to these eigenvalues have a non-vanishing last component, and therefore there exists some \(C'>0\), depending only on n, such that \(\langle v_j , \varGamma v_j\rangle >C' \gamma \varepsilon \), Therefore, up to higher order terms, standard perturbation theory for simple eigenvalues tells us that the spectrum of \({{\mathcal {X}}}({\mathcal {L}}_{\varphi ,\gamma })\) has (twice) \(n-1\) eigenvalues in the negative halfplane at a distance of \({\mathcal {O}}(\gamma \varepsilon )\) from the imaginary axis.

We next show that the eigendirections make an angle of at most \({\mathcal {O}}(\gamma )\) from the orthogonality of the eigendirections of the symmetric matrix \({{\mathcal {X}}}(M_{\varphi ,\varepsilon })\), thus proving the bound Eq. (6.3). From perturbation theory (see e.g., [12][I.\(\mathsection \)5.3]), the projection onto one of these eigenspaces is given by

$$\begin{aligned} P_j =-\frac{1}{2\pi {\varvec{\mathbf{i}}}}\int _{C_j} R(z) \mathrm{d}z, \end{aligned}$$

where the contour \(C_j\) is a circle of radius \({\mathcal {O}}(\varepsilon )\) around the eigenvalue of the problem for \(\gamma =0\) and R is the resolvent. The perturbed eigenvalue (which moves a distance \(\sim -K_j \gamma \varepsilon \) ) lies inside this circle, if \(\gamma <\gamma _0 \) is sufficiently small, where \(\gamma _0\) depends only on n but not on \(\varepsilon \), so long as \(|\varepsilon | < \varepsilon _0\), for some fixed \(\varepsilon _0\). Therefore, \({\left| \left| P_j \right| \right| }<1+{\mathcal {O}}(\gamma )\), since the contour integral over the circle leads to a bound \(1/\varepsilon \) which cancels the factor \(\varepsilon \) in \(\gamma \varepsilon \). \(\quad \square \)

Note that since the change of variables matrix X is orthogonal, these decay estimates also hold in the original coordinates, i.e.,

Corollary 6.2

There is a constant \(C_n\) such that

$$\begin{aligned} \left\| e^{t{\mathcal {L}}_{\varphi _0,\gamma }} \zeta \right\| \le (1+C_n\gamma ) e^{-\varkappa _n\gamma \varepsilon t}\Vert \zeta \Vert . \end{aligned}$$
(6.4)

7 Projecting onto the Complement of the 0 Eigenspace

In this section, we reexamine equations (3.3)–(3.5) to derive carefully, and explicitly, the equations for the evolution of the variables \(\varphi \), \(\vartheta \), and \(\zeta \). In particular, we look at the constraints on these equations imposed by the requirement that \(\zeta \) remains in the range of \({\mathbb {P}}={\mathbb {P}}_{\varphi _0}\). As \(\varphi \) changes with time, the projection will also generate terms involving \(\varphi (t)-\varphi _0\). We will bound these terms carefully, since they lead to secular growth in \(\zeta \).

As we will often have to compare \(p(\varphi (t))\) to \(p(\varphi _0)\), it is useful to bound this difference as \({\mathcal {O}}(\delta )\) with

$$\begin{aligned} \delta =\varphi (t)-\varphi _0. \end{aligned}$$

We will only be interested in small \(\delta \).

We fix a \(\varphi _0\) small enough for Theorem 5.1 to apply. We next analyze the terms on the r.h.s of Eq. (3.5), one by one, using that \(\zeta \) is orthogonal to the \(n^{(j)}_{\varphi _0}\).

Lemma 7.1

Consider the linear evolution operator

$$\begin{aligned} U=\begin{pmatrix}0&{}-\bigl ((L-\varphi ) +(p(\varphi ))^2\bigr )\\ (L-\varphi ) +3(p(\varphi ))^2&{}0 \end{pmatrix}. \end{aligned}$$

Then,

$$\begin{aligned} \langle n^{(2)}_{\varphi _0} | U \zeta \rangle&={\mathcal {O}}(\delta )\Vert \zeta \Vert , \end{aligned}$$
(7.1)
$$\begin{aligned} \langle n^{(1)}_{\varphi _0} | U \zeta \rangle&={\mathcal {O}}(\delta )\Vert \zeta \Vert , \end{aligned}$$
(7.2)
$$\begin{aligned} {\mathbb {P}}_{\varphi _0} U\zeta&= U\zeta + {\mathcal {O}}(\delta )\Vert \zeta \Vert . \end{aligned}$$
(7.3)

Proof

Note that

$$\begin{aligned} \left\langle n^{(2)}_{\varphi _0}\big |U\zeta \right\rangle = \left\langle U^*n^{(2)}_{\varphi _0} \big |\zeta \right\rangle , \end{aligned}$$

and so,

$$\begin{aligned} U^*&n^{(2)}_{\varphi _0}=\begin{pmatrix} 0&{}\bigl ((L-\varphi ) +3(p(\varphi ))^2\bigr )\\ -\bigl (L-\varphi +(p(\varphi ))^2\bigr )&{}0\end{pmatrix} \begin{pmatrix}p(\varphi _0)\\ 0\end{pmatrix}=\begin{pmatrix}{\mathcal {O}}(\delta )\\ 0 \end{pmatrix}. \end{aligned}$$

We use here, and throughout, the smoothness of \(p(\varphi )\) and the expansion of \(p(\varphi )\). The replacement of \(\varphi \) by \(\varphi _0\) therefore leads to an error term in Eq. (7.1) of the form

$$\begin{aligned} {\mathcal {O}}(\delta ){\left| \left| \xi \right| \right| }\le {\mathcal {O}}(\delta ) {\left| \left| \zeta \right| \right| }. \end{aligned}$$

This proves Eq. (7.1).

We next study \( \left\langle n^{(1)}_{\varphi _0}\big |U\zeta \right\rangle \) and take the adjoint \(\left\langle U^*n^{(1)}_{\varphi _0}\big |\zeta \right\rangle \), using \(n^{(1)}_{\varphi _0}=(2+{\mathcal {O}}(\varepsilon +\varphi _0 ))(0,\partial _{\varphi _0} p(\varphi _0))^\top \). Recall that

$$\begin{aligned} \begin{pmatrix}0&{}B_{\varphi _0}\\ A_{\varphi _0}&{}0\end{pmatrix}n^{(1)}_{\varphi _0}=n^{(2)}_{\varphi _0}, \end{aligned}$$

which is orthogonal to \(\zeta =(\xi ,\eta )^\top \). We write

$$\begin{aligned} U^* =\begin{pmatrix}0&{}B_\varphi -B_{\varphi _0}\\ A_\varphi -A_{\varphi _0}&{}0\end{pmatrix}+\begin{pmatrix}0&{}B_{\varphi _0}\\ A_{\varphi _0}&{}0\end{pmatrix}, \end{aligned}$$

and therefore

$$\begin{aligned} \left\langle n^{(1)}_{\varphi _0}\big |U\zeta \right\rangle&= \left\langle \begin{pmatrix}0&{}B_\varphi -B_{\varphi _0}\\ A_\varphi -A_{\varphi _0}&{}0\end{pmatrix}n^{(1)}_{\varphi _0}\big |\zeta \right\rangle&+\left\langle \begin{pmatrix}0&{}B_{\varphi _0}\\ A_{\varphi _0}&{}0\end{pmatrix}n^{(1)}_{\varphi _0}\big |\zeta \ \right\rangle \\&= {\mathcal {O}}(\delta ) \Vert \zeta \Vert , \end{aligned}$$

since the second term is zero by construction. This proves Eq. (7.2). The identity Eq. (7.3) follows. \(\quad \square \)

Lemma 7.2

The \(\varGamma \)-dependent terms of Eq. (3.5) lead to the bounds

$$\begin{aligned} \left\langle n^{(2)}_{\varphi _0} \big |\begin{pmatrix}-C_\varGamma \xi -C_\varGamma p(\varphi )\\ -C_\varGamma \eta \end{pmatrix}\right\rangle&= {\mathcal {O}}(\gamma \varepsilon ^{n}){\left| \left| \zeta \right| \right| }-2(1+{\mathcal {O}}(\delta ))\gamma \varepsilon ^{2n-1}, \end{aligned}$$
(7.4)
$$\begin{aligned} \left\langle n^{(1)}_{\varphi _0}\big |\begin{pmatrix}-C_\varGamma \xi -C_\varGamma p(\varphi )\\ -C_\varGamma \eta \end{pmatrix} \right\rangle&={\mathcal {O}}(\gamma \varepsilon ^{n}){\left| \left| \zeta \right| \right| }, \end{aligned}$$
(7.5)
$$\begin{aligned} {\Vert {\mathbb {P}}_{\varphi _0} \varGamma \zeta - \varGamma \zeta \Vert }&{ \le C \gamma \varepsilon ^{n} {\left| \left| \zeta \right| \right| }} . \end{aligned}$$
(7.6)

Proof

From the definition of \(n^{(2)}_{\varphi _0}\) we get

$$\begin{aligned} 2&\left\langle \begin{pmatrix}p(\varphi _0)\\ 0\end{pmatrix}\big |\begin{pmatrix}-C_\varGamma \xi -C_\varGamma p(\varphi )\\ -C_\varGamma \eta \end{pmatrix} \right\rangle \\&= -2\gamma \varepsilon \xi _n\, p(\varphi _0)_n -2\gamma \varepsilon (p(\varphi _0))_n (p(\varphi ))_n \\&={\mathcal {O}}(1) {\gamma \varepsilon \xi _n\varepsilon ^{n-1}}-2\gamma \varepsilon ^{2n-1}(1+{\mathcal {O}}(\delta )), \end{aligned}$$

using the expansion of \(p(\varphi )\) in powers of \(\varepsilon \), and observing that \(C_\varGamma \) is proportional to \(\gamma \varepsilon \).

Similarly, from the definition of \(n^{(1)}_{\varphi _0}\), we get

$$\begin{aligned} \left\langle n^{(1)}_{\varphi _0}\big |\begin{pmatrix}C_\varGamma \xi +C_\varGamma p(\varphi )\\ C_\varGamma \eta \end{pmatrix} \right\rangle = \partial _{\varphi _0}p(\varphi _0)\cdot C_\varGamma \eta =\gamma \varepsilon (n^{(1)}_{\varphi _0})_n \eta _n={\mathcal {O}}(1) \gamma \varepsilon ^{n}\eta _n, \end{aligned}$$

using the expansion for \((n^{(1)}_{\varphi _0})_j=\partial _{\varphi _0}p_j(\varphi _0)= {\mathcal {O}}( \varepsilon ^{j-1})\). The last equation follows from the first two. \(\quad \square \)

Lemma 7.3

Consider the terms involving \((t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) \). We have, omitting throughout the factor \((t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) \):

$$\begin{aligned} \left\langle n^{(2)}_{\varphi _0} \big |\begin{pmatrix}\eta \\ -\xi -p(\varphi )\end{pmatrix} \right\rangle&= (2+{\mathcal {O}}(\varepsilon +\varphi )){\left| \left| \zeta \right| \right| }, \end{aligned}$$
(7.7)
$$\begin{aligned} \left\langle n^{(1)}_{\varphi _0} \big |\begin{pmatrix}\eta \\ -\xi -p(\varphi )\end{pmatrix} \right\rangle&= -1+{\mathcal {O}}(\delta +\varepsilon )+{\mathcal {O}}(1){\left| \left| \zeta \right| \right| }, \end{aligned}$$
(7.8)
$$\begin{aligned} {\mathbb {P}}_{\varphi _0} \begin{pmatrix}\eta \\ -\xi -p(\varphi )\end{pmatrix}&=\begin{pmatrix}{\mathcal {O}}({\left| \left| \zeta \right| \right| })\\ {\mathcal {O}}(\delta +\varepsilon +{\left| \left| \zeta \right| \right| })\end{pmatrix}. \end{aligned}$$
(7.9)

Proof

Equation (7.7) follows by observing that

$$\begin{aligned} \left\langle \begin{pmatrix}p(\varphi _0)\\ 0\end{pmatrix}\big |\begin{pmatrix}\eta \\ -\xi \end{pmatrix} \right\rangle = (p(\varphi _0)\cdot \eta ), \end{aligned}$$

and

$$\begin{aligned} \left\langle \begin{pmatrix}p(\varphi _0)\\ 0\end{pmatrix}\big |\begin{pmatrix}0\\ -p(\varphi )\end{pmatrix} \right\rangle =0, \end{aligned}$$

and using \({\left| \left| p(\varphi _0) \right| \right| }=1+{\mathcal {O}}(\varepsilon +\varphi )\). To prove Eq. (7.8), observe that, by our normalization,

$$\begin{aligned} \left\langle n^{(1)}_{\varphi _0} \big |\begin{pmatrix}0\\ p(\varphi _0)\end{pmatrix} \right\rangle =\left\langle n^{(1)}_{\varphi _0}\big |v^{(1)}_{\varphi _0} \right\rangle =1, \end{aligned}$$

and therefore,

$$\begin{aligned} \left\langle n^{(1)}_{\varphi _0} \big |\begin{pmatrix}0\\ -p(\varphi )\end{pmatrix}\right\rangle =-1+{\mathcal {O}}(\delta ). \end{aligned}$$

On the other hand,

$$\begin{aligned} \left\langle n^{(1)}_{\varphi _0} \big |\begin{pmatrix}\eta \\ -\xi \end{pmatrix} \right\rangle =-\partial _{\varphi _0}p(\varphi _0)\xi , \end{aligned}$$

and thus Eq. (7.8) follows. Finally Eq. (7.9) follows from Eqs. (7.7) and (7.8);

$$\begin{aligned} {\mathbb {P}}_{\varphi _0} \begin{pmatrix}\eta \\ -\xi -p(\varphi )\end{pmatrix}&=\begin{pmatrix}\eta \\ -\xi -p(\varphi )\end{pmatrix}\\ {}&\quad - (2+{\mathcal {O}}(\varepsilon ))\begin{pmatrix}\partial _{\varphi _0}p(\varphi _0)\\ 0\end{pmatrix}(p(\varphi _0)\cdot \eta ) \\&\quad - \begin{pmatrix}0\\ p(\varphi _0)\end{pmatrix}(-1+{\mathcal {O}}(\delta +\varepsilon +{\left| \left| \zeta \right| \right| })). \end{aligned}$$

The term involving \(\eta \) cancels by the normalization of \(v^{(2)}\) and \(n^{(2)}\). The term \(-p(\varphi _0)\cdot (-1)\) cancels with \(-p(\varphi )\) up to \({\mathcal {O}}(\delta +\varepsilon )\), and thus, Eq. (7.9) follows. \(\quad \square \)

Lemma 7.4

The terms involving \({{\dot{\varphi }}}\) are bounded as follows (omitting the factor \({{\dot{\varphi }}}\)):

$$\begin{aligned} \left\langle n^{(2)}_{\varphi _0} \big |\begin{pmatrix}-\partial _\varphi p(\varphi )\\ 0\end{pmatrix}\right\rangle&=-1+{\mathcal {O}}(\delta ), \end{aligned}$$
(7.10)
$$\begin{aligned} \left\langle n^{(1)}_{\varphi _0} \big |\begin{pmatrix}-\partial _\varphi p(\varphi )\\ 0\end{pmatrix}\right\rangle&=0 , \end{aligned}$$
(7.11)
$$\begin{aligned} {\mathbb {P}}_{\varphi _0} \begin{pmatrix}-\partial _{\varphi }p(\varphi )\\ 0\end{pmatrix}&= \begin{pmatrix}{\mathcal {O}}(\delta )\\ 0\end{pmatrix}. \end{aligned}$$
(7.12)

Proof

Recall that

$$\begin{aligned} \begin{pmatrix}\partial _\varphi p(\varphi )\\ 0\end{pmatrix}\sim (1/2,0,\dots ,0)^\top , \end{aligned}$$

and so, since \(n^{(2)}_{\varphi _0}\sim 2 (p(\varphi _0),0)^\top \), we find

$$\begin{aligned} \left\langle n^{(2)}_{\varphi _0} \big |\begin{pmatrix}-\partial _\varphi p(\varphi )\\ 0\end{pmatrix}\right\rangle&=\left\langle 2\begin{pmatrix}p(\varphi _0)\\ 0\end{pmatrix}\big |\begin{pmatrix}-\partial _\varphi p(\varphi )\\ 0\end{pmatrix}\right\rangle \\&=- 2p(\varphi _0)\cdot \partial _\varphi p(\varphi )= -1+{\mathcal {O}}(\delta ) , \end{aligned}$$

which is Eq. (7.10). From the form of \(n^{(1)}\), Eq. (7.11) is obvious. Finally,

$$\begin{aligned} {\mathbb {P}}_{\varphi _0} \begin{pmatrix}-\partial _{\varphi }p(\varphi )\\ 0\end{pmatrix}&= \begin{pmatrix}-\partial _{\varphi }p(\varphi )\\ 0\end{pmatrix}+\begin{pmatrix}\partial _{\varphi _0}p(\varphi _0)\\ 0\end{pmatrix} +\begin{pmatrix}{\mathcal {O}}(\delta )\\ 0\end{pmatrix}. \end{aligned}$$

\(\quad \square \)

We now combine the Lemmas 7.17.4. Note that \({\mathbb {Q}}_{\varphi _0}\equiv \mathbb {1}-{\mathbb {P}}_{\varphi _0}\) projects on a two-dimensional space. Let \({\mathbb {Q}}^{(j)}=|v^{(j)}_{\varphi _0}\rangle \langle n^{(j)}_{\varphi _0} |\), \(j=1,2\).

For \(j=2\), we get contributions:

From Eq. (7.1), we have \({\mathbb {Q}}^{(2)} U\zeta ={\mathcal {O}}(\delta ) {\left| \left| \zeta \right| \right| }\).

From Eq. (7.4) we get \({\mathbb {Q}}^{(2)}\begin{pmatrix}C_\varGamma (\xi +p(\varphi ))\\ C_\varGamma \eta \end{pmatrix}=-2(1+{\mathcal {O}}(\delta ))\gamma \varepsilon ^{2n-1}+{\mathcal {O}}(\gamma \varepsilon ^n ){\left| \left| \zeta \right| \right| }\).

From Eq. (7.7) we get \((t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) {\mathbb {Q}}^{(2)}\begin{pmatrix}\eta \\ -\xi -p(\varphi )\end{pmatrix}=(t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) {\mathcal {O}}({\left| \left| \zeta \right| \right| })\), and

from Eq. (7.10) we get \({{\dot{\varphi }}}{\mathbb {Q}}^{(2)}\begin{pmatrix}-\partial _\varphi p(\varphi )\\ 0\end{pmatrix}={{\dot{\varphi }}}(-1+{\mathcal {O}}(\delta ))\).

Similarly, for \(j=1\), we get contributions:

From Eq. (7.2), we have \({\mathbb {Q}}^{(1)} U\zeta ={\mathcal {O}}(\delta ) {\left| \left| \zeta \right| \right| }\).

From Eq. (7.5) we get \({\mathbb {Q}}^{(1)} \begin{pmatrix}C_\varGamma (\xi +p(\varphi ))\\ C_\varGamma \eta \end{pmatrix}={\mathcal {O}}(\gamma \varepsilon ^n) {\left| \left| \zeta \right| \right| }\).

From Eq. (7.8) we get

\((t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) {\mathbb {Q}}^{(1)}\begin{pmatrix}\eta \\ -\xi -p(\varphi )\end{pmatrix}=(t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) (-1+{\mathcal {O}}(\delta +\varepsilon +{\left| \left| \zeta \right| \right| })\), and

from Eq. (7.11) we get \({{\dot{\varphi }}}{\mathbb {Q}}^{(1)}\begin{pmatrix}-\partial _\varphi p(\varphi )\\ 0\end{pmatrix}=0\).

By construction we have that \({\mathbb {Q}}_{\varphi _0}\equiv \mathbb {1}-{\mathbb {P}}_{\varphi _0}\) projects on the null-space, and therefore

$$\begin{aligned} {\mathbb {Q}}_{\varphi _0} {{\dot{\zeta }}} =0. \end{aligned}$$

Since \({\mathbb {Q}}_{\varphi _0}{{\dot{\zeta }}}=0\), we find, upon summing, for the “1” component (and recalling the nonlinear terms in (3.5)), we find

$$\begin{aligned} 0&={\mathcal {O}}(\delta ){\left| \left| \zeta \right| \right| }+(-2+{\mathcal {O}}(\delta ))\gamma \varepsilon ^{2n-1}\nonumber \\ {}&\quad +{\mathcal {O}}(\gamma \varepsilon ^n {\left| \left| \zeta \right| \right| })+ (t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) {\mathcal {O}}({\left| \left| \zeta \right| \right| })+{{\dot{\varphi }}}(-1+{\mathcal {O}}(\delta )) + {{\mathcal {O}}(\Vert \zeta \Vert ^2)} , \end{aligned}$$
(7.13)

and for the “2” component:

$$\begin{aligned} 0= ({\mathcal {O}}(\delta )+{\mathcal {O}}(\gamma \varepsilon ^n) {\left| \left| \zeta \right| \right| }- (t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) (1+{\mathcal {O}}(\delta +\varepsilon +{\left| \left| \zeta \right| \right| }))+ {{\mathcal {O}}(\Vert \zeta \Vert ^2)}.\ \ \end{aligned}$$
(7.14)

Finally, using the projection \({\mathbb {P}}_{\varphi _0}\), we find:

From Eq. (7.3), \({\mathbb {P}}_{\varphi _0} U\zeta = U\zeta + {\mathcal {O}}(\delta )\Vert \zeta \Vert \),

from Eq. (7.6), \({\mathbb {P}}_{\varphi _0}\varGamma \zeta =\varGamma \zeta +{\mathcal {O}}(\gamma \varepsilon ^{n} )\Vert \zeta \Vert \),

from Eq. (7.9), \((t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) {\mathbb {P}}_{\varphi _0} \begin{pmatrix}\eta \\ -\xi -p(\varphi )\end{pmatrix} =(t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) \begin{pmatrix}0\\ {\mathcal {O}}(\delta +\varepsilon )\end{pmatrix}\),

and finally from Eq. (7.12), \( {{\dot{\varphi }}}{\mathbb {P}}_{\varphi _0} \begin{pmatrix}-\partial _{\varphi }p(\varphi )\\ 0\end{pmatrix}= {{\dot{\varphi }}}\begin{pmatrix}{\mathcal {O}}(\delta ) \\ 0\end{pmatrix}\).

Summing these terms, we get

$$\begin{aligned} {{\dot{\zeta }}} ={\mathcal {L}}_{\varphi _0,\gamma }\zeta +(t{{\dot{\varphi }}}+{{\dot{\vartheta }}}) \begin{pmatrix}0\\ {\mathcal {O}}(\delta +\varepsilon )\end{pmatrix} +{{\dot{\varphi }}}\begin{pmatrix}{\mathcal {O}}(\delta ) \\ 0\end{pmatrix} + {\mathcal {O}}(\Vert \zeta \Vert ^2) . \end{aligned}$$
(7.15)

Simplifying the notation somewhat, and substituting Eq. (7.14) into Eq. (7.13) we formulate Eqs. (7.13)–(7.15) as a proposition:

Proposition 7.5

One has

$$\begin{aligned} t{{\dot{\varphi }}} +{{\dot{\vartheta }}}= & {} {\mathcal {O}}(\delta +\gamma \varepsilon ^{n}){\left| \left| \zeta \right| \right| }+ {\mathcal {O}}(\Vert \zeta \Vert ^2) ,\nonumber \\ {{\dot{\varphi }}}= & {} -(2+{\mathcal {O}}(\delta ))\gamma \varepsilon ^{2n-1}+{\mathcal {O}}(\delta +\gamma \varepsilon ^n){\left| \left| \zeta \right| \right| }+ {\mathcal {O}}(\Vert \zeta \Vert ^2) , \nonumber \\ {{\dot{\zeta }}}= & {} {\mathcal {L}}_{\varphi _0,\gamma } \zeta +{\mathcal {O}}(\delta +\gamma \varepsilon ^n){\left| \left| \zeta \right| \right| }\begin{pmatrix}0\\ {\mathcal {O}}(\delta +\varepsilon )\end{pmatrix}\nonumber \\&\quad -\begin{pmatrix}(2+{\mathcal {O}}(\delta ))\gamma \varepsilon ^{2n-1}{\mathcal {O}}(\delta )\\ 0 \end{pmatrix} + {\mathcal {O}}(\Vert \zeta \Vert ^2). \end{aligned}$$
(7.16)

Here, as \(\zeta =(\xi ,\eta )^\top \),

$$\begin{aligned} {\mathcal {L}}_ {\varphi _0,\gamma }\zeta = \begin{pmatrix} 0&{}A_{\varphi _0}\\ B_{\varphi _0} &{}0 \end{pmatrix} \begin{pmatrix} \xi \\ \eta \end{pmatrix} - \begin{pmatrix} C_\varGamma &{}0\\ 0&{}C_\varGamma \end{pmatrix} \begin{pmatrix} \xi \\ \eta \end{pmatrix}. \end{aligned}$$

8 Bounds on the Evolution of \(\zeta \)

In principle, \(\Vert \zeta \Vert \) can grow as the system evolves, and there are two possible causes. First, for short times, the bound Eq. (6.4)

$$\begin{aligned} \left\| e^{t{\mathcal {L}}_{\varphi _0,\gamma }} \zeta \right\| \le (1+C_n\gamma ) e^{-\varkappa _n\gamma \varepsilon t}\Vert \zeta \Vert , \end{aligned}$$

does not contract. Secondly, \(\zeta (t)\) is orthogonal to the cylinder of breathers at the initial point \((\varphi _0,\vartheta _0)\), but as \(\varphi \) and \(\vartheta \) evolve with time, this is no longer the case. We must periodically reorthogonalize \(\zeta (t)\) by a procedure which we detail in the next section, and which replaces \(\zeta (T)\) by \({{\widehat{\zeta }}}\). As we will show in Eq. (9.4), this leads to a growth which is bounded by

$$\begin{aligned} \Vert {\widehat{\zeta }} \Vert \le (1+ C_R (\varphi (T)-\varphi _0) \gamma \varepsilon ^n) \Vert \zeta (T)\Vert . \end{aligned}$$
(8.1)

In this section we show that the contraction in the semigroup generated by the dissipative terms in the equation is sufficient to overcome those growths, if we wait a sufficiently long time. We will show (up to details spelled out below) that if \({\left| \left| \zeta (0) \right| \right| }\le \gamma \varepsilon ^n\), and \(T=C_T/\varepsilon \), then \((1+C_R |\varphi (T)-\varphi _0|\gamma \varepsilon ^n){\left| \left| \zeta (T) \right| \right| }\le \gamma \varepsilon ^n\). Furthermore for all \(t\in [0,T]\) one has \({\left| \left| \zeta (t) \right| \right| }\le 2 \gamma \varepsilon ^n\). To prove such statements, we reconsider the equations of Proposition 7.5 which we rewrite in a slightly simplified way: We define

$$\begin{aligned} \delta (t) =\varphi (t)-\varphi _0 , \end{aligned}$$

and then

$$\begin{aligned} \dot{s}&={\mathcal {O}}(\delta +\gamma \varepsilon ^{n}){\left| \left| \zeta \right| \right| }+ {\mathcal {O}}( \Vert \zeta (t) \Vert ^2) , \end{aligned}$$
(8.2)
$$\begin{aligned} {{\dot{\delta }}}&=-(2+{\mathcal {O}}(\delta ))\gamma \varepsilon ^{2n-1}+{\mathcal {O}}(\delta +\gamma \varepsilon ^n){\left| \left| \zeta \right| \right| }+ {\mathcal {O}}( \Vert \zeta (t) \Vert ^2) ~ , \end{aligned}$$
(8.3)
$$\begin{aligned} {{\dot{\zeta }}}&={\mathcal {L}}_{\varphi _0,\gamma }\zeta +\dot{s} \begin{pmatrix}0\\ {\mathcal {O}}(\delta + \varepsilon )\end{pmatrix} +{{\dot{\delta }}} \begin{pmatrix}{\mathcal {O}}(\delta ) \\ 0\end{pmatrix} + {\mathcal {O}}( \Vert \zeta (t) \Vert ^2) . \end{aligned}$$
(8.4)

Here, as \(\zeta =(\xi ,\eta )^\top \),

$$\begin{aligned} {\mathcal {L}}_ {\varphi _0,\gamma }\zeta = \begin{pmatrix} 0&{}A_{\varphi _0}\\ B_{\varphi _0} &{}0 \end{pmatrix} \begin{pmatrix} \xi \\ \eta \end{pmatrix} - \begin{pmatrix} C_\varGamma &{}0\\ 0&{}C_\varGamma \end{pmatrix} \begin{pmatrix} \xi \\ \eta \end{pmatrix}. \end{aligned}$$

Definition 8.1

We define the arrival time \(T\) by

$$\begin{aligned} T= \frac{8C_n}{\varkappa _n \varepsilon }\equiv C_T\varepsilon ^{-1} . \end{aligned}$$
(8.5)

This definition ensures that

$$\begin{aligned} (1+\frac{3}{2} C_n\gamma )e^{-\varkappa _n\gamma \varepsilon T/4} \le 1. \end{aligned}$$
(8.6)

The remaining factor \(e^{-\varkappa _n \gamma \varepsilon T/4}\) will be used to bound \( (1+C_R|\delta |\gamma \varepsilon ^n)\), while another \(e^{-\varkappa _n \gamma \varepsilon T/2}\) will be used to bound the contributions from the mixed terms in Eqs. (8.2)–(8.4).

Since we have a coupled system, we introduce a norm over times in \([0,T]\). Let \(x=(s,\delta ,\zeta )\), and consider a family of functions

$$\begin{aligned} \{x\}_{T}=\{x(\tau )\}_{\tau \in [0,T]}. \end{aligned}$$

We define

$$\begin{aligned} {\left| \left| \left| \{x\}_t \right| \right| \right| } =\max \bigl ( \sup _{\tau \in [0,t]} |s(\tau )|, \sup _{\tau \in [0,t] }|\delta (\tau )| ,C_\zeta {\left| \left| \zeta (\tau ) \right| \right| }\bigr ) ,\quad \text {with}\quad C_\zeta ^{-1} =\gamma \varepsilon ^{n}. \end{aligned}$$

The equations Eqs. (8.2)–(8.4) define an evolution \(t\mapsto {\mathcal {F}}^t\).

Theorem 8.2

Let \(t\mapsto x(t)\) be a family of functions (not necessarily a solution of the system above) which satisfies \({\left| \left| \left| \{x\}_{T} \right| \right| \right| }\le 2\). Define \({\mathcal {F}}x=\{{\mathcal {F}}^\tau x\}_{\tau \in [0,T]}\) as the family evolving from x(0). If \({\left| \left| \left| \{x\}_0 \right| \right| \right| }\le 2\) and \(s(0)=\delta (0)=0\), then the solution of the system above satisfies

$$\begin{aligned} {\left| \left| \left| \{{\mathcal {F}}x\}_{t} \right| \right| \right| } \le 4 , \end{aligned}$$

for all \(t\in [0,T]\). In other words, \({\mathcal {F}}\) maps such initial conditions into the sphere of radius 4.

Furthermore, when \({\left| \left| \left| \{x\}_0 \right| \right| \right| }\le 1\) (this means in particular \({\left| \left| \zeta (0) \right| \right| }\le \gamma \varepsilon ^n\)) then the solution of the above system satisfies at time \(T\)Footnote 5:

$$\begin{aligned} | \delta (T)&+ 2\gamma \varepsilon ^{2n-1} T| \le 2\gamma \varepsilon ^{2n-1/2} T,\\ {\left| \left| \zeta (T) \right| \right| }&\le e^{- \varkappa _n \gamma \varepsilon T/4} e^{- \varkappa _n \gamma \varepsilon T/2}(1-\gamma \varepsilon )^{-1} (1+ \frac{3}{2} C_n \gamma ) \gamma \varepsilon ^n \\ {}&\le e^{- \varkappa _n \gamma \varepsilon T/4}\gamma \varepsilon ^n. \end{aligned}$$

Corollary 8.3

Referring to Eq. (8.1) (i.e., Eq. (9.4)), we get the bound

$$\begin{aligned} \Vert {{\widehat{\zeta }}}\Vert&\le (1+ C_R (\varphi (T)-\varphi _0) \gamma \varepsilon ^n) \Vert \zeta (T)\Vert \\ {}&\le e^{- \varkappa _n \gamma \varepsilon T/4}(1+ C_R (\varphi (T)-\varphi _0) \gamma \varepsilon ^n)\gamma \varepsilon ^n\le \gamma \varepsilon ^n. \end{aligned}$$

In other words, \(\Vert {{\widehat{\zeta }}}\Vert \) (at time T) stays within the region \(\gamma \varepsilon ^n\).

Remark 8.4

The norm \({\left| \left| \left| \cdot \right| \right| \right| }\) was introduced to allow for an a priori bound on \(\zeta (t)\) which is needed because of our way to estimate the evolution of the coupled system Eqs. (8.2)–(8.4).

Remark 8.5

We assumed \(\delta (0)=0\) since that is the case which interests us. Also, by the gauge invariance, we may assume that \(\vartheta _0=\vartheta (0)=0\).

Proof

We first study \(\delta \).

Lemma 8.6

Assume \({\left| \left| \left| \{x\}_{T} \right| \right| \right| }\le 2\). Then, we have, for \(t\le T=C_{T}/\varepsilon \),

$$\begin{aligned} | \delta (t) + 2\gamma \varepsilon ^{2n-1}t | \le 2 \gamma \varepsilon ^{2n-1/2} t. \end{aligned}$$
(8.7)

Remark 8.7

Note that this means that to lowest order in \(\varepsilon \), \(\delta (t) \sim 2 \gamma \varepsilon ^{2n-1} t\) for \(0 \le t \le T\), which is the rate we found in [1].

Proof of Lemma 8.6

It is here that we use the a priori bound, and later we will see that the actual orbit of \(\zeta (\tau )\) indeed satisfies this bound. By the assumption, we have \({\left| \left| \zeta (\tau ) \right| \right| }\le 2/C_\zeta =2 \gamma \varepsilon ^{n}\) for \(\tau \in [0,T]\). Therefore, we can bound \(\delta (\tau )\) as follows: The Eq. (8.3) is of the form (with local names) and finite constants \(C_B\) and \(C_C\):

$$\begin{aligned} {{\dot{\delta }}} (t)&= -A +B(t)\delta (t)+C(t) ,\\ A~~&=2\gamma \varepsilon ^{2n-1} ,\\ |B(t)|&\le C_B(\gamma \varepsilon ^{2n-1}+{\left| \left| \zeta (t) \right| \right| }) ,\\ |C(t)|&\le C_C(\gamma \varepsilon ^n{\left| \left| \zeta (t) \right| \right| } + \Vert \zeta (t) \Vert ^2). \end{aligned}$$

We have \(\delta (0)=0\). The equation for \(u(t)\equiv \delta (t)+At\) reads

$$\begin{aligned} \dot{u} = B(t)u(t) + (C(t)-At\cdot B(t)). \end{aligned}$$
(8.8)

Lemma 8.8

If \({\left| \left| \left| \{x\}_{T} \right| \right| \right| }\le 2\), then

$$\begin{aligned} |u(t)| \le At \varepsilon ^{1/2} . \end{aligned}$$

This clearly proves Eq. (8.7) and hence Lemma 8.6. \(\quad \square \)

Proof of Lemma 8.8

The solution of Eq. (8.8) is

$$\begin{aligned} u(t)=\int _0^t\mathrm{d}\tau \,\bigl ( C(\tau )-A\tau B(\tau )\bigr )e^{\int _\tau ^t \mathrm{d}\tau ' B(\tau ')}. \end{aligned}$$
(8.9)

Let \( B_{\max }=\max _{\tau \in [0,t]} |B(\tau )|\), and \(C_{\max }=\max _{\tau \in [0,t]} |C(\tau )|\). From the assumptions, we have, for sufficiently small \(\varepsilon \),

$$\begin{aligned} B_{\max }t&\le C_{T}C_B 2(\gamma \varepsilon ^{2n-1}+C_\zeta ^{-1})/\varepsilon \\&\le C_{T}C_B 2(\gamma \varepsilon ^{2n-2}+\varepsilon ^{n-1/2})\ll 2\varepsilon ,\\ C_{\max }&\le C_C 2(\gamma \varepsilon ^n C_\zeta ^{-1} + 4 C_{\zeta }^{-2} )\le 4 C_C \gamma \varepsilon ^ {2n} , \end{aligned}$$

(where we have assumed that \(\gamma < 1/2\)). The term coming from \(C(\tau )\) in Eq. (8.9) is bounded by

$$\begin{aligned} C_{\max } \frac{|1-e^{B_{\max }t}|}{B_{\max }}\le 2C_{\max }t , \end{aligned}$$

since \( B_{\max }t\ll 1\).

This leads to a bound for \(2C_{\max }t\) of the form

$$\begin{aligned} 2C_{\max }t\le 8 C_C\gamma \varepsilon ^{2n}t , \end{aligned}$$

which is much smaller than \(At=2\gamma \varepsilon ^{2n-1}t\) (when \(\varepsilon \) is small enough). The term in the integral coming from \(At\cdot B(t)\) can be bounded as:

$$\begin{aligned} \int _0^t \mathrm{d}\tau \,&A\tau \cdot B_{\max } e^{(t-\tau )B_{\max } }= A\frac{|B_{\max } t -e^{B_{\max }t}+1|}{B_{\max }}\\ {}&\le At\left( \frac{B_{\max }t}{2}+{\mathcal {O}}((B_{\max }t)^2)\right) \le At B_{\max }t , \end{aligned}$$

since we already showed \( B_{\max }t\ll 2\varepsilon \). Collecting terms, we get, for \(t\le C_{T}/\varepsilon \),

$$\begin{aligned} |u(t)| \le (8C_C\gamma \varepsilon ^{2n} +2 \varepsilon A )t \le A t \varepsilon ^{1/2} , \end{aligned}$$

which completes the proof of Lemma 8.8. \(\quad \square \)

We continue the proof of Theorem 8.2. The evolution of s is bounded in much the same way as that of \(\delta \), and this is left to the reader. (We actually do not make use of these bounds.) We finally analyze the evolution of \(\zeta \), Eq. (8.4), which controls the motion of the distance from the cylinder. By the estimates Eqs. (8.3) and (8.7) on \(\delta \) and \({{\dot{\delta }}} \), we see that

$$\begin{aligned} {\mathcal {O}}(\delta {{\dot{\delta }}})={\mathcal {O}}(\gamma ^2 \varepsilon ^{4n-2}t)+{\mathcal {O}}(\gamma \varepsilon ^{2n-1}t+\gamma \varepsilon ^{n}) {\left| \left| \zeta \right| \right| }. \end{aligned}$$

Therefore, the equation for \({{\dot{\zeta }}} \) takes the form

$$\begin{aligned} {{\dot{\zeta }}} = {\mathcal {L}}_{\varphi ,\gamma } \zeta +{\mathcal {O}}(\delta +\varepsilon )\cdot {\mathcal {O}}(\delta +\gamma \varepsilon ^n){\left| \left| \zeta \right| \right| }+{\mathcal {O}}(\Vert \zeta \Vert ^2) + {\mathcal {O}}(\gamma ^2 \varepsilon ^{4n-2}t). \end{aligned}$$
(8.10)

Using Eq. (8.7), this simplifies to

$$\begin{aligned} {{\dot{\zeta }}} = {\mathcal {L}}_{\varphi ,\gamma } \zeta +{\mathcal {O}}( \gamma \varepsilon ^{n+1}+\gamma \varepsilon ^{2n-1}t ){\left| \left| \zeta \right| \right| }+{\mathcal {O}}(\Vert \zeta \Vert ^2)+{\mathcal {O}}(\gamma ^2\varepsilon ^{4n-2}t). \end{aligned}$$

From the estimates on the semigroup generated by \({\mathcal {L}}_{\varphi ,\gamma }\) from Proposition 6.1 we conclude that

$$\begin{aligned} {\left| \left| \zeta (t) \right| \right| }\le & {} (1+C_n \gamma )e^{-\varkappa _n\gamma \varepsilon t/2}{\left| \left| \zeta _0 \right| \right| } + R_2 (1+C_n\gamma )\int _0^t e^{-\varkappa _n \gamma \varepsilon (t-s)/2} \Vert \zeta (s) \Vert ^2 ds \nonumber \\&\quad + (1+C_n\gamma )\int _0^t \mathrm{d}\tau \, e^{-\varkappa _n\gamma \varepsilon \tau /2 }X , \end{aligned}$$
(8.11)

where \(X={\mathcal {O}}(\gamma ^2\varepsilon ^{4n-2}(t-\tau ))\) bounds the contribution from the last term in Eq. (8.10). We note that the contribution from \({\mathcal {O}}(\gamma \varepsilon ^{n+1} +\gamma \varepsilon ^{2n-1}t)\) which also multiplies \(\zeta \), has been absorbed into half the decay rate \(\varkappa _n\gamma \varepsilon \).

Define

$$\begin{aligned} Z(t) = \sup _{0 \le \tau \le t} e^{ \varkappa _n\gamma \varepsilon \tau /2} \Vert \zeta (\tau ) \Vert . \end{aligned}$$

Then, from (8.11), we see that

$$\begin{aligned} Z(t)&\le (1+ C_n \gamma ) \Vert \zeta _0 \Vert \\ {}&\quad + R_2 (1+ C_n \gamma ) \int _0^t e^{- \varkappa _n \gamma \varepsilon s} ds (Z(t))^2 + C_X ( \gamma ^2 \varepsilon ^{4n-2}t^2 e^{ \varkappa _n\gamma \varepsilon t /2} ) \\&\le (1+ C_n \gamma ) \Vert \zeta _0 \Vert + \frac{2 R_2 (1+ C_n \gamma )}{\varkappa _n \gamma \varepsilon } (e^{\varkappa _n \gamma \varepsilon t/2} -1) (Z(t))^2 \\&\quad + C_X ( \gamma ^2 \varepsilon ^{4n-2}t^2 e^{ \varkappa _n\gamma \varepsilon t /2} ) . \end{aligned}$$

Suppose that \(\Vert \zeta _0 \Vert \le \gamma \varepsilon ^n\). Then, by continuity, for t small, we have

$$\begin{aligned} \frac{2 R_2 (1+ C_n \gamma )}{\varkappa _n \gamma \varepsilon } Z(t) \le \gamma \varepsilon . \end{aligned}$$

Define \(T^*\) to be the largest value such that

$$\begin{aligned} \sup _{0 \le t \le T^*} \frac{2 R_2 (1+ C_n \gamma )}{\varkappa _n \gamma \varepsilon } Z(t) \le \gamma \varepsilon . \end{aligned}$$

Then,

$$\begin{aligned} \left( 1 - \frac{2 R_2 (1+ C_n \gamma )}{\varkappa _n \gamma \varepsilon } Z(t)\right) Z(t) \le (1+ C_n \gamma ) \Vert \zeta _0 \Vert + C_X ( \gamma ^2 \varepsilon ^{4n-2}t^2 e^{ \varkappa _n\gamma \varepsilon t /2} ) , \end{aligned}$$

or

$$\begin{aligned} Z(t) \le (1-\gamma \varepsilon )^{-1} \left[ (1+ C_n \gamma ) \Vert \zeta _0 \Vert + C_X ( \gamma ^2 \varepsilon ^{4n-2}t^2 e^{ \varkappa _n\gamma \varepsilon t /2} ) \right] , \end{aligned}$$

for \(0 \le t \le T^*\).

Since \(T = \frac{8 C_n}{\varkappa \varepsilon }\), if \(n > 2\), and if \(\varepsilon \) is sufficiently small, then \(T \le T^*\) and we have for \(0 \le t \le T\),

$$\begin{aligned} Z(t) \le (1-\gamma \varepsilon )^{-1} \left( 1+ \frac{3}{2} C_n \gamma \right) \gamma \varepsilon ^n . \end{aligned}$$

From the definition of Z(t), this implies

$$\begin{aligned} \Vert \zeta (t) \Vert \le e^{- \varkappa _n \gamma \varepsilon t/2} (1-\gamma \varepsilon )^{-1} \left( 1+ \frac{3}{2} C_n \gamma \right) \gamma \varepsilon ^n , \end{aligned}$$

or

$$\begin{aligned} \Vert \zeta (T) \Vert \le \gamma \varepsilon ^n , \end{aligned}$$

using the definition of T. \(\quad \square \)

9 Re-orthogonalization

As the system evolves, the solution will remain close (at least for some time) to the cylinder of breather solutions for the \(\gamma = 0\) equations. However, it will drift, so that the base point on the cylinder changes with time, while the vector \(\zeta \) stays orthogonal to the tangent space to the cylinder at the initial base point. In the first subsection we show that we can periodically choose new coordinates in such a way that \(\zeta \) remains small for a very long time, while the frequency \(\varphi \) of the base point in the cylinder changes in a controlled and computable way. The change in the base point manifests itself in the presence of the terms proportional to \(\delta \) in the equation for \(\zeta \). To counteract this secular growth, we will stop the evolution after a long, but finite, interval and “reset” the initial data so that the “new” initial data \({\widehat{\zeta }}\) is again orthogonal to the tangent space at the “new” initial point \(({\widehat{\varphi }},{\widehat{\vartheta }})\) on the cylinder. Our approach in this section is inspired by the work of Promislow [10] on pattern formation in reaction-diffusion equations, but is complicated by the very weak dissipative properties of the semigroup \(e^{t {\mathcal {L}}_{\varphi ,\gamma }}\). In particular, we will not be able to show that the normal component, \(\zeta \) of the solution is strongly contracted, but we will prove that it remains small for a very long period, during which the solution evolves close to the cylinder of breathers.

Key to this approach is the fact that in a sufficiently small neighborhood of the cylinder of breathers, the angle and phase of the point on the cylinder and the normal direction at that point provide a smooth coordinate system. More precisely, one has:

Proposition 9.1

Fix \(0 < \varPhi _0 \ll 1\). There exists \(\mu > 0\) such that for any \({\bar{\varphi }} \in [1-\varPhi _0,1+\varPhi _0]\), \({\bar{\vartheta }}\in [0,2\pi )\), \(\Vert {\bar{\zeta }} \Vert < \mu \), there exists \(({\widehat{\varphi }}, {\widehat{\vartheta }}, {\widehat{\zeta }})\) such that

$$\begin{aligned} e^{{\varvec{\mathbf{i}}}{\bar{\vartheta }}} p({\bar{\varphi }}) + {\bar{\zeta }} = e^{{\varvec{\mathbf{i}}}{\widehat{\vartheta }}} p({\widehat{\varphi }}) + {\widehat{\zeta }} , \end{aligned}$$

and \({\widehat{\zeta }}\) is normal to the tangent space of the family of breathers at \(({\widehat{\varphi }},{\widehat{\vartheta }})\).

Remark 9.2

The utility of this proposition is that if we choose any point near our family of breathers, we can find \(({\widehat{\varphi }},{\widehat{\vartheta }},{\widehat{\zeta }})\) to use as initial conditions for our modulation equations (7.16) with \({\widehat{\zeta }} \in {\text {Range}}({\mathbb {P}}_{{\widehat{\varphi }}})\).

Proof

The proof is an application of the implicit function theorem. Begin by rescaling \({\bar{\zeta }} \rightarrow \mu {\bar{\zeta }}\), with \(\Vert {\bar{\zeta }} \Vert = 1\). Then we have \({\widehat{\zeta }} = e^{{\varvec{\mathbf{i}}}{\bar{\vartheta }}} p({\bar{\varphi }}) + \mu {\bar{\zeta }} - e^{{\varvec{\mathbf{i}}}{\widehat{\vartheta }}} p({\widehat{\varphi }})\). We wish to choose \(({\widehat{\varphi }},{\widehat{\vartheta }})\) so that \({\widehat{\zeta }}\) is orthogonal to the tangent space at \(({\widehat{\varphi }},{\widehat{\vartheta }})\). Thus, we define

$$\begin{aligned} F({\widehat{\varphi }},{\widehat{\vartheta }};\mu ) = \left( \begin{array}{c} \langle n^{(1)}_{ {\widehat{\varphi }},{\widehat{\vartheta }} } | {\widehat{\zeta }} \rangle \\ \langle n^{(2)}_{{\widehat{\varphi }},{\widehat{\vartheta }}} |{\widehat{\zeta }} \rangle \end{array}\right) = \left( \begin{array}{c} \langle n^{(1)}_{ {\widehat{\varphi }},{\widehat{\vartheta }} } | (e^{{\varvec{\mathbf{i}}}{\bar{\vartheta }}} p({\bar{\varphi }}) + \mu {\bar{\zeta }} - e^{{\varvec{\mathbf{i}}}{\widehat{\vartheta }}} p({\widehat{\varphi }}) ) \rangle \\ \langle n^{(2)}_{ {\widehat{\varphi }},{\widehat{\vartheta }} } | (e^{{\varvec{\mathbf{i}}}{\bar{\vartheta }}} p({\bar{\varphi }}) + \mu {\bar{\zeta }} - e^{{\varvec{\mathbf{i}}}{\widehat{\vartheta }}} p({\widehat{\varphi }}) ) \rangle \end{array}\right) , \end{aligned}$$

and the theorem follows by finding zeros of this function.

Note that \(F({\bar{\varphi }},{\bar{\vartheta }};0) = 0\). To compute the derivative of F with respect to \(({\widehat{\varphi }},{\widehat{\vartheta }})\) we recall from the previous sections that the derivatives of \(e^{{\varvec{\mathbf{i}}}\vartheta } p(\varphi )\) with respect to \(\varphi \) and \(\vartheta \) give precisely the two vectors \(v^{(j)}_{\varphi ,\vartheta }\) (\( j=1,2\)) which span the zero eigenspace. Thus, by the normalization of the vectors \(n^{(j)}_{{\widehat{\varphi }},{\widehat{\vartheta }}}\), we see that

$$\begin{aligned} D_{\varphi ,\vartheta } F |_{\mu =0} = \left( \begin{array}{cc} 1 &{}\quad 0 \\ 0 &{}\quad 1\end{array} \right) . \end{aligned}$$

Thus, the implicit function theorem implies that there exists \(\mu _0 > 0\) such that for any \( | \mu | < \mu _0\), we have a solution \(F({\widehat{\varphi }},{\widehat{\vartheta }};\mu ) = 0\). \(\square \)

Remark 9.3

Note that the size of the neighborhood \(\mu _0\) on which we have a solution is independent of the base point \((\varphi ,\vartheta )\) — thus we have good coordinates on a uniform neighborhood of our original family of breathers.

Remark 9.4

Note that the constructive nature of the proof of the implicit function theorem also results in good estimates of the size of the solutions of the equation. In particular, for small \(\mu \), there exists a constant \(C>0\) such that the change in the angle and phase can be estimated as:

$$\begin{aligned} | {\bar{\varphi }} - {\widehat{\varphi }} | + | {\bar{\vartheta }} - {\widehat{\vartheta }} | \le C \mu (| \langle n^{(1)}_{{\bar{\varphi }},{\bar{\vartheta }}} | {\bar{\zeta }} \rangle | + | \langle n^{(2)}_{{\bar{\varphi }},{\bar{\vartheta }}} | {\bar{\zeta }} \rangle | ) . \end{aligned}$$
(9.1)

9.1 The intuitive picture

Suppose that we start from a point near our family of breathers, with coordinates \((\varphi _0,\vartheta _0,\zeta _0)\), with \(\zeta _0 \in {\text {Range}}({\mathbb {P}}_{\varphi _0})\). We allow the system to evolve for a time T to be specified below. After this time, we will have reached a point \((\varphi _1 = \varphi (T), \vartheta _1 = \vartheta (T), \zeta _ 1 = \zeta (T))\). In terms of our original variables, this point will be

$$\begin{aligned} w_1 = e^{{\varvec{\mathbf{i}}}(\varphi _1 T + \vartheta _1)} ( p(\varphi _1) + z_1) , \end{aligned}$$

where \(z_1 = (\xi _1 + {\varvec{\mathbf{i}}}\eta _1)\), with \((\xi _1,\eta _1)^\top = \zeta _1\). The point is that \(\zeta _1 \) is no longer orthogonal to the tangent space to the cylinder of breathers at the point \((\varphi _1,\vartheta _1)\). This leads to secular growth in \(\zeta \), and eventually, we would loose control of this evolution. To prevent this, we re-express the point \(w_1\) in terms of new variables \(({\widehat{\varphi }},{\widehat{\vartheta }},{\widehat{\zeta }})\), with \({\widehat{\zeta }}\) orthogonal to the tangent space at \(({\widehat{\varphi }},{\widehat{\vartheta }})\), and restart the evolution of (7.16) with these new initial conditions. The only complication is that we must keep careful track of how much we change the various variables in the course of this re-orthogonalization process. We now explain how this is done.

Without loss of generality assume that we have chosen the “stopping time” T so that the phase \( e^{{\varvec{\mathbf{i}}}(\varphi _1 T + \vartheta _1)} =1\). (If this is not the case, we can always use the phase invariance of the equation to rotate the solution so that this does hold.) Then, after time T, the trajectory of our system will have reached the point

$$\begin{aligned} w_1 = p(\varphi _1) + z_1 . \end{aligned}$$

By Proposition 9.1 we know that there exists \(({\widehat{\varphi }}, {\widehat{\vartheta }}, {\widehat{\zeta }})\) with

$$\begin{aligned} w_1 = p(\varphi _1) + z_1 = e^{{\varvec{\mathbf{i}}}{\widehat{\vartheta }}} p({\widehat{\varphi }}) + {\widehat{\zeta }} , \end{aligned}$$

and \({\widehat{\zeta }}\) is normal to the cylinder of breathers at \(({\widehat{\varphi }}, {\widehat{\vartheta }})\). We now restart the evolution of the modulation equations (7.16) and follow the evolution as before.

The last thing we need to control the long-time evolution of the system is to estimate by how much we change \(\varphi \) and \(\zeta \) in the course of this re-orthogonalization. (The change in \(\vartheta \) is inconsequential since it does not affect the magnitude of the solution, and since the phase-invariance of the equations of motion allows to always rotate the system back to zero phase if needed.) The change from \(\varphi _1\) to \({\widehat{\varphi }}\) is estimated with the aid of the implicit function theorem.

We know that the vectors \(\langle n^{(j)}_{\varphi ,\vartheta } |\) depend smoothly on \(\varphi \) and hence

$$\begin{aligned} | \langle n^{(1)}_{\varphi _1} | \zeta _1 \rangle&\le | \langle n^{(1)}_{\varphi _1} | \zeta _1 \rangle - \langle n^{(1)}_{\varphi _0} | \zeta _1 \rangle | + | \langle n^{(1)}_{\varphi _0} | \zeta _1 \rangle | \nonumber \\&\le | \langle n^{(1)}_{\varphi _1} | \zeta _1 \rangle - \langle n^{(1)}_{\varphi _0} | \zeta _1 \rangle | \le C \delta (T) \Vert \zeta _1 \Vert \le C \delta (T)\gamma \varepsilon ^n . \end{aligned}$$
(9.2)

Here, the first inequality just uses the triangle inequality, the second the fact that \(\zeta _1\) is orthogonal to \(n^{(1)}_{\varphi _0} \) by construction, the third uses Cauchy-Schwarz, plus the smooth dependence of the normal vectors on \(\varphi \), and the last, the estimate on \(\zeta _1\) coming from Theorem 8.2. If we combine this estimate with (9.1), we see that the change in \(\varphi \) from \(\varphi _1\) to \({\widehat{\varphi }}\) produced by the re-orthogonalization is extremely small.

It remains to estimate the corresponding change in \(\zeta \) when we replace \(\zeta _1\) by \({\widehat{\zeta }}\). We have

$$\begin{aligned} p(\varphi _1) + z_1 = e^{{\varvec{\mathbf{i}}}{\widehat{\vartheta }}} p({\widehat{\varphi }} ) + {\widehat{z}} , \end{aligned}$$

where as usual \({\widehat{z}} = {\widehat{\xi }} + {\varvec{\mathbf{i}}}{\widehat{\eta }} \), with \({\widehat{\zeta }} = ({\widehat{\xi }},{\widehat{\eta }})^\top \). Again, using the fact that \(p(\varphi )\) depends smoothly on \(\varphi \), plus estimates on the difference in \(\varphi _1\) and \({\widehat{\varphi }}\) given by (9.2) and similar estimates for the \({\widehat{\vartheta }}\), we see that

$$\begin{aligned} \Vert \zeta _1 - {\widehat{\zeta }} \Vert \le C_R \delta (T) \gamma \varepsilon ^n , \end{aligned}$$
(9.3)

or

$$\begin{aligned} \Vert {\widehat{\zeta }} \Vert \le (1+ C_R \delta (T) \gamma \varepsilon ^n) \Vert \zeta _1\Vert , \end{aligned}$$
(9.4)

for some finite \(R_2\).

10 Iterating

The estimates of the previous section show that if we take initial conditions for (1.3) close to the cylinder of breathers for the undamped equations, and if we express that initial point as

$$\begin{aligned} w_0 = p(\varphi _0) + z_0 , \end{aligned}$$

with \(\zeta _0 = (\mathfrak {R}(z_0),\mathfrak {I}(z_0))^\top \in {\text {Range}}({\mathbb {P}}_0)\) and \(\Vert \zeta _0 \Vert \le \gamma \varepsilon ^n\), then \(\varphi \), \(\vartheta \), and \(\zeta \) will evolve via (8.2)–(8.4) and after a time \(T = \frac{4 C_n}{\varkappa _n \varepsilon }\) we will have

$$\begin{aligned} \varphi (T) - \varphi _0= & {} -2 \gamma \varepsilon ^{2n-1} T (1+{\mathcal {O}}(\varepsilon ^{1/2}) ) , \end{aligned}$$
(10.1)
$$\begin{aligned} \Vert \zeta (T) \Vert\le & {} (1-\gamma \varepsilon )^{-1} (1+ \frac{3}{2} C_n \gamma ) \varepsilon ^n . \end{aligned}$$
(10.2)

As usual we ignore the evolution of \(\vartheta \) since any \(\vartheta \) dependence of the solution can be removed using the phase invariance of the problem.

As discussed in Sect. 9, \(\zeta (T)\) will not lie in \({\text {Range}}({\mathbb {P}}_{\varphi (T)})\). Thus, we now re-orthogonalize. To see what is involved, consider again Fig. 2.

Fig. 2
figure 2

Illustration of the re-orthogonalization process. At time 0, the orbit starts at a distance \(\Vert \zeta \Vert \) from the base point \(p(\varphi )\), which lies on the cylinder (shown as a line). \(\zeta \) is orthogonal to the tangent at the point \(p(\varphi )\) on the cylinder (this is the 2-dimensional subspace of 0 eigenvalues). At time T, the solution has moved to \(p({{\bar{\varphi }}})+{{\bar{\zeta }}} \), with \({{\bar{\zeta }}}\) still orthogonal to the tangent space at \(p(\varphi )\). The re-orthogonalization consists of finding a new base point \({{\widehat{\varphi }}}\) in such a way that \(p({{\bar{\varphi }}})+{{\bar{\zeta }}} =p({{\widehat{\varphi }}}) +{{\widehat{\zeta }}} \) and \({{\widehat{\zeta }}}\) is orthogonal to the tangent space at \(p(\widehat{\varphi })\). This solution is found by the implicit function theorem. Note that \(\Vert {{\widehat{\zeta }}}\Vert \) might be larger than \(\Vert \zeta \Vert \), but this is compensated by the contraction induced by semigroup due to the dissipation

This means we reexpress

$$\begin{aligned} w(T) = e^{{\varvec{\mathbf{i}}}\varphi (T) T } p(\varphi (T)) + z(T) = e^{{\varvec{\mathbf{i}}}({\widehat{\varphi }} T + {\widehat{\vartheta }})} p({\widehat{\varphi }}) + {\widehat{z}} , \end{aligned}$$
(10.3)

where as usual, \({\widehat{z}} = ({\widehat{\xi }} + {\varvec{\mathbf{i}}}{\widehat{\eta }})\), with \(({\widehat{\xi }},{\widehat{\eta }}) ={\widehat{\zeta }} \) and \({\widehat{\zeta }} \in {\text {Range}}({\mathbb {P}}_{{\widehat{\varphi }}})\).

We now recall the estimates for the change in \(\varphi \) and \(\zeta \) produced by the re-orthogonalization. First, from (9.2), plus the estimate on \(\delta (T)\) from Lemma 8.6, we have

$$\begin{aligned} | \varphi (T) - {\widehat{\varphi }}| \le C \delta (T) \varepsilon ^n , \end{aligned}$$

and hence by the triangle inequality we see that

$$\begin{aligned} | (\varphi _0 - {\widehat{\varphi }}) + 2 \gamma \varepsilon ^{2n-1} T | \le 4 \gamma \varepsilon ^{2n - 1/2} T , \end{aligned}$$

i.e., to leading order \(\varphi _0 - {\widehat{\varphi }} \approx \varphi _0 - \varphi (T)\).

Likewise, from (9.3), we have

$$\begin{aligned} \Vert {\widehat{\zeta }} \Vert&\le \Vert \zeta (T) \Vert + \Vert \zeta (T) - {\widehat{\zeta }} \Vert \\ {}&\le e^{-\varkappa _n \gamma \varepsilon T/2} (1-\gamma \varepsilon )^{-1} ( 1 + \frac{3}{2} C_n \gamma ) \varepsilon ^n + 4 \gamma \varepsilon ^{2n-1} T \\&\le \gamma \varepsilon ^n , \end{aligned}$$

for \(\varepsilon \) sufficiently small. If we look at the second line above, we see how the contraction, and the “waiting” for a time T come in: Namely, the first factor contracts,because of the estimates on the semigroup (and the dissipation), while the next two factors come from the reprojection and the prefactor from the bound on the semigroup.

Thus, we can begin to evolve our equation of motion starting from the point w(T), but now expressed as

$$\begin{aligned} w(T) = e^{{\varvec{\mathbf{i}}}(T{\widehat{\varphi }} + {\widehat{\vartheta }})} p({\widehat{\varphi }}) + {\widehat{\zeta }} , \end{aligned}$$

where \({{\widehat{\zeta }}} \in {\text {Range}}({\mathbb {P}}_{{\widehat{\varphi }}})\), and \(\Vert {\widehat{\zeta }} \Vert \le \gamma \varepsilon ^n\). Thus, the new representation for w(T) has the same properties as the representation of \(w_0\) that we started with, and hence we can continue to evolve our trajectory which will remain close to the cylinder of breathers.

11 Conclusions and Future Directions

We have proven that the presence of breather solutions leads to very slow energy decay in discrete nonlinear Schrödinger equations. There are many other types of lattice dynamical systems that possess breather solutions such as discrete Klein–Gordon equations or Fermi–Pasta–Ulam–Tsingou models. (For a recent survey of such results see [13].) It would be interesting to see if breathers play a similar role in the transport of energy through lattices governed by such equations. In addition, it is clear that, at least intuitively, the reason for the slow energy decay induced by the breathers is related to their strong localization properties which means that most of the energy of the system is localized far from the region in which the dissipation acts. Thus it would also be interesting to investigate systems whose breathers are either more or less strongly localized than those of the NLS system studied here [14, 15].