1 Introduction

1.1 The Main Result

In this article, we examine the propagation of solitary waves on the surface of an ocean under ice, regarding the water as a perfect fluid in irrotational flow and the ice sheet as an elastic shell which bends with the surface without stretching and without friction or cavitation between it and the fluid beneath. For this purpose we consider the model derived by Plotnikov and Toland [33] using the Euler equations for inviscid fluid flow and the Cosserat theory of hyperelastic shells (Fig. 1).

Fig. 1
figure 1

An ice sheet on the free surface of a two-dimensional perfect fluid

We suppose that the fluid occupies the region bounded below by a rigid horizontal bottom \(\{y = 0\}\) and above by the free surface \(\{y = h+\eta (x,t)\}\), where h is the depth of the water in its undisturbed state. Travelling waves move in the \(x\)-direction with constant speed \(c\) and without change of shape, so that \(\eta (x,t) = \eta (x-ct)\), and solitary waves are localised travelling waves, so that \(\eta (x-ct)\rightarrow 0\) as \(x-ct\rightarrow \pm \infty \). In terms of an Eulerian velocity potential \(\phi \), the governing equations for the hydrodynamic problem in dimensionless coordinates and in a coordinate system moving with the wave are

$$\begin{aligned}&\phi _{xx}+\phi _{y y} = 0,{} & {} 0<y<1+\eta \end{aligned}$$
(1.1)

with boundary conditions

$$\begin{aligned}&\phi _{y}\big |_{y=0} = 0, \end{aligned}$$
(1.2)
$$\begin{aligned}&\phi _{y}-\eta _x \phi _x + \eta _x\big |_{y=1+\eta } = 0, \end{aligned}$$
(1.3)
$$\begin{aligned}&\begin{aligned}&-\phi _x+\tfrac{1}{2}(\phi _x^2+\phi _{y}^2)+\alpha \eta \\&\quad \;+\beta \Bigg (\frac{1}{(1+\eta _x^2)^{1/2}}\bigg [\frac{1}{(1+\eta _x^2)^{1/2}}\bigg (\frac{\eta _{xx}}{(1+\eta _x^2)^{3/2}}\bigg )_x\bigg ]_x+\frac{1}{2}\bigg (\frac{\eta _{xx}}{(1+\eta _x^2)^{3/2}}\bigg )^3\Bigg )\Bigg |_{y=1+\eta } = 0 \end{aligned} \end{aligned}$$
(1.4)

and asymptotic conditions \(\eta \rightarrow 0\), \((\phi _x,\phi _{y})\rightarrow (0,0)\) as \(x\rightarrow \pm \infty \) (see Guyenne and Parau [16]). The dimensionless parameters \(\alpha \) and \(\beta \) are given by

$$\begin{aligned} \alpha = \frac{g h}{c^2},\qquad \beta = \frac{D}{\rho h^3c^2},\end{aligned}$$

where \(D\) is the coefficient of flexural rigidity of the ice sheet, g is the acceleration due to gravity, c is the wave speed and \(\rho \) is the constant water density.

This formulation is unfavourable because of the variable fluid domain. It is, therefore, convenient to introduce the change of variable

$$\begin{aligned} \tilde{y} = \frac{y}{1+\eta (x)},\qquad \Phi (x,\tilde{y}) = \phi (x,y), \end{aligned}$$
(1.5)

which maps the variable fluid domain \(\{(x,y): x\in \mathbb {R},\,y\in (0,1+\eta (x))\}\) to the fixed strip \(\mathbb {R}\times (0,1)\). Dropping the tildes for notational simplicity, one obtains the transformed equations

$$\begin{aligned}&\Phi _{xx}+\Phi _{yy}\frac{1+y^2\eta _x^2}{(1+\eta )^2}-2\Phi _{x y}\frac{y \eta _x}{1+\eta } - \Phi _{y}\frac{y \eta _{xx}}{1+\eta }+2\Phi _{y}\frac{y \eta _x^2}{(1+\eta )^2} = 0, \qquad 0< y<1, \end{aligned}$$
(1.6)
$$\begin{aligned}&\Phi _{y}\big |_{y=0} = 0, \end{aligned}$$
(1.7)
$$\begin{aligned}&\frac{\Phi _{y}}{1+\eta } +\eta _x-\eta _x\Big (\Phi _x-\Phi _{y}\frac{y \eta _x}{1+\eta }\Big )\Big |_{y=1}= 0, \end{aligned}$$
(1.8)
$$\begin{aligned}&-\Big (\Phi _x-\Phi _{y}\frac{y\eta _x}{1+\eta }\Big )+\frac{1}{2}\Big (\Big (\Phi _x-\Phi _{y}\frac{y \eta _x}{1+\eta }\Big )^2+\Big (\frac{\Phi _{y}}{1+\eta }\Big )^2\Big ) + \alpha \eta \nonumber \\&\quad +\beta \Bigg (\frac{1}{(1+\eta _x^2)^{1/2}}\bigg [\frac{1}{(1+\eta _x^2)^{1/2}}\bigg (\frac{\eta _{xx}}{(1+\eta _x^2)^{3/2}}\bigg )_x\bigg ]_x+\frac{1}{2}\bigg (\frac{\eta _{xx}}{(1+\eta _x^2)^{3/2}}\bigg )^3\Bigg )\Bigg |_{y=1} = 0 \end{aligned}$$
(1.9)

with asymptotic conditions \(\eta \rightarrow 0\), \((\Phi _x,\Phi _{y})\rightarrow (0,0)\) as \(x\rightarrow \pm \infty \).

Fig. 2
figure 2

The linear dispersion relation for a fixed \(\beta _0\)

Let us briefly review the (formal) classical weakly nonlinear theory as it applies to this problem. Figure 2 shows the linear dispersion relation

$$\begin{aligned} \alpha +\beta s^4=s\coth (s)\end{aligned}$$

for a sinusoidal wave train with wave number \(s\). For each fixed value \(\beta _0\) of \(\beta \), the dispersion curve has a unique minimum at \((s,\alpha ^{-1}) = (s_\textrm{min},\alpha _0^{-1})\); the relationship between \(\beta _0\), \(\alpha _0\) and \(s=s_\textrm{min}\) can be expressed in the form

$$\begin{aligned} \beta _0(s)=\frac{1}{4s^3}\coth (s)-\frac{1}{4s^2}{{\,\textrm{cosech}\,}}^2(s), \qquad \alpha _0(s) = \frac{3s}{4}\coth (s)+\frac{s^2}{4}{{\,\textrm{cosech}\,}}^2(s), \end{aligned}$$

which defines a curve \(C\) in the \((\beta ,\alpha )\)-plane parametrised by \(s\in (0,\infty )\). Setting \(\alpha = \alpha _0+\delta ^2\), \(\beta =\beta _0\), and substituting the modulational Ansatz

$$\begin{aligned} \eta (x)= & {} \delta \big (A_1(\delta x)\textrm{e}^{\textrm{i}sx}+\bar{A}_1(\delta x)\textrm{e}^{-\textrm{i}sx}\big ) \\{} & {} \quad +\,\delta ^2\big (A_2(\delta x)\textrm{e}^{2\textrm{i}sx}+\bar{A}_2(\delta x)\textrm{e}^{-2\textrm{i}sx} + A_0(\delta x)\big ) + \cdots \end{aligned}$$

into Eqs. (1.6)–(1.9), one finds that to leading order \(A_1\) satisfies the nonlinear Schrödinger equation

$$\begin{aligned} A_1-b_1A_{1XX}-b_2|A_1|^2A_1 = 0, \end{aligned}$$
(1.10)

where \(X=\delta x\) (see Appendix A for details of the derivation and formulae for the coefficients \(b_1\) and \(b_2\)). One finds that \(b_1\) is positive for all values of \(s\), and there exists a critical value \(s^\star \) (numerically \(s^\star \approx 177.33\)) such that \(b_2>0\) for \(s<s^\star \) and \(b_2<0\) for \(s>s^\star \).

Suppose that \(b_2 > 0\), that is, choose \(s\) sufficiently small, or equivalently \(\beta _0\) sufficiently large (corresponding to sufficiently shallow water in physical variables). Equation (1.10) admits the family

$$\begin{aligned} A_1(X) = \left( \frac{2}{b_2}\right) ^{1/2}{{\,\textrm{sech}\,}}\left( \frac{X}{b_1^{1/2}}\right) \textrm{e}^{\textrm{i}\theta }, \qquad \theta \in [0,2\pi ) \end{aligned}$$

of homoclinic solutions (solutions which decay to zero as \(x\rightarrow \pm \infty \)), which correspond to the solitary waves

$$\begin{aligned} \eta (x) = 2\delta \left( \frac{2}{b_2}\right) ^{1/2}{{\,\textrm{sech}\,}}\left( \frac{\delta x}{b_1^{1/2}}\right) \cos (sx+\theta )+O(\delta ^2). \end{aligned}$$

These waves take the form of periodic wave trains modulated by exponentially decaying envelopes; the wave with \(\theta = 0\) is a symmetric wave of elevation, while the wave with \(\theta =\pi \) is a symmetric wave of depression (see Fig. 3). In this article, we confirm the predictions of the weakly nonlinear theory and prove the following theorem.

Theorem 1.1

Choose \(s\in (0,s^\star )\) and let \((\beta _0,\alpha _0)\) denote the point on the curve \(C\) with this parameter value. For each sufficiently small value of \(\delta >0\) and \(\nu \in (0,1)\), the hydroelastic problem (1.1)–(1.4) with \(\beta =\beta _0\) and \(\alpha = \alpha _0 + \delta ^2\) admits two geometrically distinct, symmetric solitary-wave solutions \((\eta ^\pm ,\phi ^\pm )\) which satisfy the estimate

$$\begin{aligned} \eta ^\pm (x) = \pm 2\delta \left( \frac{2}{b_2}\right) ^{1/2}{{\,\textrm{sech}\,}}\left( \frac{\delta x}{b_1^{1/2}}\right) \cos (sx) + O(\delta ^2\textrm{e}^{-\nu b_1^{-1/2}\delta |x|}) \end{aligned}$$

uniformly over \(x \in {\mathbb R}\).

Fig. 3
figure 3

Symmetric envelope solitary waves (with scaled amplitudes and wavelengths)

1.2 Spatial Dynamics and the Kirchgässner Reduction

We prove Theorem 1.1 using the Kirchgässner reduction: the hydrodynamic problem is formulated as a spatial Hamiltonian system and reduced to a locally equivalent Hamiltonian system with finitely many degrees of freedom; homoclinic solutions of the reduced system correspond to solitary waves. The method was introduced by Kirchgässner [21] and has been used for many problems in fluid mechanics, in particular for water waves (see Dias and Iooss [6] for a review), and more recently for water waves with vorticity (Groves and Wahlén [14, 15], Kozlov et al. [22], Kozlov and Lokharu [23]) and ferrofluids (Groves et al. [11], Groves and Nilsson [12]). In this paper, we review the method as it applies to hydroelastic solitary waves, presenting various refinements and new features.

Our starting point in Sect. 2 is the observation that the Eqs. (1.1)–(1.4) follow from the formal variational principle

$$\begin{aligned} \delta \int _{\mathbb {R}}\bigg \{\int _0^{1+\eta (x)}\Big (-\phi _x+\tfrac{1}{2}(\phi _x^2+\phi _{y}^2)\Big )\;\textrm{d}y + \tfrac{1}{2}\alpha \eta ^2+\tfrac{1}{2}\beta \frac{\eta _{xx}^2}{(1+\eta _x^2)^{5/2}}\bigg \}\, \textrm{d}x= 0, \end{aligned}$$
(1.11)

in which the variations are taken over \(\eta \) and \(\phi \) (a modified version of the classical variational principle introduced by Luke [25]); this observation is confirmed by the calculation

$$\begin{aligned} \delta&\int _{\mathbb {R}}\bigg \{\int _0^{1+\eta (x)}\Big (-\phi _x+\tfrac{1}{2}(\phi _x^2+\phi _{y}^2)\Big )\;\textrm{d}y + \tfrac{1}{2}\alpha \eta ^2+\tfrac{1}{2}\beta \frac{\eta _{xx}^2}{(1+\eta _x^2)^{5/2}}\bigg \}\, \textrm{d}x\\&\;=\begin{aligned}&\int _{\mathbb {R}}\bigg \{-\int _0^{1+\eta (x)}(\phi _{xx}+\phi _{y y})\dot{\phi }\;\textrm{d}y+\big ((-\eta _x\phi _x+\phi _{y}+\eta _x)\dot{\phi }\big )\big |_{y=1+\eta }-(\phi _{y}\dot{\phi })\big |_{y=0} \\&\begin{aligned}&\qquad + \bigg (\big (-\phi _x+\tfrac{1}{2}(\phi _x^2+\phi _{y}^2)\big )\big |_{y=1+\eta }+\alpha \eta +\tfrac{1}{2}\beta \bigg (\frac{\eta _{xx}}{(1+\eta _x^2)^{3/2}}\bigg )^3 \\&\qquad +\beta \frac{1}{(1+\eta _x^2)^{1/2}}\bigg [\frac{1}{(1+\eta _x^2)^{1/2}}\bigg (\frac{\eta _{xx}}{(1+\eta _x^2)^{3/2}}\bigg )_x\bigg ]_x\bigg )\dot{\eta }\bigg \}\, \textrm{d}x, \end{aligned} \end{aligned} \end{aligned}$$

where the formal first variations of \(\eta \) and \(\phi \) are denoted by respectively \(\dot{\eta }\) and \(\dot{\phi }\) and we have used integration by parts and Green’s integral formula.

We proceed using the change of variable (1.5), which transforms (1.11) into the new variational principle

$$\begin{aligned} \delta \int L(\eta ,\eta _x,\eta _{xx},\Phi ,\Phi _x)\, \textrm{d}x=0 \end{aligned}$$

with Lagrangian

$$\begin{aligned}&L(\eta ,\eta _x,\eta _{xx},\Phi ,\Phi _x) \\&\quad := \int _0^1\!\!\bigg (\!\!\!-\!\!\bigg [\Phi _x\!-\! \Phi _y \frac{y\eta _x}{1+\eta }\bigg ]\!+\!\frac{1}{2}\bigg [\Phi _x\!-\! \Phi _y \frac{y\eta _x}{1+\eta }\bigg ]^2\!\!\!+\!\frac{1}{2}\frac{\Phi _y^2}{(1+\eta )^2}\bigg )\!(1+\eta )\, \textrm{d}y\!\\&\qquad \qquad \quad +\! \tfrac{1}{2} \alpha \eta ^2\!+\!\tfrac{1}{2}\beta \frac{\eta _{xx}^2}{(1+\eta _x^2)^{5/2}}; \end{aligned}$$

this variational principle recovers the transformed equations (1.6)–(1.9). The next step is to perform a (formal) Legendre transform to obtain a formulation of the hydrodynamic problem as a spatial Hamiltonian system (in which the variable x plays the role of ‘time’). The presence of second-order derivatives in the Lagrangian, however, necessitates the use of a higher-order Legendre transform (see Lanczos [24, Appendix I]), by means of which obtain the Hamiltonian system

$$\begin{aligned} \eta _x = \frac{\delta H}{\delta \omega }, \quad \rho _x = \frac{\delta H}{\delta \xi }, \quad \omega _x = -\frac{\delta H}{\delta \eta }, \quad \xi _x = -\frac{\delta H}{\delta \rho }, \quad \Phi _x = \frac{\delta H}{\delta \Psi }, \quad \Psi _x = -\frac{\delta H}{\delta \Phi } \end{aligned}$$
(1.12)

with variables \(\eta \), \(\Phi \) and

$$\begin{aligned} \rho = \eta _x, \qquad \omega = \frac{\delta L}{\delta \eta _x} - \frac{\textrm{d}}{\textrm{d}x}\bigg (\frac{\delta L}{\delta \eta _{xx}}\bigg ), \qquad \xi = \frac{\delta L}{\delta \eta _{xx}}, \qquad \Psi = \frac{\delta L}{\delta \Phi _x}; \end{aligned}$$

these equations are accompanied by the boundary conditions

$$\begin{aligned} -\Phi _y+y\rho \Psi \big |_{y=0,1} = 0, \end{aligned}$$
(1.13)

which emerge when computing the variational derivatives.

Equations (1.12), (1.13) are reversible, that is invariant under the transformation \((\eta , \omega , \rho , \xi , \Phi , \Psi )(x) \mapsto (\eta , -\omega , -\rho , \xi , -\Phi , \Psi )(-x)\); this symmetry is inherited from (1.6)–(1.9), which are invariant under \((\eta (x),\Phi (x,y)) \mapsto (\eta (-x),\Phi (-x,y))\). They are also invariant under the transformation \(\Phi \mapsto \Phi + c\) for any constant c. To eliminate this symmetry, one replaces \((\Phi ,\Psi )\) with new variables \((\bar{\Phi },\Phi _0,\bar{\Psi },\Psi _0)\), where \(\bar{\Phi }=\Phi -\Phi _0\), \(\bar{\Psi }=\Psi -\Psi _0\) and \( \Phi _0 = \int _0^1 \Phi \, \textrm{d}y\), \(\Psi _0 = \int _0^1 \Psi \, \textrm{d}y\), thus obtaining a new canonical Hamiltonian system with Hamiltonian

$$\begin{aligned} \bar{H}(\eta , \omega , \rho , \xi , \bar{\Phi },\bar{\Psi },\Phi _0,\Psi _0)= & {} H(\eta , \omega , \rho , \xi , \bar{\Phi }+\Phi _0, \bar{\Psi }+\Psi _0)\\ {}= & {} H(\eta , \omega , \rho , \xi , \bar{\Phi }, \bar{\Psi }+\Psi _0) \end{aligned}$$

and additional constraints \(\int _0^1 \bar{\Phi }\, \textrm{d}y=0\), \(\int _0^1 \bar{\Psi }\, \textrm{d}y=0\). The variable \(\Phi _0\) is cyclic, so that its conjugate \(\Psi _0\) is a conserved quantity; we proceed in standard fashion by setting \(\Psi _0=-1\), considering the equations for \((\eta , \omega , \rho , \xi , \bar{\Phi },\bar{\Psi })\) and recovering \(\Phi _0\) by quadrature. The nonlinear boundary condition

$$\begin{aligned} -\bar{\Phi }_y+y\rho (\bar{\Psi }-1)\big |_{y=1} = 0 \end{aligned}$$

necessitates a further change of variable, namely

$$\begin{aligned} \bar{\Gamma }= \bar{\Phi } - \rho \int _0^ys(\bar{\Psi }(s)-1)\, \textrm{d}s+\rho \int _0^1\int _0^ys(\bar{\Psi }(s)-1)\, \textrm{d}s\, \textrm{d}y, \end{aligned}$$

in terms of which the boundary conditions take the simple, linear form \(\bar{\Gamma }_y\big |_{y=0,1}=0\).

The formulation of the hydrodynamic problem as a spatial Hamiltonian system is discussed rigorously in Sect. 2, where a precise definition of a Hamiltonian system is given and Hamilton’s equations are derived. Full details of the changes of variable, which are performed explicitly, are also given; the result is a quasilinear evolution equation of the form

$$\begin{aligned} u_x=Lu + N^\varepsilon (u) \end{aligned}$$
(1.14)

for the variable \(u=(\eta ,\rho ,\omega ,\xi ,\bar{\Gamma },\bar{\Psi })\) in the phase space

$$\begin{aligned} X =\{(\eta , \rho , \omega , \xi , \bar{\Gamma }, \bar{\Psi }) \in {\mathbb R} \times {\mathbb R} \times {\mathbb R} \times {\mathbb R} \times \bar{H}^1(0,1) \times \bar{L}^2(0,1)\}, \end{aligned}$$

where the overline denotes the subspace of functions with zero mean value; the domain of the linear operator L is

$$\begin{aligned} \mathcal {D}(L) = \big \{(\eta , \rho , \omega , \xi , \bar{\Gamma }, \bar{\Psi }) \in \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times \bar{H}^2(0,1) \times \bar{H}^1(0,1): \bar{\Gamma }_y\big |_{y=0,1} = 0 \big \} \end{aligned}$$

and the nonlinear term on the right-hand side of (1.14), which satisfies \(N^\varepsilon (u)=O(\Vert (\varepsilon ,u)\Vert \Vert u\Vert )\), maps a neighbourhood of the origin in \({\mathbb R}^2 \times {\mathcal D}(L)\) analytically into X. Here we have written \(\alpha = \alpha _0 + \varepsilon _1\) and \(\beta = \beta _0+\varepsilon _2\), where \(\alpha _0\) and \(\beta _0\) are fixed, and the superscript \(\varepsilon \) denotes the dependence upon this parameter.

In Sect. 3 we show that the spectrum of L is discrete. By reducing the spectral problem to a non self-adjoint Sturm–Liouville problem, we show that a complex number \(\lambda \) is an eigenvalue of \(L\) if and only if

$$\begin{aligned} \alpha _0+\lambda ^4\beta _0 = \lambda \cot (\lambda ) \end{aligned}$$
(1.15)

and deduce that \(\sigma (L)\) consists of

  1. (a)

    a countably infinite family \(\{\lambda _k\}_{k \in {\mathbb Z}{\setminus }\{0\}}\) of simple real eigenvalues, where \(\{\lambda _k\}_{k=1}^\infty \) are the positive real solutions of equation (1.15), so that \(\lambda _k \in (k\pi , (k+1)\pi )\) for \(k=1,2,\ldots \) and

    $$\begin{aligned} \lambda _k^2 = k^2\pi ^2+\frac{2}{\beta _0}+o\left( \frac{1}{k}\right) \end{aligned}$$

    for large k, and \(\lambda _{-k}=-\lambda _k\) for \(k=1,2,\ldots \),

  2. (b)

    four additional eigenvalues (counted according to multiplicity) which are shown in Fig. 4. Note in particular that a Hamiltonian–Hopf bifurcation occurs at each point \((\beta _0(s),\alpha _0(s))\) of the curve C: two pairs of purely imaginary eigenvalues become complex by colliding at the points \(\pm \textrm{i}s\) on the imaginary axis.

Remarkably, we can treat (1.14) as a dynamical system with countably infinitely many coordinates by showing that L is a Riesz spectral operator, that is its generalised eigenvectors form a Riesz basis for X (a Schauder basis obtained by an isomorphism from an orthonormal basis). In particular, at a point \((\beta _0(s),\alpha _0(s))\) of the curve C (a ‘Hamiltonian–Hopf point’) we can write

$$\begin{aligned} X = \bigg \{ u=A e + B f + \bar{A} \bar{e} + \bar{B} \bar{f} + \sum _{k \in {\mathbb Z} \setminus \{0\}} \beta _k e_{\lambda _k}: A, B \in {\mathbb C},\ \{\beta _k\} \in \ell ^2\bigg \}, \end{aligned}$$

where ef and \(e_{\lambda _k}\) are suitably normalised generalised eigenvectors with \((L-\textrm{i}sI)e=0\), \((L-\textrm{i}sI)f=e\) and \((L-\lambda _kI)e_{\lambda _k} = 0\). In the above notation,

$$\begin{aligned} Lu=(\textrm{i}s A +B) e + \textrm{i}s B f + (-\textrm{i}s \bar{A} + \bar{B}) \bar{e} -\textrm{i}s \bar{B} \bar{f} + \sum _{k \in {\mathbb Z} \setminus \{0\}} \lambda _k\beta _k e_{\lambda _k} \end{aligned}$$

and \(u \in {\mathcal D}(L)\) whenever \(\{\lambda _k \beta _k\} \in \ell ^2\).

Fig. 4
figure 4

The shaded region indicates the parameter regime in which homoclinic bifurcation is detected; dots and crosses denote, respectively, simple and algebraically double, geometrically simple eigenvalues

Homoclinic solutions of (1.12) are of particular interest since they correspond to solitary waves. We detect them using centre-manifold reduction (see Mielke [28, 29] for the version of the reduction theorem used here). Denoting the central and hyperbolic subspaces of X at a Hamiltonian-Hopf point by

$$\begin{aligned}{} & {} X_1=\{u_1=A e + B f + \bar{A} \bar{e} + \bar{B} \bar{f}: A, B \in {\mathbb C}\},\\{} & {} X_2=\bigg \{ u_2=\sum _{k \in {\mathbb Z} \setminus \{0\}} \beta _k e_{\lambda _k}: \{\beta _k\} \in \ell ^2\bigg \}, \end{aligned}$$

one finds that all small, globally bounded solutions to (1.14) lie on a centre manifold of the form \(\{u_2= r(u_1;\varepsilon )\}\), where the reduction function \(r: X_1 \rightarrow {\mathcal D}(L)\) is \(O(\Vert (\varepsilon ,u)\Vert \Vert u\Vert )\). The flow on the centre manifold is governed by the reduced system

$$\begin{aligned} u_{1x} = Lu_1+N^\varepsilon (u_1+r(u_1;\varepsilon )), \end{aligned}$$
(1.16)

which is itself a reversible Hamiltonian system (with two degrees of freedom). One of the key requirements in Mielke’s theorem is that the operator \(L_2 = L|_{\mathcal {X}_2}\) has \(L^p\)-maximal regularity in the sense that the differential equation

$$\begin{aligned} \partial _x u_2 = L_2 u_2 + h \end{aligned}$$

admits a unique solution \(u_2 \in W^{1,p}({\mathbb R},X_2) \cap L^p({\mathbb R},{\mathcal D}(L_2))\) for each \(h \in L^p({\mathbb R},X_2)\) and \(p>1\). In fact \(L^p\)-maximal regularity for some \(p>1\) implies \(L^p\)-maximal regularity for all \(p>1\) (see Mielke [27]), and an operator has \(L^2\)-maximal regularity if and only if it is bisectorial (see Arendt and Duelli [1, Theorem 2.4]); the theorem is usually stated with bisectorality as a hypothesis. (Mielke’s theorem actually requires maximal regularity in exponentially weighted spaces, a property which is implied by \(L^p\)-maximal regularity; see Mielke [27, Lemma 2.3]). In Sect. 4 we, however, demonstrate directly that a Riesz spectral operator with no imaginary eigenvalues has \(L^2\)-maximal regularity, and stipulate \(L^2\)-maximal regularity as a hypothesis in Mielke’s theorem. This approach is more direct than that taken in the above references to the Kirchgässner reduction, in which central and hyperbolic subspaces of a suitable phase space are defined by Dunford integrals and the bisectorality condition is verified by a priori estimates.

Writing \((\varepsilon _1,\varepsilon _2)=(\mu ,0)\), so that positive values of \(\mu \) correspond to points on the ‘complex’ side of \(C\) (the shaded region in Fig. 4), one finds after a Darboux and normal-form transformation that the reduced equation (1.16) can be formulated as the Hamiltonian system

$$\begin{aligned}{} & {} A_x = \frac{\partial \tilde{H}^\mu }{\partial \bar{B}}, \qquad B_x = -\frac{\partial \tilde{H}^\mu }{\partial \bar{A}}, \end{aligned}$$
(1.17)
$$\begin{aligned} \tilde{H}^\mu (A,B,\bar{A},\bar{B})= & {} \textrm{i}s(A\bar{B}- \bar{A}B) + |B|^2+\tilde{H}_{\textrm{NF}}^\mu (|A|^2,\textrm{i}(A\bar{B}-\bar{A} B)) \\{} & {} \quad + O(|(A,B)|^2|(\mu ,A,B)|^{n_0}), \end{aligned}$$

where \(\tilde{H}_{\textrm{NF}}^\mu (A,B,\bar{A},\bar{B})\) is a real polynomial function of its arguments which satisfies

$$\begin{aligned} \tilde{H}_{\textrm{NF}}^\mu (|A|^2,\textrm{i}(A\bar{B}-\bar{A} B),\mu ) \!= O(|(A,B)|^2|(\mu ,A,B)|); \end{aligned}$$

it contains the terms of order \(3,..., n_0+1\) in the Taylor expansion of \(\tilde{H}^\mu (A,B,\tilde{A},\tilde{B})\). Equation (1.17) inherit the reversibility of (1.12): they are invariant under the transformation \((A,B)(x) \mapsto (\bar{A},-\bar{B})(-x)\). Neglecting the remainder term in the Hamiltonian and introducing the scaled variables

$$\begin{aligned} A(x)=\delta \textrm{e}^{\textrm{i}s x}\tilde{A}(X), \qquad B(x) = \delta ^2 \textrm{e}^{\textrm{i}s x} \tilde{B}(X), \qquad X=\delta x, \end{aligned}$$

where \(\delta =\mu ^2\), confirms that the system is at leading order equivalent to the nonlinear Schrödinger equation

$$\begin{aligned} \tilde{A}_{XX} = -c_1 \tilde{A}-d_1\tilde{A}|\tilde{A}|^2, \end{aligned}$$

where \(c_1\) and \(d_1\) are the coefficients of respectively \(\mu |A|^2\) and \(|A|^4\) in the Taylor expansion of \(\tilde{H}_{\textrm{NF}}^\mu \). We compute these coefficients explicitly in Appendix B and find that

$$\begin{aligned} c_1 = -\frac{1}{b_1}, \qquad d_1 = \frac{\sinh ^2(s)}{2\tau _1}\,\frac{b_2}{b_1}, \end{aligned}$$

where \(b_1\), \(b_2\) are the coefficients in Eq. (1.10) and \(\tau _1>0\) is defined in Eq. (4.4).

A rigorous analysis of (1.17) is given in Sect. 5. Returning to real coordinates q, \(p \in {\mathbb R}^2\) given by \(A=\frac{1}{\sqrt{2}}(q_1+\textrm{i}q_2)\), \(B=\frac{1}{\sqrt{2}}(p_1+\textrm{i}p_2)\), eliminating p and introducing the scaled variables

$$\begin{aligned} q(x)=\delta R_{s x} Q(X), \qquad X=\delta x, \end{aligned}$$

where \(\delta ^2=-c_1\mu \) and \(R_\theta \) is the matrix representing a rotation through the angle \(\theta \), transforms (1.17) into

$$\begin{aligned} Q_{XX} = Q-CQ|Q|^2+T_1^\delta (Q,Q_X)+R_{-s X/\delta }T_2^\delta (R_{s X/\delta }Q,R_{s X/\delta }Q_X,R_{s X/\delta }Q_{XX}), \end{aligned}$$
(1.18)

where \(C=-d_1/c_1\) and

$$\begin{aligned} T_1^\delta (Q,Q_X) = O(\delta |(Q,Q_X)|), \qquad T_2^\delta (Q,Q_X,Q_{XX})=O( \delta ^{n_0-2}|(Q,Q_X,Q_{XX})|). \end{aligned}$$

Equation (1.18) is invariant under the transformation \(X \mapsto -X\), \((Q_1(X),Q_2(X)) \mapsto (Q_1(-X),-Q_2(-X))\) and in the limit \(\delta = 0\) has the explicit solution

$$\begin{aligned} Q(X)=\begin{pmatrix} h(X) \\ 0 \end{pmatrix}, \qquad h(X)=\Big (\frac{2}{C}\Big )^{1/2}\textrm{sech}(X), \end{aligned}$$

which is nondegenerate in the class of symmetric functions (see Sect. 5 for a precise statement of this result). This fact allows one to prove the following theorem with an implicit-function theorem argument.

Theorem 1.2

For each \(\nu \in (0,1)\) and each sufficiently small value of \(\delta >0\) Eq. (1.18) has two homoclinic solutions \(Q^{\delta \pm }\) which are symmetric, that is invariant under the transformation \((Q_1(X),Q_2(X)) \mapsto (Q_1(-X),-Q_2(-X))\), and satisfy the estimate

$$\begin{aligned} Q^{\delta \pm }(X) = \pm \begin{pmatrix} h(X) \\ 0 \end{pmatrix} + O(\delta \textrm{e}^{-\nu |X|}) \end{aligned}$$

for all \(X \in {\mathbb R}\).

Finally, let us briefly mention some related work in the literature. Buffoni and Groves [4] show that (1.17) has an infinite number of geometrically distinct homoclinic solutions which generically resemble multiple copies of one of the ‘primary’ homoclinic solutions found here. In the present context, this result yields the existence of an infinite family of ‘multi-pulse’ hydroelastic solitary waves. A variational existence theory for hydroelastic solitary waves in the present parameter regime has been given by Groves et al. [10], while the Kirchgässner reduction (without the Hamiltonian framework) has also been applied to alternative models in which the ice sheet is modelled as a thin Euler–Bernoulli elastic plate (Parau and Dias [32]) and a Kirchhoff–Love elastic plate with non-zero thickness and inertial effects (Ilichev [17], Ilichev and Tomashpolskii [18]). There are also several numerical studies of hydroelastic solitary waves in deep water (Gao et al. [8], Guyenne and Parau [16], Milewski et al. [30]), and an alternative approach to centre-manifold reduction has been given by Chen et al. [5].

2 Formulation as a Spatial Hamiltonian System

In this section, we formulate the hydrodynamic problem as a spatial Hamiltonian system. Starting with a variational principle for the ‘flattened’ hydrodynamic problem (1.6)–(1.9), we perform a formal Legendre transform to detect its spatial Hamiltonian structure, the correctness of which is confirmed a posteriori.

The ‘flattened’ hydrodynamic problem follows from the variational principle

$$\begin{aligned} \delta \int L(\eta ,\eta _x,\eta _{xx},\Phi ,\Phi _x)\, \textrm{d}x=0 \end{aligned}$$

with Lagrangian

$$\begin{aligned}&L(\eta ,\eta _x,\eta _{xx},\Phi ,\Phi _x) \\&\quad =\int _0^1\!\!\bigg (\!\!\!-\!\!\bigg [\Phi _x\!-\!\Phi _y \frac{y\eta _x}{1+\eta }\bigg ]\!+\!\frac{1}{2}\bigg [\Phi _x\!-\! \Phi _y \frac{y\eta _x}{1+\eta }\bigg ]^2\!\!\!+\!\frac{1}{2}\frac{\Phi _y^2}{(1+\eta )^2}\bigg )\!(1+\eta )\, \textrm{d}y\\&\qquad \qquad \quad + \tfrac{1}{2} \alpha \eta ^2\!+\!\tfrac{1}{2}\beta \frac{\eta _{xx}^2}{(1+\eta _x^2)^{5/2}}. \end{aligned}$$

We perform a formal Legendre transformation (see Lanczos [24, Appendix I]) by defining

$$\begin{aligned} \rho&= \eta _x, \\ \omega&= \frac{\delta L}{\delta \eta _x} - \frac{\textrm{d}}{\textrm{d}x}\bigg (\frac{\delta L}{\delta \eta _{xx}}\bigg ) \\&= \int _0^1 \bigg (y \Phi _y -\bigg [\Phi _x- \Phi _y \frac{y\eta _x}{1+\eta }\bigg ]y \Phi _y\ \bigg )\, \textrm{d}y+\tfrac{5}{2}\beta \frac{\eta _x\eta _{xx}^2}{(1+\eta _x^2)^{7/2}}-\beta \frac{\eta _{xxx}}{(1+\eta _x^2)^{5/2}}, \\ \xi&= \frac{\delta L}{\delta \eta _{xx}} = \beta \frac{\eta _{xx}}{(1+\eta _x^2)^{5/2}}, \\ \Psi&= \frac{\delta L}{\delta \Phi _x} = -(1+\eta )+\bigg (\Phi _x-\Phi _y \frac{y \eta _x}{1+\eta }\bigg )(1+\eta ) \end{aligned}$$

and defining the Hamiltonian function by

$$\begin{aligned} H(\eta ,\rho ,\omega ,\xi ,\Phi ,\Psi )&= \omega \eta _x+ \xi \eta _{xx} + \int _0^1\Psi \Phi _x\, \textrm{d}y-L(\eta ,\rho ,\omega ,\xi ,\Phi ,\Psi ) \\&= \begin{aligned}&\omega \rho - \tfrac{1}{2}\alpha \eta ^2+\frac{\xi ^2}{2\beta }(1+\rho ^2)^{5/2}+\tfrac{1}{2}(1+\eta ) \\&\quad +\int _0^1\left( \frac{1}{2(1+\eta )}(\Psi ^2-\Phi _y^2) +\Psi + \frac{\rho y \Phi _y \Psi }{1+\eta }\right) \, \textrm{d}y. \end{aligned} \end{aligned}$$

Writing \(\alpha = \alpha _0 + \varepsilon _1\) and \(\beta = \beta _0+\varepsilon _2\), where \(\alpha _0\) and \(\beta _0\) are fixed, we find that Hamilton’s equations are given explicitly by

$$\begin{aligned} \eta _x&= \frac{\delta H^\varepsilon }{\delta \omega } = \rho , \end{aligned}$$
(2.1)
$$\begin{aligned} \rho _x&= \frac{\delta H^\varepsilon }{\delta \xi } =\frac{(1+\rho ^2)^{5/2}}{\beta _0+\varepsilon _2} \xi , \end{aligned}$$
(2.2)
$$\begin{aligned} \omega _x&= -\frac{\delta H^\varepsilon }{\delta \eta } = \frac{1}{(1+\eta )^2}\int _0^1\left( \tfrac{1}{2}(\Psi ^2-\Phi _y^2)+\rho y \Phi _y\Psi \right) \, \textrm{d}y+(\alpha _0+\varepsilon _1)\eta -\tfrac{1}{2}, \end{aligned}$$
(2.3)
$$\begin{aligned} \xi _x&= -\frac{\delta H^\varepsilon }{\delta \rho } = -\omega -\frac{5}{2}\frac{\rho }{\beta _0+\varepsilon _2} \xi ^2 (1+\rho ^2)^{3/2}-\frac{1}{1+\eta }\int _0^1y \Phi _y\Psi \, \textrm{d}y, \end{aligned}$$
(2.4)
$$\begin{aligned} \Phi _x&= \frac{\delta H^\varepsilon }{\delta \Psi } = \frac{\Psi +\eta }{1+\eta }+\frac{\rho y \Phi _y}{1+\eta }, \end{aligned}$$
(2.5)
$$\begin{aligned} \Psi _x&= -\frac{\delta H^\varepsilon }{\delta \Phi } = \frac{1}{1+\eta }(-\Phi _y+\rho y \Psi )_y, \end{aligned}$$
(2.6)

where the superscript denotes the dependence upon \(\varepsilon = (\varepsilon _1,\varepsilon _2)\), with boundary conditions

$$\begin{aligned} -\Phi _y+y\rho \Psi \big |_{y=0,1} = 0, \end{aligned}$$

which emerge from the integration by parts used to compute (2.6). A straightforward calculation shows that the \(\eta \)- and \(\Phi \)-components of any solution to these equations satisfy (1.6)–(1.9).

Note that Eqs. (2.1)–(2.6) are reversible, that is invariant under the transformation \((\eta , \omega , \rho , \xi , \Phi , \Psi )(x) \mapsto S(\eta , \omega , \rho , \xi , \Phi , \Psi )(-x)\), where the reverser is defined by

$$\begin{aligned} S(\eta , \omega , \rho , \xi , \Phi , \Psi ) = (\eta , -\omega , -\rho , \xi , -\Phi , \Psi ).\end{aligned}$$

They are also invariant under the transformation \(\Phi \mapsto \Phi + c\) for any constant c. To eliminate this symmetry it is convenient to replace \((\Phi ,\Psi )\) with new variables \((\bar{\Phi },\Phi _0,\bar{\Psi },\Psi _0)\), where \(\bar{\Phi }=\Phi -\Phi _0\), \(\bar{\Psi }=\Psi -\Psi _0\) and

$$\begin{aligned} \Phi _0 = \int _0^1 \Phi \, \textrm{d}y, \qquad \Psi _0 = \int _0^1 \Psi \, \textrm{d}y. \end{aligned}$$

This transformation leads to a new canonical Hamiltonian system with Hamiltonian

$$\begin{aligned} \bar{H}(\eta , \omega , \rho , \xi , \bar{\Phi },\bar{\Psi },\Phi _0,\Psi _0)= & {} H(\eta , \omega , \rho , \xi , \bar{\Phi }+\Phi _0, \bar{\Psi }+\Psi _0)\\ {}= & {} H(\eta , \omega , \rho , \xi , \bar{\Phi }, \bar{\Psi }+\Psi _0) \end{aligned}$$

and additional constraints

$$\begin{aligned} \int _0^1 \bar{\Phi }\, \textrm{d}y=0, \qquad \int _0^1 \bar{\Psi }\, \textrm{d}y=0. \end{aligned}$$

Observe that \(\Phi _0\) is a cyclic variable whose conjugate \(\Psi _0\) is a conserved quantity; we proceed in standard fashion by setting \(\Psi _0=-1\), considering the equations for \((\eta , \omega , \rho , \xi , \bar{\Phi },\bar{\Psi })\) and recovering \(\Phi _0\) by quadrature. Dropping the bars for notational simplicity, one finds that Hamilton’s equations for the reduced system are

$$\begin{aligned} \eta _x&= \rho , \end{aligned}$$
(2.7)
$$\begin{aligned} \rho _x&=\frac{(1+\rho ^2)^{5/2}}{\beta _0+\varepsilon _2} \xi , \end{aligned}$$
(2.8)
$$\begin{aligned} \omega _x&= \frac{1}{(1+\eta )^2}\int _0^1\left( \tfrac{1}{2}((\Psi -1)^2-\Phi _y^2)+\rho y \Phi _y(\Psi -1)\right) \, \textrm{d}y-\tfrac{1}{2}+(\alpha _0+\varepsilon _1)\eta , \end{aligned}$$
(2.9)
$$\begin{aligned} \xi _x&= -\omega -\frac{5}{2}\frac{\rho }{\beta _0+\varepsilon _2} \xi ^2 (1+\rho ^2)^{3/2}-\frac{1}{1+\eta }\int _0^1y \Phi _y(\Psi -1)\, \textrm{d}y, \end{aligned}$$
(2.10)
$$\begin{aligned} \Phi _x&= \frac{\Psi }{1+\eta }+\frac{\rho }{1+\eta }\left( y\Phi _y-\int _0^1 y \Phi _y{\, \textrm{d}y}\right) , \end{aligned}$$
(2.11)
$$\begin{aligned} \Psi _x&= \frac{1}{1+\eta }(-\Phi _y+\rho y (\Psi -1))_y, \end{aligned}$$
(2.12)

with constraints

$$\begin{aligned} \int _0^1 \Phi \, \textrm{d}y=0, \qquad \int _0^1 \Psi \, \textrm{d}y=0 \end{aligned}$$
(2.13)

and boundary conditions

$$\begin{aligned} -\Phi _y+y\rho (\Psi -1)\big |_{y=0,1} = 0. \end{aligned}$$
(2.14)

To make this construction rigorous we recall the differential-geometric definitions of a Hamiltonian system and Hamilton’s equations for its associated vector field (see Groves and Toland [13, §1.4]).

Definition 2.1

A Hamiltonian system consists of a triple \((M,\Omega ,H)\), where \(M\) is a manifold, \(\Omega :TM \times TM \rightarrow \mathbb {R}\) is a closed, weakly nondegenerate bilinear form (the symplectic 2-form) and the Hamiltonian \(H:M \rightarrow \mathbb {R}\) is a smooth function. Its Hamiltonian vector field \(v_H\) with domain \({\mathcal D}(v_H)\subseteq M\) is defined as follows. The point \(m\in M\) belongs to \({\mathcal D}(v_H)\) with \(v_H|_m :=w\in TM|_m\) if and only if

$$\begin{aligned} \Omega |_m(w,v) = \textbf{d}H|_m(v) \end{aligned}$$

for all tangent vectors \(v \in TM|_m\). Hamilton’s equations for \((M,\Omega ,H)\) are the differential equations

$$\begin{aligned} \dot{u} = v_H|_u \end{aligned}$$

which determine the trajectories \(u \in C^1(\mathbb {R},X) \cap C(\mathbb {R},{\mathcal D}(v_H))\) of its Hamiltonian vector field.

Let

$$\begin{aligned} X =\{(\eta , \rho , \omega , \xi , \Phi , \Psi ) \in {\mathbb R} \times {\mathbb R} \times {\mathbb R} \times {\mathbb R} \times \bar{H}^1(0,1) \times \bar{L}^2(0,1)\}, \end{aligned}$$

where the overline denotes the subspace of functions with zero mean value, and define the manifold

$$\begin{aligned} M=\{(\eta ,\rho ,\omega , \xi , \Phi , \Psi ) \in X: \eta >-1\}. \end{aligned}$$

The 2-form \(\Omega \) on \(M\) defined by

$$\begin{aligned}&\Omega |_m\big ((\eta _1, \rho _1, \omega _1, \xi _1, \Phi _1, \Psi _1),(\eta _2, \rho _2, \omega _2,\xi _2,\Phi _2,\Psi _2)\big ) \\ {}&\quad = \int _0^1 (\Psi _2\Phi _1-\Phi _2\Psi _1)\, \textrm{d}y+ \omega _2\eta _1+\xi _2 \rho _1-\eta _2\omega _1-\rho _2\xi _1 \end{aligned}$$

is skew-symmetric, closed (since it is constant) and weakly nondegenerate at each point of \(M\). The triple \((M,\Omega ,H^\varepsilon )\) is, therefore, a Hamiltonian system in the sense of Definition 2.1.

Theorem 2.2

Consider the Hamiltonian system \((M,\Omega ,H^\varepsilon )\). The domain of the corresponding Hamiltonian vector field \(v_{H^\varepsilon }\) is

$$\begin{aligned} {\mathcal D}(v_{H^\varepsilon })&=\Big \{(\eta ,\rho ,\omega ,\xi ,\Phi ,\Psi ) \in {\mathbb R} \times {\mathbb R} \times {\mathbb R} \times {\mathbb R} \times \bar{H}^2(0,1) \times \bar{H}^1(0,1):\\&\qquad \qquad \eta > -1,\ \Phi _y-y\rho (\Psi -1)\big |_{y=0,1} = 0\Big \}, \end{aligned}$$

upon which it is given by the right-hand sides of equations (2.7)–(2.12).

Proof

Let \(\bar{v} |_m = (\bar{\eta }, \bar{\rho }, \bar{\omega }, \bar{\xi }, \bar{\Phi }, \bar{\Psi }) \in TM|_m\), where \(m = (\eta , \rho , \omega , \xi , \Phi , \Psi ) \in M\). The point \(m\) lies in \({\mathcal D}(v_{H^\varepsilon })\) with \(v_{H^\varepsilon }|_m = \bar{v}|_m\) if and only if

$$\begin{aligned} \Omega |_m(\bar{v}|_m,v_1|_m) = \textbf{d}H^\varepsilon |_m(v_1|_m), \end{aligned}$$

that is

$$\begin{aligned}&\omega _1 \bar{\eta }+ \xi _1 \bar{\rho }- \eta _1 \bar{\omega }- \rho _1 \bar{\xi }+ \int _0^1(\Psi _1\bar{\Phi }- \Phi _1\bar{\Psi }\, \textrm{d}y\nonumber \\&= \left( -(\alpha _0+\varepsilon _1)\eta +\tfrac{1}{2} -\frac{1}{2 (1+\eta )^2} \int _0^1\big ((\Psi -1)^2-\Phi _y^2\big )\, \textrm{d}y\right. \nonumber \\&\qquad \quad \left. -\frac{1}{(1+\eta )^2}\int _0^1\rho y \Phi _y (\Psi -1)\, \textrm{d}y\right) \eta _1 \nonumber \\&\quad \;+\left( \omega + \frac{5}{2}\frac{\xi ^2}{\beta _0+\varepsilon _2}\rho (1+\rho ^2)^{3/2}+\int _0^1\frac{y \Phi _y(\Psi -1)}{1+\eta }\, \textrm{d}y\right) \rho _1 \nonumber \\&\quad \;+ \rho \omega _1+\frac{\xi }{\beta _0+\varepsilon _2}(1+\rho ^2)^{5/2}\xi _1 +\frac{1}{1+\eta }\int _0^1(-\Phi _y+\rho y (\Psi -1) )\Phi _{1y}\, \textrm{d}y\nonumber \\&\quad \;+\frac{1}{1+\eta }\int _0^1(\Psi +\eta + \rho y \Phi _y)\Psi _1\, \textrm{d}y\end{aligned}$$
(2.15)

for all \(\bar{v}_1|_m = (\eta _1,\rho _1,\omega _1,\xi _1,\Phi _1,\Psi _1) \in TM|_m\).

The four particular choices \((\eta _1,\rho _1,\xi _1,\Phi _1,\Psi _1)=(0,0,0,0,0)\), \((\eta _1,\rho _1,\omega _1,\Phi _1,\Psi _1)=(0,0,0,0,0)\), \((\rho _1,\omega _1,\xi _1,\Phi _1,\Psi _1)=(0,0,0,0,0)\) and \((\eta _1,\omega _1,\xi _1,\Phi _1,\Psi _1)=(0,0,0,0,0)\) yield, respectively,

$$\begin{aligned} \bar{\eta }&= \rho , \\ \bar{\rho }&=(1+\rho ^2)^{5/2}\frac{\xi }{\beta _0+\varepsilon _2},\\ \bar{\omega }&=\frac{1}{(1+\eta )^2}\int _0^1\left( \tfrac{1}{2}((\Psi -1)^2-\Phi _y^2)+\rho y \Phi _y (\Psi -1)\right) \, \textrm{d}y-\tfrac{1}{2} + (\alpha _0+\varepsilon _1)\eta ,\\ \bar{\xi }&= -\omega -\frac{5}{2}\frac{\rho }{\beta _0+\varepsilon _2} \xi ^2 (1+\rho ^2)^{3/2}-\frac{1}{1+\eta }\displaystyle \int _0^1y \Phi _y(\Psi -1)\, \textrm{d}y, \end{aligned}$$

and with these expressions for \(\bar{\omega }\), \(\bar{\eta }\), \(\bar{\rho }\) and \(\bar{\xi }\) equation (2.15) becomes

$$\begin{aligned} \int _0^1\big (\Psi _1\bar{\Phi }- \Phi _1\bar{\Psi }\big )\, \textrm{d}y= \frac{1}{1+\eta }\int _0^1\big ( (-\Phi _y+\rho y (\Psi -1) )\Phi _{1y}+(\Psi +\eta + \rho y \Phi _y)\Psi _1\big )\, \textrm{d}y. \end{aligned}$$

Choosing \(\tilde{\Phi } \in H^1(0,1)\), \(\tilde{\Psi } \in L^2(0,1)\) and setting \(\Phi _1=0\), \(\Psi _1 = \tilde{\Psi }-\int _0^1 \tilde{\Psi } \, \textrm{d}y\) and \(\Phi _1 = \tilde{\Phi }-\int _0^1 \tilde{\Phi } \, \textrm{d}y\), \(\Psi _1=0\), we thus find that

$$\begin{aligned} \int _0^1\tilde{\Psi }\bigg (\frac{\Psi }{1+\eta }+\frac{\rho }{1+\eta }\left( y\Phi _y-\int _0^1 y \Phi _y{\, \textrm{d}y}\right) -\bar{\Phi }\bigg )\, \textrm{d}y= 0 \end{aligned}$$

for all \(\tilde{\Psi } \in L^2(0,1)\), and in particular for all \(\tilde{\Psi } \in C_0^\infty (0,1)\), which implies that

$$\begin{aligned} \bar{\Phi }= \frac{\Psi }{1+\eta }+\frac{\rho }{1+\eta }\left( y\Phi _y-\int _0^1 y \Phi _y{\, \textrm{d}y}\right) \in H^1(0,1), \end{aligned}$$
(2.16)

and

$$\begin{aligned} \int _0^1\bigg (\tilde{\Phi }\bar{\Psi }+ \tilde{\Phi }_y\bigg (-\frac{\Phi _y}{1+\eta }+\frac{\rho y (\Psi -1)}{1+\eta }\bigg )\bigg )\, \textrm{d}y= 0 \end{aligned}$$
(2.17)

for all \(\tilde{\Phi } \in H^1(0,1)\), and in particular for all \(\tilde{\Phi } \in C_0^\infty (0,1)\), which implies that

$$\begin{aligned} \bar{\Psi }= \dfrac{1}{1+\eta }(-\Phi _y+\rho y (\Psi -1))_y \in L^2(0,1) \end{aligned}$$
(2.18)

in the weak sense. It follows from (2.16) and (2.18) that \(\Phi _y \in H^1(0,1)\) and \(\Psi _y\in L^2(0,1)\), so that \(\Phi \in H^2(0,1)\) and \(\Psi \in H^1(0,1)\).

Finally, integrating the second term in (2.17) by parts and using (2.18), we find that

$$\begin{aligned} \bigg [\Phi _1\bigg (-\frac{\Phi _y}{1+\eta }+\frac{\rho y (\Psi -1)}{1+\eta }\bigg )\bigg ]_0^1=0 \end{aligned}$$

for all \(\Phi _1 \in C^\infty [0,1]\), so that

$$\begin{aligned} -\frac{\Phi _y}{1+\eta }+\frac{\rho y (\Psi -1)}{1+\eta }\bigg |_{y=0,1} = 0. \end{aligned}$$

\(\square \)

One cannot work directly with (2.7)–(2.12) because of the nonlinear boundary condition at \(y=1\) in the domain of the Hamiltonian vector field \(v_{H^\varepsilon }\). We overcome this difficulty using the change of variable \((\eta , \rho , \omega , \xi , \Phi , \Psi ) \mapsto (\eta , \rho , \omega , \xi , \Gamma , \Psi )\), where

$$\begin{aligned} \Gamma = \Phi - \rho \int _0^ys(\Psi (s)-1)\, \textrm{d}s+\rho \int _0^1\int _0^ys(\Psi (s)-1)\, \textrm{d}s\, \textrm{d}y, \end{aligned}$$

which is a smooth diffeomorphism \(X\rightarrow X\) and \(M\rightarrow M\) with inverse

$$\begin{aligned} \Phi = \Gamma + \rho \int _0^ys(\Psi (s)-1)\, \textrm{d}s-\rho \int _0^1\int _0^ys(\Psi (s)-1)\, \textrm{d}s\, \textrm{d}y. \end{aligned}$$

This change of variable transforms equations (2.7)–(2.12) into

$$\begin{aligned} \eta _x&= \rho , \end{aligned}$$
(2.19)
$$\begin{aligned} \rho _x&= \frac{(1+\rho ^2)^{5/2}}{\beta _0+\varepsilon _2}\xi , \end{aligned}$$
(2.20)
$$\begin{aligned} \omega _x&= \frac{1}{(1+\eta )^2}\int _0^1\bigg \{\tfrac{1}{2}(\Psi -1)^2-\tfrac{1}{2}\big (\Gamma _y+\rho y (\Psi -1))^2\big ) \nonumber \\&\qquad \qquad \qquad \qquad \quad + \rho y\Gamma _y(\Psi -1) + \rho ^2 y^2 (\Psi -1)^2\bigg \}\, \textrm{d}y\nonumber \\&\qquad -\tfrac{1}{2}+ (\alpha _0+\varepsilon _1) \eta , \end{aligned}$$
(2.21)
$$\begin{aligned} \xi _x&= -\omega -\frac{5}{2}\frac{\rho }{\beta _0+\varepsilon _2}\xi ^2(1+\rho ^2)^{3/2}-\frac{1}{1+\eta }\int _0^1y(\Gamma _y+\rho y (\Psi -1)) (\Psi -1)\, \textrm{d}y, \end{aligned}$$
(2.22)
$$\begin{aligned} \Gamma _x&= \frac{\Psi }{1+\eta } + \frac{\rho y(\Gamma _y + \rho y (\Psi -1))}{1+\eta }- \frac{\rho }{1+\eta }\int _0^1 y(\Gamma _y + \rho y (\Psi -1))\, \textrm{d}y\nonumber \\&\qquad \qquad \quad - \frac{(1+\rho ^2)^{5/2}}{\beta _0+\varepsilon _2}\xi \int _0^ys(\Psi (s)-1)\, \textrm{d}s\nonumber \\&\qquad \qquad \quad + \frac{(1+\rho ^2)^{5/2}}{\beta _0+\varepsilon _2}\xi \int _0^1\int _0^ys(\Psi (s)-1)\, \textrm{d}s\, \textrm{d}y\nonumber \\&\qquad \qquad \quad + \rho \int _0^y\frac{s}{1+\eta }\Gamma _{yy}\, \textrm{d}s- \rho \int _0^1\int _0^y\frac{s}{1+\eta }\Gamma _{yy}\, \textrm{d}s\, \textrm{d}y, \end{aligned}$$
(2.23)
$$\begin{aligned} \Psi _x&= -\frac{1}{1+\eta }\Gamma _{yy} \end{aligned}$$
(2.24)

and the boundary conditions (2.14) into

$$\begin{aligned} \Gamma _y\big |_{y=0,1} = 0 . \end{aligned}$$
(2.25)

Equations (2.19)–(2.24) are Hamilton’s equations for the Hamiltonian system \((M,\Upsilon ,\hat{H}^{\varepsilon })\), where

$$\begin{aligned} \hat{H}^\varepsilon (\eta ,\rho ,\omega ,\xi ,\Gamma ,\Psi )&=\omega \rho - \tfrac{1}{2}(\alpha _0+\varepsilon _1)\eta ^2+\frac{\xi ^2}{2\beta _0+\varepsilon _2}(1+\rho ^2)^{5/2}+\tfrac{1}{2}(\eta -1) \\&\qquad \quad +\!\!\int _0^1\!\!\bigg \{\frac{1}{2(1+\eta )}\big ((\Psi \!-\!1)^2\!-\!(\Gamma _y+\rho y(\Psi \!-\!1))^2\big )\\&\qquad \quad \qquad \qquad \quad + \frac{1}{1+\eta }\big (\rho y\Gamma _y(\Psi \!-\!1)+\rho ^2y^2(\Psi -1)^2\big )\!\bigg \}\!\, \textrm{d}y\end{aligned}$$

and

$$\begin{aligned}&\Upsilon |_{(\eta , \rho , \omega , \xi , \Gamma , \Psi )}\big ((\tilde{\eta }_1, \tilde{\rho }_1, \tilde{\omega }_1, \tilde{\xi }_1, \tilde{\Gamma }_1, \tilde{\Psi }_1), (\tilde{\eta }_2, \tilde{\rho }_2, \tilde{\omega }_2, \tilde{\xi }_2, \tilde{\Gamma }_2, \tilde{\Psi }_2)\big ) \\&\qquad =\int _0^1 \bigg \{\tilde{\Psi }_2\left( \tilde{\Gamma }_1+\tilde{\rho }_1\int _0^ys\Psi (s)\, \textrm{d}s+\rho \int _0^ys\tilde{\Psi }_1(s)\, \textrm{d}s\right) - \tfrac{1}{2}\tilde{\rho }_1y^2\tilde{\Psi }_2\\&\qquad \qquad \quad \qquad \quad -\tilde{\Psi }_1\left( \tilde{\Gamma }_2 + \tilde{\rho }_2\int _0^ys\Psi (s)\, \textrm{d}s+ \rho \int _0^ys\tilde{\Psi }_2(s)\, \textrm{d}s\right) +\tfrac{1}{2}\tilde{\rho }_2y^2\tilde{\Psi }_1\bigg \} \, \textrm{d}y\\&\qquad \qquad \qquad +\tilde{\omega }_2\tilde{\eta }_1+\tilde{\xi }_2\tilde{\rho }_1-\tilde{\eta }_2\tilde{\omega }_1-\tilde{\rho }_2\tilde{\xi }_1; \end{aligned}$$

furthermore,

$$\begin{aligned} {\mathcal D}(v_{\hat{H}^\varepsilon })=\left\{ (\eta ,\rho ,\omega ,\xi ,\Gamma ,\Psi ) \in M: \Gamma _y\big |_{y=0,1} = 0\right\} . \end{aligned}$$

We write (2.19)–(2.24) as

$$\begin{aligned} u_x = L u + N^\varepsilon (u), \end{aligned}$$

in which \(L = \textrm{d}v_{\hat{H}^0}[0]\), so that

with

$$\begin{aligned} \mathcal {D}(L) = \left\{ (\eta , \rho , \omega , \xi , \Gamma , \Psi ) \in \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times \mathbb {R}\times \bar{H}^2(0,1) \times \bar{H}^1(0,1) : \Gamma _y\big |_{y=0,1} = 0 \right\} . \end{aligned}$$

3 Spectral Analysis

In this section, we examine the spectrum of the linear operator \(L:{\mathcal D}(L) \subseteq X \rightarrow X\) in detail. Our first result is obtained by a straightforward calculation.

Proposition 3.1

A complex number \(\lambda \) is an eigenvalue of \(L\) if and only if

$$\begin{aligned} \alpha _0+\lambda ^4\beta _0 = \lambda \cot (\lambda ); \end{aligned}$$
(3.1)

its eigenspace is one-dimensional and spanned by, respectively,

$$\begin{aligned} e_\lambda =\begin{pmatrix} \frac{1}{\lambda }\sin (\lambda ) \\ \sin (\lambda ) \\ \frac{1}{\lambda ^2} (\alpha _0-1)\sin (\lambda ) \\ \frac{1}{\lambda ^2} \cos (\lambda ) - \frac{1}{\lambda ^3} \alpha _0 \sin (\lambda ) \\ \frac{1}{\lambda }\cos (\lambda y) -\frac{1}{\lambda ^2}\sin (\lambda )+\frac{1}{2}(y^2-\frac{1}{3})\sin (\lambda ) \\ \cos (\lambda y)-\frac{1}{\lambda }\sin (\lambda ) \end{pmatrix}, \qquad e_0=\begin{pmatrix} 1 \\ 0 \\ 0\\ 0 \\ 0 \\ 0 \end{pmatrix} \end{aligned}$$

for \(\lambda \ne 0\) and \(\lambda =0\) (which arises only for \(\alpha _0=1\)). All eigenvalues are also algebraically simple, with the exception of the zero eigenvalue at \(\alpha _0=1\) and the purely imaginary eigenvalues \(\pm \textrm{i}s\) at the point \((\beta _0(s),\alpha _0(s))\) of the curve

$$\begin{aligned} C=\left\{ \hspace{-1.5pt}(\beta _0(s),\alpha _0(s))=\left( \frac{1}{4s^3}\coth (s)-\frac{1}{4s^2\sinh ^2(s)},\frac{3s}{4}\coth (s)+\frac{s^2}{4\sinh ^2(s)}\right) : s \in (0,\infty )\hspace{-1.5pt}\right\} \end{aligned}$$

in the parameter plane which are algebraically double.

The following lemma gives more precise information on the point spectrum of L.

Lemma 3.2

Choose \((\beta _0,\alpha _0) \in C\). The point spectrum of L consists of a countably infinite family \(\{\lambda _k\}_{k \in {\mathbb Z}{\setminus }\{0\}}\) of simple real eigenvalues, where \(\{\lambda _k\}_{k=1}^\infty \) are the positive real solutions of equation (3.1) and \(\lambda _{-k}=-\lambda _k\) for \(k=1,2,\ldots \) together with

  1. (a)

    two plus–minus pairs of simple purely imaginary eigenvalues if \(\alpha _0>1\) and \((\beta _0,\alpha _0)\) lies to the left of the curve C in the parameter plane,

  2. (b)

    a plus–minus pair of algebraically double purely imaginary eigenvalues \(\pm \textrm{i}s\) if \((\beta _0,\alpha _0)\) is the point with parameter value s on the curve C,

  3. (c)

    a plus–minus quartet of genuinely complex eigenvalues if \(\alpha _0>1\) and \((\beta _0,\alpha _0)\) lies to the right of the curve C in the parameter plane,

  4. (d)

    a plus–minus pair of simple purely imaginary eigenvalues and an algebraically double zero eigenvalue if \(\alpha _0=1\),

  5. (e)

    an additional plus–minus pair of simple real eigenvalues and a plus–minus pair of simple purely imaginary eigenvalues if \(\alpha _0<1\).

Furthermore, \(\lambda _k \in (k\pi , (k+1)\pi )\) for \(k=1,2,\ldots \) and

$$\begin{aligned} \lambda _k^2 = k^2\pi ^2+\frac{2}{\beta _0}+o\left( \frac{1}{k}\right) \end{aligned}$$

for large k.

Proof

Observe that \(\lambda \) solves (3.1) if and only if \(\nu =\lambda ^2\) is an eigenvalue of the non-self-adjoint Sturm–Liouville problem

$$\begin{aligned} -v_{yy}= & {} \nu v, \end{aligned}$$
(3.2)
$$\begin{aligned} \frac{v_y(1)}{v(1)}= & {} \alpha _0 + \beta _0 \nu ^2, \end{aligned}$$
(3.3)
$$\begin{aligned} v(0)= & {} 0. \end{aligned}$$
(3.4)

This problem has a countable number of (not necessarily real) eigenvalues \(\{\nu _n\}_{n \in {\mathbb N}_0}\), which repeated according to algebraic multiplicity and listed according in increasing absolute value, are given asymptotically for large n by

$$\begin{aligned} \nu _n = (n-1)^2\pi ^2+\frac{2}{\beta _0}+o\left( \frac{1}{n}\right) \end{aligned}$$
(3.5)

(see Binding et al. [3, Theorem 2.2]). The real eigenvalues of the spectral problem (3.2)–(3.4) correspond to the intersections in the \((\nu ,s)\) plane of the parabola \(s=\alpha _0+\beta _0 \nu ^2\) and the curve \(s=B(\nu )\), where \(B(\nu )=\sqrt{\nu } \cot \sqrt{\nu }\). The function \(B(\nu )\) has poles exactly at the Dirichlet eigenvalues

$$\begin{aligned} \nu _n^\textrm{D} = (n+1)^2 \pi ^2, \qquad n \in {\mathbb N}_0 \end{aligned}$$
(3.6)

of the self-adjoint problem in which (3.3) is replaced by \(v(1)=0\); it is strictly decreasing from \(+\infty \) to \(-\infty \) in each interval \((-\infty , \nu _0^\textrm{D})\) and \((\nu _n^\textrm{D}, \nu _{n+1}^\textrm{D})\), \(n\in {\mathbb N}_0\). It follows that (3.2)–(3.4) has at least one real eigenvalue in each interval \((\nu _n^\textrm{D}, \nu _{n+1}^\textrm{D})\), \(n\in {\mathbb N}_0\) (see Fig. 5).

Fig. 5
figure 5

Geometric characterisation of the eigenvalues \(\nu _n\) as the points of intersection of the curve \(s=B(\nu )\) with the parabola \(s=\alpha _0+\beta _0\nu ^2\); one real eigenvalue lies in each interval \((\nu _n^\textrm{D}, \nu _{n+1}^\textrm{D})\), \(n \in {\mathbb N}_0\). a Two additional complex eigenvalues; b one additional algebraically double negative eigenvalue; ce two additional real eigenvalues

Comparing (3.5) with (3.6) and using the above geometrical characterisation of the real eigenvalues, one concludes that

  1. (1)

    each interval \((\nu _n^\textrm{D}, \nu _{n+1}^\textrm{D})\), \(n \in {\mathbb N}\) contains a simple real eigenvalue;

  2. (2)

    there are precisely two additional eigenvalues (counted according to algebraic multiplicity) in the form of either

    1. (a)

      a complex-conjugate pair (with non-vanishing imaginary part) whose absolute value is less than \(\nu _0^\textrm{D}\) (Fig. 5a),

    2. (b)

      one negative, algebraically double eigenvalue (Fig. 5b),

    3. (c)

      two simple real eigenvalues to the left of \(\nu _0^D\), at least one of which is negative (Fig. 5c–e).

The solutions \(\lambda \) of (3.1) are recovered from the above analysis by the formula \(\nu =\lambda ^2\), so that in particular they occur in plus-minus pairs. Clearly, (3.1) has a real solution in each interval \(((\nu _n^\textrm{D})^{1/2}, (\nu _{n+1}^\textrm{D})^{1/2})\) and \((-(\nu _{n+1}^\textrm{D})^{1/2},-(\nu _n^\textrm{D})^{1/2})\), \(n \in {\mathbb N}_0\) (see point (1) above), and it follows from point (2) that there are four additional solutions (counted according to multiplicity). The results in Proposition 3.1 and the fact that \(B(0)=1\) show that these four solutions are described by precisely one of the statements (a)–(e) (according to which of the scenarios in Fig. 5 occurs).

The asymptotic formula for \(\lambda _k\) follows by writing \(k=n+1\). \(\square \)

According to this lemma the purely imaginary eigenvalues of L appear in pairs \(\pm \textrm{i}s\) satisfying the dispersion relation

$$\begin{aligned} \alpha _0+s^4\beta _0=s\coth (s). \end{aligned}$$
(3.7)

Fig. 4 shows the dependence of these eigenvalues upon \(\beta _0\) and \(\alpha _0\). At each point of \(\{\alpha _0=1\}\), two real eigenvalues become purely imaginary by colliding at the origin, while at each point of the curve C two pairs of purely imaginary eigenvalues become complex by colliding at non-zero points \(\pm \textrm{i}s\) on the imaginary axis. For later reference, we record the formulae

for an eigenvector e and generalised eigenvector f with eigenvalue \(\textrm{i}s\) when \((\beta _0,\alpha _0) \in C\) (the corresponding formulae for the zero eigenvalue at \(\alpha _0=1\) are \(e_0=(1,0,0,0,0,0)^\textrm{T}\), \(f_0=(0,1,-\frac{1}{3},0,0,0)^\textrm{T}\)).

Lemma 3.3

The operator L is regular, that is its spectrum consists entirely of isolated eigenvalues of finite algebraic multiplicity.

Proof

Since \({\mathcal D}(L)\) is compactly embedded in X it suffices to show that \(\rho (L)\) is non-empty, so that L has compact resolvent (Kato [20, Theorem III.6.29]). In the case \(\alpha _0 \ne 1\), a direct calculation shows that L is invertible with

To deal with the case \(\alpha _0=1\) note that \(L|_{\alpha _0=1}\) is a compact perturbation of \(L|_{\alpha _0=\frac{1}{2}}\), so that the essential spectrum of these two operators (the set of \(\lambda \) for which \((\lambda I - L)\) is not Fredholm with index zero) is identical (see Schechter [34]). It follows that the spectrum of \(L|_{\alpha _0=1}\) consists of the solution set of (3.1); in particular, its resolvent set is non-empty. \(\square \)

Finally, we show that the set of generalised eigenvectors of L form a Schauder basis for X, which is henceforth replaced by its complexification. In particular, we show that this set is a Riesz basis, that is a basis obtained from an orthonormal basis by an isomorphism (see Gohberg and Krein [9, §VI.2]); note that we use the Dirichlet norm for the space \(\bar{H}^1(0,1)\).

Proposition 3.4

The set

$$\begin{aligned} {\mathcal A} = \left\{ \begin{pmatrix} (k\pi )^{-1}\cos (k \pi y) \\ \cos (k \pi y) \end{pmatrix}\right\} _{k \in {\mathbb Z} \setminus \{0\}} \end{aligned}$$

is an orthonormal basis for \(\bar{H}^1(0,1) \times \bar{L}^2(0,1)\).

Proof

Note that \(\{\sqrt{2} \cos (k\pi y)\}_{k=1}^\infty \), \(\{\sqrt{2} (k\pi )^{-1}\cos (k\pi y)\}_{k=1}^\infty \) are orthonormal bases for, respectively, \(\bar{L}^2(0,1)\) and \(\bar{H}^1(0,1)\). It, therefore, follows from

$$\begin{aligned}{} & {} \textrm{sp} \left\{ \begin{pmatrix} (k\pi )^{-1}\cos (k \pi y) \\ \cos (k \pi y) \end{pmatrix}\right\} _{k \in {\mathbb Z} \setminus \{0\}} = \textrm{sp} \left\{ \begin{pmatrix} \sqrt{2}(k\pi )^{-1}\cos (k \pi y) \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ \sqrt{2}\cos (k \pi y) \end{pmatrix}\right\} _{k = 1}^\infty \end{aligned}$$

in \(\bar{H}^1(0,1) \times \bar{L}^2(0,1)\) that \({\mathcal A}\) is complete, and it is evidently orthonormal. \(\square \)

Corollary 3.5

Let P be the spectral projection onto the four-dimensional subspace of X corresponding to the eigenvalues shown in Fig. 4, and let \(\{e_1,e_2,e_3,e_4\}\) be a basis for P[X] consisting of generalised eigenvectors of L. The set

$$\begin{aligned} \{e_1,e_2,e_3,e_4\} \cup \{f_k\}_{k \in {\mathbb Z} \setminus \{0\}}, \qquad f_k=\begin{pmatrix}0 \\ 0 \\ 0 \\ 0 \\ (k\pi )^{-1}\cos (k \pi y) \\ \cos (k \pi y) \end{pmatrix} \end{aligned}$$

is a Riesz basis for X.

Proof

Let \(\{g_1,g_2,g_3,g_4\}\) denote the usual basis for the subset \({\mathbb C}^4 \times \{(0,0)\}\) of X, and note that \(\{g_1,g_2,g_3,g_4\} \cup \{f_k\}_{k \in {\mathbb Z} {\setminus } \{0\}}\) is an orthonormal basis for X. Let \(\pi : X \rightarrow {\mathbb C}^4\) denote the projection \((\eta ,\rho ,\omega ,\xi ,\Phi ,\Psi ) \mapsto (\eta ,\rho ,\omega ,\xi )\), and note that \(\{\pi e_1,\pi e_2, \pi e_3, \pi e_4\}\) spans \({\mathbb C}^4\).

The formula \(S(\eta ,\rho ,\omega ,\xi ,\Phi ,\Psi )=(T(\eta ,\rho ,\omega ,\xi ),(\Phi ,\Psi ))\), where \(T(\eta ,\rho ,\omega ,\xi )\) is the coordinate vector of \((\eta ,\rho ,\omega ,\xi )\) with respect to the basis \(\{\pi e_1,\pi e_2, \pi e_3, \pi e_4\}\) for \({\mathbb C}^4\), defines an isomorphism \(X \rightarrow X\) with

$$\begin{aligned} S[\{g_1,g_2,g_3,g_4\} \cup \{f_k\}_{k \in {\mathbb Z} {\setminus } \{0\}}]=\{e_1,e_2,e_3,e_4\} \cup \{f_k\}_{k \in {\mathbb Z} {\setminus } \{0\}}. \end{aligned}$$

It follows that \(\{e_1,e_2,e_3,e_4\} \cup \{f_k\}_{k \in {\mathbb Z} {\setminus } \{0\}}\) is a Riesz basis for X. \(\square \)

Theorem 3.6

The set \(\{e_1,e_2,e_3,e_4\} \cup \{e_{\lambda _k}\}_{k \in {\mathbb Z}{\setminus }\{0\}}\) is a Riesz basis for X.

Proof

We first note that the set \(\{e_1,e_2,e_3,e_4\}\cup \{e_{\lambda _k}\}_{k \in {\mathbb Z}{\setminus }\{0\}}\) is \(\omega \)-linearly independent since it is the union of bases for the generalised eigenspaces of a regular operator (see Gohberg and Krein [9, p. 329]).

Choose \(\mu ^\star \in (0,\lambda _1)\). The function \(h:(0,\infty ) \rightarrow {\mathcal X}\) defined by

$$\begin{aligned} h(\mu )= \begin{pmatrix} \frac{1}{\mu }\sin (\mu ) \\ \sin (\mu ) \\ \frac{1}{\mu ^2} (\alpha _0-1)\sin (\mu ) \\ \frac{1}{\mu ^2} \cos (\mu ) - \frac{1}{\mu ^3} \alpha _0 \sin (\mu ) \\ \frac{1}{\mu }\cos (\mu y) -\frac{1}{\mu ^2}\sin (\mu )+\frac{1}{2}(y^2-\frac{1}{3})\sin (\mu ) \\ \cos (\mu y)-\frac{1}{\mu }\sin (\mu ) \end{pmatrix} \end{aligned}$$

satisfies

$$\begin{aligned} \Vert h(\mu _1)-h(\mu _2)\Vert \le \sup _{\mu \in [\mu ^\star ,\infty )} \Vert h^\prime (\mu )\Vert |\mu _1-\mu _2| \lesssim |\mu _1-\mu _2| \end{aligned}$$

for all \(\mu _1\), \(\mu _2 \in (\mu ^\star ,\infty )\). With \(\mu _1 = \lambda _k\) and \(\mu _2 = k \pi \) this calculation shows in particular that

$$\begin{aligned} \left\| e_{\lambda _k}-\begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \\ (k\pi )^{-1}\cos (k \pi y) \\ \cos (k \pi y) \end{pmatrix}\right\|&= \left\| e_{\lambda _k}-\begin{pmatrix} 0 \\ 0 \\ 0 \\ (k \pi )^{-2}\cos (k \pi ) \\ (k \pi )^{-1}\cos (k \pi y) \\ \cos (k \pi y) \end{pmatrix}+\begin{pmatrix} 0 \\ 0 \\ 0 \\ (k \pi )^{-2}\cos (k \pi ) \\ 0 \\ 0 \end{pmatrix}\right\| \\&\le \left\| e_{\lambda _k}-\begin{pmatrix} 0 \\ 0 \\ 0 \\ (k \pi )^{-2}\cos (k \pi ) \\ (k \pi )^{-1}\cos (k \pi y) \\ \cos (k \pi y) \end{pmatrix} \right\| + \frac{1}{k^2\pi ^2} \\&\lesssim |\lambda _k - k \pi | + \frac{1}{k^2\pi ^2} \\&= O(\tfrac{1}{k}) \end{aligned}$$

as \(k \rightarrow \infty \), and similarly

$$\begin{aligned} \left\| e_{-\lambda _k}-\begin{pmatrix} 0 \\ 0 \\ 0 \\ (k \pi )^{-2}\cos (k \pi ) \\ -(k \pi )^{-1}\cos (k \pi y) \\ \cos (k \pi y) \end{pmatrix}\right\| = \left\| e_{\lambda _k}-\begin{pmatrix} 0 \\ 0 \\ 0 \\ (k \pi )^{-2}\cos (k \pi ) \\ (k \pi )^{-1}\cos (k \pi y) \\ \cos (k \pi y) \end{pmatrix}\right\| =O(\tfrac{1}{k}) \end{aligned}$$

as \(k \rightarrow \infty \). Hence,

$$\begin{aligned} \sum _{j=1}^4\Vert e_j-e_j\Vert ^2 +\sum _{k \in {\mathbb Z} \setminus \{0\}} \left\| e_{\lambda _k}-\begin{pmatrix} 0 \\ 0 \\ 0 \\ (k \pi )^{-2}\cos (k \pi ) \\ (k \pi )^{-1}\cos (k \pi y) \\ \cos (k \pi y) \end{pmatrix}\right\| ^2 < \infty \end{aligned}$$

and the conclusion now follows by Bari’s theorem (Gohberg and Krein [9, Theorem VI.2.3]). \(\square \)

Let \(\{e^1,e^2,e^3,e^4\} \cup \{e^{\lambda _k}\}_{k \in {\mathbb Z} {\setminus }\{0\}}\) be the dual Riesz basis to \(\{e_1,e_2,e_3,e_4\} \cup \{e_{\lambda _k}\}_{k \in {\mathbb Z} {\setminus }\{0\}}\) (see Gohberg and Krein [9, §VI.1-2]), so that

$$\begin{aligned} P = \sum _{i=1}^4 \langle \,\cdot ,e^i\rangle e_i, \qquad (I-P) = \sum _{k \in {\mathbb Z} \setminus \{0\}} \langle \,\cdot , e^{\lambda _k}\rangle e_{\lambda _k}, \end{aligned}$$

and define \(X_2 = (I-P)X\), \(L_2=L|_{X_2}\) (with \({\mathcal D}(L_2) = {\mathcal D}(L) \cap X_2\)). Note that

$$\begin{aligned} X_2 = \bigg \{\sum _{k \in {\mathbb Z}\setminus \{0\}}\beta _k e_{\lambda _k}: \{\beta _k\} \in \ell ^2\bigg \}, \qquad {\mathcal D}(L_2) = \bigg \{\sum _{k \in {\mathbb Z}\setminus \{0\}} \beta _k e_{\lambda _k}: \{\lambda _k\beta _k\} \in \ell ^2\bigg \}. \end{aligned}$$

We conclude this section with a maximal regularity result for \(\tilde{L}\) which is used in Sect. 4 below.

Lemma 3.7

The operator \(L_2:{\mathcal D}(L_2) \subseteq X_2 \rightarrow X_2\) has \(L^2\)-maximal regularity in the sense that the differential equation

$$\begin{aligned} \dot{w} = L_2 w + h \end{aligned}$$

admits a unique solution \(w \in H^1({\mathbb R},X_2)\cap L^2({\mathbb R},{\mathcal D}(L_2))\) for each \(h \in L^2({\mathbb R},X_2)\).

Proof

Writing

$$\begin{aligned} w = \sum _{k \in {\mathbb Z}\setminus \{0\}} w_k e_{\lambda _k}, \qquad h=\sum _{k \in {\mathbb Z}\setminus \{0\}} h_k e_{\lambda _k} \end{aligned}$$

(where \(w_k = \langle w,e^{\lambda _k}\rangle \), \(h_k = \langle h,e^{\lambda _k}\rangle \)), we find that

$$\begin{aligned} \dot{w}_k = \lambda _k w_k + h_k, \end{aligned}$$
(3.8)

which is solved by

Note that

$$\begin{aligned} \Vert w_k\Vert _{L^2(\mathbb {R},\mathbb {R})} \le \frac{1}{\lambda _k} \Vert h_k\Vert _{L^2(\mathbb {R},\mathbb {R})} \end{aligned}$$

because

$$\begin{aligned} \Vert w_k\Vert _{L^2(\mathbb {R},\mathbb {R})}^2&= \int _{-\infty }^\infty \left| \int _t^\infty h_k(s)\textrm{e}^{\lambda _k(t-s)}\, \textrm{d}s\right| ^2\, \textrm{d}t\\&\le \int _{-\infty }^\infty \int _t^\infty \textrm{e}^{\lambda _k(t-s)}\, \textrm{d}s\int _t^\infty \textrm{e}^{\lambda _k(t-s)}|h_k(s)|^2 \, \textrm{d}s\, \textrm{d}t\\&= \frac{1}{\lambda _k} \int _{-\infty }^\infty \int _t^\infty \textrm{e}^{\lambda _k(t-s)}|h_k(s)|^2 \, \textrm{d}s\, \textrm{d}t\\&= \frac{1}{\lambda _k} \int _{-\infty }^\infty \int _{-\infty }^s \textrm{e}^{\lambda _k(t-s)} \, \textrm{d}t\, | h_k(s)|^2 \, \textrm{d}s\\&= \frac{1}{\lambda _k^2} \Vert h_k\Vert _{L^2(\mathbb {R},\mathbb {R})}^2 \end{aligned}$$

for \(k>0\) with a similar calculation for \(k<0\). It follows that

$$\begin{aligned} \Vert w\Vert _{L^2(\mathbb {R},X_2)}^2= & {} \hspace{-0.5pt} \int _{-\infty }^\infty \sum _{k \in {\mathbb Z} \setminus \{0\}} |w_k(t)|^2\, \textrm{d}t\\= & {} \sum _{k \in {\mathbb Z} \setminus \{0\}} \Vert w_k\Vert _{L^2(\mathbb {R},\mathbb {R})}^2\\\le & {} \sum _{k \in {\mathbb Z} \setminus \{0\}} \Vert h_k\Vert _{L^2(\mathbb {R},\mathbb {R})}^2\\= & {} \Vert h\Vert _{L^2(\mathbb {R},X_2)}^2 \end{aligned}$$

and similarly

$$\begin{aligned} \Vert L_2w\Vert _{L^2(\mathbb {R},X_2)}^2= & {} \int _{-\infty }^\infty \sum _{k \in {\mathbb Z} \setminus \{0\}} \lambda _k^2|w_k(t)|^2\, \textrm{d}t\\= & {} \sum _{k \in {\mathbb Z} \setminus \{0\}} \lambda _k^2\Vert w_k\Vert _{L^2(\mathbb {R},\mathbb {R})}^2\\\le & {} \sum _{k \in {\mathbb Z} \setminus \{0\}} \Vert h_k\Vert _{L^2(\mathbb {R},\mathbb {R})}^2\!\\= & {} \Vert h\Vert _{L^2(\mathbb {R},X_2)}^2, \end{aligned}$$

so that w, \(L_2w \in L^2(\mathbb {R},X)\). Equation (3.8) shows that w is differentiable, satisfies \(\dot{w} \in L^2(\mathbb {R},X)\) and solves the given differential equation.

The uniqueness of the solution follows by noting that Eq. (3.8) has no nontrivial solution in \(L^2(\mathbb {R},\mathbb {R})\) when \(h_k=0\). \(\square \)

4 Centre-Manifold Reduction

Our strategy in finding solutions to Hamilton’s equations (2.19)–(2.24) for \((M,\Upsilon ,H^\varepsilon )\) consists in applying a reduction principle which asserts that it is locally equivalent to a finite-dimensional Hamiltonian system. The key result is the following theorem due to Mielke [28, 29].

Theorem 4.1

Consider the differential equation

$$\begin{aligned} \dot{u} = \mathcal {L} u + \mathcal {N}(u;\lambda ), \end{aligned}$$
(4.1)

which represents Hamilton’s equations for the reversible Hamiltonian system \((M,\Omega ^\lambda ,H^\lambda )\). Here \(u\) belongs to a Hilbert space \(\mathcal {X}\), \(\lambda \in \mathbb {R}^l\) is a parameter and \(\mathcal {L}:\mathcal {D}(\mathcal {L}) \subset \mathcal {X} \rightarrow \mathcal {X}\) is a densely defined, closed linear operator. Regarding \(\mathcal {D}(\mathcal {L})\) as a Hilbert space equipped with the graph norm, suppose that \(0\) is an equilibrium point of (4.1) when \(\lambda = 0\) and that

  1. (H1)

    The part of the spectrum \(\sigma (\mathcal {L})\) of \(\mathcal {L}\) which lies on the imaginary axis of a finite number of eigenvalues of finite multiplicity and is separated from the rest of \(\sigma (\mathcal {L})\) in the sense of Kato, so that \(\mathcal {X}\) admits the decomposition \(\mathcal {X} = \mathcal {X}_1 \oplus \mathcal {X}_2\), where \(\mathcal {X}_1 = \mathcal {P}(\mathcal {X})\), \(\mathcal {X}_2 = (I-\mathcal {P})(\mathcal {X})\) are the centre and hyperbolic subspaces of \({\mathcal L}\) defined by the spectral projection \({\mathcal P}\) corresponding the purely imaginary part of \(\sigma (\mathcal {L})\).

  2. (H2)

    The operator \(\mathcal {L}_2 = \mathcal {L}|_{\mathcal {X}_2}\) has \(L^2\)-maximal regularity in the sense that the differential equation

    $$\begin{aligned} \dot{u}_2 = \mathcal {L}_2 u_2 + h \end{aligned}$$

    admits a unique solution \(u_2 \in H^1({\mathbb R},{\mathcal X}_2) \cap L^2({\mathbb R},{\mathcal D}({\mathcal L}_2))\) for each \(h \in L^2({\mathbb R},{\mathcal X}_2)\).

  3. (H3)

    There exist a natural number \(k\) and neighbourhoods \(\Lambda \subset \mathbb {R}^l\) of \(0\) and \(U \subset \mathcal {D}(\mathcal {L})\) of \(0\) such that \(\mathcal {N}\) is \((k+1)\) times continuously differentiable on \(U \times \Lambda \), its derivatives are bounded and uniformly continuous on \(U \times \Lambda \) and \(\mathcal {N}(0,0) = 0, \textrm{d}_1\mathcal {N}[0,0] = 0\).

Under these hypotheses, there exist neighbourhoods \(\tilde{\Lambda }\subset \Lambda \) of \(0\) and \(\tilde{U}_1\subset U \cap \mathcal {X}_1\), \(\tilde{U}_2\subset U \cap \mathcal {X}_2\) of \(0\) and a reduction function \(r:\tilde{U}_1 \times \tilde{\Lambda }\rightarrow \tilde{U}_2\) with the following properties. The reduction function \(r\) is \(k\) times continuously differentiable on \(\tilde{U}_1 \times \tilde{\Lambda }\) and \(r(0;0) = 0\), \(\textrm{d}_1r[0;0]= 0\). The graph \(\tilde{M}^\lambda = \{u_1+r(u_1;\lambda ) \in \mathcal {X}_1 \oplus \mathcal {X}_2: u_1 \in \tilde{U}_1\}\) is a Hamiltonian centre manifold for (4.1), so that

  1. (i)

    \(\tilde{M}^\lambda \) is a locally invariant manifold of (4.1): through every point in \(\tilde{M}^\lambda \), there passes a unique solution of (4.1) that remains on \(\tilde{M}^\lambda \) as long as it remains in \(\tilde{U}_1 \times \tilde{U}_2\).

  2. (ii)

    Every bounded solution \(u(x)\), \(x\!\in \!\mathbb {R}\) of (4.1) that satisfies \((u_1(x),\!u_2(x))\! \in \! \tilde{U}_1 \!\times \! \tilde{U}_2\) lies completely in \(\tilde{M}^\lambda \).

  3. (iii)

    Every solution \(u_1:(x_1,x_2) \rightarrow \tilde{U}_1\) of the reduced equation

    $$\begin{aligned} \dot{u}_1 = \mathcal {L} u_1 + \tilde{\mathcal {N}}^\lambda (u_1), \end{aligned}$$
    (4.2)

    where \(\tilde{\mathcal {N}}^\lambda (u_1)={\mathcal P}\mathcal {N}(u_1+r(u_1;\lambda );\lambda )\), generates a solution

    $$\begin{aligned} u(x) = u_1(x) + r(u_1(x);\lambda ) \end{aligned}$$

    of the full equation (4.1).

  4. (iv)

    \(\tilde{M}^\lambda \) is a symplectic submanifold of \(M\) and the flow determined by the Hamiltonian system \((\tilde{M}^\lambda ,\tilde{\Omega }^\lambda ,\tilde{H}^\lambda )\), where the tilde denotes restriction to \(\tilde{M}^\lambda \), coincides with the flow on \(\tilde{M}^\lambda \) determined by \((M,\Omega ^\lambda ,H^\lambda )\). The reduced equation (4.2) is reversible and represents Hamilton’s equations for \((\tilde{M}^\lambda ,\tilde{\Omega }^\lambda ,\tilde{H}^\lambda )\).

Remarks 4.2

  1. (i)

    We find that

    $$\begin{aligned} \tilde{H}^\lambda (u_1)&= H^\lambda (u_1+r(u_1;\lambda )), \nonumber \\ \tilde{\Omega }^\lambda |_{u_1}(v_1,v_2)&= \Omega ^0|_0(v_1 + \textrm{d}_1r[u_1;\lambda ](v_1), v_2 + \textrm{d}r[u_1;\lambda ](v_2)) \nonumber \\&= \Omega ^0|_0(v_1,v_2) + O(|(\lambda ,u_1)|) \end{aligned}$$
    (4.3)

    as \((\lambda ,u_1) \rightarrow 0\). Using a parameter-dependent version of Darboux’s theorem (e.g. see Buffoni and Groves [4]), we may assume that the remainder term in (4.3) vanishes identically.

  2. (ii)

    Substituting \(u=u_1+r(u_1;\lambda )\) into (4.1) and eliminating \(\dot{u}_1\) using (4.2) leads to the equation

    $$\begin{aligned} {\mathcal L} r(u_1;\lambda ) - \textrm{d}_1r[u_1;\lambda ]({\mathcal L}u_1) = \tilde{\mathcal {N}}^\lambda (u_1) + \textrm{d}_1r[u_1;\lambda ](\tilde{\mathcal {N}}^\lambda (u_1)) - \mathcal {N}(u_1+r(u_1;\lambda );\lambda ), \end{aligned}$$

    which can be used to recursively determine the terms in the Taylor series of \(r(u_1;\lambda )\) and \({\mathcal N}^\lambda (u_1)\).

We proceed by choosing \((\beta _0(s),\alpha _0(s)) \in C\), setting \((\varepsilon _1,\varepsilon _2)=(\mu ,0)\), and applying Theorem 4.1 to \((M,\Upsilon ,H^\varepsilon )\). Hypothesis (H3) is clearly satisfied for any natural number k, and we henceforth refer to functions which are continuously differentiable an arbitrary, but fixed number of times as ‘smooth’. The spectral theory in Sect. 3 shows that (H1), (H2) are also satisfied; indeed, the (complexified) four-dimensional centre subspace of \(L\) is spanned by the generalised eigenvectors

$$\begin{aligned} E = \tau _1^{-1/2}\,e, \quad \bar{E} = \tau _1^{-1/2}\,\bar{e},\quad F = \tau _1^{-1/2}\Big (f-\frac{\textrm{i}\tau _2}{2\tau _1}\, e\Big ), \quad \bar{F}= \tau _1^{-1/2}\Big (\bar{f}+\frac{\textrm{i}\tau _2}{2\tau _1}\, \bar{e}\Big ), \end{aligned}$$

where

$$\begin{aligned} \tau _1&= -s\coth (s)+\frac{3\sinh (s)\cosh (s)}{2s}-\frac{1}{2}>0, \\ \tau _2&= -\frac{\sinh (2s)}{2s^2}+\frac{4s}{3}-\frac{1}{2s}-\frac{3\cosh (2s)}{2s}+\frac{s+\sinh (2s)}{\sinh ^2(s)}, \nonumber \end{aligned}$$
(4.4)

so that the centre and hyperbolic subspaces of L are, respectively,

$$\begin{aligned} X_1=\{A E + B F + \bar{A}E + \bar{B}F: A, B \in {\mathbb C}\}, \qquad X_2=\bigg \{\sum _{k \in {\mathbb Z}\setminus \{0\}}\beta _k f_k: \{\beta _k\} \in \ell ^2\bigg \}. \end{aligned}$$

The vectors are normalised such that \((L-\textrm{i}sI)E = 0\), \((L-\textrm{i}sI)F = E\) with \(SE=\bar{E}\), \(SF=-\bar{F}\), and

$$\begin{aligned} \Upsilon |_0(E,\bar{F}) = \Upsilon |_0(\bar{E}, F) = 1, \qquad \Upsilon |_0(\bar{F},E) = \Upsilon |_0(F,\bar{E}) = -1 \end{aligned}$$

and the symplectic product of any other combination of the vectors \(E,F,\bar{E}, \bar{F}\) is zero (so that \(\{E,F,\bar{E}, \bar{F}\}\) is a symplectic basis for the centre subspace of \(L\)). Writing

$$\begin{aligned} u_1 = AE + BF + \bar{A} \bar{E} + \bar{B} \bar{F}, \end{aligned}$$

we, therefore, find that \(A,B\) are canonical coordinates for the reduced Hamiltonian system (see Remark 4.2(i)), which can therefore be written as

$$\begin{aligned} A_x&= \frac{\partial \tilde{H}^\mu }{\partial \bar{B}}, \qquad B_x = -\frac{\partial \tilde{H}^\mu }{\partial \bar{A}} \end{aligned}$$

(with a slight abuse of notation we abbreviate \(\tilde{H}^\varepsilon |_{(\varepsilon _1,\varepsilon _2)=(\mu ,0)}\) to \(\tilde{H}^\mu \)); this system is reversible with reverser \(S:(A,B) \rightarrow (\bar{A}, -\bar{B})\). Note that the quadratic, parameter-independent part of the Hamiltonian is

$$\begin{aligned} H_2^0(A,B,\bar{A},\bar{B}) = \textrm{i}s(A\bar{B}-\bar{A}B)+|B|^2, \end{aligned}$$

so that in coordinates

$$\begin{aligned} L\begin{pmatrix}A \\ B \\ \bar{A} \\ \bar{B} \end{pmatrix} = \begin{pmatrix} \textrm{i}s &{}\quad 1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad \textrm{i}s &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad -\textrm{i}s &{}\quad 1 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad -\textrm{i}s \end{pmatrix}\!\!\! \begin{pmatrix}A \\ B \\ \bar{A} \\ \bar{B} \end{pmatrix}. \end{aligned}$$

The next step is to use a normal-form transform to simplify the Hamiltonian. For this purpose we use the following result due to Elphick [7].

Lemma 4.3

Let \(n_0 \ge 2\). There exists a near-identity, canonical change of variables which transforms the Hamiltonian to

$$\begin{aligned} \textrm{i}s(A\bar{B}-\bar{A}B)+|B|^2 + H_\textrm{NF}^\mu (A,B,\bar{A},\bar{B}) + O(|(A,B)|^2|(\mu ,A,B)|^{n_0}), \end{aligned}$$

where the complexification of \(H_\textrm{NF}^\mu \) lies in \(\ker {\mathcal L}_{L^*}\), and \({\mathcal L}_{M^*}: {\mathbb C}[Z] \rightarrow {\mathbb C}[Z]\) is defined by

$$\begin{aligned} ({\mathcal L}_{M^*}p)(Z)=M^*Z\cdot \nabla p(Z) \end{aligned}$$

for \(M \in {\mathbb C}^{4 \times 4}\), where the coefficients of the polynomials in the complex polynomial rings depend upon \(\mu \) and the gradient is taken with respect to \(Z=(A,B,\bar{A},\bar{B})\).

We proceed by characterising \(\ker {\mathcal L}_{L^*}\) using the following lemma, the statements in which are obtained from results by Murdock [31, Lemma 3.4.8], Malonza [26, Lemma 4, Theorem 9] and Billera et al. [2, Section 4], respectively. Corollary 4.5 takes into account that \(H_\textrm{NF}^\mu \) is real valued.

Lemma 4.4

Let \(S=\textrm{diag}(\textrm{i}s,\textrm{i}s,-\textrm{i}s,-\textrm{i}s)\) and \(N=L-S\).

  1. (i)

    The kernel of \({\mathcal L}_{L^*}: {\mathbb C}[Z] \rightarrow {\mathbb C}[Z]\) is given by \(\ker {\mathcal L}_{L^*} = \ker {\mathcal L}_{N^*} \cap \ker {\mathcal L}_{S^*}\).

  2. (ii)

    The kernel of \({\mathcal L}_{N^*}\) is given by \(\ker {\mathcal L}_{N^*}={\mathbb C}[A,\bar{A},A\bar{B}-\bar{A}B]\).

  3. (iii)

    The kernel of \({\mathcal L}_{S^*}\) is given by \(\ker {\mathcal L}_{S^*}={\mathbb C}[A\bar{A},A\bar{B},B\bar{A},B\bar{B}]\).

Corollary 4.5

The kernel of \({\mathcal L}_{L^*}: {\mathbb C}[Z] \rightarrow {\mathbb C}[Z]\) is given by \({\mathbb C}[|A|^2,\textrm{i}(A\bar{B}-\bar{A}B)]\) and \(H_\textrm{NF}^\mu \in {\mathbb R}[|A|^2,\textrm{i}(A\bar{B}-\bar{A}B)]\).

Writing the transformed reduced system as

$$\begin{aligned} {u_1}_x = Lu_1 + P^\mu (u_1), \end{aligned}$$

where

$$\begin{aligned} u_1&= AE + BF + \bar{A} \bar{E} + \bar{B} \bar{F}, \\ P^\mu (u_1)&=\partial _{\bar{B}} \tilde{H}^\mu (A, B,\bar{A}, \bar{B})E -\partial _{\bar{A}} \tilde{H}^\mu (A, B, \bar{A}, \bar{B})F \\&\quad \qquad +\partial _B \tilde{H}^\mu (A, B, \bar{A}, \bar{B}) \bar{E} - \partial _A \tilde{H}^\mu (A,B,\bar{A}, \bar{B}) \bar{F}, \end{aligned}$$

we can compute the Taylor series of \(r(u_1;\mu )\) and \({\mathcal N}^\mu (u_1)\), and hence \(H^\mu (A,B,\bar{A},\bar{B})\), recursively using the equation

$$\begin{aligned} Lr(u_1;\mu ) - \textrm{d}_1r[u_1;\mu ](Lu_1) = P^\mu (u_1) + \textrm{d}_1r[u_1;\mu ](P^\mu (u_1)) - N^\mu (u_1+r(u_1;\mu )) \end{aligned}$$
(4.5)

(see Remark 4.2(ii)), where with a slight abuse of notation we have applied the near-identity normal-form transformation to the reduction function. Corollary 4.5 states that there are real constants \(c_1\), \(c_2\), \(d_1\), \(d_2\), \(d_3\) such that

$$\begin{aligned} \tilde{H}_2^1(A,B,\bar{A}, \bar{B},0)&= c_1|A|^2+c_2\textrm{i}(A\bar{B}-\bar{A}B), \\ \tilde{H}_3^0(A,B,\bar{A}, \bar{B},0)&= 0, \\ \tilde{H}_4^0(A,B,\bar{A}, \bar{B},0)&=d_1|A|^4 + d_2\textrm{i}(A\bar{B}-\bar{A}B)|A|^2 -d_3(A\bar{B}-\bar{A}B)^2, \end{aligned}$$

where \(\mu ^j\tilde{H}_k^j(A,B,\bar{A}, \bar{B})\) denotes the part of the Taylor expansion of \(\tilde{H}^\mu (A,B,\bar{A}, \bar{B})\) which is homogeneous of order \(j\) in \(\mu \) and \(k\) in \((A,B,\bar{A}, \bar{B})\). The coefficients \(c_1\) and \(d_1\), whose values are required in Sect. 5 below, are computed in Appendix B; we find that \(c_1<0\) and there exists a critical value \(s^\star \) of s such that \(d_1>0\) for \(s<s^\star \), which we now assume.

5 Homoclinic Solutions

In this section, we examine the reduced Hamiltonian system

$$\begin{aligned} A_x&= \partial _{\bar{B}}\tilde{H}^\mu (A,B,\bar{A}, \bar{B})\nonumber \\&=\textrm{i}sA+B+\partial _{\bar{B}}\tilde{H}^\mu _{\textrm{NF}}(|A|^2,\textrm{i}(A\bar{B}-\bar{A} B,\mu )+\underline{O}(|(A,B)||(\mu ,A,B)|^{n_0}),\nonumber \\ \end{aligned}$$
(5.1)
$$\begin{aligned} B_x&= -\partial _{\bar{A}}\tilde{H}^\mu (A,B,\bar{A}, \bar{B})\nonumber \\&=\textrm{i}s B -\partial _{\bar{A}}\tilde{H}^\mu _{\textrm{NF}}(|A|^2,\textrm{i}(A\bar{B}-\bar{A} B),\mu )+\underline{O}(|(A,B)||(\mu ,A,B)|^{n_0}), \end{aligned}$$
(5.2)

where the underscore indicates that the order-of-magnitude estimate remains valid when formally differentiated with respect to (AB). The truncated system without the remainder terms was examined in detail by Iooss and Pérouème [19], who also studied the ‘persistence’ of certain solutions as solutions to the full system. Here, we present an alternative, functional-analytic proof of the existence of two reversible homoclinic solutions to (5.1), (5.2).

We begin by returning to real coordinates \(q=(q_1,q_2)^\textrm{T}\), \(p=(p_1,p_2)^\textrm{T}\) given by

$$\begin{aligned} A=\frac{1}{\sqrt{2}}(q_1+\textrm{i}q_2), \qquad B=\frac{1}{\sqrt{2}}(p_1+\textrm{i}p_2) \end{aligned}$$

and, hence, obtaining the real Hamiltonian system

$$\begin{aligned} q_x&= \frac{\partial \tilde{H}^\mu }{\partial p} = p +sR_{\frac{\pi }{2}}q+\overbrace{\partial _2\tilde{H}_{\textrm{NF}}^\mu (\tfrac{1}{2}|q|^2,p\!\cdot \!R_{\frac{\pi }{2}}q)R_{\frac{\pi }{2}}q}^{\displaystyle :=P_1^\mu (q,p)}+R_1^\mu (q,p), \end{aligned}$$
(5.3)
$$\begin{aligned} p_x&= -\frac{\partial \tilde{H}^\mu }{\partial p} = sR_{\frac{\pi }{2}}p\underbrace{-\partial _1\tilde{H}_{\textrm{NF}}^\mu (\tfrac{1}{2}|q|^2,p\!\cdot \! R_{\frac{\pi }{2}}q)q+\partial _2\tilde{H}_{\textrm{NF}}^\mu (\tfrac{1}{2}|q|^2,p\!\cdot \! R_{\frac{\pi }{2}}q)R_{\frac{\pi }{2}}p}_{\displaystyle :=P_2^\mu (q,p)} + R_2^\mu (q,p), \end{aligned}$$
(5.4)

in which

$$\begin{aligned} \tilde{H}^\mu (q,p) = \frac{1}{2}|p|^2+sp\!\cdot \! R_{\frac{\pi }{2}}q+\tilde{H}_{\textrm{NF}}^\mu (\tfrac{1}{2}|q|^2,p\!\cdot \! R_{\frac{\pi }{2}}q,\mu )+O(|(q,p)|^2|(\mu ,q,p)|^{n_0}), \end{aligned}$$

so that \(P_1^\mu (q,p)\), \(P_2^\mu (q,p)\) are polynomials in \(\mu \), q and p and

$$\begin{aligned} R_1^\mu (q,p),\ R_2^\mu (q,p)=\underline{O}(|(q,p)|(\mu ,q,p)|^{n_0}). \end{aligned}$$

Note that this system is reversible with reverser \(S:(q_1,p_1,q_2,p_2)\mapsto (q_1,-p_1,-q_2,p_2)\) and that

$$\begin{aligned} R_\theta P_1^\mu (q,p) = P_1^\mu (R_\theta q, R_\theta p), \qquad R_\theta P_2^\mu (q,p) = P_2^\mu (R_\theta q, R_\theta p) \end{aligned}$$

for all \(\theta \in [0,2\pi )\), where \(R_\theta \) is the matrix representing a rotation through the angle \(\theta \).

The next step is to recast equations (5.3), (5.4) as a single second-order equation. Writing

$$\begin{aligned} p=q_x-s R_{\frac{\pi }{2}} q+v, \end{aligned}$$

we find from equation (5.3) that

$$\begin{aligned} v+P_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v) + R_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v)=0, \end{aligned}$$
(5.5)

and using the implicit-function theorem, we now construct a solution of (5.5) of the form

$$\begin{aligned} v=v_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q)+v_2^\mu (q,q_x-s R_{\frac{\pi }{2}} q), \end{aligned}$$

where \(v_1^\mu \) solves the truncated equation with \(R_1^\mu =0\) and takes the particular form

$$\begin{aligned} v_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q)=w_1^\mu (|q|^2,R_{\frac{\pi }{2}}q\! \cdot \!(q_x-s R_{\frac{\pi }{2}} q))R_{\frac{\pi }{2}} q. \end{aligned}$$
(5.6)

Note that \(w_1^\mu \) necessarily solves

(5.7)

while \(v_2^\mu \) necessarily solves

$$\begin{aligned}{} & {} v_2+P_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q)+v_2)\nonumber \\{} & {} \quad -P_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q)) \nonumber \\{} & {} \quad +R_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q)+v_2)=0. \end{aligned}$$
(5.8)

Proposition 5.1

  1. (i)

    Equation (5.7) has a unique solution \(w_1=w_1^\mu (|q|^2,R_{\frac{\pi }{2}}q\! \cdot \!(q_x-s R_{\frac{\pi }{2}} q))\) which depends analytically upon \(\mu \), \(|q|^2\) and \(R_{\frac{\pi }{2}}q\! \cdot \!(q_x-s R_{\frac{\pi }{2}} q)\) and satisfies \(w_1^0(0,0)=0\). The function \(v_1^\mu \) defined by (5.6) satisfies

    $$\begin{aligned} v_1^\mu +P_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v_1^\mu )=0 \end{aligned}$$

    and

    $$\begin{aligned} R_\theta v_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q)= v_1^\mu (R_\theta q,R_\theta (q_x-s R_{\frac{\pi }{2}} q)) \end{aligned}$$

    for all \(\theta \in [0,2\pi )\).

  2. (ii)

    Equation (5.8) has a unique solution \(v_2=v_2^\mu (q,q_x-s R_{\frac{\pi }{2}} q)\) which depends smoothly upon \(\mu \), q and \(q_x-s R_{\frac{\pi }{2}} q\) and satisfies

    $$\begin{aligned} v_2^\mu (q,q_x-s R_{\frac{\pi }{2}} q)=\underline{O}(|(q,q_x-s R_{\frac{\pi }{2}} q)||(\mu ,q,q_x-s R_{\frac{\pi }{2}} q)|^{n_0}). \end{aligned}$$

Substituting

$$\begin{aligned} p=q_x-s R_{\frac{\pi }{2}} q+v_1^\mu +v_2^\mu \end{aligned}$$

into Eq. (5.4), where we have omitted the arguments of \(v_1^\mu \), \(v_2^\mu \) for notational simplicity, shows that

$$\begin{aligned} (\partial _x - s R_{\frac{\pi }{2}})^2 q=-(\partial _x - s R_{\frac{\pi }{2}})(v_1^\mu +v_2^\mu )+\tilde{P}^\mu (q,q_x-s R_{\frac{\pi }{2}} q)+\tilde{R}^\mu (q,q_x-s R_{\frac{\pi }{2}} q), \end{aligned}$$

in which

$$\begin{aligned} \tilde{P}^\mu (q,q_x-s R_{\frac{\pi }{2}} q)&=P_2^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v_1^\mu ), \\ \tilde{R}^\mu (q,q_x-s R_{\frac{\pi }{2}} q)&= P_2^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v_1^\mu +v_2^\mu )-P_2^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v_1^\mu ) \\&\quad \qquad \,+R_2^\mu (q,q_x-s R_{\frac{\pi }{2}} q+v_1^\mu +v_2^\mu ). \end{aligned}$$

It follows that

$$\begin{aligned} \!\!\!\!(\partial _x - s R_{\frac{\pi }{2}})^2 q&=-\partial _1 v_1^\mu (q_x-s R_{\frac{\pi }{2}} q) -\partial _2 v_1^\mu (\partial _x - s R_{\frac{\pi }{2}})^2 q \nonumber \\&\qquad +\tilde{P}^\mu (q,q_x-s R_{\frac{\pi }{2}} q)-\partial _1 v_2^\mu (q_x-s R_{\frac{\pi }{2}} q)\nonumber \\&\qquad -\partial _2 v_2^\mu (\partial _x - s R_{\frac{\pi }{2}})^2 q -\partial _1 v_2^\mu s R_{\frac{\pi }{2}}q\nonumber \\&\qquad -\partial _2 v_2^\mu s R_{\frac{\pi }{2}}(q_x-s R_{\frac{\pi }{2}} q)+ sR_{\frac{\pi }{2}}v_2^\mu +\tilde{R}^\mu (q,q_x-s R_{\frac{\pi }{2}} q), \end{aligned}$$
(5.9)

where \(\partial _j v_k^\mu \) is the matrix \(\textrm{d}_j v_k^\mu [q,q_x-s R_{\frac{\pi }{2}} q]\) and we have used the calculation

$$\begin{aligned} (\partial _x -s&R_{\frac{\pi }{2}}) v_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q)\\&= (\partial _x -s R_{\frac{\pi }{2}})R_{s x} v_1^\mu (R_{-s x}q,R_{-s x}(q_x-s R_{\frac{\pi }{2}} q)) \\&= R_{s x} \partial _x v_1^\mu (R_{-s x}q,R_{-s x}(q_x-s R_{\frac{\pi }{2}} q))\\&= R_{s x} \partial _1 v_1^\mu (R_{-s x}q,R_{-s x}(q_x-s R_{\frac{\pi }{2}} q)) \partial _x (R_{-s x}q) \\&\qquad + R_{sx} \partial _2 v_1^\mu (R_{-s x}q,R_{-s x}(q_x-s R_{\frac{\pi }{2}} q)) \partial _x(R_{-s x}(q_x-s R_{\frac{\pi }{2}} q)) \\&= R_{s x} \partial _1 v_1^\mu (R_{-s x}q,(q_x-s R_{\frac{\pi }{2}} q)) R_{-s x}(q_x-s R_{\frac{\pi }{2}} q)\\&\qquad +R_{sx}\partial _2 v_1^\mu (R_{-s x}q,(q_x-s R_{\frac{\pi }{2}} q)) R_{-s x}(\partial _x - s R_{\frac{\pi }{2}})^2 q \\&= \partial _1 v_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q)(q_x-s R_{\frac{\pi }{2}} q) +\partial _2 v_1^\mu (q,q_x-s R_{\frac{\pi }{2}} q)(\partial _x - s R_{\frac{\pi }{2}})^2 q. \end{aligned}$$

Introducing the scaled variables

$$\begin{aligned} q(x)=\delta R_{s x} Q(X), \qquad X=\delta x, \end{aligned}$$

where \(\delta ^2=-c_1\mu \), so that

$$\begin{aligned} q_x-s R_{\frac{\pi }{2}} q = \delta ^2 R_{sx} Q_X(X), \qquad (\partial _x - s R_{\frac{\pi }{2}})^2 q = \delta ^3 R_{sx} Q_{XX}(X), \end{aligned}$$

transforms equation (5.9) into

$$\begin{aligned} Q_{XX} = Q-CQ|Q|^2+T_1^\delta (Q,Q_X)+R_{-s X/\delta }T_2^\delta (R_{s X/\delta }Q,R_{s X/\delta }Q_X,R_{s X/\delta }Q_{XX}), \end{aligned}$$
(5.10)

where \(C=-d_1/c_1\) and

$$\begin{aligned} T_1^\delta (Q,Q_X) = O(\delta |(Q,Q_X)|), \qquad T_2^\delta (Q,Q_X,Q_{XX})=O( \delta ^{n_0-2}|(Q,Q_X,Q_{XX})|). \end{aligned}$$

Remark 5.2

The various changes of variable preserve the reversibility symmetry, so that equation (5.10) is invariant under the transformation \(X \mapsto -X\), \((Q_1,Q_2) \mapsto (Q_1,-Q_2)\).

Before proving the existence of homoclinic solutions to (5.10) we define the function spaces with which we work and refer to some functional-analytic results which are used in the proof (see Kirchgässner [21, Proposition 5.1]).

Definition 5.3

Suppose that \(k \in \mathbb {N}_{0}\) and \(\nu \ge 0\). Define

$$\begin{aligned} C_{\nu }^{k}(\mathbb {R}) = \{f \in C^{k}(\mathbb {R}) :\Vert f\Vert _{k,\nu } < \infty \}, \qquad \Vert f\Vert _{k,\nu }:= \sup _{t \in \mathbb {R}}\sum _{j=0}^{k}|f^{(j)}(t)|\textrm{e}^{\nu |t|} \end{aligned}$$

and their subspaces

$$\begin{aligned} C_{\nu ,\textrm{e}}^{k}= & {} \{f \in C_{\nu }^{k}(\mathbb {R}) :f(-t) = f(t), \ t \in \mathbb {R}\}, \\ C_{\nu ,\textrm{o}}^{k}= & {} \{f \in C_{\nu }^{k}(\mathbb {R}) :f(-t) = -f(t), \ t \in \mathbb {R}\}. \end{aligned}$$

In the case \(k=0\) we just write \(C_{\nu }(\mathbb {R}), \ C_{\nu ,\textrm{e}}(\mathbb {R})\) and \(C_{\nu ,\textrm{o}}(\mathbb {R})\).

Proposition 5.4

  1. (i)

    The formula

    $$\begin{aligned} K \begin{pmatrix} z_{1} \\ z_{2} \end{pmatrix} = \begin{pmatrix} z_{1XX}-z_{1} \\ z_{2XX}-z_{2} \end{pmatrix} \end{aligned}$$

    defines a bounded linear operator \(C_{\nu }^{2}(\mathbb {R})^{2} \rightarrow C_{\nu }(\mathbb {R})^{2}\) and \(C_{\nu ,\textrm{e}}^{2}(\mathbb {R}) \times C_{\nu ,\textrm{o}}^{2}(\mathbb {R}) \rightarrow C_{\nu ,\textrm{e}}(\mathbb {R}) \times C_{\nu ,\textrm{o}}(\mathbb {R})\) for each \(\nu \ge 0\).

  2. (ii)

    For \(0 \le \nu < 1\) the operator \(K :C_{\nu }^{2}(\mathbb {R})^{2} \rightarrow C_{\nu }(\mathbb {R})^{2}\) is invertible with bounded inverse given by

    $$\begin{aligned} (K^{-1}f)(t) = -\frac{1}{2}\int _{-\infty }^{\infty }\textrm{e}^{-|t-s|}f(s) \; \textrm{d}s, \end{aligned}$$

    where the integration is taken componentwise.

  3. (iii)

    Suppose that \(C>0\), \(h \in C_{1}(\mathbb {R})\) and \(0 \le \nu < 1\). The formula

    $$\begin{aligned} K_{h}z = K^{-1}\begin{pmatrix} -3Ch^{2}z_{1} \\ -Ch^{2}z_{2} \end{pmatrix} \end{aligned}$$

    defines a bounded linear operator \(C_{0}(\mathbb {R})^{2} \rightarrow C_{\nu }^{2}(\mathbb {R})^{2}\) and a compact operator \(C_{\nu }(\mathbb {R})^{2} \rightarrow C_{\nu }(\mathbb {R})^{2}\).

Theorem 5.5

For each \(\nu \in (0,1)\) and each sufficiently small value of \(\delta >0\) equation (5.10) has two homoclinic solutions \(Q^{\delta \pm }\) which are symmetric, that is invariant under the transformation \((Q_1(X),Q_2(X)) \mapsto (Q_1(-X),-Q_2(-X))\), and satisfy the estimate

$$\begin{aligned} Q^{\delta \pm }(X) = \pm \begin{pmatrix} h(X) \\ 0 \end{pmatrix} + O(\delta \textrm{e}^{-\nu |X|}) \end{aligned}$$

for all \(X \in {\mathbb R}\).

Proof

For \(\delta = 0\) equation (5.10) has the family

$$\begin{aligned} \left\{ (Q_1,Q_2)^\textrm{T} = R_\theta (h(X_{0} + \cdot ),0)^\textrm{T} :\theta \in [0,2\pi ), \ X_{0} \in \mathbb {R}\right\} \end{aligned}$$

of homoclinic solutions, where

$$\begin{aligned} h(X) = \Big (\frac{2}{C}\Big )^{1/2}\textrm{sech}(X). \end{aligned}$$

Two of these solutions, namely those with \((\theta ,X_0)=(0,0)\) and \((\theta ,X_0)=(\pi ,0)\), which we denote by respectively \(Q^+\) and \(Q^-\), are symmetric. We seek a solution of (5.10) in the form of a perturbation of \(Q^+\) by writing

$$\begin{aligned} Q_{1} = h + z_{1}, \qquad Q_{2} = z_{2}, \end{aligned}$$

so that \(z=(z_1,z_2)^\textrm{T}\) satisfies

$$\begin{aligned} z_{1XX} - z_{1}&= -3Ch^{2}z_{1} + r_{1}^{\delta }(z_{1},z_{2},z_{1X},z_{2X},z_{1XX},z_{2XX},X), \end{aligned}$$
(5.11)
$$\begin{aligned} z_{2XX} - z_{2}&= -Ch^{2}z_{2} + r_{2}^{\delta }(z_{1},z_{2},z_{1X},z_{2X},z_{1XX},z_{2XX},X) \end{aligned}$$
(5.12)

with the obvious definitions of \(r_1^\delta \) and \(r_2^\delta \). We study the system (5.11), (5.12) in the space \(C_{\nu }^{2}(\mathbb {R})^{2}\) with fixed \(\nu \in (0,1)\) and, with a slight abuse of notation, consider the nonlinearity \(r^{\delta }=(r_1^\delta ,r_2^\delta )^\textrm{T}\) as a mapping \(C_{\nu }^{2}(\mathbb {R})^{2} \rightarrow C_{\nu }(\mathbb {R})^{2}\) and \(C_{\nu ,\textrm{e}}^{2}(\mathbb {R})\times C_{\nu ,\textrm{o}}^{2}(\mathbb {R}) \rightarrow C_{\nu ,\textrm{e}}(\mathbb {R})\times C_{\nu ,\textrm{o}}(\mathbb {R})\) with

$$\begin{aligned} \Vert r^{\delta }(z_{1},z_{2})\Vert _{0,\nu } = O(\delta ) + \underline{O}_1(\Vert (z_{1},z_{2})\Vert _{2,\nu }^{2}). \end{aligned}$$

In terms of the operators \(K\) and \(K_h\) defined in Proposition 5.4 equations (5.11), (5.12) can thus be written as

$$\begin{aligned} z = K_{h}z + K^{-1}r^{\delta }(z). \end{aligned}$$
(5.13)

The eigenvalue problem

$$\begin{aligned} K_{h}z = z \end{aligned}$$

is equivalent to the decoupled system

$$\begin{aligned} z_{1XX}&= z_{1} - 3Ch^{2}z_{1}, \end{aligned}$$
(5.14)
$$\begin{aligned} z_{2XX}&= z_{2} - Ch^{2}z_{2} \end{aligned}$$
(5.15)

of ordinary differential equations. Let

$$\begin{aligned}&z_{1}^1(X) = {{\,\textrm{sech}\,}}(X)\tanh (X),{} & {} z_{2}^1(X) = {{\,\textrm{sech}\,}}(X), \\&z_{1}^2(X) = {{\,\textrm{sech}\,}}(X)(-3 + \cosh ^{2}(X) + 3X\tanh (X)),{} & {} z_{2}^2(X) = {{\,\textrm{sech}\,}}(X)(2X + \sinh (2X)), \end{aligned}$$

so that \(\{z_{1}^1, z_{1}^2\}\) and \(\{z_{2}^1, z_{2}^2\}\) are fundamental solution sets for, respectively, (5.14) and (5.15). Since \(z_{1}^1, \ z_{2}^1\) are bounded while \(z_{1}^2, \ z_{2}^2\) are unbounded, we conclude that all bounded solutions of equation (5.14) are multiples of \(z_{1}^1 = -h_X\) and all bounded solutions of equation (5.15) are multiples of \(z_{2}^1 = (2/C)^{-1/2}h\). The eigenspace of \(K_{h} :C_{\nu }(\mathbb {R})^{2} \rightarrow C_{\nu }(\mathbb {R})^{2}\) corresponding to the eigenvalue \(1\) is, therefore,

$$\begin{aligned} \textrm{sp}\left\{ \begin{pmatrix} h_X \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ h \end{pmatrix} \right\} , \end{aligned}$$

which lies in \(C_{\nu ,\textrm{o}}(\mathbb {R}) \times C_{\nu ,\textrm{e}}(\mathbb {R})\). This calculation shows that \(1\) is not an eigenvalue of \(K_{h}|_{C_{\nu ,\textrm{e}}(\mathbb {R}) \times C_{\nu ,\textrm{o}}(\mathbb {R})}\) and since \(K_{h}\) is a compact operator \(C_{\nu }(\mathbb {R})^{2} \rightarrow C_{\nu }(\mathbb {R})^{2}\), one concludes that the spectrum of \(K_{h}|_{C_{\nu ,\textrm{e}}(\mathbb {R}) \times C_{\nu ,\textrm{o}}(\mathbb {R})}\) consists only of eigenvalues, so that \(1\) lies in the resolvent set of \(K_{h}|_{C_{\nu ,\textrm{e}}(\mathbb {R}) \times C_{\nu ,\textrm{o}}(\mathbb {R})}\). It follows that

$$\begin{aligned} I- K_{h} :C_{\nu ,\textrm{e}}(\mathbb {R}) \times C_{\nu ,\textrm{o}}(\mathbb {R}) \rightarrow C_{\nu ,\textrm{e}}^{2}(\mathbb {R}) \times C_{\nu ,\textrm{o}}^{2}(\mathbb {R}) \end{aligned}$$

is invertible. We can, therefore, solve equation (5.13) for sufficiently small values of \(\delta > 0\) using the implicit-function theorem; the solution \(z_{\star }^\delta \) satisfies \(\Vert z_{\star }^\delta \Vert _{2,\nu } = O(\delta )\).

Returning to equation (5.10), we have found a symmetric solution \(Q^{\delta +}= Q^++z_\star ^\delta \) which satisfies the stated estimate. The second homoclinic solution \(Q^{\delta -}\) is obtained from \(Q^{-}\) by the same procedure. \(\square \)

5.1 Appendix A: Formal Derivation of the Nonlinear Schrödinger Equation

Writing \(\beta = \beta _0\), \(\alpha = \alpha _0+\delta ^2\) and substituting the formal asymptotic expansions

$$\begin{aligned} \eta (x)&= \delta \eta _1(x,X)+\delta ^2\eta _2(x,X)+\delta ^3\eta _3(x,X) + \cdots , \\ \Psi (x,y)&= \delta \Psi _1(x,X,y) + \delta ^2\Psi _2(x,X,y) + \delta ^3\Psi _3(x,X,y) + \cdots , \end{aligned}$$

where \(X = \delta x\), into Eqs. (1.6)–(1.9) yields the boundary-value problems

$$\begin{aligned}&\Psi _{1xx}+\Psi _{1yy} = 0, \qquad 0<y<1, \end{aligned}$$
(5.16)
$$\begin{aligned}&\Psi _{1y}\big |_{y=0} = 0, \end{aligned}$$
(5.17)
$$\begin{aligned}&\Psi _{1y}+\eta _{1x}\big |_{y=1} = 0, \end{aligned}$$
(5.18)
$$\begin{aligned}&\alpha _0\eta _1-\Psi _{1x}+\beta _0\eta _{1xxxx}\big |_{y=1} = 0 \end{aligned}$$
(5.19)

for \(\Psi _1\),

$$\begin{aligned}&\Psi _{2xx}+\Psi _{2yy} +2\Psi _{1xX} +2\eta _1\Psi _{1xx}-2y\eta _{1x}\Psi _{1xy}-y\Psi _{1y}\eta _{1xx} = 0, \qquad 0<y<1, \nonumber \\ \end{aligned}$$
(5.20)
$$\begin{aligned}&\Psi _{2y}\big |_{y=0} = 0, \end{aligned}$$
(5.21)
$$\begin{aligned}&\Psi _{2y}+\eta _{1X}+\eta _{2x} - \eta _{1x}\Psi _{1x}+\eta _{1x}\eta _1\big |_{y=1} = 0, \end{aligned}$$
(5.22)
$$\begin{aligned}&-\Psi _{2x}-\Psi _{1X}+\alpha _0\eta _2+4\beta _0\eta _{1xxxX} +\beta _0\eta _{2xxxx}+\Psi _{1y}\eta _{1x}+\tfrac{1}{2}\Psi _{1x}^2+\tfrac{1}{2}\Psi _{1y}^2\big |_{y=1} = 0 \end{aligned}$$
(5.23)

for \(\Psi _2\) and

$$\begin{aligned}&\Psi _{3xx}+\Psi _{3yy}+2\Psi _{2xX}+4\eta _1\Psi _{1xX} + 2\eta _1\Psi _{2xx} + 2\eta _2\Psi _{1xx} \nonumber \\&\quad \;-2y\eta _{1x}\Psi _{1Xy}+\Psi _{1XX}-2y\eta _{1x}\Psi _{2xy}-2y\eta _{1X}\Psi _{1xy}\nonumber \\&\quad \;-2y\eta _{2x}\Psi _{1xy}-2y\eta _{1xX}\Psi _{1y}-y\eta _{2xx}\Psi _{1y}-y\eta _{1xx}\Psi _{2y} \nonumber \\&\quad \;+\eta _1^2\Psi _{1xx}+y^2\eta _{1x}^2\Psi _{1yy}-2y\eta _{1}\eta _{1x}\Psi _{1xy}-y\eta _1\eta _{1xx}\Psi _{1y} \nonumber \\&\quad \;+2y\eta _{1x}^2\Psi _{1y} = 0, \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad 0<y<1, \end{aligned}$$
(5.24)
$$\begin{aligned}&\Psi _{3y}\big |_{y=0} = 0, \end{aligned}$$
(5.25)
$$\begin{aligned}&\begin{aligned}&\Psi _{3y}+\eta _{2X}+\eta _{3x} - \eta _{1x}\Psi _{1X} - \eta _{1x}\Psi _{2x}-\eta _{2x}\Psi _{1x} -\eta _{1X}\Psi _{1x}\\&\quad \;+\eta _1\eta _{1X}+\eta _{1x}\eta _2+\eta _1\eta _{2x} -\eta _1\eta _{1x}\Psi _{1x}+y\eta _{1x}^2\Psi _{1y}\big |_{y=1} = 0, \end{aligned} \end{aligned}$$
(5.26)
$$\begin{aligned}&-\Psi _{3x} -\Psi _{2X}+\alpha _0\eta _3+6\beta _0 \eta _{1xxXX}+4\beta _0\eta _{2xxxX}+\beta _0\eta _{3xxxx} \nonumber \\&\quad \;+\eta _{1x}\Psi _{2y}+\eta _{1X}\Psi _{1y}+\eta _{2x}\Psi _{1y} + \Psi _{1x}\Psi _{1X} + \Psi _{1x}\Psi _{2x} \nonumber \\&\quad \;+\Psi _{1y}\Psi _{2y}-\tfrac{5}{2}\beta _0\eta _{1x}^2\eta _{1xxxx} - \eta _1\eta _{1x}\Psi _{1y}-\eta _1\Psi _{1y}^2 \nonumber \\&\quad \;-\eta _{1x}\Psi _{1x}\Psi _{1y}+\eta _1-10\beta _0\eta _{1x}\eta _{1xx}\eta _{1xxx}-\tfrac{5}{2}\beta _0\eta _{1xx}^3\big |_{y=1} = 0 \end{aligned}$$
(5.27)

for \(\Psi _3\). We proceed by making the modulational Ansatz

$$\begin{aligned} \eta _1(x,X)&= A_1(X)\textrm{e}^{\textrm{i}sx}+ \mathrm {c.c.}\,, \\ \eta _2(x,X)&= A_2(X)\textrm{e}^{2\textrm{i}sx}+ \mathrm {c. c.} + A_0(X), \\ \eta _3(x,X)&= A_3(X)\textrm{e}^{3\textrm{i}sx}+A_4(X)\textrm{e}^{2\textrm{i}sx}+ A_5(X)\textrm{e}^{\textrm{i}sx} + \mathrm {c. c.} + A_6(X). \end{aligned}$$
  • From (5.16)–(5.18) it follows that

    $$\begin{aligned} \Psi _{1xx}+\Psi _{1yy}&= 0,{} & {} 0<y<1, \\ \Psi _{1y}\big |_{y=0}&= 0, \\ \Psi _{1y}\big |_{y=1}&= -\textrm{i}sA_1\textrm{e}^{\textrm{i}sx} + \mathrm {c. c.}\;, \end{aligned}$$

    the solution to which is

    $$\begin{aligned} \Psi _1(x,X,y) = -\frac{\textrm{i}\cosh (sy)}{\sinh (s)}A_1\textrm{e}^{\textrm{i}sx} +\mathrm {c. c.} + g_1(X), \end{aligned}$$

    where \(g_1\) is an arbitrary function of a single variable. The equation

    $$\begin{aligned} (\alpha _0+\beta _0s^4)A_1-\Psi _{1x}\big |_{y=1} = 0, \end{aligned}$$

    which follows from (5.19), then recovers the dispersion relation (3.7).

  • From (5.20)–(5.22), it follows that

    $$\begin{aligned}&\Psi _{2xx}+\Psi _{2yy} = \begin{aligned}&-2s\frac{\cosh (sy)}{\sinh (s)}A_{1X}\textrm{e}^{\textrm{i}sx}\\ {}&\qquad \;-\textrm{i}s(3sy\sinh (sy)-2\cosh (sy))A_1^2\textrm{e}^{2\textrm{i}sx} + \mathrm {c. c.}, \qquad 0<y<1,\end{aligned} \\&\Psi _{2y}\big |_{y=0} = 0, \\&\Psi _{2y}\big |_{y=1} = -A_{1X}\textrm{e}^{\textrm{i}sx}+\textrm{i}s\left( \frac{s\cosh (sy)}{\sinh (s)}A_1^2-A_1^2-2A_2\right) \textrm{e}^{2\textrm{i}sx}, \end{aligned}$$

    the solution to which is

    $$\begin{aligned} \Psi _2(x,X,y)&= \left( \frac{\coth (s)}{\sinh (s)}\cosh (sy)-y\frac{\sinh (sy)}{\sinh (s)}\right) A_{1X}\textrm{e}^{\textrm{i}sx} \\ {}&\begin{aligned}&\qquad \quad + \left( \textrm{i}s\bigg (\frac{\coth (s)\cosh (2sy)}{\sinh (2s)}-y\frac{\sinh (sy)}{\sinh (s)}\right) A_1^2+\frac{\textrm{i}\cosh (2sy)}{\sinh (2s)}A_2\bigg )\textrm{e}^{2\textrm{i}sx} \end{aligned} \\&\qquad \quad + \mathrm {c. c.} + g_2(X), \end{aligned}$$

    where \(g_2\) is an arbitrary function of a single variable. Substituting the formulae for \(\Psi _1\), \(\Psi _2\) and the modulational Ansatz into (5.23), and equating the coefficients of \(\textrm{e}^{0\textrm{i}sx}, \textrm{e}^{\textrm{i}sx}, \textrm{e}^{2\textrm{i}sx}\), we then find that

    $$\begin{aligned} g_{1X}&= \begin{aligned}&\frac{s^2}{\sinh ^2(s)}|A_1|^2 + \alpha _0A_0, \end{aligned} \\ A_2&= \frac{1}{2}\;\frac{(1-3\coth ^2(s))s^2}{\alpha _0+16s^4\beta _0-s(\coth (s)+(\coth (s))^{-1})}\;A_1^2, \nonumber \\ \beta _0&= \frac{1}{4s^3}\coth (s)-\frac{1}{4s^2}{{\,\textrm{cosech}\,}}^2(s). \nonumber \end{aligned}$$
    (5.28)

    Using the dispersion relation and the above formula for \(\beta _0\), we find that

    $$\begin{aligned} \alpha _0 =\frac{3s}{4}\coth (s)+\frac{s^2}{4}{{\,\textrm{cosech}\,}}^2(s). \end{aligned}$$
  • Similarly, (5.24)–(5.26) yield a Poisson equation for \(\Psi _3\) with boundary conditions at \(y=0\) and \(y=1\), the solution to which is

    $$\begin{aligned} \Psi _3&(x,X,y)\\ =&\begin{aligned}&\bigg (\begin{aligned} \bigg (&\frac{3}{2}\;\frac{\textrm{i}s^2\cosh (sy)}{\sinh (s)}-\frac{2\textrm{i}s^2\coth (2s)\cosh (s)\cosh (sy)}{\sinh ^2(s)}-\frac{1}{2}\frac{\textrm{i}s^2y^2\cosh (sy)}{\sinh (s)} \\&\quad +\textrm{i}s^2y\frac{\sinh (2sy)}{\sinh ^2(s)}\bigg )\bar{A}_1 A_1^2\end{aligned} \\&\qquad +\begin{aligned}&\bigg (\hspace{-0.5pt}\frac{(2\textrm{i}s\coth (s)+\textrm{i}s\tanh (s))\cosh (sy)}{\sinh (s)} +\frac{\textrm{i}sy\sinh (sy)}{\sinh (s)}-\frac{2\textrm{i}sy\sinh (2sy)}{\sinh (2s)}\hspace{-0.5pt}\bigg )\hspace{-0.5pt}\bar{A}_1 A_2 \end{aligned}\\&\qquad +\bigg (\frac{\textrm{i}s\coth (s)\cosh (sy)}{\sinh (s)}-\frac{\textrm{i}sy\sinh (sy)}{\sinh (s)}\bigg )A_0A_1 \\&\qquad +\bigg (\frac{\textrm{i}y^2\cosh (sy)}{2\sinh (s)}+\frac{\textrm{i}(2\coth ^2(s)-1)\cosh (sy)}{2\sinh (s)}-\frac{\textrm{i}y\coth (s)\sinh (sy)}{\sinh (s)}\bigg )A_{1XX}\\&\qquad -\frac{\textrm{i}\cosh (sy)}{\sinh (s)}A_5+\frac{\textrm{i}\cosh (sy)}{\sinh (s)}A_1g_{1X}\bigg )\textrm{e}^{\textrm{i}sx} + (\cdots )\textrm{e}^{2\textrm{i}sx} + (\cdots )\textrm{e}^{3\textrm{i}sx} +\mathrm {c.c} \end{aligned}\\&\qquad +\frac{\sinh (sy)(sy\coth (s)+\coth (s)-y)-\cosh (sy)(sy^2+\coth (s))}{\sinh (s)}\frac{\textrm{d}}{\textrm{d}X}|A_1|^2 \\&\qquad -\tfrac{1}{2}y^2-\tfrac{1}{2}y^2g_{1XX} \end{aligned}$$

    with

    $$\begin{aligned} g_{1XX}-A_{0X}+2s\coth (s)\frac{\textrm{d}}{\textrm{d}X}|A_1|^2 = 0. \end{aligned}$$

    By integrating this equation and substituting it into (5.28) we find that

    $$\begin{aligned} A_0 = \Big (\frac{s^2}{\alpha _0-1}(1-\coth ^2(s))-\frac{2s\coth (s)}{\alpha _0-1}\Big )|A_1|^2, \end{aligned}$$

    so that

    $$\begin{aligned} g_{1X} = \left( \frac{s^2\alpha _0}{\alpha _0-1}(1-\coth ^2(s))-\frac{2s\alpha _0}{\alpha _0-1}\coth (s)-s^2(1-\coth ^2(s))\right) |A_1|^2. \end{aligned}$$

    Substituting the formulae for \(A_0\), \(A_2\), \(g_{1X}\), \(\Psi _1\), \(\Psi _2\), \(\Psi _3\) and the modulational Ansatz into (5.27), and equating coefficients of \(\textrm{e}^{\textrm{i}sx}\), finally yields the nonlinear Schrödinger equation

    $$\begin{aligned}&A_1-(6\beta _0 s^2-(1-\sigma ^2)(1-s\sigma ))A_{1XX} \\&\quad \begin{aligned}&\;\;+\Bigg (\frac{-s^4(1-3\sigma ^2)^2}{2(\alpha _0+16\beta _0 s^4-s(\sigma +\sigma ^{-1}))}+s^3(-5s^3\beta _0+4\sigma -2\sigma ^3) \\&\quad \qquad -\frac{s^4(1-\sigma ^2)^2}{\alpha _0-1}+\frac{4s^3\sigma (1-\sigma ^2)}{\alpha _0-1}-\frac{4\alpha _0 s^2\sigma ^2}{\alpha _0-1}\Bigg )|A_1|^2A_1 = 0, \end{aligned} \end{aligned}$$

    where \(\sigma = \coth (s)\).

6 Appendix B: Computation of the Normal-Form Coefficients

For this purpose we make use of the calculation

$$\begin{aligned} \Upsilon |_0(Lu,v)=H_2^0(u,v)=H_2^0(v,u)=\Upsilon |_0(Lv,u), \end{aligned}$$

denote the parts of \(H^\mu (w)\), \(g^\mu (w)\) which are homogeneous of order m in \(\mu \) and n in w by \(\mu ^m H_n^m(w)\), \(\mu ^m N_{n}^m(w)\) and the part of \(r(u_1;\mu )\) which is homogeneous of order m in \(\mu \) and n in \(u_1\) by \(r_n^m(u_1;\mu )\). With a slight abuse of notation we use the same symbols for the multilinear operators associated with these quantities.

Write

$$\begin{aligned} r_n^m(u_1;\mu )=\sum _{i+j+k+\ell =m}r_{ijk\ell }^n\mu ^m A^iB^j\bar{A}^k\bar{B}^\ell \end{aligned}$$

and consider the \(\mu A\)-component of (4.5), namely

$$\begin{aligned} (L-\textrm{i}s I)r_{1000}^1=c_2\textrm{i}E-c_1F-N_{1}^1(E). \end{aligned}$$

Taking the symplectic product of this equation with \(\bar{E}\), we find that

$$\begin{aligned} c_1=-\Upsilon |_0(r_{1000}^1,\underbrace{(L+\textrm{i}s I)\bar{E}}_{\displaystyle = 0})+\Upsilon |_0(N_{1}^1(E),\bar{E})=2H_2^1(E,\bar{E})=-\frac{\sinh ^2(s)}{\tau _1}. \end{aligned}$$

To compute \(d_1\) we consider the \(A^2\bar{A}\)-component of (4.5), namely

$$\begin{aligned} (L-\textrm{i}s I)r_{2010}^0= & {} \textrm{i}d_2E-2d_1F-3N_{3}^0(E,E,\bar{E})\\{} & {} \quad -2N_{2}^0(\bar{E},r_{200000}^0)-2N_{2}^0(E,r_{101000}^0), \end{aligned}$$

and again take the symplectic product with \(\bar{E}\), so that

$$\begin{aligned} 2d_1&=-\Upsilon |_0(r^0_{2010},\underbrace{(L+\textrm{i}s I)\bar{E}}_{\displaystyle = 0})+3\Upsilon |_0(N^0_3(E,E,\bar{E}),\bar{E}) \\&\qquad \qquad +2\Upsilon |_0(N_2^0(\bar{E}, r^0_{2000}),\bar{E})+2\Upsilon |_0(N_2^0(E,r_{1010}),\bar{E}). \end{aligned}$$

The functions \(r_{2000}^0\) and \(r_{1010}^0\) are obtained from the \(A^2\)- and \(A\bar{A}\)-components of (4.5), which are respectively

$$\begin{aligned} (K-2\textrm{i}sI)r_{2000}^0=-N_2^0(E,E), \\ Kr_{1010}^0=-2N_2^0(E,\bar{E}) \end{aligned}$$

(note that \(r_{101000}^0\) is determined up to addition of a multiple of \(F\)). Altogether we find that

$$\begin{aligned} d_1&= \begin{aligned}&\frac{\sinh ^4(s)}{2\tau _1^2}\left( \frac{s^4(1-3\sigma ^2)^2}{2(\alpha _0+16\beta _0s^4-s(\sigma +\sigma ^{-1})}-s^3(-5s^3\beta _0+4\sigma -2\sigma ^3)\right. \\&\qquad \quad \qquad \quad +\left. \frac{s^4(1-\sigma ^2)^2}{\alpha _0-1}-\frac{4s^3\sigma (1-\sigma ^2)}{\alpha _0-1}+\frac{4\alpha _0 s^2\sigma ^2}{\alpha _0-1}\right) , \end{aligned} \end{aligned}$$

where \(\sigma =\coth (s)\).

For completeness, we record the formulae for \(\tilde{r}^0_{101000}\) and \(\tilde{r}^0_{200000}\), namely

$$\begin{aligned} \tilde{r}^0_{1010} =&\begin{pmatrix}-s(\alpha _0-1)^{-1}(s+\sinh (2s))\\ 0\\ 0\\ 0\\ 0\\ -s\sinh (2s)+2s\sinh (s)(sy\sinh (sy)+\cosh (sy))\end{pmatrix},\\ \tilde{r}^0_{2000} =&\begin{aligned}&\tilde{a}_{2000}^0 \begin{pmatrix}\textrm{i}\sinh (2s) \\ -2s\sinh (2s) \\ \frac{1}{2s}(\alpha _0-1)\sinh (2s)\\ -4 \textrm{i}s^2\beta _0 \sinh (2s)\\ -(\frac{1}{2s}+(y^2-\frac{1}{3})s)\sinh (2s)+\cosh (2sy) \\ 2\textrm{i}s\cosh (2sy)-\textrm{i}\sinh (2s)\end{pmatrix} \\&\quad +\begin{pmatrix}\frac{1}{2}s\sinh (2s) \\ \textrm{i}s^2\sinh (2s) \\ -\frac{\textrm{i}}{8}(2\alpha _0-3)\sinh (2s)-\frac{\textrm{i}}{s}\sinh ^2(s)\\ -2\beta _0s^3\sinh (2s) \\ \textrm{i}\sinh (s)(-2 sy\sinh (sy) + \cosh (s y))\\ \textrm{i}(1+\frac{1}{2}s^2(y^2-\frac{1}{3}))\sinh (2s)+\textrm{i}(-\frac{3}{s}+\frac{1}{2}s(y^2-\frac{1}{3}))\sinh ^2(s)\\ s\sinh (s)(sy\sinh (sy)+\cosh (sy))-\frac{1}{2}s\sinh (2s) \end{pmatrix}, \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \tilde{a}_{2000}^0 = \frac{1}{2}\textrm{i}\left( \frac{s^2(\cosh (2s)+2)}{\sinh (2s)(\alpha _0+16\beta _0s^4)-2s\cosh (2s)}+s\right) . \end{aligned}$$