1 Introduction to Main Results

A classical problem in fluid dynamics, pioneered by the famous work of Stokes [36] in 1847, concerns the spectral stability/instability of periodic traveling waves—called Stokes waves– of the gravity water waves equations in any depth.

Benjamin and Feir [3], Lighthill [30] and Zakharov [40, 42] discovered in the sixties, through experiments and formal arguments, that Stokes waves in deep water are unstable, proposing an heuristic mechanism which leads to the disintegration of wave trains. More precisely, these works predicted unstable eigenvalues of the linearized equations at the Stokes wave, near the origin of the complex plane, corresponding to small Floquet exponents \( \mu \) or, equivalently, to long-wave perturbations. The same phenomenon was later predicted by Whitham [38] and Benjamin [2] for Stokes waves of wavelength \( 2\pi \kappa \), in finite depth \( \mathtt h \), provided that \( \kappa \mathtt h > 1.363 \) approximately. This phenomenon is nowadays called “Benjamin–Feir"—or modulational– instability, and it is supported by an enormous amount of physical observations and numerical simulations, see e.g. [16, 31]. We refer to [43] for an historical survey.

A serious difficulty for a rigorous mathematical proof of the Benjamin–Feir instability is that the perturbed eigenvalues bifurcate from the eigenvalue zero, which is defective, with multiplicity four. The first rigorous proof of a local branch of unstable eigenvalues close to zero for \( \kappa \mathtt h \) larger than the Whitham-Benjamin threshold \(1.363\ldots \) was obtained by Bridges-Mielke [9] in finite depth (see also the preprint [23]). Their method, based on a spatial dynamics and a center manifold reduction, breaks down in deep water. For dealing with this case Nguyen-Strauss [33] have recently developed a new approach, based on a Lyapunov-Schmidt decomposition. Very recently Berti-Maspero-Ventura [6], in deep water, provided a detailed account of the splitting of the four eigenvalues close to zero, as the Floquet exponent is turned on (see also [7] for a review of this result).

The goal of this paper is to completely describe the Benjamin–Feir spectrum at any finite value of the depth \( \texttt{h}> 0 \). This analysis has fundamental physical importance, since real-life experiments are performed in water tanks (for example the original Benjamin and Feir experiments, in Feltham’s National Physical Laboratory, had Stokes waves of wavelength 2.2 m and bottom’s depth of 7.62 m, see [2]). The limits \( \texttt{h}\rightarrow + \infty \) (infinite depth) and \( \mu \rightarrow 0 \) (long waves) do not commute and the emergence of Benjamin–Feir unstable eigenvalues in finite depth is not a direct followup of the infinite depth case.

Through out this paper, with no loss of generality, we consider \(2\pi \)-periodic Stokes waves, i.e. with wave number \(\kappa =1\). In Theorems 2.5 and 1.1 we prove the existence of a unique depth \( \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\), in perfect agreement with the Benjamin–Feir critical value 1.363..., such that

  • Shallow water case: for any \( 0< \mathtt h < \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\) the eigenvalues close to zero are purely imaginary for Stokes waves of sufficiently small amplitude, see Fig. 2-left;

  • Sufficiently deep water case: for any \( \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}< \mathtt h < \infty \), there exists a pair of non-purely imaginary eigenvalues which traces a complete closed figure “8” (as shown in Fig. 2-right) parameterized by the Floquet exponent \( \mu \). By further increasing \( \mu \), the eigenvalues recollide far from the origin on the imaginary axis where then they keep moving. As \( {\mathtt h} \rightarrow \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}^{\, +} \) the set of unstable Floquet exponents shrinks to zero and the Benjamin–Feir unstable eigenvalues collapse to the origin, see Fig. 3. This figure ‘8" was first numerically discovered by Deconink-Oliveras in [16].

We remark that the present approach provides a necessary and sufficient condition for the existence of unstable eigenvalues.

We encounter several differences between the current proof and the one of the infinite depth case in [6], the major of which we anticipate here. In the deep water ideal case it turns out that the “reduced” \(4\times 4\) matrix obtained by the Kato spectral procedure is a small perturbation of a block-diagonal matrix which shows up the Benjamin–Feir unstable eigenvalues. In finite depth this is not the case; the coupling between the \( 2\times 2\) block-diagonal matrices and the out-diagonal ones is much stronger. The difference arises because, when \( \texttt{h}= + \infty \), the \(4 \times 4\) reduced Kato matrix has two eigenvalues of size \( \mathcal {O}(\mu )\) and the other two have the much bigger size \(\mathcal {O}(\sqrt{\mu }) \), whereas in finite depth all four eigenvalues are \(\mathcal {O}(\mu ) \). In turn, this is due to the different asymptotic expansions of the function

$$\begin{aligned} \sqrt{\mu \tanh (\mu \texttt{h})} = \left\{ \begin{array}{ll} \sqrt{\mu } &{} \qquad \text {if} \quad \texttt{h}= + \infty \,, \\ \sqrt{\texttt{h}} \mu + O(\mu ^3) &{}\qquad \forall \texttt{h}> 0 \ \ \text {as} \ \ \mu \rightarrow 0 \,, \end{array}\right. \end{aligned}$$

appearing in the Floquet operator (see Sect. 2). This significantly increases the complexity of the spectral analysis. In order to rigorously compute the spectrum of the \(4\times 4\) reduced matrix in finite depth (not only providing a formal expansion) we introduce a novel non-perturbative step of block diagonalization, which considerably modifies the block-diagonal matrices (see comments below Theorem 2.5). Such procedure is uniform in \(\texttt{h}\) only on compact subsets of \((0, + \infty )\) and becomes singular in the deep water limit.

These differences indicate that the limits \( \texttt{h}\rightarrow + \infty \) (infinite depth) and \( \mu \rightarrow 0 \) (long wave) can not be simply interchanged, and the connection between the Benjamin–Feir instability in these two cases is far more complex: the modulational instability in infinite depth is not the limit of the finite depth one, nor the latter is a direct followup of the infinite depth case.

Let us now present, rigorously, our results.

1.1 Benjamin–Feir Instability in Finite Depth

We consider the pure gravity water waves equations for a bidimensional fluid occupying a region with finite depth \( \mathtt h \). With no loss of generality we set the gravity \( g = 1 \), see Remark 2.4. We consider a \(2\pi \)-periodic Stokes wave with amplitude \(0< \epsilon \ll 1\) and speed

$$\begin{aligned} c_\epsilon = {\mathtt c}_{\mathtt h}+ \mathcal {O}(\epsilon ^2) \,, \quad {\mathtt c}_{\mathtt h}:= \sqrt{\tanh (\texttt{h})} \,. \end{aligned}$$

The linearized water waves equations at the Stokes wave are, in the inertial reference frame moving with speed \(c_\epsilon \), a linear time independent system of the form \( h_t = \mathcal {L}_{\epsilon } h \) where \( \mathcal {L}_{\epsilon }:= \mathcal {L}_{\epsilon }({\mathtt h}) \) is a linear operator with \( 2 \pi \)-periodic coefficients, see (2.17) (the operator \( \mathcal {L}_{\epsilon } \) in (2.17) is actually obtained conjugating the linearized water waves equations in the Zakharov formulation at the Stokes wave via the “good unknown of Alinhac" (2.11) and the Levi-Civita (2.16) invertible transformations). The operator \( \mathcal {L}_{\epsilon } \) possesses the eigenvalue 0, which is defective, with multiplicity four, due to symmetries of the water waves equations. The problem is to prove that the linear system \( h_t = \mathcal {L}_{\epsilon } h \) has solutions of the form \(h(t,x) = \text {Re}\left( e^{\lambda t} e^{\textrm{i}\,\mu x} v(x)\right) \) where v(x) is a \(2\pi \)-periodic function, \(\mu \) in \( {\mathbb {R}}\) is the Floquet exponent and \(\lambda \) has positive real part, thus h(tx) grows exponentially in time. By Bloch-Floquet theory, such \(\lambda \) is an eigenvalue of the operator \( \mathcal {L}_{\mu ,\epsilon }:= e^{-\textrm{i}\,\mu x } \,\mathcal {L}_{\epsilon } \, e^{\textrm{i}\,\mu x } \) acting on \(2\pi \)-periodic functions.

The main result of this paper proves, for any finite value of the depth \( \mathtt h \), the full splitting of the four eigenvalues close to zero of the operator \( \mathcal {L}_{\mu ,\epsilon }:= \mathcal {L}_{\mu ,\epsilon } (\mathtt h ) \) when \( \epsilon \) and \( \mu \) are small enough, see Theorem 2.5. We first present Theorem 1.1 which focuses on the figure “8" formed by the Benjamin–Feir unstable eigenvalues.

We first need to introduce the “Whitham-Benjamin” function

$$\begin{aligned} \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}:= \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h}):= & {} \frac{1}{{\mathtt c}_{\mathtt h}} \Big [ \frac{9{\mathtt c}_{\mathtt h}^8-10{\mathtt c}_{\mathtt h}^4+9}{8{\mathtt c}_{\mathtt h}^6}\nonumber \\{} & {} - \frac{1}{\texttt{h}- \frac{1}{4}\texttt{e}_{12}^2} \Big (1 + \frac{1-{\mathtt c}_{\mathtt h}^4}{2} + \frac{3}{4} \frac{(1-{\mathtt c}_{\mathtt h}^4)^2}{{\mathtt c}_{\mathtt h}^2}\texttt{h}\Big ) \Big ] \,, \end{aligned}$$
(1.1)

where \({\mathtt c}_{\mathtt h}= \sqrt{\tanh (\texttt{h})} \) is the speed of the linear Stokes wave, and

$$\begin{aligned} \texttt{e}_{12}:= \texttt{e}_{12}(\texttt{h}):= {\mathtt c}_{\mathtt h}+{\mathtt c}_{\mathtt h}^{-1}(1-{\mathtt c}_{\mathtt h}^4)\texttt{h}> 0 \,, \quad \forall \texttt{h}> 0 \,. \end{aligned}$$
(1.2)

The function \( \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h})\) is well defined for any \( \texttt{h}> 0 \) because the denominator \( \texttt{h}- \tfrac{1}{4} \texttt{e}_{12}^2 > 0 \) in (1.1) is positive for any \( \texttt{h}> 0 \), see Lemma 5.7. The function (1.1) coincides, up to a non zero factor, with the celebrated function obtained by Whitham [38], Benjamin [2] and Bridges-Mielke [9] which determines the “shallow/sufficiently deep” threshold regime. In particular the Whitham-Benjamin function \(\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h})\) vanishes at \( \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}= 1.363...\), it is negative for \( 0< \texttt{h}< \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\), positive for \( \texttt{h}> \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\) and tends to 1 as \(\texttt{h}\rightarrow +\infty \), see Fig. 1. We also introduce the positive coefficient

$$\begin{aligned} \texttt{e}_{22}:= \texttt{e}_{22}(\texttt{h}):= \dfrac{(1-{\mathtt c}_{\mathtt h}^4)(1+3{\mathtt c}_{\mathtt h}^4) \texttt{h}^2+2 {\mathtt c}_{\mathtt h}^2({\mathtt c}_{\mathtt h}^4-1) \texttt{h}+{\mathtt c}_{\mathtt h}^4}{{\mathtt c}_{\mathtt h}^3}> 0 \,, \quad \forall \texttt{h}> 0 \,. \nonumber \\ \end{aligned}$$
(1.3)

We remark that the functions \(\texttt{e}_{12}(\texttt{h}) > {\texttt{c}}_\texttt{h}\) and \(\texttt{e}_{22}(\texttt{h}) > 0 \) are positive for any \( \texttt{h}> 0 \), tend to 0 as \(\texttt{h}\rightarrow 0^+\) and to 1 as \(\texttt{h}\rightarrow +\infty \), see Lemma 4.8.

Fig. 1
figure 1

Plot of the Whitham-Benjamin function \( \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h}) \). The red dot shows its unique root \(\texttt{h}_{\scriptscriptstyle {\textsc {WB}}}=1.363\dots \). which is the celebrated “shallow/sufficiently deep” water threshold predicted independently by Whitham (cfr. [38] p.49) and Benjamin (cfr. [2] p.68), and recovered in the rigorous proof of Bridges-Mielke [9, p. 183]

Through out the paper we denote by \(r(\epsilon ^{m_1} \mu ^{n_1}, \ldots , \epsilon ^{m_p} \mu ^{n_p})\) a real analytic function fulfilling for some \(C >0\) and \(\epsilon , \mu \) sufficiently small, the estimate \(| r(\epsilon ^{m_1} \mu ^{n_1}, \ldots , \epsilon ^{m_p} \mu ^{n_p}) | \le C \sum _{j=1}^p |\epsilon |^{m_j} |\mu |^{n_j} \), where the constant \(C:=C(\texttt{h})\) is uniform for \(\texttt{h}\) in any compact set of \((0, + \infty )\).

Theorem 1.1

(Benjamin–Feir unstable eigenvalues) For any \( \mathtt h > \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\), there exist \( \epsilon _1, \mu _0 > 0 \) and an analytic function \({\underline{\mu }}: [0,\epsilon _1)\rightarrow [0,\mu _0)\), of the form

$$\begin{aligned} {\underline{\mu }}(\epsilon ) = \texttt{e}_{\texttt{h}} \epsilon (1+r(\epsilon )) \,, \quad \texttt{e}_{\texttt{h}}:= \sqrt{\frac{8\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h})}{\texttt{e}_{22}(\texttt{h})}} \,, \end{aligned}$$
(1.4)

such that, for any \( \epsilon \in [0, \epsilon _1) \), the operator \(\mathcal {L}_{\mu ,\epsilon }\) has two eigenvalues \(\lambda ^\pm _1 (\mu ,\epsilon )\) of the form

$$\begin{aligned} {\left\{ \begin{array}{ll} \textrm{i}\,\frac{1}{2} \breve{\mathtt c}_\texttt{h}\mu +\textrm{i}\,r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3)\\ \qquad \pm \tfrac{1}{8} \mu \sqrt{\texttt{e}_{22}(\texttt{h})} (1+r(\epsilon ,\mu )) \sqrt{\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h};\mu ,\epsilon ) }, &{} \forall \mu \in [0, {\underline{\mu }} (\epsilon )) \!\!\! \\ \textrm{i}\,\frac{1}{2} \breve{\mathtt c}_\texttt{h}{\underline{\mu }} (\epsilon )+\textrm{i}\,r(\epsilon ^3), &{} \mu = {\underline{\mu }} (\epsilon ) \!\!\! \\ \textrm{i}\,\frac{1}{2} \breve{\mathtt c}_\texttt{h}\mu +\textrm{i}\,r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3)\\ \qquad \pm \textrm{i}\,\tfrac{1}{8} \mu \sqrt{\texttt{e}_{22}(\texttt{h})} (1+r(\epsilon ,\mu )) \sqrt{|\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h};\mu ,\epsilon )|}, &{} \forall \mu \in ( {\underline{\mu }} (\epsilon ), \mu _0) \!\!\! \end{array}\right. }\!\!\! \end{aligned}$$
(1.5)

where \(\breve{\mathtt c}_\texttt{h}:=2 {\mathtt c}_{\mathtt h}- \texttt{e}_{12}(\texttt{h}) >0\) and \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h};\mu ,\epsilon ) \) is the “Benjamin–Feir discriminant" function

$$\begin{aligned} \Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h};\mu ,\epsilon ):= 8\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h}) \epsilon ^2 +r_1(\epsilon ^3,\mu \epsilon ^2)-\texttt{e}_{22} (\texttt{h}) \mu ^2\big (1+r_1''(\epsilon ,\mu )\big ) \,. \end{aligned}$$
(1.6)

Note that, for any \(0<\epsilon <\epsilon _1\) (depending on \(\texttt{h}\)) the function \( \Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h};\mu ,\epsilon ) > 0 \) is positive, respectively \( < 0 \), provided \(0<\mu < \underline{\mu }(\epsilon )\), respectively \(\mu > \underline{\mu }(\epsilon )\).

Fig. 2
figure 2

The picture on the left shows, in the “shallow” water regime \(\texttt{h}< \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\), the eigenvalues \(\lambda ^\pm _{1} (\mu ,\epsilon )\) and \(\lambda ^\pm _{0} (\mu ,\epsilon )\) which are purely imaginary. The picture on the right shows, in the “sufficiently deep” water regime \(\texttt{h}> \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\), the eigenvalues \(\lambda ^\pm _1 (\mu ,\epsilon )\) in the complex \( \lambda \)-plane at fixed \(|\epsilon | \ll 1 \) as \(\mu \) varies. This figure “8 ” depends on \(\texttt{h}\) and shrinks to 0 as \(\texttt{h}\rightarrow \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}^+\), see Fig. 3. As \(\texttt{h}\rightarrow +\infty \) the spectrum resembles the one in deep water found in [6]

Let us make some comments.

1. Benjamin–Feir unstable eigenvalues. For \( \mathtt h > \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\), according to (1.5), for values of the Floquet parameter \( 0<\mu < {\underline{\mu }} (\epsilon ) \), the eigenvalues \(\lambda ^\pm _1 (\mu , \epsilon ) \) have opposite non-zero real part. As \( \mu \) tends to \( {\underline{\mu }} (\epsilon )\), the two eigenvalues \(\lambda ^\pm _1 (\mu ,\epsilon ) \) collide on the imaginary axis far from 0 (in the upper semiplane \( \text {Im} (\lambda ) > 0 \)), along which they keep moving for \( \mu > {\underline{\mu }} (\epsilon ) \), see Figure 2. For \( \mu < 0 \) the operator \( {{\mathcal {L}}}_{\mu ,\epsilon } \) possesses the symmetric eigenvalues \( \overline{\lambda _1^{\pm } (-\mu ,\epsilon )} \) in the semiplane \( \text {Im} (\lambda ) < 0 \). For \( \mu \in [0, {\underline{\mu }}(\epsilon )]\) we obtain the upper part of the figure “8”, which is well approximated by the curves

$$\begin{aligned} \mu \mapsto \Big ( \pm \frac{\mu }{8} \sqrt{\texttt{e}_{22}} \sqrt{8\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}\epsilon ^2 - \texttt{e}_{22}\mu ^2}, \ \tfrac{1}{2} \breve{\texttt{c}}_\texttt{h}\mu \Big ) \,, \end{aligned}$$
(1.7)

in accordance with the numerical simulations by Deconinck-Oliveras [16], and the formal expansions in [15]. Note that for \( \mu > 0 \) the imaginary part in (1.7) is positive because \( \breve{\texttt{c}}_\texttt{h}= {\mathtt c}_{\mathtt h}^{-1} ( \tanh (\texttt{h}) - (1- \tanh ^2 (\texttt{h})) \texttt{h})> 0 \) for any \( \texttt{h}> 0 \). The higher order “side-band" corrections of the eigenvalues \( \lambda _1^\pm (\mu ,\epsilon ) \) in (1.5), provided by the analytic functions \(r, r_1, r_1'', r_2 \), are explicitly computable. We finally remark that the eigenvalues (1.5) are not analytic in \((\mu , \epsilon )\) close to the value \((\underline{\mu }(\epsilon ),\epsilon )\) where \( \lambda ^\pm _1 (\mu , \epsilon ) \) collide at the top of the figure “8" far from 0 (clearly they are continuous).

2. Behaviour near the Whitham-Benjamin depth \( \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\). As \( {\mathtt h} \rightarrow \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}^+ \) the constant \( \epsilon _1:= \epsilon _1(\texttt{h}) > 0 \) in Theorem 1.1 tends to zero, the set of unstable Floquet exponents \( (0, {\underline{\mu }}(\epsilon ) ) \) with \( {\underline{\mu }}(\epsilon ) = \texttt{e}_{\mathtt h} \epsilon (1+r(\epsilon )) \) given in (1.4) shrinks to zero and the figure “8” of Benjamin–Feir unstable eigenvalues collapse to zero, see Fig. 3. In particular

$$\begin{aligned} \max _{\mu \in [0,\underline{\mu }(\epsilon )]} \text {Re}\,\lambda _1^+(\mu ,\epsilon ) = \text {Re}\,\lambda _1^+(\mu _{\max },\epsilon ) =\frac{1}{2} {\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}}(\texttt{h}) \epsilon ^2 + r(\epsilon ^3)\ \text { and } \end{aligned}$$
(1.8)

tends to zero as \(\texttt{h}\rightarrow \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}^+\), since \(0<\epsilon <\epsilon _1(\texttt{h})\) and \(\epsilon _1(\texttt{h})\rightarrow 0^+\).

Fig. 3
figure 3

The Benjamin–Feir eigenvalue \(\lambda ^+_{1}(\mu _{\max },\epsilon ) \) in (1.8) with maximal real part, as well as the whole figure “8” shrinks to zero as \(\texttt{h}\rightarrow \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}^+\)

3. Relation with Bridges-Mielke  [9]. Bridges and Mielke describe the unstable eigenvalues very close to the origin, namely the cross amid the ‘8". In order to make a precise comparison with our result let us spell out the relation of the functions \(\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}\), \(\texttt{e}_{12}\) and \(\texttt{e}_{22}\) with the coefficients obtained in [9]. The Whitham-Benjamin function \(\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}\) in (4.13) is \(\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}= ({\mathtt c}_{\mathtt h}\texttt{h})^{-1} \nu (F)\), where \(\nu (F)\) is defined in [9, formula (6.17)] and \(F = {\mathtt c}_{\mathtt h}\texttt{h}^{- \frac{1}{2}} \) is the Froude number, cfr. [9, formula (3.4)]. Moreover the term \(\texttt{e}_{12}\) in (1.2) is \(\texttt{e}_{12} = 2 c_g \), where \( c_g = \frac{1}{2} {\mathtt c}_{\mathtt h}\big (1+ F^{-2} \text {sech}^2(\texttt{h})\big ) \) is the group velocity defined in Bridges-Mielke [9, formula (3.8)]. Finally \(\texttt{e}_{22}(\texttt{h}) \propto \dot{c}_g\) where \( \dot{c}_g \) is the derivative of the group velocity defined in [9, formula (6.15)], which for gravity waves is negative in any depth.

4. Complete spectrum near 0. In Theorem 1.1 we have described just the two unstable eigenvalues of \(\mathcal {L}_{\mu ,\epsilon }\) close to zero for \( \mathtt h > \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\). There are also two larger purely imaginary eigenvalues of order \( \mathcal {O}(\mu ) \), see Theorem 2.5.

5. Shallow water regime. In the shallow water regime \( 0< \mathtt h < \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\), we prove in Theorem 2.5 that all the four eigenvalues of \( {{\mathcal {L}}}_{\mu ,\epsilon } \) close to zero remain purely imaginary for \(\epsilon \) sufficiently small. The eigenvalue expansions of Theorem 2.5 become singular as \( \texttt{h}\rightarrow 0^+ \).

6. Behavior at the Whitham-Benjamin threshold \(\texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\). The analysis of Theorem 1.1 is not conclusive at the critical depth \(\texttt{h}= \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\). The reason is that \( \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h}_{\scriptscriptstyle {\textsc {WB}}}) = 0 \) and the Benjamin–Feir discriminant function (1.6) reduces to

$$\begin{aligned} \Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}_{\scriptscriptstyle {\textsc {WB}}}; \mu , \epsilon ) = r(\epsilon ^3) + r(\mu \epsilon ^2) -\texttt{e}_{22}(\texttt{h}_{\scriptscriptstyle {\textsc {WB}}}) \mu ^2 (1+r_1''(\epsilon ,\mu )) \,. \end{aligned}$$
(1.9)

Thus its quadratic expansion is not sufficient anymore to determine the sign of \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}_{\scriptscriptstyle {\textsc {WB}}}; \mu , \epsilon )\). Note that (1.9) could be positive due to the term \(r(\epsilon ^3) \) for \(\epsilon \) and \(\mu \) small enough. Actually the cubic term in \(r(\epsilon ^3) = \beta \epsilon ^4 + \ldots \) vanishes and the coefficient \( \beta \) could be explicitly computed taking into account the fourth order expansion of the Stokes waves.

7. Unstable Floquet exponents and amplitudes \( (\mu ,\epsilon ) \). In Theorem 2.5 we actually prove that the expansion (1.5) of the eigenvalues of \( \mathcal {L}_{\mu ,\epsilon } \) holds for any value of \((\mu , \epsilon ) \) in a larger rectangle \( [0,\mu _0) \times [0,\epsilon _0 )\), and there exist Benjamin–Feir unstable eigenvalues if and only if the analytic function \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}; \mu , \epsilon )\) in (1.6) is positive. The zero set of \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}; \mu , \epsilon )\) is an analytic variety which, for \( \texttt{h}> \texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\), is, restricted to the rectangle \( [0, \mu _0) \times [0, \epsilon _1)\), the graph of the analytic function \( {\underline{\mu }}(\epsilon ) = \texttt{e}_{\mathtt h}\epsilon (1+r(\epsilon )) \) in (1.4). This function is tangent at \( \epsilon = 0 \) to the straight line \( \mu = \texttt{e}_{\mathtt h} \epsilon \), and divides \( [0,\mu _0) \times [0,\epsilon _1 )\) in the region where \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}; \mu ,\epsilon ) > 0 \) –and thus the eigenvalues of \( {\mathcal {L}}_{\mu ,\epsilon }\) have non-trivial real part–, from the “stable" one where all the eigenvalues of \( {\mathcal {L}}_{\mu ,\epsilon }\) are purely imaginary, see Fig. 4. In the region \( [0,\mu _0)\times [\epsilon _1,\epsilon _0)\) the higher order polynomial approximations of \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}; \mu ,\epsilon )\) (which are computable) will determine the sign of \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}; \mu ,\epsilon ) \).

Fig. 4
figure 4

The solid curve portrays the graph of the real analytic function \(\underline{\mu }(\epsilon )\) in (1.4) as \(\texttt{h}>\texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\). For values of \(\mu \) below this curve, the two eigenvalues \(\lambda ^\pm _1(\mu ,\epsilon )\) have non zero real part. For \(\mu \) above the curve, \(\lambda ^\pm _1(\mu ,\epsilon )\) are purely imaginary. In the region \([\epsilon _1,\epsilon _0)\times [0,\mu _0)\) the eigenvalues are real/purely imaginary depending on the higher order corrections given by Theorem 2.5, which determine the sign of \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}; \mu ,\epsilon ) \)

8. Deep water limit. Theorems 1.1 and 2.5 do not pass to the limit as \( \texttt{h}\rightarrow + \infty \) since the remainders in the expansions of the eigenvalues are uniform only on any compact set of \(\texttt{h}\in (0,+\infty )\). From a mathematical point of view, the difference is evident in the asymptotic behavior of \(\tanh (\texttt{h}\mu ) \) (and similar quantities) which, if \(\texttt{h}=+\infty \), is identically equal to 1 for any arbitrarily small Floquet exponent \( \mu \), whereas \( \tanh (\texttt{h}\mu ) = O(\mu \texttt{h}) \) for any \( \texttt{h}\) finite, as \( \mu \rightarrow 0 \). Additional intermediate scaling regimes \( \texttt{h}\mu \sim 1 \), \( \texttt{h}\mu \ll 1 \), \( \texttt{h}\mu \gg 1 \) are possible. It is well-known (e.g. see [14]) that intermediate long-wave regimes of the water-waves equations formally lead to different physically-relevant limit equations as Boussinesq, KdV, NLS, Benjamin–Ono, etc...

We shall describe in detail the ideas of proof and the differences with the deep water case below the statement of Theorem 2.5.

Further literature. Modulational instability has been studied also for a variety of approximate water waves models, such as KdV, gKdV, NLS and the Whitham equation by, for instance, Whitham [39], Segur, Henderson, Carter and Hammack [35], Gallay and Haragus [18], Haragus and Kapitula [19], Bronski and Johnson [11], Johnson [25], Hur and Johnson [21], Bronski, Hur and Johnson [10], Hur and Pandey [22], Leisman, Bronski, Johnson and Marangell [28]. Also for these approximate models, numerical simulations predict a figure “8” similar to that in Fig. 2 for the bifurcation of the unstable eigenvalues close to zero.

Finally, we mention the nonlinear modulational instability result of Jin, Liao, and Lin [24] for several fluid model equations and the preprint by Chen-Su [12] for Stokes waves in deep water. Nonlinear transversal instability results of traveling solitary water waves in finite depth decaying at infinity on \( {\mathbb {R}}\) have been proved in [34] (in deep water no solitary wave exists [20, 27]).

2 The Complete Benjamin–Feir Spectrum in Finite Depth

In this section we present in detail the complete spectral Theorem 2.5. We first introduce the pure gravity water waves equations and the Stokes waves solutions.

The water waves equations. We consider the Euler equations for a 2-dimensional incompressible, irrotational fluid under the action of gravity. The fluid fills the region

$$\begin{aligned} { {\mathcal {D}}}_\eta := \left\{ (x,y)\in \mathbb {T}\times {\mathbb {R}}:\; -\texttt{h}\le y< \eta (t,x)\right\} \,, \quad \mathbb {T}:={\mathbb {R}}/2\pi \mathbb {Z}, \end{aligned}$$

with finite depth and space periodic boundary conditions. The irrotational velocity field is the gradient of a harmonic scalar potential \(\Phi =\Phi (t,x,y) \) determined by its trace \( \psi (t,x)=\Phi (t,x,\eta (t,x)) \) at the free surface \( y = \eta (t, x ) \). Actually \(\Phi \) is the unique solution of the elliptic equation \( \Delta \Phi = 0 \) in \( {{\mathcal {D}}}_\eta \) with Dirichlet datum \( \Phi (t,x,\eta (t,x)) = \psi (t,x)\) and \( \Phi _y(t,x,y) = 0 \) at \(y = - \texttt{h}\).

The time evolution of the fluid is determined by two boundary conditions at the free surface. The first is that the fluid particles remain, along the evolution, on the free surface (kinematic boundary condition), and the second one is that the pressure of the fluid is equal, at the free surface, to the constant atmospheric pressure (dynamic boundary condition). Then, as shown by Zakharov [41] and Craig-Sulem [13], the time evolution of the fluid is determined by the following equations for the unknowns \( (\eta (t,x), \psi (t,x)) \),

$$\begin{aligned} \eta _t = G(\eta )\psi \,, \quad \psi _t = - g \eta - \dfrac{\psi _x^2}{2} + \dfrac{1}{2(1+\eta _x^2)} \big ( G(\eta ) \psi + \eta _x \psi _x \big )^2 \,, \end{aligned}$$
(2.1)

where \(g > 0 \) is the gravity constant and \(G(\eta ):= G(\eta , \texttt{h})\) denotes the Dirichlet-Neumann operator \( [G(\eta )\psi ](x):= \Phi _y(x,\eta (x)) - \Phi _x(x,\eta (x)) \eta _x(x)\). In the sequel, with no loss of generality, we set the gravity constant \( g = 1 \), see Remark 2.4.

The equations (2.1) are the Hamiltonian system

$$\begin{aligned} \partial _t \begin{bmatrix}\eta \\ \psi \end{bmatrix} = \mathcal {J}\begin{bmatrix}\nabla _\eta \mathcal {H} \\ \nabla _\psi \mathcal {H} \end{bmatrix}, \quad \quad \mathcal {J}:= \begin{bmatrix} 0 &{} \textrm{Id}\\ -\textrm{Id}&{} 0 \end{bmatrix}, \end{aligned}$$
(2.2)

where \( \nabla \) denote the \( L^2\)-gradient, and the Hamiltonian \( \mathcal {H}(\eta ,\psi ):= \frac{1}{2} \int _{\mathbb {T}} \big ( \psi \,G(\eta )\psi +\eta ^2 \big ) \textrm{d}x \) is the sum of the kinetic and potential energy of the fluid. In addition of being Hamiltonian, the water waves system (2.1) possesses other important symmetries. First of all it is time reversible with respect to the involution

$$\begin{aligned} \rho \begin{bmatrix}\eta (x) \\ \psi (x) \end{bmatrix}:= \begin{bmatrix}\eta (-x) \\ -\psi (-x) \end{bmatrix}, \quad \text {i.e. } \mathcal {H} \circ \rho = \mathcal {H} \,. \end{aligned}$$
(2.3)

Moreover, the equation (2.1) is space invariant.

Stokes waves. The Stokes waves are traveling solutions of (2.1) of the form \(\eta (t,x)=\breve{\eta }(x-ct)\) and \(\psi (t,x)=\breve{\psi }(x-ct)\) for some real c and \(2\pi \)-periodic functions \((\breve{\eta } (x), \breve{\psi } (x)) \). In a reference frame in translational motion with constant speed c, the water waves equations (2.1) become

$$\begin{aligned} \eta _t = c\eta _x+G(\eta )\psi \,, \quad \psi _t = c\psi _x - \eta - \dfrac{\psi _x^2}{2} + \dfrac{1}{2(1+\eta _x^2)} \big ( G(\eta ) \psi + \eta _x \psi _x \big )^2\nonumber \\ \end{aligned}$$
(2.4)

and the Stokes waves \((\breve{\eta }, \breve{\psi })\) are equilibrium steady solutions of (2.4).

The bifurcation result of small amplitude of Stokes waves is due to Struik [37] in finite depth, and Levi-Civita [29], and Nekrasov [32] in infinite depth. We denote by \(B(r):= \{ x \in {\mathbb {R}}:\ |x| < r\}\) the real ball with center 0 and radius r.

Theorem 2.1

(Stokes waves) For any \(\texttt{h}>0\) there exist \(\epsilon _*:=\epsilon _*(\texttt{h}) >0\) and a unique family of real analytic solutions \((\eta _\epsilon (x), \psi _\epsilon (x), c_\epsilon )\), parameterized by the amplitude \(|\epsilon | \le \epsilon _*\), of

$$\begin{aligned} c \, \eta _x+G(\eta )\psi = 0 \,, \quad c \, \psi _x - \eta - \dfrac{\psi _x^2}{2} + \dfrac{1}{2(1+\eta _x^2)} \big ( G(\eta ) \psi + \eta _x \psi _x \big )^2 = 0,\qquad \quad \end{aligned}$$
(2.5)

such that \( \eta _\epsilon (x), \psi _\epsilon (x) \) are \(2\pi \)-periodic; \(\eta _\epsilon (x) \) is even and \(\psi _\epsilon (x) \) is odd, of the form

$$\begin{aligned} \begin{aligned}&\eta _\epsilon (x) = \epsilon \cos (x) + \epsilon ^2 (\eta _{2}^{[0]} + \eta _{2}^{[2]} \cos (2x)) + \mathcal {O}(\epsilon ^3), \\&\psi _\epsilon (x) = \epsilon {\mathtt c}_{\mathtt h}^{-1} \sin (x) + \epsilon ^2 \psi _{2}^{[2]} \sin (2x) +\mathcal {O}(\epsilon ^3) \,, \\&c_\epsilon = {\mathtt c}_{\mathtt h}+ \epsilon ^2 c_2 +\mathcal {O}(\epsilon ^3) \quad \text {where} \quad {\mathtt c}_{\mathtt h}= \sqrt{\tanh (\texttt{h})} \,, \end{aligned} \end{aligned}$$
(2.6)

and

$$\begin{aligned} \eta _{2}^{[0]}&:= \frac{{\mathtt c}_{\mathtt h}^4-1}{4{\mathtt c}_{\mathtt h}^2} \, , \qquad \eta _{2}^{[2]} := \frac{3-{\mathtt c}_{\mathtt h}^4}{4{\mathtt c}_{\mathtt h}^6} \, , \qquad \psi _{2}^{[2]} := \frac{3+{\mathtt c}_{\mathtt h}^8}{8{\mathtt c}_{\mathtt h}^7} \, , \qquad \end{aligned}$$
(2.7)
$$\begin{aligned} c_2&:= \frac{9-10{\mathtt c}_{\mathtt h}^4+9{\mathtt c}_{\mathtt h}^8 }{16{\mathtt c}_{\mathtt h}^7} + {\frac{(1-{\mathtt c}_{\mathtt h}^4) }{2{\mathtt c}_{\mathtt h}}} \eta _2^{[0]} = \frac{-2 {\mathtt c}_{\mathtt h}^{12}+13 {\mathtt c}_{\mathtt h}^8-12 {\mathtt c}_{\mathtt h}^4+9}{16 {\mathtt c}_{\mathtt h}^7}\, . \end{aligned}$$
(2.8)

More precisely for any \( \sigma \ge 0 \) and \( s > \frac{5}{2} \), there exists \( \epsilon _*>0 \) such that the map \(\epsilon \mapsto (\eta _\epsilon , \psi _\epsilon , c_\epsilon )\) is analytic from \(B(\epsilon _*) \rightarrow H^{\sigma ,s}_{\texttt{ev}} (\mathbb {T})\times H^{\sigma ,s}_{\texttt{odd}}(\mathbb {T})\times {\mathbb {R}}\), where \( H^{\sigma ,s}_{\texttt{ev}}(\mathbb {T}) \), respectively \( H^{\sigma ,s}_{\texttt{odd}}(\mathbb {T}) \), denote the space of even, respectively odd, real valued \( 2 \pi \)-periodic analytic functions \( u(x) = \sum _{k \in \mathbb {Z}} u_k e^{\textrm{i}\,k x} \) such that \( \Vert u \Vert _{\sigma ,s}^2:= \sum _{k \in \mathbb {Z}} |u_k|^2 \langle k \rangle ^{2\,s} e^{2 \sigma |k|} < + \infty \).

The expansions (2.6)-(2.8) are derived in the Appendix B for completeness, although present in the literature (they coincide with [39, section 13, chapter 13] and [2, section 2]). Note that in the shallow water regime \(\texttt{h}\rightarrow 0^+\) the expansions (2.6)-(2.8) become singular. For the analiticity properties of the maps stated in Theorem 2.1 we refer to [8].

We also mention that more general time quasi-periodic traveling Stokes waves—which are nonlinear superpositions of multiple Stokes waves traveling with rationally independent speeds—have been recently proved for (2.1) in [5] in finite depth, in [17] in infinite depth, and in [4] for capillary-gravity water waves in any depth.

Linearization at the Stokes waves. In order to determine the stability/instability of the Stokes waves given by Theorem 2.1, we linearize the water waves equations (2.4) with \( c = c_\epsilon \) at \((\eta _\epsilon (x), \psi _\epsilon (x))\). In the sequel we closely follow [6] pointing out the differences of the finite depth case.

By using the shape derivative formula for the differential \( \textrm{d}_\eta G(\eta )[{\hat{\eta }} ]\) of the Dirichlet-Neumann operator one obtains the autonomous real linear system

$$\begin{aligned} \begin{bmatrix}{\hat{\eta }}_t \\ {\hat{\psi }}_t \end{bmatrix}= & {} \begin{bmatrix} -G(\eta _\epsilon )B-\partial _x \circ (V-c_\epsilon ) &{} G(\eta _\epsilon ) \\ -1+B(V-c_\epsilon )\partial _x - B \partial _x \circ (V-c_\epsilon ) - BG(\eta _\epsilon )\circ B &{} - (V-c_\epsilon )\partial _x + BG(\eta _\epsilon ) \end{bmatrix}\nonumber \\{} & {} \begin{bmatrix}{\hat{\eta }} \\ {\hat{\psi }} \end{bmatrix}, \end{aligned}$$
(2.9)

where

$$\begin{aligned} V:= & {} V(x):= -B (\eta _\epsilon )_x + (\psi _\epsilon )_x \,, \nonumber \\ B:= & {} B(x):= \frac{G(\eta _\epsilon )\psi _\epsilon + (\psi _\epsilon )_x (\eta _\epsilon )_x}{1+(\eta _\epsilon )_x^2} = \frac{ (\psi _\epsilon )_x- c_\epsilon }{1+(\eta _\epsilon )_x^2}(\eta _\epsilon )_x \,. \end{aligned}$$
(2.10)

The functions (VB) are the horizontal and vertical components of the velocity field \( (\Phi _x, \Phi _y) \) at the free surface. Moreover \(\epsilon \mapsto (V,B)\) is analytic as a map \(B(\epsilon _0) \rightarrow H^{\sigma , s-1}(\mathbb {T})\times H^{\sigma ,s-1}(\mathbb {T})\). The real system (2.9) is Hamiltonian, i.e. of the form with , where is the transposed operator with respect the scalar product of \(L^2(\mathbb {T}, {\mathbb {R}})\times L^2(\mathbb {T}, {\mathbb {R}})\). Moreover the linear operator in (2.9) is reversible, i.e. it anti-commutes with the involution \( \rho \) in (2.3).

Under the time-independent “good unknown of Alinhac" linear transformation

$$\begin{aligned} \begin{bmatrix}{\hat{\eta }} \\ {\hat{\psi }} \end{bmatrix}:= Z \begin{bmatrix}u \\ v \end{bmatrix} \,, \qquad Z = \begin{bmatrix} 1 &{} 0 \\ B &{} 1\end{bmatrix}, \quad Z^{-1} = \begin{bmatrix} 1 &{} 0 \\ -B &{} 1\end{bmatrix}, \end{aligned}$$
(2.11)

the system (2.9) assumes the simpler form

$$\begin{aligned} \begin{bmatrix}u_t \\ v_t \end{bmatrix} = \widetilde{\mathcal {L}}_\epsilon \begin{bmatrix}u \\ v \end{bmatrix}, \qquad \widetilde{\mathcal {L}}_\epsilon := \begin{bmatrix} -\partial _x\circ (V-c_\epsilon ) &{}\quad G(\eta _\epsilon ) \\ -1 - (V-c_\epsilon ) B_x &{}\quad - (V-c_\epsilon )\partial _x \end{bmatrix} \,. \end{aligned}$$
(2.12)

Next, we perform a conformal change of variables to flatten the water surface. Here the finite depth case induces a modification with respect to the deep water case. By [1, Appendix A], there exists a diffeomorphism of \(\mathbb {T}\), \( x\mapsto x+\mathfrak {p}(x)\), with a small \(2\pi \)-periodic function \(\mathfrak {p}(x)\), and a small constant \(\texttt{f}_\epsilon \), such that, by defining the associated composition operator \( (\mathfrak {P}u)(x):= u(x+\mathfrak {p}(x))\), the Dirichlet-Neumann operator can be written as [1, Lemma A.5]

$$\begin{aligned} G(\eta _\epsilon ) = \partial _x \circ \mathfrak {P}^{-1} \circ {{\mathcal {H}}} \circ \tanh \big ((\texttt{h}+\texttt{f}_\epsilon )|D| \big ) \circ \mathfrak {P} \,, \end{aligned}$$
(2.13)

where \( {{\mathcal {H}}} \) is the Hilbert transform, i.e. the Fourier multiplier operator

$$\begin{aligned} {\mathcal {H}}(e^{\textrm{i}\,j x}):= - \textrm{i}\,sign (j) e^{\textrm{i}\,j x} \,, \quad \forall j \in \mathbb {Z}\setminus \{0\} \,, \quad {\mathcal {H}}(1):= 0 \,. \end{aligned}$$

The function \({\mathfrak {p}}(x)\) and the constant \(\texttt{f}_\epsilon \) are determined as a fixed point of (see [1, formula (A.15)])

$$\begin{aligned}{} & {} \mathfrak {p} = \frac{\mathcal {H}}{\tanh \big ((\texttt{h}+ \texttt{f}_\epsilon )|D| \big )}[\eta _\epsilon ( x + \mathfrak {p}(x))] \,, \nonumber \\{} & {} \texttt{f}_\epsilon := \frac{1}{2\pi } \int _\mathbb {T}\eta _\epsilon (x + \mathfrak {p}(x)) \textrm{d}x \,. \end{aligned}$$
(2.14)

By the analyticity of the map \(\epsilon \rightarrow \eta _\epsilon \in H^{\sigma ,s}\), \(\sigma >0\), \(s > 1/2\), the analytic implicit function theorem implies the existence of a solution \(\epsilon \mapsto \mathfrak {p}(x):=\mathfrak {p}_\epsilon (x) \), \( \epsilon \mapsto \texttt{f}_\epsilon \), analytic as a map \(B(\epsilon _0) \rightarrow H^{s}(\mathbb {T}) \times {\mathbb {R}}\). Moreover, since \(\eta _\epsilon \) is even, the function \({\mathfrak {p}}(x)\) is odd. In Appendix B we prove the expansion

$$\begin{aligned} {\mathfrak {p}}(x)= & {} \epsilon {\mathtt c}_{\mathtt h}^{-2} \sin (x)+\epsilon ^2\frac{(1+{\mathtt c}_{\mathtt h}^4)(3+{\mathtt c}_{\mathtt h}^4)}{8{\mathtt c}_{\mathtt h}^8}\sin (2x)+\mathcal {O}(\epsilon ^3) \,,\nonumber \\ \texttt{f}_\epsilon= & {} \epsilon ^2\frac{{\mathtt c}_{\mathtt h}^4-3}{4{\mathtt c}_{\mathtt h}^2} + \mathcal {O}(\epsilon ^3) \,. \end{aligned}$$
(2.15)

Under the symplectic and reversibility-preserving map

$$\begin{aligned} \mathcal {P}:= \begin{bmatrix}(1+\mathfrak {p}_x)\mathfrak {P} &{} 0 \\ 0 &{} \mathfrak {P} \end{bmatrix} \,, \end{aligned}$$
(2.16)

the system (2.12) transforms, by (2.13), into the linear system \( h_t = \mathcal {L}_\epsilon h \) where \( \mathcal {L}_\epsilon \) is the Hamiltonian and reversible real operator

$$\begin{aligned} \begin{aligned} \mathcal {L}_\epsilon := \mathcal {P} \, \widetilde{{\mathcal {L}}}_\epsilon \, \mathcal {P}^{-1}&= \begin{bmatrix} \partial _x \circ ({\mathtt c}_{\mathtt h}+p_\epsilon (x)) &{}\quad |D|\tanh ((\texttt{h}+\texttt{f}_\epsilon ) |D|) \\ - (1+a_\epsilon (x)) &{}\quad ({\mathtt c}_{\mathtt h}+p_\epsilon (x))\partial _x \end{bmatrix} \\&= \mathcal {J}\begin{bmatrix} 1+a_\epsilon (x) &{} \quad -({\mathtt c}_{\mathtt h}+p_\epsilon (x)) \partial _x \\ \partial _x \circ ({\mathtt c}_{\mathtt h}+p_\epsilon (x)) &{}\quad |D|\tanh ((\texttt{h}+\texttt{f}_\epsilon ) |D|) \end{bmatrix}, \end{aligned} \end{aligned}$$
(2.17)

where

$$\begin{aligned} {\mathtt c}_{\mathtt h}+p_\epsilon (x):= & {} \displaystyle {\frac{ c_\epsilon -V(x+\mathfrak {p}(x))}{ 1+\mathfrak {p}_x(x)}} \,, \nonumber \\ 1+a_\epsilon (x):= & {} \displaystyle {\frac{1+ (V(x + \mathfrak {p}(x)) - c_\epsilon ) B_x(x + \mathfrak {p}(x)) }{1+\mathfrak {p}_x(x)}} \,. \end{aligned}$$
(2.18)

By the analiticity results of the functions \( V, B, \mathfrak {p}(x) \) given above, the functions \(p_\epsilon \) and \(a_\epsilon \) are analytic in \(\epsilon \) as maps \(B(\epsilon _0)\rightarrow H^{s} ({\mathbb {T}})\). In the Appendix B we prove the following expansions:

Lemma 2.2

The analytic functions \(p_\epsilon (x) \) and \(a_\epsilon (x) \) in (2.18) are even in x, and

$$\begin{aligned} p_\epsilon (x) = \epsilon p_1 (x) + \epsilon ^2 p_2 (x) + \mathcal {O}(\epsilon ^3) \,, \qquad a_\epsilon (x) = \epsilon a_1(x) +\epsilon ^2 a_2 (x) + \mathcal {O}(\epsilon ^3) \,,\nonumber \\ \end{aligned}$$
(2.19)

where

$$\begin{aligned} p_1(x)&= p_1^{[1]}\cos (x)\, , \qquad \quad \quad p_1^{[1]} := - 2 {\mathtt c}_{\mathtt h}^{-1}\, , \end{aligned}$$
(2.20)
$$\begin{aligned} p_2(x)&= p_2^{[0]}+p_2^{[2]}\cos (2x)\, , \nonumber \\ p_2^{[0]}&:= \frac{9+12 {\mathtt c}_{\mathtt h}^4+ 5{\mathtt c}_{\mathtt h}^8-2 {\mathtt c}_{\mathtt h}^{12}}{16 {\mathtt c}_{\mathtt h}^7}\, , \quad p_2^{[2]}:= - \frac{3+{\mathtt c}_{\mathtt h}^4}{2{\mathtt c}_{\mathtt h}^7}\, , \end{aligned}$$
(2.21)

and

$$\begin{aligned} a_1(x)&= a_1^{[1]} \cos (x)\, , \qquad \qquad a_1^{[1]}:= - ( {\mathtt c}_{\mathtt h}^2 + {\mathtt c}_{\mathtt h}^{-2})\, , \end{aligned}$$
(2.22)
$$\begin{aligned} a_2(x)&= a_2^{[0]}+a_2^{[2]}\cos (2x)\, ,\quad a_2^{[0]}:=\frac{3}{2} + \frac{1}{2{\mathtt c}_{\mathtt h}^4}\, , \quad a_2^{[2]} := \frac{-14{\mathtt c}_{\mathtt h}^4+9{\mathtt c}_{\mathtt h}^8-3}{4{\mathtt c}_{\mathtt h}^8}\, . \end{aligned}$$
(2.23)

Bloch-Floquet expansion. Since the operator \(\mathcal {L}_\epsilon \) in (2.17) has \(2\pi \)-periodic coefficients, Bloch-Floquet theory guarantees that

$$\begin{aligned} \sigma _{L^2({\mathbb {R}})} (\mathcal {L}_\epsilon ) = \bigcup _{\mu \in [- \frac{1}{2}, \frac{1}{2})} \sigma _{L^2(\mathbb {T})} (\mathcal {L}_{\mu , \epsilon }) \qquad \text {where} \quad \qquad \mathcal {L}_{\mu ,\epsilon }:= e^{- \textrm{i}\,\mu x} \, \mathcal {L}_\epsilon \, e^{\textrm{i}\,\mu x} \,. \end{aligned}$$

The domain \( [- \frac{1}{2}, \frac{1}{2}) \) is called, in solid state physics, the “first zone of Brillouin". In particular, if \(\lambda \) is an eigenvalue of \(\mathcal {L}_{\mu ,\epsilon }\) on \(L^2(\mathbb {T}, \mathbb {C}^2)\) with eigenvector v(x), then \(h (t,x) = e^{\lambda t} e^{\textrm{i}\,\mu x} v(x)\) solves \(h_t = \mathcal {L}_{\epsilon } h\). We remark that: (i) if \(A = \textrm{Op}(a) \) is a pseudo-differential operator with symbol \( a(x, \xi ) \), which is \(2\pi \) periodic in x, then \( A_\mu := e^{- \textrm{i}\,\mu x}A e^{ \textrm{i}\,\mu x} = \textrm{Op} (a(x, \xi + \mu )) \). (ii) If A is a real operator then \( \overline{ A_\mu } = A_{- \mu } \). As a consequence the spectrum \( \sigma (A_{-\mu }) = \overline{ \sigma (A_{\mu }) } \) and we can study \( \sigma (A_{\mu }) \) just for \( \mu > 0 \). Furthermore \(\sigma (A_{\mu })\) is a 1-periodic set with respect to \(\mu \), so one can restrict to \(\mu \in [0, \frac{1}{2})\).

By the previous remarks the Floquet operator associated with the real operator \(\mathcal {L}_\epsilon \) in (2.17) is the complex Hamiltonian and reversible operator

$$\begin{aligned} \mathcal {L}_{\mu ,\epsilon } :&= \begin{bmatrix} (\partial _x+\textrm{i}\,\mu )\circ ({\mathtt c}_{\mathtt h}+p_\epsilon (x)) &{}\quad |D+\mu | \tanh \big ((\texttt{h}+ \texttt{f}_\epsilon ) |D+\mu | \big ) \\ -(1+a_\epsilon (x)) &{}\quad ({\mathtt c}_{\mathtt h}+p_\epsilon (x))(\partial _x+\textrm{i}\,\mu ) \end{bmatrix} \nonumber \\&= \underbrace{\begin{bmatrix} 0 &{} \textrm{Id}\\ -\textrm{Id}&{} 0 \end{bmatrix}}_{\displaystyle {=\mathcal {J}}} \underbrace{\begin{bmatrix} 1+a_\epsilon (x) &{}\quad -({\mathtt c}_{\mathtt h}+p_\epsilon (x))(\partial _x+\textrm{i}\,\mu ) \\ (\partial _x+\textrm{i}\,\mu )\circ ({\mathtt c}_{\mathtt h}+p_\epsilon (x)) &{}\quad |D+\mu | \tanh \big ((\texttt{h}+ \texttt{f}_\epsilon ) |D+\mu | \big ) \end{bmatrix}}_{\displaystyle {=:\mathcal {B}_{\mu ,\epsilon }}} \, . \end{aligned}$$
(2.24)

We regard \( \mathcal {L}_{\mu ,\epsilon } \) as an operator with domain \(H^1(\mathbb {T}):= H^1(\mathbb {T},\mathbb {C}^2)\) and range \(L^2(\mathbb {T}):=L^2(\mathbb {T},\mathbb {C}^2)\), equipped with the complex scalar product

$$\begin{aligned} (f,g):= \frac{1}{2\pi } \int _{0}^{2\pi } \left( f_1 \bar{g_1} + f_2 \bar{g_2} \right) \, \text {d} x \,, \quad \forall f= \begin{bmatrix}f_1 \\ f_2 \end{bmatrix}, \ \ g= \begin{bmatrix}g_1 \\ g_2 \end{bmatrix} \in L^2(\mathbb {T}, \mathbb {C}^2) \,.\nonumber \\ \end{aligned}$$
(2.25)

We also denote \( \Vert f \Vert ^2 = (f,f) \).

The complex operator \(\mathcal {L}_{\mu ,\epsilon }\) in (2.24) is Hamiltonian and Reversible.

Definition 2.3

(Complex Hamiltonian/Reversible operator) A complex operator \(\mathcal {L}: H^1(\mathbb {T},\mathbb {C}^2) \rightarrow L^2(\mathbb {T},\mathbb {C}^2) \) is Hamiltonian, if \(\mathcal {L}= \mathcal {J}\mathcal {B}\) where \( \mathcal {B}\) is a self-adjoint operator, namely \( \mathcal {B}= \mathcal {B}^* \), where \(\mathcal {B}^*\) (with domain \(H^1(\mathbb {T})\)) is the adjoint with respect to the complex scalar product (2.25) of \(L^2(\mathbb {T})\); it is reversible if

$$\begin{aligned} \mathcal {L}\circ {\bar{\rho }}=- {\bar{\rho }}\circ \mathcal {L}\,, \end{aligned}$$
(2.26)

where \({\bar{\rho }}\) is the complex involution (cfr. (2.3))

$$\begin{aligned} {\bar{\rho }}\begin{bmatrix}\eta (x) \\ \psi (x) \end{bmatrix}:= \begin{bmatrix}{\bar{\eta }}(-x) \\ -{\bar{\psi }}(-x) \end{bmatrix} \,. \end{aligned}$$
(2.27)

The property (2.26) for \( \mathcal {L}_{\mu ,\epsilon } \) follows because \( \mathcal {L}_\epsilon \) is a real operator which is reversible with respect to the involution \( \rho \) in (2.3). Equivalently, since \(\mathcal {J}\circ {\bar{\rho }}= -{\bar{\rho }}\circ \mathcal {J}\), the self-adjoint operator \(\mathcal {B}_{\mu ,\epsilon }\) is reversibility-preserving, i.e.

$$\begin{aligned} \mathcal {B}_{\mu ,\epsilon } \circ {\bar{\rho }}= {\bar{\rho }}\circ \mathcal {B}_{\mu ,\epsilon } \,. \end{aligned}$$
(2.28)

In addition \((\mu , \epsilon ) \rightarrow \mathcal {L}_{\mu ,\epsilon } \in \mathcal {L}(H^1(\mathbb {T}), L^2(\mathbb {T}))\) is analytic, since the functions \(\epsilon \mapsto a_\epsilon \), \(p_\epsilon \) defined in (2.19) are analytic as maps \(B(\epsilon _0) \rightarrow H^1(\mathbb {T})\) and \({{\mathcal {L}}}_{\mu ,\epsilon }\) is analytic with respect to \(\mu \), since, for any \( \mu \in [-\frac{1}{2}, \frac{1}{2}) \),

$$\begin{aligned} |D+\mu | \tanh \big ((\texttt{h}+ \texttt{f}_\epsilon ) |D+\mu | \big ) = (D + \mu ) \tanh \big ((\texttt{h}+ \texttt{f}_\epsilon ) (D+\mu ) \big ).\qquad \end{aligned}$$
(2.29)

We also note that (see [33, Section 5.1])

$$\begin{aligned} |D+\mu | = |D| + \mu ({{\,\textrm{sgn}\,}}(D)+\Pi _0) \,, \quad \forall \mu > 0 \,, \end{aligned}$$
(2.30)

where \({{\,\textrm{sgn}\,}}(D)\) is the Fourier multiplier operator, acting on \(2\pi \)-periodic functions, with symbol

$$\begin{aligned} {{\,\textrm{sgn}\,}}(k):= 1\ \forall k > 0 \,, \quad {{\,\textrm{sgn}\,}}(0):=0 \,,\quad {{\,\textrm{sgn}\,}}(k):= -1 \ \forall k < 0 \,, \end{aligned}$$
(2.31)

and \(\Pi _0\) is the projector operator on the zero mode, \(\Pi _0f(x):= \frac{1}{2\pi } \int _\mathbb {T}f(x)\textrm{d}x. \)

Remark 2.4

If \( (\eta (x), \psi (x), c ) \) solve the traveling wave equations (2.5) then the rescaled functions \( ({\widetilde{\eta }} (x), {\widetilde{\psi }} (x), {\widetilde{c}} ):= (\eta (x), \sqrt{g} \psi (x), \sqrt{g} c) \) solve the same equations with gravity constant g instead of 1. The eigenvalues of the corresponding linearized operators (2.9) and (2.24) for a general gravity g are those of the \(g = 1\) case multiplied by \(\sqrt{g}\).

Our aim is to prove the existence of eigenvalues of \( \mathcal {L}_{\mu ,\epsilon } \) in (2.24) with non zero real part. We remark that the Hamiltonian structure of \(\mathcal {L}_{\mu ,\epsilon }\) implies that eigenvalues with non zero real part may arise only from multiple eigenvalues of \(\mathcal {L}_{\mu ,0}\) (“Krein criterion"), because if \(\lambda \) is an eigenvalue of \(\mathcal {L}_{\mu ,\epsilon }\) then also \(-{\bar{\lambda }}\) is, and the total algebraic multiplicity of the eigenvalues is conserved under small perturbation. We now describe the spectrum of \(\mathcal {L}_{\mu ,0}\).

The spectrum of \(\mathcal {L}_{\mu ,0}\). The spectrum of the Fourier multiplier matrix operator

$$\begin{aligned} \mathcal {L}_{\mu ,0} = \begin{bmatrix} {\mathtt c}_{\mathtt h}( \partial _x+\textrm{i}\,\mu ) &{}\quad |D+\mu | \, \tanh \big (\texttt{h}|D+\mu | \big ) \\ -1 &{}\quad {\mathtt c}_{\mathtt h}(\partial _x+\textrm{i}\,\mu ) \end{bmatrix} \end{aligned}$$
(2.32)

consists of the purely imaginary eigenvalues \(\{\lambda _k^\pm (\mu ),\; k\in \mathbb {Z}\} \), where

$$\begin{aligned} \lambda _k^\pm (\mu ):= \textrm{i}\,\big ( {\mathtt c}_{\mathtt h}( \pm k+\mu ) \mp \sqrt{|k \pm \mu |\tanh (\texttt{h}|k \pm \mu |)} \big ) \,. \end{aligned}$$
(2.33)

For \(\mu =0\) the real operator \(\mathcal {L}_{0,0}\) possesses the eigenvalue 0 with algebraic multiplicity 4,

$$\begin{aligned} \lambda _0^+(0) = \lambda _0^-(0) = \lambda _1^+(0) = \lambda _{1}^-(0)=0 \,, \end{aligned}$$

and geometric multiplicity 3. A real basis of the Kernel of \(\mathcal {L}_{0,0}\) is

$$\begin{aligned} f_1^+ := \begin{bmatrix} {\mathtt c}_{\mathtt h}^{1/2} \cos (x) \\ {\mathtt c}_{\mathtt h}^{-1/2} \sin (x) \end{bmatrix}, \quad f_1^{-} := \begin{bmatrix}- {\mathtt c}_{\mathtt h}^{1/2} \sin (x) \\ {\mathtt c}_{\mathtt h}^{-1/2}\cos (x) \end{bmatrix},\qquad f_0^-:=\begin{bmatrix}0 \\ 1 \end{bmatrix} \, , \end{aligned}$$
(2.34)

together with the generalized eigenvector

$$\begin{aligned} f_0^+:=\begin{bmatrix}1 \\ 0 \end{bmatrix} , \qquad \mathcal {L}_{0,0}f_0^+ =-f_0^- \, . \end{aligned}$$
(2.35)

Furthermore 0 is an isolated eigenvalue for \(\mathcal {L}_{0,0}\), namely the spectrum \(\sigma \left( \mathcal {L}_{0,0}\right) \) decomposes in two separated parts,

$$\begin{aligned} \sigma \left( \mathcal {L}_{0,0}\right) = \sigma '\left( \mathcal {L}_{0,0}\right) \cup \sigma ''\left( \mathcal {L}_{0,0}\right) , \quad \text {where} \quad \sigma '(\mathcal {L}_{0,0}):=\{0\}, \end{aligned}$$
(2.36)

and \( \sigma ''(\mathcal {L}_{0,0}):= \big \{ \lambda _k^\sigma (0),\ k \ne 0,1 \,, \sigma = \pm \big \} \).

We shall also use that, as proved in Theorem 4.1 in [33], the operator \( {{\mathcal {L}}}_{0,\epsilon } \) possesses, for any sufficiently small \(\epsilon \ne 0\), the eigenvalue 0 with a four dimensional generalized Kernel, spanned by \( \epsilon \)-dependent vectors \( U_1, {\tilde{U}}_2, U_3, U_4 \) satisfying, for some real constant \( \alpha _\epsilon , \beta _\epsilon \),

$$\begin{aligned}{} & {} {{\mathcal {L}}}_{0,\epsilon } U_1 = 0 \,, \ \ {{\mathcal {L}}}_{0,\epsilon } {\tilde{U}}_2 = 0 \,, \ \ {{\mathcal {L}}}_{0,\epsilon } U_3 = \alpha _\epsilon \, {\tilde{U}}_2 \,,\nonumber \\{} & {} {{\mathcal {L}}}_{0,\epsilon } U_4 = - U_1- \beta _\epsilon {\tilde{U}}_2 \,, \quad U_1:= \begin{bmatrix}0 \\ 1 \end{bmatrix} \,. \end{aligned}$$
(2.37)

By Kato’s perturbation theory (see Lemma 3.1 below) for any \(\mu , \epsilon \ne 0\) sufficiently small, the perturbed spectrum \(\sigma \left( \mathcal {L}_{\mu ,\epsilon }\right) \) admits a disjoint decomposition as

$$\begin{aligned} \sigma \left( \mathcal {L}_{\mu ,\epsilon }\right) = \sigma '\left( \mathcal {L}_{\mu ,\epsilon }\right) \cup \sigma ''\left( \mathcal {L}_{\mu ,\epsilon }\right) \,, \end{aligned}$$
(2.38)

where \( \sigma '\left( \mathcal {L}_{\mu ,\epsilon }\right) \) consists of 4 eigenvalues close to 0. We denote by \(\mathcal {V}_{\mu , \epsilon }\) the spectral subspace associated with \(\sigma '\left( \mathcal {L}_{\mu ,\epsilon }\right) \), which has dimension 4 and it is invariant by \(\mathcal {L}_{\mu , \epsilon }\). Our goal is to prove that, for \( \epsilon \) small, for values of the Floquet exponent \( \mu \) in an interval of order \( \epsilon \), the \(4\times 4\) matrix which represents the operator \( \mathcal {L}_{\mu ,\epsilon }: \mathcal {V}_{\mu ,\epsilon } \rightarrow \mathcal {V}_{\mu ,\epsilon } \) possesses a pair of eigenvalues close to zero with opposite non zero real parts.

Before stating our main result, let us introduce a notation we shall use through all the paper.

  • Notation: we denote by \(\mathcal {O}(\mu ^{m_1}\epsilon ^{n_1},\dots ,\mu ^{m_p}\epsilon ^{n_p})\), \( m_j, n_j \in \mathbb {N}\) (for us \(\mathbb {N}:=\{1,2,\dots \} \)), analytic functions of \((\mu ,\epsilon )\) with values in a Banach space X which satisfy, for some \( C > 0 \) uniform for \(\texttt{h}\) in any compact set of \((0, + \infty )\), the bound \(\Vert \mathcal {O}(\mu ^{m_j}\epsilon ^{n_j})\Vert _X \le C \sum _{j = 1}^p |\mu |^{m_j}|\epsilon |^{n_j}\) for small values of \((\mu , \epsilon )\). Similarly we denote \(r_k (\mu ^{m_1}\epsilon ^{n_1},\dots ,\mu ^{m_p}\epsilon ^{n_p}) \) scalar functions \(\mathcal {O}(\mu ^{m_1}\epsilon ^{n_1},\dots ,\mu ^{m_p}\epsilon ^{n_p})\) which are also real analytic.

Our complete spectral result is the following:

Theorem 2.5

(Complete Benjamin–Feir spectrum) There exist \( \epsilon _0, \mu _0>0 \), uniformly for the depth \( \texttt{h}\) in any compact set of \( (0,+\infty )\), such that, for any \( 0\,<\, \mu < \mu _0 \) and \( 0\le \epsilon < \epsilon _0 \), the operator \( \mathcal {L}_{\mu ,\epsilon }: \mathcal {V}_{\mu ,\epsilon } \rightarrow \mathcal {V}_{\mu ,\epsilon } \) can be represented by a \(4\times 4\) matrix of the form

(2.39)

where \( \texttt{U} \) and \( \texttt{S} \) are \( 2 \times 2 \) matrices, with identical diagonal entries each, of the form

$$\begin{aligned}&\texttt{U} = {\begin{pmatrix} \textrm{i}\,\big (({\mathtt c}_{\mathtt h}- \tfrac{1}{2}\texttt{e}_{12})\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) &{}\quad -\texttt{e}_{22}\frac{\mu }{8}(1+r_5(\epsilon ,\mu )) \\ -\mu \epsilon ^2 \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}+ r_1'( \mu \epsilon ^3, \mu ^2\epsilon ^2 ) +\texttt{e}_{22}\frac{\mu ^3}{8} (1+r_1''(\epsilon ,\mu )) &{}\quad \textrm{i}\,\big ( ({\mathtt c}_{\mathtt h}-\tfrac{1}{2}\texttt{e}_{12})\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) \end{pmatrix}}\, , \nonumber \\&\texttt{S} = \begin{pmatrix} \textrm{i}\,{\mathtt c}_{\mathtt h}\mu + \textrm{i}\,{ r_9(\mu \epsilon ^2, \mu ^2\epsilon )} &{} \tanh (\texttt{h}\mu )+ {r_{10}(\mu \epsilon )} \\ -\mu + {r_8(\mu \epsilon ^2, \mu ^3 \epsilon )} &{} \textrm{i}\,{\mathtt c}_{\mathtt h}\mu +\textrm{i}\,{r_9(\mu \epsilon ^2,\mu ^2\epsilon ) } \end{pmatrix}\, , \end{aligned}$$
(2.40)

where \(\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}\), \(\texttt{e}_{12}, \texttt{e}_{22}\) are defined in (1.1), (1.2), (1.3). The eigenvalues of \( \texttt{U} \) have the form

$$\begin{aligned} \begin{aligned} \lambda _1^\pm (\mu ,\epsilon )&= \textrm{i}\,\frac{1}{2}\breve{\mathtt c}_\texttt{h}\mu +\textrm{i}\,r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \\&\quad \pm \tfrac{1}{8} \mu \sqrt{\texttt{e}_{22}(\texttt{h}) (1+r_5(\epsilon ,\mu )) } \sqrt{\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}; \mu , \epsilon )} \,, \end{aligned} \end{aligned}$$
(2.41)

where \(\breve{\mathtt c}_\texttt{h}:=2 {\mathtt c}_{\mathtt h}- \texttt{e}_{12}(\texttt{h})\) and \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}; \mu , \epsilon )\) is the Benjamin–Feir discriminant function (1.6) (with \(r_1(\epsilon ^3, \mu \epsilon ^2):=-8 r_1'(\epsilon ^3, \mu \epsilon ^2)\)). As \(\texttt{e}_{22}(\texttt{h})>0\), they have non-zero real part if and only if \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h}; \mu , \epsilon )>0\).

The eigenvalues of the matrix \( \texttt{S} \) are a pair of purely imaginary eigenvalues of the form

$$\begin{aligned} \lambda _0^\pm (\mu , \epsilon ) = \textrm{i}\,{\mathtt c}_{\mathtt h}\mu \big (1+{r_9(\epsilon ^2,\mu \epsilon )\big )} \mp \textrm{i}\,\sqrt{\mu \tanh (\texttt{h}\mu )}\big (1+ {r(\epsilon )}\big )\,. \end{aligned}$$
(2.42)

For \( \epsilon = 0\) the eigenvalues \( \lambda _1^\pm (\mu ,0), \lambda _0^\pm (\mu ,0) \) coincide with those in (2.33).

Remark 2.6

At \(\epsilon = 0\), the eigenvalues in (2.41) have the Taylor expansion

$$\begin{aligned} \lambda ^\pm _1(\mu ,0) = \textrm{i}\,\left( {\mathtt c}_{\mathtt h}- \frac{1}{2} \texttt{e}_{12}(\texttt{h})\right) \mu \pm \textrm{i}\,\frac{\texttt{e}_{22}(\texttt{h})}{8} \mu ^2 + \mathcal {O}(\mu ^3) \,, \end{aligned}$$

which coincides with the one of \(\lambda ^\pm _1(\mu )\) in (2.33), in view of the coefficients \(\texttt{e}_{12}(\texttt{h})\) and \(\texttt{e}_{22}(\texttt{h})\) defined in (1.2), (1.3).

We conclude this section by describing our approach in detail.

Ideas and scheme of proof. The first step is to exploit as in [6] Kato’s theory to prolong the unperturbed symplectic basis \(\{f_1^\pm , f_0^\pm \}\) of \(\mathcal {V}_{0,0}\) in (2.34)-(2.35) into a symplectic basis \( \{ f^\sigma _k(\mu ,\epsilon ), k = 0,1, \sigma = \pm \} \) of the spectral subspace \(\mathcal {V}_{\mu ,\epsilon }\) associated with \(\sigma '\left( \mathcal {L}_{\mu ,\epsilon }\right) \) in (2.38), depending analytically on \(\mu , \epsilon \).

Its expansion in \(\mu ,\epsilon \) is provided in Lemma 4.2. This procedure reduces our spectral problem to determine the eigenvalues of the \(4\times 4\) Hamiltonian and reversible matrix \(\texttt{L}_{\mu ,\epsilon }\) (Lemma 3.4), representing the action of the operator \( \mathcal {L}_{\mu ,\epsilon }- \textrm{i}\,{\mathtt c}_{\mathtt h}\mu \) on \( \{f_k^\sigma (\mu ,\epsilon )\} \). In Proposition 4.3 we prove that

(2.43)

and the \( 2 \times 2 \) matrices EGF have the expansions (4.10)-(4.12). In finite depth this computation is much more involved than in deep water, as we need to track the exact dependence of the matrix entries with respect to \(\texttt{h}\). In particular the matrix E is

$$\begin{aligned} E = \begin{pmatrix} \texttt{e}_{11} \epsilon ^2(1+r_1'(\epsilon ,\mu \epsilon )) - \texttt{e}_{22}\frac{\mu ^2}{8}(1+r_1''(\epsilon ,\mu )) &{}\quad \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) \\ - \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12} \mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) &{}\quad -\texttt{e}_{22}\frac{\mu ^2}{8}(1+r_5(\epsilon ,\mu )) \end{pmatrix}\nonumber \\ \end{aligned}$$
(2.44)

where the coefficients \(\texttt{e}_{11} \) and \(\texttt{e}_{22}\), defined in (4.13) and (1.3), are strictly positive for any value of \(\texttt{h}>0\). Thus the submatrix \(\texttt{J}_2 E\) has a pair of eigenvalues with nonzero real part, for any value of \(\texttt{h}>0\), provided \(0<\mu < \overline{\mu }(\epsilon ) \sim \epsilon \). On the other hand, it has to come out that the complete \(4\times 4\) matrix \(\texttt{L}_{\mu ,\epsilon }\) possesses unstable eigenvalues if and only if the depth exceeds the celebrated Whitham-Benjamin threshold \(\texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\sim 1.363\ldots \). Indeed the correct eigenvalues of \(\texttt{L}_{\mu ,\epsilon }\) are not a small perturbation of those of \({\begin{pmatrix} \texttt{J}_2 E &{} 0 \\ 0 &{}\texttt{J}_2 G \end{pmatrix}} \) and will emerge only after one non-perturbative step of block diagonalization. This was not the case in the infinitely deep water case [6], where the corresponding submatrix \(\texttt{J}_2 E\) showed up the Benjamin–Feir eigenvalues, and we only had to check their stability under perturbation.

Remark 2.7

We underline that (2.44) is not a simple Taylor expansion in \(\mu , \epsilon \): note that the (2, 2)-entry in (2.44) does not have any term \(\mathcal {O}( \epsilon ^m )\) nor \( \mathcal {O}( \mu \epsilon ^m ) \) for any \( m \in \mathbb {N}\). These terms could change the sign of the entry (2, 2) which instead, in (2.44), is always negative (recall that \(\texttt{e}_{22}(\texttt{h}) >0\)). We prove the absence of terms \(\epsilon ^m\) exploiting the structural information (2.37) concerning the four dimensional generalized Kernel of the operator \(\mathcal {L}_{0,\epsilon }\) for any \(\epsilon >0\), see Lemma 4.4. We also note that the \(2 \times 2\) matrices \(\texttt{J}_2 E \) and \(\texttt{J}_2 G\) in (2.43) have both eigenvalues of size \(\mathcal {O}(\mu )\). As already mentioned in the introduction, this is a crucial difference with the deep water case, where the eigenvalues of \(\texttt{J}_2 G\) are \(\mathcal {O}(\sqrt{\mu })\).

In order to determine the spectrum of the matrix \(\texttt{L}_{\mu ,\epsilon }\) in (2.43), we perform a block diagonalization of \(\texttt{L}_{\mu ,\epsilon }\) to eliminate the coupling term \( \texttt{J}_2 F \) (which has size \( \epsilon \), see (4.12)). We proceed, in Sect. 5, in three steps.

1. Symplectic rescaling. We first perform a symplectic rescaling which is singular at \(\mu =0\), see Lemma 5.1, obtaining the matrix \(\texttt{L}_{\mu ,\epsilon }^{(1)}\). The effects are twofold: (i) the diagonal elements of

$$\begin{aligned} E^{(1)} = \begin{pmatrix} \texttt{e}_{11} \mu \epsilon ^2(1+r_1'(\epsilon ,\mu \epsilon ))- \texttt{e}_{22}\frac{\mu ^3}{8}(1+r_1''(\epsilon ,\mu )) &{}\quad \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) \\ - \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) &{}\quad -\texttt{e}_{22}\frac{\mu }{8}(1+r_5(\epsilon ,\mu )) \end{pmatrix}\nonumber \\ \end{aligned}$$
(2.45)

have size \(\mathcal {O}(\mu )\), as well as those of \(G^{(1)}\), and (ii) the matrix \(F^{(1)}\) has the smaller size \( { \mathcal {O}(\mu \epsilon ) } \).

2. Non-perturbative step of block-diagonalization (Section  5.1). Inspired by KAM theory, we perform one step of block decoupling to decrease further the size of the off-diagonal blocks. This step modifies the matrix \(\texttt{J}_2 E^{(1)}\) in a substantial way, by a term \( \mathcal {O}(\mu \epsilon ^2 )\). Let us explain better this step. In order to reduce the size of \(\texttt{J}_2 F^{(1)} \), we conjugate \(\texttt{L}_{\mu ,\epsilon }^{(1)}\) by the symplectic matrix \(\exp (S^{(1)})\), where \(S^{(1)}\) is a Hamiltonian matrix with the same form of \( \texttt{J}_2 F^{(1)} \), see (5.9). The transformed matrix \(\texttt{L}_{\mu ,\epsilon }^{(2)} = \exp (S^{(1)}) \texttt{L}_{\mu ,\epsilon }^{(1)}\exp (-S^{(1)}) \) has the Lie expansionFootnote 1

(2.46)

The first line in the right hand side of (2.46) is the previous block-diagonal matrix, the second line of (2.46) is a purely off-diagonal matrix and the third line is the sum of two block-diagonal matrices and “h.o.t." collects terms of much smaller size. \(S^{(1)}\) is determined in such a way that the second line of (2.46) vanishes, and therefore the remaining off-diagonal matrices (appearing in the h.o.t. remainder) are smaller in size. Unlike the infinitely deep water case [6], the block-diagonal corrections in the third line of (2.46) are not perturbative, modifying substantially the block-diagonal part. More precisely we obtain that \( \texttt{L}_{\mu ,\epsilon }^{(2)} \) has the form (5.10) with

$$\begin{aligned} E^{(2)}:= {\begin{pmatrix} \mu \epsilon ^2 \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}+ r_1'(\mu \epsilon ^3, \mu ^2 \epsilon ^2 )-\texttt{e}_{22}\frac{\mu ^3}{8}(1+r_1''(\epsilon ,\mu )) &{}\quad \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) \\ - \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) &{}\quad -\texttt{e}_{22}\frac{\mu }{8}(1+r_5(\epsilon ,\mu )) \end{pmatrix}}\,. \end{aligned}$$

Note the appearance of the Whitham-Benjamin function \(\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h}) \) in the (1,1)-entry of \( E^{(2)} \), which changes sign at the critical depth \(\texttt{h}_{\scriptscriptstyle {\textsc {WB}}}\), see Fig. 1, unlike the coefficient \( \texttt{e}_{11} (\texttt{h})> 0 \) in (2.45). If \(\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h}) >0\) and \(\epsilon \) and \(\mu \) are sufficiently small, the matrix \(\texttt{J}_2 E^{(2)}\) has eigenvalues with non-zero real part (recall that \(\texttt{e}_{22}(\texttt{h})>0\) for any \( \texttt{h}\)). On the contrary, if \(\texttt{e}_{\scriptscriptstyle {\textsc {WB}}}(\texttt{h}) <0\), then the eigenvalues of \(\texttt{J}_2 E^{(2)}\) lay on the imaginary axis.

3. Complete block-diagonalization (Section 5.2). In Lemma 5.9 we completely block-diagonalize \(\texttt{L}^{(2)}_{\mu ,\epsilon }\) by means of a standard implicit function theorem, finally proving that \(\texttt{L}_{\mu ,\epsilon }\) is conjugated to the matrix (2.39).

3 Perturbative Approach to the Separated Eigenvalues

We apply Kato’s similarity transformation theory [26, I-§4-6, II-§4] to study the splitting of the eigenvalues of \( \mathcal {L}_{\mu ,\epsilon } \) close to 0 for small values of \( \mu \) and \( \epsilon \), following [6]. First of all, it is convenient to decompose the operator \( \mathcal {L}_{\mu ,\epsilon }\) in (2.24) as

$$\begin{aligned} \mathcal {L}_{\mu ,\epsilon } = \textrm{i}\,{\mathtt c}_{\mathtt h}\mu + \mathscr {L}_{\mu ,\epsilon } \,, \qquad \mu > 0 \,, \end{aligned}$$
(3.1)

where, using also (2.30), \(\mathscr {L}_{\mu ,\epsilon }\) is the Hamiltonian operator

(3.2)

with selfadjoint, and it is also reversible, namely it satisfies, by (2.26),

$$\begin{aligned} \mathscr {L}_{\mu ,\epsilon }\circ {\bar{\rho }}=- {\bar{\rho }}\circ \mathscr {L}_{\mu ,\epsilon } \,, \qquad {\bar{\rho }} \text{ defined } \text{ in } (2.27) \,, \end{aligned}$$
(3.3)

whereas is reversibility-preserving, i.e. fulfills (2.28). Note also that is a real operator.

The scalar operator \( \textrm{i}\,{\mathtt c}_{\mathtt h}\mu \equiv \textrm{i}\,{\mathtt c}_{\mathtt h}\mu \, \text {Id}\) just translates the spectrum of \( \mathscr {L}_{\mu ,\epsilon }\) along the imaginary axis of the quantity \( \textrm{i}\,{\mathtt c}_{\mathtt h}\mu \), that is, in view of (3.1), \( \sigma ({{\mathcal {L}}}_{\mu ,\epsilon }) = \textrm{i}\,{\mathtt c}_{\mathtt h}\mu + \sigma (\mathscr {L}_{\mu ,\epsilon }) \,. \) Thus in the sequel we focus on studying the spectrum of \( \mathscr {L}_{\mu ,\epsilon }\).

Note also that \(\mathscr {L}_{0,\epsilon } = \mathcal {L}_{0,\epsilon }\) for any \(\epsilon \ge 0\). In particular \(\mathscr {L}_{0,0}\) has zero as isolated eigenvalue with algebraic multiplicity 4, geometric multiplicity 3 and generalized kernel spanned by the vectors \(\{f^+_1, f^-_1, f^+_0, f^-_0\}\) in (2.34), (2.35); furthermore, its spectrum is separated as in (2.36). For any \(\epsilon \ne 0\) small, \(\mathscr {L}_{0,\epsilon }\) has zero as isolated eigenvalue with geometric multiplicity 2, and two generalized eigenvectors satisfying (2.37).

We remark that, in view of (2.30), the operator \(\mathscr {L}_{\mu ,\epsilon }\) is analytic with respect to \(\mu \). The operator \( \mathscr {L}_{\mu ,\epsilon }: Y \subset X \rightarrow X \) has domain \(Y:=H^1(\mathbb {T}):=H^1(\mathbb {T},\mathbb {C}^2)\) and range \(X:=L^2(\mathbb {T}):=L^2(\mathbb {T},\mathbb {C}^2)\).

Lemma 3.1

(Kato theory for separated eigenvalues) Let \(\Gamma \) be a closed, counterclockwise-oriented curve around 0 in the complex plane separating \(\sigma '\left( \mathscr {L}_{0,0}\right) =\{0\}\) and the other part of the spectrum \(\sigma ''\left( \mathscr {L}_{0,0}\right) \) in (2.36). There exist \(\epsilon _0, \mu _0>0\) such that for any \((\mu , \epsilon ) \in B(\mu _0)\times B(\epsilon _0)\) the following statements hold:

  1. 1.

    The curve \(\Gamma \) belongs to the resolvent set of the operator \(\mathscr {L}_{\mu ,\epsilon }: Y \subset X \rightarrow X \) defined in (3.2).

  2. 2.

    The operators

    $$\begin{aligned} P_{\mu ,\epsilon }:= -\frac{1}{2\pi \textrm{i}\,}\oint _\Gamma (\mathscr {L}_{\mu ,\epsilon }-\lambda )^{-1} \textrm{d}\lambda : X \rightarrow Y \end{aligned}$$
    (3.4)

    are well defined projectors commuting with \(\mathscr {L}_{\mu ,\epsilon }\), i.e. \( P_{\mu ,\epsilon }^2 = P_{\mu ,\epsilon } \) and \( P_{\mu ,\epsilon }\mathscr {L}_{\mu ,\epsilon } = \mathscr {L}_{\mu ,\epsilon } P_{\mu ,\epsilon } \). The map \((\mu , \epsilon )\mapsto P_{\mu ,\epsilon }\) is analytic from \(B({\mu _0})\times B({\epsilon _0})\) to \( \mathcal {L}(X, Y)\).

  3. 3.

    The domain Y of the operator \(\mathscr {L}_{\mu ,\epsilon }\) decomposes as the direct sum

    $$\begin{aligned} Y= \mathcal {V}_{\mu ,\epsilon } \oplus \text {Ker}(P_{\mu ,\epsilon }) \,, \quad \mathcal {V}_{\mu ,\epsilon }:=\text {Rg}(P_{\mu ,\epsilon })=\text {Ker}(\textrm{Id}-P_{\mu ,\epsilon }) \,, \end{aligned}$$

    of closed invariant subspaces, namely \( \mathscr {L}_{\mu ,\epsilon }: \mathcal {V}_{\mu ,\epsilon } \rightarrow \mathcal {V}_{\mu ,\epsilon } \), \( \mathscr {L}_{\mu ,\epsilon }: \text {Ker}(P_{\mu ,\epsilon }) \rightarrow \text {Ker}(P_{\mu ,\epsilon }) \). Moreover

    $$\begin{aligned} \begin{aligned}&\sigma (\mathscr {L}_{\mu ,\epsilon })\cap \{ z \in \mathbb {C} \text{ inside } \Gamma \} = \sigma (\mathscr {L}_{\mu ,\epsilon }\vert _{{{\mathcal {V}}}_{\mu ,\epsilon }} ) = \sigma '(\mathscr {L}_{\mu , \epsilon }), \\&\sigma (\mathscr {L}_{\mu ,\epsilon })\cap \{ z \in \mathbb {C} \text{ outside } \Gamma \} = \sigma (\mathscr {L}_{\mu ,\epsilon }\vert _{Ker(P_{\mu ,\epsilon })} ) = \sigma ''( \mathscr {L}_{\mu , \epsilon }) \,. \end{aligned} \end{aligned}$$
  4. 4.

    The projectors \(P_{\mu ,\epsilon }\) are similar one to each other; the transformation operators

    $$\begin{aligned} U_{\mu ,\epsilon }:= \big ( \textrm{Id}-(P_{\mu ,\epsilon }-P_{0,0})^2 \big )^{-1/2} \big [ P_{\mu ,\epsilon }P_{0,0} + (\textrm{Id}- P_{\mu ,\epsilon })(\textrm{Id}-P_{0,0}) \big ]\nonumber \\ \end{aligned}$$
    (3.5)

    are bounded and invertible in Y and in X, with inverse

    $$\begin{aligned} U_{\mu ,\epsilon }^{-1} = \big [ P_{0,0} P_{\mu ,\epsilon }+(\textrm{Id}-P_{0,0}) (\textrm{Id}- P_{\mu ,\epsilon }) \big ] \big ( \textrm{Id}-(P_{\mu ,\epsilon }-P_{0,0})^2 \big )^{-1/2} \,, \end{aligned}$$

    and \( U_{\mu ,\epsilon } P_{0,0}U_{\mu ,\epsilon }^{-1} = P_{\mu ,\epsilon } \) as well as \( U_{\mu ,\epsilon }^{-1} P_{\mu ,\epsilon } U_{\mu ,\epsilon } = P_{0,0} \).Footnote 2 The map \((\mu , \epsilon )\mapsto U_{\mu ,\epsilon }\) is analytic from \(B(\mu _0)\times B(\epsilon _0)\) to \(\mathcal {L}(Y)\).

  5. 5.

    The subspaces \(\mathcal {V}_{\mu ,\epsilon }=\text {Rg}(P_{\mu ,\epsilon })\) are isomorphic one to each other: \( \mathcal {V}_{\mu ,\epsilon }= U_{\mu ,\epsilon }\mathcal {V}_{0,0}. \) In particular \(\dim \mathcal {V}_{\mu ,\epsilon } = \dim \mathcal {V}_{0,0}=4 \), for any \((\mu , \epsilon ) \in B(\mu _0)\times B(\epsilon _0)\).

Proof

For any \( \lambda \in \mathbb {C}\) we decompose \(\mathscr {L}_{\mu ,\epsilon }-\lambda = \mathscr {L}_{0,0}-\lambda + {{\mathcal {R}}}_{\mu ,\epsilon } \) where \( \mathscr {L}_{0,0} = \begin{bmatrix} {\mathtt c}_{\mathtt h}\partial _x &{} |D| \tanh (\texttt{h}|D|) \\ -1 &{} {\mathtt c}_{\mathtt h}\partial _x \end{bmatrix}\) and

$$\begin{aligned} {{\mathcal {R}}}_{\mu ,\epsilon }:=\mathscr {L}_{\mu ,\epsilon }-\mathscr {L}_{0,0} = \begin{bmatrix} (\partial _x +\textrm{i}\,\mu ) p_\epsilon (x) &{} f_{\mu ,\epsilon }(D) \\ -a_\epsilon (x) &{} p_\epsilon (x)(\partial _x + \textrm{i}\,\mu ) \end{bmatrix}: Y \rightarrow X \,, \end{aligned}$$

having used also (2.30) and setting

$$\begin{aligned}{} & {} f_{\mu ,\epsilon }(D):= |D+ \mu | \, \tanh \big ((\texttt{h}+\texttt{f}_\epsilon )|D+\mu |\big ) - |D| \tanh (\texttt{h}|D|) \in \mathcal {L}(Y) \,, \\{} & {} {\Vert f_{\mu ,\epsilon }(D) \Vert }_{\mathcal {L}(Y)} = \mathcal {O}(\mu ,\epsilon ) \,. \end{aligned}$$

For any \(\lambda \in \Gamma \), the operator \(\mathscr {L}_{0,0}-\lambda \) is invertible with inverse

$$\begin{aligned}{} & {} (\mathscr {L}_{0,0}-\lambda )^{-1} \\{} & {} \quad = \text {Op}\left( \frac{1}{(\textrm{i}\,{\mathtt c}_{\mathtt h}k-\lambda )^2 + |k| \tanh (\texttt{h}|k|)} \begin{bmatrix} \textrm{i}\,{\mathtt c}_{\mathtt h}k - \lambda &{}\quad -|k|\tanh (\texttt{h}|k|) \\ 1 &{}\quad \textrm{i}\,{\mathtt c}_{\mathtt h}k - \lambda \end{bmatrix} \right) : X \rightarrow Y \,. \end{aligned}$$

Hence, for \(|\epsilon |<\epsilon _0\) and \(|\mu |<\mu _0\) small enough, uniformly on the compact set \(\Gamma \), the operator \((\mathscr {L}_{0,0}-\lambda )^{-1}{{\mathcal {R}}}_{\mu ,\epsilon }:Y\rightarrow Y\) is bounded, with small operatorial norm. Then \(\mathscr {L}_{\mu ,\epsilon }-\lambda \) is invertible by Neumann series and \(\Gamma \) belongs to the resolvent set of \(\mathscr {L}_{\mu ,\epsilon }\). The remaining part of the proof follows exactly as in Lemma 3.1 in [6]. \(\quad \square \)

The Hamiltonian and reversible nature of the operator \( \mathscr {L}_{\mu ,\epsilon } \), see (3.2) and (3.3), imply additional algebraic properties for spectral projectors \(P_{\mu ,\epsilon }\) and the transformation operators \(U_{\mu ,\epsilon } \). By Lemma 3.2 in [6] we have that:

Lemma 3.2

For any \((\mu , \epsilon ) \in B(\mu _0)\times B(\epsilon _0)\), the following holds true:

  1. (i)

    The projectors \(P_{\mu ,\epsilon }\) defined in (3.4) are skew-Hamiltonian, namely \(\mathcal {J}P_{\mu ,\epsilon }=P_{\mu ,\epsilon }^*\mathcal {J}\), and reversibility preserving, i.e. \( {\bar{\rho }}P_{\mu ,\epsilon } = P_{\mu ,\epsilon } {\bar{\rho }}\).

  2. (ii)

    The transformation operators \(U_{\mu ,\epsilon }\) in (3.5) are symplectic, namely \( U_{\mu ,\epsilon }^* \mathcal {J}U_{\mu ,\epsilon }= \mathcal {J}\), and reversibility preserving.

  3. (iii)

    \(P_{0,\epsilon }\) and \(U_{0,\epsilon }\) are real operators, i.e. \(\bar{P_{0,\epsilon }}=P_{0,\epsilon }\) and \(\bar{U_{0,\epsilon }}=U_{0,\epsilon }\).

By the previous lemma, the linear involution \({\bar{\rho }}\) commutes with the spectral projectors \(P_{\mu ,\epsilon }\) and then \({\bar{\rho }}\) leaves invariant the subspace \( \mathcal {V}_{\mu ,\epsilon } = \text {Rg}(P_{\mu ,\epsilon }) \).

Symplectic and reversible basis of \(\mathcal {V}_{\mu ,\epsilon }\). It is convenient to represent the Hamiltonian and reversible operator \( \mathscr {L}_{\mu ,\epsilon }: \mathcal {V}_{\mu ,\epsilon } \rightarrow \mathcal {V}_{\mu ,\epsilon } \) in a basis which is symplectic and reversible, according to the following definition:

Definition 3.3

(Symplectic and reversible basis) A basis \(\texttt{F}:=\{\texttt{f}^+_1,\,\texttt{f}^-_1,\,\texttt{f}^+_0,\,\texttt{f}^-_0 \}\) of \(\mathcal {V}_{\mu ,\epsilon }\) is symplectic if, for any \( k, k' = 0,1 \),

$$\begin{aligned}{} & {} \left( \mathcal {J}\texttt{f}_k^-\,,\,\texttt{f}_k^+\right) = 1 \,, \ \ \big ( \mathcal {J}\texttt{f}_k^\sigma , \texttt{f}_k^\sigma \big ) = 0 \,, \ \forall \sigma = \pm \,; \nonumber \\{} & {} \quad \text {if} \ k \ne k' \ \text {then} \ \big ( \mathcal {J}\texttt{f}_k^\sigma , \texttt{f}_{k'}^{\sigma '} \big ) = 0 \,, \ \forall \sigma , \sigma ' = \pm \,. \end{aligned}$$
(3.7)

This is reversible if

$$\begin{aligned}{} & {} {\bar{\rho }} \texttt{f}^+_1 = \texttt{f}^+_1, \quad {\bar{\rho }} \texttt{f}^-_1 = - \texttt{f}^-_1, \quad {\bar{\rho }} \texttt{f}^+_0 = \texttt{f}^+_0, \quad {\bar{\rho }} \texttt{f}^-_0 = - \texttt{f}^-_0,\nonumber \\{} & {} \quad \text {i.e. } {\bar{\rho }}\texttt{f}_k^\sigma = \sigma \texttt{f}_k^\sigma \,, \ \forall \sigma = \pm , k = 0,1 \,. \end{aligned}$$
(3.8)

We use the following notation along the paper: we denote by even(x) a real \(2\pi \)-periodic function which is even in x, and by odd(x) a real \(2\pi \)-periodic function which is odd in x.

By the definition of the involution \({\bar{\rho }}\) in (2.27), the real and imaginary parts of a reversible basis \(\texttt{F}=\{\texttt{f}^\pm _k \}\), \(k=0,1\), enjoy the following parity properties (cfr. Lemma 3.4 in [6])

$$\begin{aligned} \texttt{f}_k^+(x) = \begin{bmatrix}even(x)+\textrm{i}\,odd(x) \\ odd(x)+\textrm{i}\,even(x) \end{bmatrix}, \quad \texttt{f}_k^-(x) = \begin{bmatrix}odd(x)+\textrm{i}\,even(x) \\ even(x)+\textrm{i}\,odd(x) \end{bmatrix}. \end{aligned}$$
(3.9)

By Lemmata 3.5 and 3.6 in [6] we have

Lemma 3.4

The \( 4 \times 4 \) matrix that represents the Hamiltonian and reversible operator with respect to a symplectic and reversible basis \(\texttt{F}=\{\texttt{f}_1^+,\texttt{f}_1^-,\texttt{f}_0^+,\texttt{f}_0^-\} \) of \(\mathcal {V}_{\mu ,\epsilon }\) is

(3.10)

is the self-adjoint matrix

(3.11)

The entries of the matrix \(\texttt{B}_{\mu ,\epsilon }\) are alternatively real or purely imaginary: for any \( \sigma = \pm \), \( k = 0, 1 \),

(3.12)

It is convenient to give a name to the matrices of the form obtained in Lemma 3.4.

Definition 3.5

A \( 2n \times 2n \), \( n = 1,2, \) matrix of the form \(\texttt{L}=\texttt{J}_{2n} \texttt{B}\) is Hamiltonian if \( \texttt{B}\) is a self-adjoint matrix, i.e. \(\texttt{B}=\texttt{B}^*\). It is reversible if \(\texttt{B}\) is reversibility-preserving, i.e. \(\rho _{2n}\circ \texttt{B}= \texttt{B}\circ \rho _{2n} \), where

$$\begin{aligned} \rho _4:= \begin{pmatrix}\rho _2 &{} 0 \\ 0 &{} \rho _2\end{pmatrix}, \qquad \rho _2:= \begin{pmatrix} \mathfrak {c} &{} 0 \\ 0 &{} - \mathfrak {c} \end{pmatrix}, \end{aligned}$$

and \(\mathfrak {c}: z \mapsto {\bar{z}} \) is the conjugation of the complex plane. Equivalently, \(\rho _{2n} \circ \texttt{L}= - \texttt{L}\circ \rho _{2n}\).

The transformations preserving the Hamiltonian structure are called symplectic, and satisfy

$$\begin{aligned} Y^* \texttt{J}_4 Y = \texttt{J}_4 \, . \end{aligned}$$
(3.13)

If Y is symplectic then \(Y^*\) and \(Y^{-1}\) are symplectic as well. A Hamiltonian matrix \(\texttt{L}=\texttt{J}_4 \texttt{B}\), with \(\texttt{B}=\texttt{B}^*\), is conjugated through Y in the new Hamiltonian matrix

$$\begin{aligned} \texttt{L}_1 = Y^{-1} \texttt{L}Y = Y^{-1} \texttt{J}_4 Y^{-*} Y^* \texttt{B}Y = \texttt{J}_4 \texttt{B}_1 \quad \text {where } \quad \texttt{B}_1:= Y^* \texttt{B}Y = \texttt{B}_1^* \,. \nonumber \\ \end{aligned}$$
(3.14)

A \( 4\times 4\) matrix \(\texttt{B}=(\texttt{B}_{ij})_{i,j=1,\dots ,4}\) is reversibility-preserving if and only if its entries are alternatively real and purely imaginary, namely \(\texttt{B}_{ij}\) is real when \(i+j\) is even and purely imaginary otherwise, as in (3.12). A \(4\times 4\) complex matrix \(\texttt{L}=(\texttt{L}_{ij})_{i,j=1, \ldots , 4}\) is reversible if and only if \(\texttt{L}_{ij}\) is purely imaginary when \(i+j\) is even and real otherwise.

Finally, we mention that the flow of a Hamiltonian reversibility-preserving matrix is symplectic and reversibility-preserving (see Lemma 3.8 in [6]).

4 Matrix Representation of \({\pmb {\mathscr {L}}}_{{\varvec{\mu }},{\varvec{\epsilon }}}\) on \({\pmb {\mathcal {V}}}_{{\varvec{\mu }},{\varvec{\epsilon }}}\)

Using the transformation operators \(U_{\mu ,\epsilon }\) in (3.5), we construct the basis of \(\mathcal {V}_{\mu ,\epsilon }\)

$$\begin{aligned} \mathcal {F}:= & {} \big \{ f_{1}^+(\mu ,\epsilon ), \ f_{1}^- (\mu ,\epsilon ), \ f_{0}^+(\mu ,\epsilon ),\ f_{0}^-(\mu ,\epsilon ) \big \} \,,\nonumber \\ f_{k}^\sigma (\mu ,\epsilon ):= & {} U_{\mu ,\epsilon } f_{k}^\sigma \,, \ \sigma =\pm \,, \,k=0,1 \,, \end{aligned}$$
(4.1)

where

$$\begin{aligned} f_1^+ = \begin{bmatrix}{\mathtt c}_{\mathtt h}^{1/2} \cos (x) \\ {\mathtt c}_{\mathtt h}^{-1/2} \sin (x) \end{bmatrix}, \quad f_1^- = \begin{bmatrix}- {\mathtt c}_{\mathtt h}^{1/2} \sin (x) \\ {\mathtt c}_{\mathtt h}^{-1/2} \cos (x) \end{bmatrix}, \quad f_0^+ = \begin{bmatrix}1 \\ 0 \end{bmatrix}, \quad f_0^- = \begin{bmatrix}0 \\ 1 \end{bmatrix},\nonumber \\ \end{aligned}$$
(4.2)

form a basis of \( \mathcal {V}_{0,0} =\textrm{Rg} (P_{0,0}) \), cfr. (2.34)-(2.35). Note that the real valued vectors \( \{ f_1^\pm , f_0^\pm \} \) form a symplectic and reversible basis for \( \mathcal {V}_{0,0} \), according to Definition 3.3. Then, by Lemma 3.2 and 3.1 we deduce that (cfr. Lemma 4.1 in [6]):

Lemma 4.1

The basis \( \mathcal {F}\) of \(\mathcal {V}_{\mu ,\epsilon }\) defined in (4.1), is symplectic and reversible, i.e. satisfies (3.7) and (3.8). Each map \((\mu , \epsilon ) \mapsto f^\sigma _k(\mu , \epsilon )\) is analytic as a map \(B(\mu _0)\times B(\epsilon _0) \rightarrow H^1(\mathbb {T})\).

In the next lemma we expand the vectors \( f_k^\sigma (\mu ,\epsilon ) \) in \( (\mu , \epsilon ) \). We denote by \(even_0(x)\) a real, even, \(2\pi \)-periodic function with zero space average. In the sequel \(\mathcal {O}(\mu ^{m} \epsilon ^{n}) \begin{bmatrix}even(x) \\ odd(x) \end{bmatrix}\) denotes an analytic map in \((\mu , \epsilon )\) with values in \( H^1(\mathbb {T}, \mathbb {C}^2) \), whose first component is even(x) and the second one odd(x); we have a similar meaning for \(\mathcal {O}(\mu ^{m} \epsilon ^{n}) \begin{bmatrix}odd(x) \\ even(x) \end{bmatrix}\), etc....

Lemma 4.2

(Expansion of the basis \( \mathcal {F}\)) For small values of \((\mu , \epsilon )\) the basis \( \mathcal {F}\) in (4.1) has the expansion

$$\begin{aligned} f^+_1(\mu , \epsilon )&= \begin{bmatrix}{\mathtt c}_{\mathtt h}^\frac{1}{2} \cos (x) \\ {\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\sin (x) \end{bmatrix} + \textrm{i}\,\frac{\mu }{4}\gamma _\texttt{h}\begin{bmatrix}{\mathtt c}_{\mathtt h}^\frac{1}{2}\sin (x) \\ {\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\cos (x) \end{bmatrix} + \epsilon \begin{bmatrix}\alpha _\texttt{h}\cos (2x) \\ \beta _\texttt{h}\sin (2x) \end{bmatrix} \nonumber \\&\quad + \mathcal {O}(\mu ^2) \begin{bmatrix}even_0(x) + \textrm{i}\,odd(x) \\ odd(x) + \textrm{i}\,even_0(x) \end{bmatrix} + \mathcal {O}(\epsilon ^2) \begin{bmatrix}even_0(x) \\ odd(x) \end{bmatrix}\nonumber \\&\quad + \textrm{i}\,\mu \epsilon \begin{bmatrix}odd(x) \\ even(x) \end{bmatrix} + \mathcal {O}(\mu ^2\epsilon ,\mu \epsilon ^2) \, , \end{aligned}$$
(4.3)
$$\begin{aligned} f^-_1(\mu , \epsilon )&= \begin{bmatrix}-{\mathtt c}_{\mathtt h}^\frac{1}{2} \sin (x) \\ {\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\cos (x) \end{bmatrix} + \textrm{i}\,\frac{\mu }{4} \gamma _\texttt{h}\begin{bmatrix}{\mathtt c}_{\mathtt h}^{\frac{1}{2}}\cos (x) \\ -{\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\sin (x) \end{bmatrix} + \epsilon \begin{bmatrix}-\alpha _{\texttt{h}}\sin (2x) \\ \beta _\texttt{h}\cos (2x) \end{bmatrix}\nonumber \\&+ \mathcal {O}(\mu ^2) \begin{bmatrix}odd(x) + \textrm{i}\,even_0(x) \\ even_0(x) + \textrm{i}\,odd(x) \end{bmatrix} + \mathcal {O}(\epsilon ^2) \begin{bmatrix}odd(x) \\ even(x) \end{bmatrix}\nonumber \\&\quad + \textrm{i}\,\mu \epsilon \begin{bmatrix}even(x) \\ odd(x) \end{bmatrix} + \mathcal {O}(\mu ^2\epsilon ,\mu \epsilon ^2) \, , \end{aligned}$$
(4.4)
$$\begin{aligned} f^+_0(\mu , \epsilon )&= \begin{bmatrix}1 \\ 0 \end{bmatrix}+ \epsilon \delta _\texttt{h}\begin{bmatrix}{\mathtt c}_{\mathtt h}^\frac{1}{2} \cos (x) \\ - {\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\sin (x) \end{bmatrix} + \mathcal {O}(\epsilon ^2) \begin{bmatrix}even_0(x) \\ odd(x) \end{bmatrix} \nonumber \\&\quad + \textrm{i}\,\mu \epsilon \begin{bmatrix}odd(x) \\ even_0(x) \end{bmatrix}+ \mathcal {O}(\mu ^2\epsilon ,\mu \epsilon ^2) \, , \end{aligned}$$
(4.5)
$$\begin{aligned} f^-_0(\mu , \epsilon )&= \begin{bmatrix}0 \\ 1 \end{bmatrix} + \textrm{i}\,\mu \epsilon \begin{bmatrix}even_0(x) \\ odd(x) \end{bmatrix}+\mathcal {O}(\mu ^2\epsilon ,\mu \epsilon ^2) \, , \end{aligned}$$
(4.6)

where the remainders \(\mathcal {O}()\) are vectors in \(H^1(\mathbb {T})\) and

$$\begin{aligned} \alpha _\texttt{h}:= & {} \frac{1}{2} {\mathtt c}_{\mathtt h}^{-\frac{11}{2}}(3+{\mathtt c}_{\mathtt h}^4) \,, \quad \beta _\texttt{h}:= \frac{1}{4}{\mathtt c}_{\mathtt h}^{-\frac{13}{2}}(1+{\mathtt c}_{\mathtt h}^4)(3-{\mathtt c}_{\mathtt h}^4) \,,\nonumber \\ \gamma _\texttt{h}:= & {} 1+\frac{\texttt{h}(1-{\mathtt c}_{\mathtt h}^4)}{{\mathtt c}_{\mathtt h}^2} \,, \quad \delta _\texttt{h}:= \frac{3+{\mathtt c}_{\mathtt h}^4}{4 {\mathtt c}_{\mathtt h}^{\frac{5}{2}}} \,. \end{aligned}$$
(4.7)

For \(\mu =0\) the basis \(\{f_k^\pm (0,\epsilon ), k=0,1 \} \) is real and

$$\begin{aligned} f^{+}_1 (0, \epsilon )= & {} \begin{bmatrix}even_0(x) \\ odd(x) \end{bmatrix}, \ f^{-}_1 (0, \epsilon ) = \begin{bmatrix}odd(x) \\ even(x) \end{bmatrix}, \nonumber \\ f^{+}_0 (0, \epsilon )= & {} \begin{bmatrix}1 \\ 0 \end{bmatrix}+ \begin{bmatrix}even_0(x) \\ odd(x) \end{bmatrix} \,, \ f^{-}_0 (0, \epsilon ) = \begin{bmatrix}0 \\ 1 \end{bmatrix} \,. \end{aligned}$$
(4.8)

Proof

The long calculations are given in Appendix A. \(\quad \square \)

We now state the main result of this section.

Proposition 4.3

The matrix that represents the Hamiltonian and reversible operator \( \mathscr {L}_{\mu ,\epsilon }: \mathcal {V}_{\mu ,\epsilon } \rightarrow \mathcal {V}_{\mu ,\epsilon } \) in the symplectic and reversible basis \(\mathcal {F}\) of \(\mathcal {V}_{\mu ,\epsilon }\) defined in (4.1), is a Hamiltonian matrix \(\texttt{L}_{\mu ,\epsilon }=\texttt{J}_4 \texttt{B}_{\mu ,\epsilon }\), where \(\texttt{B}_{\mu ,\epsilon } \) is a self-adjoint and reversibility preserving (i.e. satisfying (3.12)) \( 4 \times 4\) matrix of the form

$$\begin{aligned} \texttt{B}_{\mu ,\epsilon }= \begin{pmatrix} E &{} F \\ F^* &{} G \end{pmatrix}, \qquad E = E^* \,, \ \ G = G^* \,, \end{aligned}$$
(4.9)

where EFG are the \( 2 \times 2 \) matrices

$$\begin{aligned}&E := \begin{pmatrix} \texttt{e}_{11} \epsilon ^2(1+r_1'(\epsilon ,\mu \epsilon )) - \texttt{e}_{22}\frac{\mu ^2}{8}(1+r_1''(\epsilon ,\mu )) &{}\quad \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) \\ - \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12} \mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) &{}\quad -\texttt{e}_{22}\frac{\mu ^2}{8}(1+r_5(\epsilon ,\mu )) \end{pmatrix} \end{aligned}$$
(4.10)
$$\begin{aligned}&G := \begin{pmatrix} 1+r_8(\epsilon ^2,\mu ^2\epsilon ) &{} - \textrm{i}\,r_9(\mu \epsilon ^2,\mu ^2\epsilon ) \\ \textrm{i}\,r_9(\mu \epsilon ^2, \mu ^2\epsilon ) &{}\qquad \mu \tanh (\texttt{h}\mu )+ r_{10}(\mu ^2\epsilon ) \end{pmatrix} \end{aligned}$$
(4.11)
$$\begin{aligned}&F := \begin{pmatrix} \texttt{f}_{11}\epsilon + r_3(\epsilon ^3,\mu \epsilon ^2,\mu ^2\epsilon ) &{}\qquad \textrm{i}\,\mu \epsilon {\mathtt c}_{\mathtt h}^{-\frac{1}{2}} +\textrm{i}\,r_4({\mu \epsilon ^2}, \mu ^2 \epsilon ) \\ \textrm{i}\,r_6(\mu \epsilon ) &{} r_7(\mu ^2\epsilon ) \end{pmatrix} \, , \end{aligned}$$
(4.12)

with \(\texttt{e}_{12}\) and \(\texttt{e}_{22}\) given in (1.2) and (1.3) respectively, and

$$\begin{aligned} \texttt{e}_{11}&:= \dfrac{9{\mathtt c}_{\mathtt h}^8-10{\mathtt c}_{\mathtt h}^4+9}{8{\mathtt c}_{\mathtt h}^7} = \dfrac{9 (1-{\mathtt c}_{\mathtt h}^4)^2 +8 {\mathtt c}_{\mathtt h}^4 }{8{\mathtt c}_{\mathtt h}^7} > 0 \, , \qquad \texttt{f}_{11} := \tfrac{1}{2} {\mathtt c}_{\mathtt h}^{-\frac{3}{2}}(1-{\mathtt c}_{\mathtt h}^4) \, . \end{aligned}$$
(4.13)

The rest of this section is devoted to the proof of Proposition 4.3.

We decompose in (3.2) as

where , , are the self-adjoint and reversibility preserving operators

(4.14)
(4.15)
(4.16)

In view of (2.29), the operator is analytic in \( \mu \).

Lemma 4.4

(Expansion of \(\texttt{B}_\epsilon \)) The self-adjoint and reversibility preserving matrix \(\texttt{B}_\epsilon := \texttt{B}_\epsilon (\mu )\) associated, as in (3.11), with the self-adjoint and reversibility preserving operator defined in (4.14), with respect to the basis \(\mathcal {F} \) of \( {{\mathcal {V}}}_{\mu ,\epsilon } \) in (4.1), expands as

(4.17)

where \(\texttt{e}_{11}\), \(\texttt{f}_{11}\) are defined respectively in (4.13), and

$$\begin{aligned} \zeta _\texttt{h}:= \tfrac{1}{8}{\mathtt c}_{\mathtt h}\gamma _\texttt{h}^2 \,. \end{aligned}$$
(4.18)

Proof

We expand the matrix \( \texttt{B}_\epsilon (\mu ) \) as

$$\begin{aligned} \texttt{B}_\epsilon (\mu ) = \texttt{B}_\epsilon (0) + \mu (\partial _\mu \texttt{B}_\epsilon )(0) + \frac{\mu ^2 }{2} (\partial _\mu ^2 \texttt{B}_0)(0) + \mathcal {O}(\mu ^2\epsilon ,\mu ^3) \,. \end{aligned}$$
(4.19)

The matrix \(\texttt{B}_\epsilon (0)\). The main result of this long paragraph is to prove that the matrix \(\texttt{B}_\epsilon (0)\) has the expansion (4.23). The matrix \(\texttt{B}_\epsilon (0)\) is real, because the operator is real and the basis \( \{ f_k^\pm (0,\epsilon ) \}_{k=0,1}\) is real. Consequently, by (3.12), its matrix elements \((\texttt{B}_\epsilon (0))_{i,j}\) are real whenever \(i+j\) is even and vanish for \(i+j\) odd. In addition \(f^-_0(0,\epsilon ) = \begin{bmatrix}0 \\ 1 \end{bmatrix}\) by (4.8), and, by (4.14), we get , for any \( \epsilon \). We deduce that the self-adjoint matrix \( \texttt{B}_\epsilon (0) \) has the form

(4.20)

where \(E_{11}(0,\epsilon )\), \(E_{22}(0,\epsilon )\), \(G_{11}(0,\epsilon )\), \(F_{11}(0,\epsilon )\) are real. We claim that \( E_{22}(0,\epsilon ) = 0 \) for any \( \epsilon \). As a first step, following [6], we prove that

$$\begin{aligned} \text { either } \ E_{22}(0,\epsilon )\equiv 0 \,, \qquad \text { or } \ E_{11}(0,\epsilon )\equiv 0 \equiv F_{11}(0,\epsilon ) \,. \end{aligned}$$
(4.21)

Indeed, by (2.37), the operator \( \mathscr {L}_{0,\epsilon } \equiv {{\mathcal {L}}}_{0,\epsilon }\) possesses, for any sufficiently small \(\epsilon \ne 0\), the eigenvalue 0 with a four dimensional generalized Kernel \( \mathcal {W}_\epsilon := \text {span} \{ U_1, {\tilde{U}}_2, U_3, U_4 \} \), spanned by \( \epsilon \)-dependent vectors \( U_1, {\tilde{U}}_2, U_3, U_4 \). By Lemma 3.1 it results that \( \mathcal {W}_\epsilon = {{\mathcal {V}}}_{0,\epsilon } = \text {Rg}(P_{0,\epsilon } )\) and by (2.37) we have \( \mathscr {L}_{0,\epsilon }^2 = 0 \) on \( \mathcal {V}_{0,\epsilon } \). Thus the matrix

(4.22)

which represents \( \mathscr {L}_{0,\epsilon }:\mathcal {V}_{0,\epsilon }\rightarrow \mathcal {V}_{0,\epsilon }\), satisfies \( \texttt{L}^2_\epsilon (0) = 0 \), namely

which implies (4.21). We now prove that the matrix \(\texttt{B}_\epsilon (0)\) defined in (4.20) expands as

(4.23)

where \(\texttt{e}_{11}\) and \(\texttt{f}_{11}\) are in (4.29) and (4.32). We expand the operator in (4.14) as

(4.24)

where the remainder term \(\mathcal {O}(\epsilon ^3) \in \mathcal {L}(Y, X)\), the functions \(a_1\), \(p_1\), \(a_2\), \(p_2\) are given in (2.20)-(2.23) and, in view of (2.15), \(\texttt{f}_2:= \tfrac{1}{4}{\mathtt c}_{\mathtt h}^{-2}({\mathtt c}_{\mathtt h}^4-3)\).

\( \bullet \)  Expansion of  \(E_{11}(0,\epsilon )=\texttt{e}_{11}\epsilon ^2+r(\epsilon ^3)\). By (4.3) we split the real function \(f_1^+(0,\epsilon )\) as

$$\begin{aligned} \begin{aligned}&f_1^+(0,\epsilon ) = f_1^+ + \epsilon f_{1_1}^+ + \epsilon ^2 f_{1_2}^+ + \mathcal {O}(\epsilon ^3) \,, \\&f_1^+ = \begin{bmatrix}{\mathtt c}_{\mathtt h}^\frac{1}{2} \cos (x) \\ {\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\sin (x) \end{bmatrix},\ f_{1_1}^+:= \begin{bmatrix}\alpha _\texttt{h}\cos (2x) \\ \beta _\texttt{h}\sin (2x) \end{bmatrix} \,, \ f_{1_2}^+:= \begin{bmatrix}even_0(x) \\ odd(x) \end{bmatrix}, \end{aligned} \end{aligned}$$
(4.25)

where both \(f_{1_2}^+\) and \(\mathcal {O}(\epsilon ^3)\) are vectors in \(H^1(\mathbb {T})\). Since , and both , are self-adjoint real operators, it results

(4.26)

By (4.24) one has

(4.27)

with

$$\begin{aligned} \begin{aligned}&A_1:= \tfrac{1}{2}(a_1^{[1]}{\mathtt c}_{\mathtt h}^\frac{1}{2} - p_1^{[1]}{\mathtt c}_{\mathtt h}^{-\frac{1}{2}}), \qquad B_1:= -p_1^{[1]} {\mathtt c}_{\mathtt h}^{\frac{1}{2}} \,, \\&A_2:= {\mathtt c}_{\mathtt h}^{\frac{1}{2}} a_2^{[0]} - {\mathtt c}_{\mathtt h}^{-\frac{1}{2}} p_2^{[0]}+\tfrac{1}{2} {\mathtt c}_{\mathtt h}^{\frac{1}{2}} a_2^{[2]} - \tfrac{1}{2} {\mathtt c}_{\mathtt h}^{-\frac{1}{2}}p_2^{[2]} \,, \qquad A_4:=\alpha _\texttt{h}-2\beta _\texttt{h}{\mathtt c}_{\mathtt h}\,, \\&B_2:=-{\mathtt c}_{\mathtt h}^{\frac{1}{2}}p_2^{[0]}-\tfrac{1}{2} {\mathtt c}_{\mathtt h}^{\frac{1}{2}}p_2^{[2]} + {{\mathtt c}_{\mathtt h}^{-\frac{1}{2}}} \texttt{f}_2(1-{\mathtt c}_{\mathtt h}^4) \,, \qquad B_4:= -2\alpha _\texttt{h}{\mathtt c}_{\mathtt h}+ \displaystyle {\frac{4{\mathtt c}_{\mathtt h}^2}{1+{\mathtt c}_{\mathtt h}^4}}{\beta _\texttt{h}}\,. \end{aligned} \end{aligned}$$
(4.28)

By (4.27) and (4.25), we deduce

$$\begin{aligned} E_{11}(0,\epsilon )= & {} \texttt{e}_{11}\epsilon ^2 + r(\epsilon ^3) \,,\nonumber \\ \texttt{e}_{11}:= & {} \frac{1}{2} \big (A_2{\mathtt c}_{\mathtt h}^\frac{1}{2}+ B_2{\mathtt c}_{\mathtt h}^{-\frac{1}{2}}+ 2\alpha _\texttt{h}A_1 +2 B_1 \beta _\texttt{h}+ \alpha _\texttt{h}A_4+\beta _\texttt{h}B_4 \big ).\qquad \quad \end{aligned}$$
(4.29)

By (4.29), (4.28), (4.7), (2.20)-(2.23) we obtain (4.13). Since \(\texttt{e}_{11}>0\) the second alternative in (4.21) is ruled out, implying \(E_{22}(0,\epsilon ) \equiv 0\).

\( \bullet \) Expansion of \(G_{11} (0,\epsilon )=1+r(\epsilon ^2)\). By (4.5) we split the real-valued function \(f_0^+(0,\epsilon )\) as

$$\begin{aligned} f_0^+(0,\epsilon )= & {} f_0^+ + \epsilon f_{0_1}^+ + \epsilon ^2 f_{0_2}^+ + \mathcal {O}(\epsilon ^3) \,, \ \ f_0^+ = \begin{bmatrix}1 \\ 0 \end{bmatrix} \,, \nonumber \\ f_{0_1}^+:= & {} \delta _\texttt{h}\begin{bmatrix}{\mathtt c}_{\mathtt h}^{\frac{1}{2}}\cos (x) \\ -{\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\sin (x) \end{bmatrix} \,, \ f_{0_2}^+:= \begin{bmatrix}even_0(x) \\ odd(x) \end{bmatrix} \,. \end{aligned}$$
(4.30)

Since, by (2.34) and (4.24), , using that , are self-adjoint real operators, and \(\Vert f_0^+\Vert = 1\), \( (f_0^+, f_{0_1}^+ ) \), we have . By (4.24) and (2.20)-(2.23) one has

(4.31)

and, by (4.30), we deduce \( G_{11} (0,\epsilon ) = 1+ r(\epsilon ^2 ) \).

\( \bullet \) Expansion of \(F_{11}(0,\epsilon )=\texttt{f}_{11}\epsilon +r(\epsilon ^3)\). By (4.24), (4.25), (4.30), using that are self-adjoint and real, and , , we obtain

By (4.25), (4.27), (4.28), (4.30), (4.31), all these scalar products vanish but the first one, and then

$$\begin{aligned} F_{11}(0,\epsilon )=\texttt{f}_{11}\epsilon +r(\epsilon ^3) \,,\quad \texttt{f}_{11}:= A_1 = \tfrac{1}{2} \left( a_1^{[1]}{\mathtt c}_{\mathtt h}^{\frac{1}{2}}-p_1^{[1]}{\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\right) , \end{aligned}$$
(4.32)

which, by substituting the expressions of \( a_1^{[1]} \), \( p_1^{[1]} \) in Lemma 2.2, gives the expression in (4.13).

The expansion (4.23) in proved.

Linear terms in \( \mu \). We now compute the terms of \(\texttt{B}_\epsilon (\mu )\) that are linear in \(\mu \). It results

(4.33)

We now prove that

(4.34)

The matrix \( \texttt{L}_\epsilon (0) \) in (4.22) where \(E_{22}(0,\epsilon )=0\), represents the action of the operator \( \mathscr {L}_{0,\epsilon }:\mathcal {V}_{0,\epsilon }\rightarrow \mathcal {V}_{0,\epsilon }\) in the basis \( \{ f^{\sigma }_k (0,\epsilon ) \} \) and then we deduce that \( \mathscr {L}_{0,\epsilon } f_1^-(0,\epsilon ) = 0 \), \( \mathscr {L}_{0,\epsilon } f_0^-(0,\epsilon ) = 0 \). Thus also , , and the second and the fourth column of the matrix X in (4.34) are zero. To compute the other two columns we use the expansion of the derivatives. In view of (4.3)–(4.6) and by denoting with a dot the derivative w.r.t. \(\mu \), one has

$$\begin{aligned} \begin{aligned}&\dot{f}^{+}_{1}(0,\epsilon ) = \frac{\textrm{i}\,}{4} \gamma _\texttt{h}\begin{bmatrix}{\mathtt c}_{\mathtt h}^{\frac{1}{2}}\sin (x) \\ {\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\cos (x) \end{bmatrix}+\textrm{i}\,\epsilon \begin{bmatrix}odd(x) \\ even(x) \end{bmatrix}+\mathcal {O}(\epsilon ^2) \,, \\&\dot{f}^{+}_{0}(0,\epsilon ) = \textrm{i}\,\epsilon \begin{bmatrix}odd(x) \\ even_0(x) \end{bmatrix}+\mathcal {O}(\epsilon ^2) \,,\\&\dot{f}^{-}_{1}(0,\epsilon ) = \frac{\textrm{i}\,}{4}\gamma _\texttt{h}\begin{bmatrix}{\mathtt c}_{\mathtt h}^{\frac{1}{2}}\cos (x) \\ -{\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\sin (x) \end{bmatrix}+\textrm{i}\,\epsilon \begin{bmatrix}even(x) \\ odd(x) \end{bmatrix}+\mathcal {O}(\epsilon ^2) \,,\\&\dot{f}^{-}_{0}(0,\epsilon ) =\textrm{i}\,\epsilon \begin{bmatrix}even_0(x) \\ odd(x) \end{bmatrix} +\mathcal {O}(\epsilon ^2) \,. \end{aligned} \end{aligned}$$
(4.35)

In view of (2.2), (4.3)–(4.6), (4.22), (4.8), (4.29),(4.32), and since , we have

(4.36)

We deduce (4.34) by (4.35) and (4.36).

Quadratic terms in \( \mu \). By denoting with a double dot the double derivative w.r.t. \(\mu \), we have

(4.37)

We claim that \(Y = 0 \). Indeed, its first, second and fourth column are zero, since for \(f_k^\sigma \in \{ f_1^+,f_1^-,f_0^- \} \). The third column is also zero by noting that and

$$\begin{aligned} \ddot{f}_{1}^{+}(0,0)= & {} \begin{bmatrix}even_0(x)+\textrm{i}\,odd(x) \\ odd(x) +\textrm{i}\,even_0(x) \end{bmatrix}, \ \ \ddot{f}_{1}^{-}(0,0) = \begin{bmatrix}odd(x) +\textrm{i}\,even_0(x) \\ even_0(x)+\textrm{i}\,odd(x) \end{bmatrix}, \\ \ddot{f}_{0}^{+}(0,0)= & {} \ddot{f}_{0}^{-}(0,0)=0 \,. \end{aligned}$$

We claim that

(4.38)

with \(\zeta _\texttt{h}\) as in (4.18). Indeed, by (4.35), we have \(\dot{f}^+_0(0,0)=\dot{f}^-_0(0,0)= 0\). Therefore the last two columns of Z, and by self-adjointness the last two rows, are zero. By (4.24), (4.35) we obtain the matrix (4.38) with

In conclusion (4.19), (4.33), (4.34), (4.37), the fact that \(Y=0\) and (4.38) imply (4.17), using also the selfadjointness of \(\texttt{B}_\epsilon \) and (3.12). \(\quad \square \)

We now consider .

Lemma 4.5

(Expansion of \(\texttt{B}^\flat \)) The self-adjoint and reversibility-preserving matrix \(\texttt{B}^\flat \) associated, as in (3.11), to the self-adjoint and reversibility-preserving operator , defined in (4.15), with respect to the basis \(\mathcal {F}\) of \( {{\mathcal {V}}}_{\mu ,\epsilon } \) in (4.1), admits the expansion

(4.39)

where \(\texttt{e}_{12}\) is defined in (1.2)  and

$$\begin{aligned} {{\textbf {b}}}_{\texttt{h}}:= \gamma _\texttt{h}{\mathtt c}_{\mathtt h}+ {\mathtt c}_{\mathtt h}^{-1}\texttt{h}(1-{\mathtt c}_{\mathtt h}^4) (\gamma _\texttt{h}- 2(1-{\mathtt c}_{\mathtt h}^2\texttt{h})) \, \,. \end{aligned}$$
(4.40)

Proof

We have to compute the expansion of the matrix entries . First, by (4.6), (4.15) and since \(\texttt{f}_\epsilon =O(\epsilon ^2)\) (cfr. (2.15)) we have

Hence, by (4.3)–(4.6), the entries of the last column (and row) of \(\texttt{B}^\flat \) are

in agreement with (4.39).

In order to compute the other matrix entries we expand in (4.15) at \(\mu = 0\), obtaining

(4.41)

We note that

$$\begin{aligned} \mu \big ( {{\mathcal {R}}}^\flat (\epsilon ) f^{\sigma }_k (\mu , \epsilon ), f^{\sigma '}_{k'} (\mu , \epsilon ) \big )= & {} \mu \big ( {{\mathcal {R}}}^\flat f^{\sigma }_k (0, \epsilon ), f^{\sigma '}_{k'} (0, \epsilon ) \big ) + \mathcal {O}(\mu ^2\epsilon ^2) \nonumber \\= & {} {\left\{ \begin{array}{ll} \mathcal {O}(\mu ^2\epsilon ^2) &{} \text{ if } \sigma =\sigma '\,, \\ \mathcal {O}(\mu \epsilon ^2) &{} \text{ if } \sigma \ne \sigma '\,. \end{array}\right. } \end{aligned}$$
(4.42)

Indeed, if \(\sigma =\sigma '\), \( \big ( {{\mathcal {R}}}^\flat f^{\sigma }_k (0, \epsilon ), f^{\sigma '}_{k'} (0, \epsilon ) \big )\) is real by (3.12), but purely imaginaryFootnote 3 too, since the operator \({{\mathcal {R}}}^\flat \) is purely imaginary (as is) and the basis \( \{ f_k^\pm (0,\epsilon ) \}_{k=0,1}\) is real. The terms (4.42) contribute to \( r_2 (\mu \epsilon ^2) \) and \( r_6 (\epsilon \mu )\) in (4.39).

Next we compute the other scalar products. By (4.3), (4.41), and the identities \( {{\,\textrm{sgn}\,}}(D) \sin (kx) = - \textrm{i}\,\cos (kx) \) and \( {{\,\textrm{sgn}\,}}(D)\cos (kx) = \textrm{i}\,\sin (kx) \) for any \( k \in \mathbb {N}\), we have

where

(4.43)

Similarly , where

(4.44)

Analogously, using (4.4),

and , with , \(j=1,2,3\), defined in (4.43) and (4.44). In addition, by (4.5)–(4.6), we get that

with in (4.43). By taking the scalar products of the above expansions of with the functions \(f^{\sigma '}_{k'}(\mu ,\epsilon ) \) expanded as in (4.3)-(4.6) we obtain that (recall that the scalar product is conjugate-linear in the second component)

and, recalling (4.41), (4.43), (4.44), we deduce the expansion of the entries (1, 1) and (2, 2) of the matrix \(\texttt{B}^\flat \) in (4.39) with in (4.40). Moreover

where is equal to (1.2). Finally we obtain

The expansion (4.39) is proved. \(\quad \square \)

Finally, we consider .

Lemma 4.6

(Expansion of \(\texttt{B}^\sharp \)) The self-adjoint and reversibility-preserving matrix \(\texttt{B}^\sharp \) associated, as in (3.11), to the self-adjoint and reversibility-preserving operators , defined in (4.16), with respect to the basis \(\mathcal {F}\) of \( {{\mathcal {V}}}_{\mu ,\epsilon } \) in (4.1), admits the expansion

(4.45)

Proof

Since and \(p_\epsilon =\mathcal {O}(\epsilon )\) by (2.19), we have the expansion

(4.46)

The matrix entries , \( k, k' = 0,1 \), \( \sigma = \{ \pm \} \) are zero, because they are simultaneously real by (3.12), and purely imaginary, being the operator purely imaginary and the basis \( \{ f_k^\pm (0,\epsilon ) \}_{k=0,1}\) real. Hence \(\texttt{B}^\sharp \) has the form

(4.47)

and \(\alpha \), \( \beta \), \( \gamma \), \( \delta \) are real numbers. As in \(\mathcal {L}(Y)\), we deduce that \( \gamma =r( \mu \epsilon ) \). Let us compute the expansion of \(\beta \), \(\delta \) and \(\eta \). By (2.20) and (2.2) we write the operator in (4.16) as

(4.48)

with \(\mathcal {O}(\mu \epsilon ^2) \in \mathcal {L}(Y)\). In view of (4.3)–(4.6), \(f_1^\pm (0,\epsilon ) = f_1^\pm + \mathcal {O}(\epsilon )\), \(f_0^+(0,\epsilon )=f_0^+ +\mathcal {O}(\epsilon )\), \(f_0^-(0,\epsilon ) = \begin{bmatrix}0 \\ 1 \end{bmatrix}\), where \( f_k^\sigma \) are in (4.2). By (4.48) we have , and then

This proves (4.45). \(\quad \square \)

Lemmata 4.4, 4.5, 4.6 imply (4.9) where the matrix E has the form (4.10) and

$$\begin{aligned} \texttt{e}_{22}:=2( {\textbf {b}}_{\texttt{h}} - 4 \zeta _\texttt{h}) = 2\gamma _\texttt{h}{\mathtt c}_{\mathtt h}+ 2{\mathtt c}_{\mathtt h}^{-1}\texttt{h}(1-{\mathtt c}_{\mathtt h}^4) (\gamma _\texttt{h}- 2(1-{\mathtt c}_{\mathtt h}^2\texttt{h})) - {\mathtt c}_{\mathtt h}\gamma _\texttt{h}^2 \,, \end{aligned}$$

with \( {\textbf {b}}_{\texttt{h}} \) in (4.40) and \( \zeta _\texttt{h}\) in (4.18). The term \(\texttt{e}_{22}\) has the expansion in (1.3). Moreover

$$\begin{aligned}&G := G(\mu ,\epsilon ) = \begin{pmatrix} 1+r_8(\epsilon ^2,\mu ^2\epsilon , \mu ^3) &{}\quad - \textrm{i}\,r_9(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \\ \textrm{i}\,r_9(\mu \epsilon ^2, \mu ^2\epsilon ,\mu ^3) &{}\quad \mu \tanh (\texttt{h}\mu )+ r_{10}(\mu ^2\epsilon ,\mu ^3) \end{pmatrix} \end{aligned}$$
(4.49)
$$\begin{aligned}&F := F(\mu ,\epsilon ) = \begin{pmatrix} \texttt{f}_{11}\epsilon + r_3(\epsilon ^3,\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) &{}\quad \textrm{i}\,\mu \epsilon {\mathtt c}_{\mathtt h}^{-\frac{1}{2}} +\textrm{i}\,r_4({\mu \epsilon ^2}, \mu ^2 \epsilon , \mu ^3) \\ \textrm{i}\,r_6(\mu \epsilon , \mu ^3) &{} r_7(\mu ^2\epsilon ,\mu ^3) \end{pmatrix} \, . \end{aligned}$$
(4.50)

In order to deduce the expansion (4.11)–(4.12) of the matrices FG we exploit further information for

(4.51)

We have

Lemma 4.7

At \( \epsilon = 0 \) the matrices are \( F (\mu ,0) = 0 \) and \( G (\mu ,0) = \begin{pmatrix} 1 &{} 0 \\ 0 &{} \mu \tanh ( \texttt{h}\mu ) \end{pmatrix} \).

Proof

By Lemma A.5 and (4.51) we have and , for any \( \mu \). Then the lemma follows recalling (3.11) and the fact that \(f_1^+(\mu ,0)\) and \(f_1^-(\mu ,0)\) have zero space average by Lemma A.5. \(\quad \square \)

In view of Lemma 4.7 we deduce that the matrices (4.49) and (4.50) have the form (4.11) and (4.12). This completes the proof of Proposition 4.3.

We now show that the constant \(\texttt{e}_{22}\) in (1.3) is positive for any depth \(\texttt{h}>0 \).

Lemma 4.8

For any \( \texttt{h}> 0 \) the term \(\texttt{e}_{22} \) in (1.3) is positive, \(\texttt{e}_{22} \rightarrow 0 \) as \(\texttt{h}\rightarrow 0^+\) and \(\texttt{e}_{22} \rightarrow 1 \) as \(\texttt{h}\rightarrow +\infty \). As a consequence for any \(\texttt{h}_0 >0 \) the term \(\texttt{e}_{22}\) is bounded from below uniformly in \(\texttt{h}>\texttt{h}_0\).

Proof

The quantity \( z:= {\mathtt c}_{\mathtt h}^2 = \tanh (\texttt{h}) \) is in (0, 1) for any \( \texttt{h}> 0 \). Then the quadratic polynomial \( (0, + \infty ) \ni \texttt{h}\mapsto (1-z^2)(1+3z^2) \texttt{h}^2+2 z(z^2-1) \texttt{h}+z^2 \) is positive because its discriminant \(- 4z^4(1-z^2) \) is negative as \( 0<z^2<1\). The limits for \( \texttt{h}\rightarrow 0^+\) and \(\texttt{h}\rightarrow +\infty \) follow by inspection. \(\quad \square \)

5 Block-Decoupling and Emergence of the Whitham–Benjamin Function

In this section we block-decouple the \( 4 \times 4 \) Hamiltonian matrix \(\texttt{L}_{\mu ,\epsilon } = \texttt{J}_4 \texttt{B}_{\mu ,\epsilon } \) obtained in Proposition 4.3.

We first perform a singular symplectic and reversibility-preserving change of coordinates.

Lemma 5.1

(Singular symplectic rescaling) The conjugation of the Hamiltonian and reversible matrix \(\texttt{L}_{\mu ,\epsilon } = \texttt{J}_4 \texttt{B}_{\mu ,\epsilon } \) obtained in Proposition 4.3 through the symplectic and reversibility-preserving \( 4 \times 4 \)-matrix

$$\begin{aligned} Y:= \begin{pmatrix} Q &{} 0 \\ 0 &{} Q \end{pmatrix} \quad \text {with} \quad Q:=\begin{pmatrix} \mu ^{\frac{1}{2}} &{} 0 \\ 0 &{} \mu ^{-\frac{1}{2}}\end{pmatrix} \,, \ \ \mu > 0 \,, \end{aligned}$$
(5.1)

yields the Hamiltonian and reversible matrix

$$\begin{aligned}&\texttt{L}_{\mu ,\epsilon }^{(1)} := Y^{-1} \texttt{L}_{\mu ,\epsilon } Y = \texttt{J}_4\texttt{B}^{(1)}_{\mu ,\epsilon } = \begin{pmatrix} \texttt{J}_2 E^{(1)} &{} \texttt{J}_2 F^{(1)} \\ \texttt{J}_2 [F^{(1)}]^* &{} \texttt{J}_2 G^{(1)} \end{pmatrix} \end{aligned}$$
(5.2)

where \( \texttt{B}_{\mu ,\epsilon }^{(1)} \) is a self-adjoint and reversibility-preserving \( 4 \times 4\) matrix

$$\begin{aligned} \texttt{B}_{\mu ,\epsilon }^{(1)} = \begin{pmatrix} E^{(1)} &{} F^{(1)} \\ [F^{(1)}]^* &{} G^{(1)} \end{pmatrix}, \quad E^{(1)} = [E^{(1)}]^* \,, \ G^{(1)} = [G^{(1)}]^* \,, \end{aligned}$$
(5.3)

where the \( 2 \times 2 \) reversibility-preserving matrices \(E^{(1)} \), \( G^{(1)} \) and \( F^{(1)}\) extend analytically at \(\mu =0\) with the following expansion

$$\begin{aligned}&E^{(1)} =\begin{pmatrix} \texttt{e}_{11} \mu \epsilon ^2(1+r_1'(\epsilon ,\mu \epsilon ))- \texttt{e}_{22}\frac{\mu ^3}{8}(1+r_1''(\epsilon ,\mu )) &{}\quad \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) \\ - \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) &{}\quad -\texttt{e}_{22}\frac{\mu }{8}(1+r_5(\epsilon ,\mu )) \end{pmatrix}\, , \end{aligned}$$
(5.4)
$$\begin{aligned}&G^{(1)} = \begin{pmatrix} \mu + r_8(\mu \epsilon ^2, \mu ^3 \epsilon ) &{} - \textrm{i}\,r_9(\mu \epsilon ^2,\mu ^2\epsilon ) \\ \textrm{i}\,r_9(\mu \epsilon ^2, \mu ^2\epsilon ) &{} \tanh (\texttt{h}\mu ) + r_{10}(\mu \epsilon ) \end{pmatrix}\, , \end{aligned}$$
(5.5)
$$\begin{aligned}&F^{(1)} = \begin{pmatrix} \texttt{f}_{11}\mu \epsilon +r_3(\mu \epsilon ^3,\mu ^2\epsilon ^2,\mu ^3\epsilon ) &{}\quad \textrm{i}\,\mu \epsilon {\mathtt c}_{\mathtt h}^{-\frac{1}{2}} + \textrm{i}\,r_4(\mu \epsilon ^2, \mu ^2 \epsilon ) \\ \textrm{i}\,r_6(\mu \epsilon ) &{} r_7(\mu \epsilon ) \end{pmatrix} \end{aligned}$$
(5.6)

where \(\texttt{e}_{11}, \texttt{e}_{12}, \texttt{e}_{22}, \texttt{f}_{11}\) are defined in (4.13), (1.2), (1.3).

Remark 5.2

The matrix \(\texttt{L}_{\mu ,\epsilon }^{(1)}\), a priori defined only for \(\mu \ne 0\), extends analytically to the zero matrix at \(\mu = 0\). For \(\mu \ne 0\) the spectrum of \(\texttt{L}_{\mu ,\epsilon }^{(1)}\) coincides with the spectrum of \(\texttt{L}_{\mu ,\epsilon }\).

Proof

The matrix Y is symplectic, i.e. (3.13) holds, and since \(\mu \) is real, it is reversibility preserving, i.e. satisfies (3.12). By (3.14),

$$\begin{aligned} \texttt{B}_{\mu ,\epsilon }^{(1)} = Y^* \texttt{B}_{\mu ,\epsilon } Y = \begin{pmatrix} E^{(1)} &{} F^{(1)} \\ [F^{(1)}]^* &{} G^{(1)} \end{pmatrix}, \end{aligned}$$

with, Q being self-adjoint, \(E^{(1)}=QEQ = [E^{(1)}]^* \), \(G^{(1)}=QGQ=[G^{(1)}]^*\) and \(F^{(1)}=QFQ\). In view of (4.10)–(4.12), we obtain (5.4)–(5.6). \(\quad \square \)

5.1 Non-perturbative Step of Block-Decoupling

We first verify that the quantity \(D_\texttt{h}:=\texttt{h}-\tfrac{1}{4} \texttt{e}_{12}^2\) is nonzero for any \(\texttt{h}> 0 \). In view of the comment 3 after Theorem 1.1, we have that \(D_\texttt{h}= \texttt{h}-c_g^2\). The non-degeneracy property \( D_\texttt{h}\ne 0 \) corresponds to that in Bridges-Mielke [9, p.183] and [38, p.409].

Lemma 5.3

For any \( \texttt{h}>0 \) it results

$$\begin{aligned} \texttt{D}_\texttt{h}:=\texttt{h}-\tfrac{1}{4} \texttt{e}_{12}^2> 0 \,,\quad \text {and}\quad \lim _{\texttt{h}\rightarrow 0^+}\texttt{D}_\texttt{h}=0\,. \end{aligned}$$
(5.7)

Proof

We write \( \texttt{D}_\texttt{h}= (\sqrt{\texttt{h}}+\frac{1}{2} \texttt{e}_{12})(\sqrt{\texttt{h}}-\frac{1}{2} \texttt{e}_{12})\) whose first factor is positive for \(\texttt{h}>0\). We claim that also the second factor is positive. In view of (1.2) it is equal to \( \tfrac{1}{2} {\mathtt c}_{\mathtt h}^{-1} f(\texttt{h}) \) with

$$\begin{aligned} f( \texttt{h})&:= \big (\sqrt{\texttt{h}}\tanh (\texttt{h}) - \sqrt{\texttt{h}}+\sqrt{\tanh (\texttt{h})}\big )\big (\sqrt{\texttt{h}}\tanh (\texttt{h}) + \sqrt{\texttt{h}}-\sqrt{\tanh (\texttt{h})}\big )\\&=:q(\texttt{h})p(\texttt{h})\, . \end{aligned}$$

The function \(p(\texttt{h})\) is positive since \( \texttt{h}>\tanh (\texttt{h})\) for any \(\texttt{h}>0\). We claim that also the function \(q(\texttt{h})\) is positive. Indeed its derivative

$$\begin{aligned} q' (\texttt{h}) = \frac{1 - \tanh (\texttt{h})}{2 \sqrt{\texttt{h}} \sqrt{ \tanh (\texttt{h})}} \Big ( - \sqrt{ \tanh (\texttt{h})} + \sqrt{\texttt{h}} + \sqrt{\texttt{h}} \, {\tanh (\texttt{h}) } \Big ) + \sqrt{\texttt{h}} \big ( 1 - \tanh ^2 (\texttt{h}) \big ) > 0 \end{aligned}$$

for any \( \texttt{h}> 0 \). Since \( q(0) = 0 \) we deduce that \( q (\texttt{h}) > 0 \) for any \( \texttt{h}> 0 \). This proves the lemma. \(\quad \square \)

We now state the main result of this section.

Lemma 5.4

(Step of block-decoupling) There exists a \(2\times 2\) reversibility-preserving matrix X, analytic in \((\mu , \epsilon ) \), of the form

$$\begin{aligned} X&:= \begin{pmatrix} x_{11} &{} \textrm{i}\,x_{12} \\ \textrm{i}\,x_{21} &{} x_{22} \end{pmatrix} \qquad \qquad \qquad \qquad \text {with} \quad x_{ij}\in {\mathbb {R}}\, , \ i,j=1,2 \, , \nonumber \\&= \begin{pmatrix} r_{11}(\epsilon ) &{} \textrm{i}\,\, r_{12}(\epsilon ) \\ -\textrm{i}\,\frac{1}{2} \texttt{D}_\texttt{h}^{-1} (\texttt{e}_{12} \texttt{f}_{11} + 2{\mathtt c}_{\mathtt h}^{-\frac{1}{2}})\epsilon + \textrm{i}\,r_{21}(\epsilon ^2, \mu \epsilon ) &{}\quad \frac{1}{2} \texttt{D}_\texttt{h}^{-1} ({\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\texttt{e}_{12} +2\texttt{h}\texttt{f}_{11})\epsilon + r_{22}(\epsilon ^2,\mu \epsilon ) \end{pmatrix}\, , \end{aligned}$$
(5.8)

where \(\texttt{e}_{12}\), \(\texttt{f}_{11}\) are defined in (1.2), (4.13) and \( \texttt{D}_\texttt{h}\) is the positive constant in (5.7), such that the following holds true. By conjugating the Hamiltonian and reversible matrix \(\texttt{L}_{\mu ,\epsilon }^{(1)}\), defined in (5.2), with the symplectic and reversibility-preserving \(4\times 4\) matrix

$$\begin{aligned} \exp \left( S^{(1)} \right) \,, \quad \text { where } \qquad S^{(1)}:= \texttt{J}_4 \begin{pmatrix} 0 &{} \Sigma \\ \Sigma ^* &{} 0 \end{pmatrix} \,, \qquad \Sigma := \texttt{J}_2 X \,, \end{aligned}$$
(5.9)

we get the Hamiltonian and reversible matrix

$$\begin{aligned} \texttt{L}_{\mu ,\epsilon }^{(2)}:= \exp \left( S^{(1)} \right) \texttt{L}_{\mu ,\epsilon }^{(1)} \exp \left( -S^{(1)} \right) = \texttt{J}_4 \texttt{B}_{\mu ,\epsilon }^{(2)} = \begin{pmatrix} \texttt{J}_2 E^{(2)} &{} \texttt{J}_2 F^{(2)} \\ \texttt{J}_2 [F^{(2)}]^* &{} \texttt{J}_2 G^{(2)} \end{pmatrix}\,,\nonumber \\ \end{aligned}$$
(5.10)

where the reversibility-preserving \(2\times 2\) self-adjoint matrix \([E^{(2)}]^*=E^{(2)}\) has the form

$$\begin{aligned}&E^{(2)} = \begin{pmatrix} \mu \epsilon ^2 \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}+ r_1'(\mu \epsilon ^3, \mu ^2 \epsilon ^2 )-\texttt{e}_{22}\frac{\mu ^3}{8}(1+r_1''(\epsilon ,\mu )) &{}\qquad \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) \\ - \textrm{i}\,\big ( \frac{1}{2}\texttt{e}_{12}\mu + r_2(\mu \epsilon ^2,\mu ^2\epsilon ,\mu ^3) \big ) &{} -\texttt{e}_{22}\frac{\mu }{8}(1+r_5(\epsilon ,\mu )) \end{pmatrix}\, , \end{aligned}$$
(5.11)

where

$$\begin{aligned} \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}=\texttt{e}_{11} - \texttt{D}_\texttt{h}^{-1} \big ( {\mathtt c}_{\mathtt h}^{-1} + \texttt{h}\texttt{f}_{11}^2 +\texttt{e}_{12}\texttt{f}_{11}{\mathtt c}_{\mathtt h}^{-\frac{1}{2}} \big ) \end{aligned}$$
(5.12)

(with constants \( \texttt{e}_{11}\), \(\texttt{D}_\texttt{h}\), \(\texttt{f}_{11} \), \( \texttt{e}_{12}\), defined in (4.13), (5.7), (1.2)), is the Whitham-Benjamin function defined in (1.1), the reversibility-preserving \(2\times 2\) self-adjoint matrix \([G^{(2)}]^*=G^{(2)}\) has the form

$$\begin{aligned} G^{(2)} = \begin{pmatrix} \mu + r_8(\mu \epsilon ^2, \mu ^3 \epsilon ) &{} - \textrm{i}\,r_9(\mu \epsilon ^2,\mu ^2\epsilon ) \\ \textrm{i}\,r_9(\mu \epsilon ^2, \mu ^2\epsilon ) &{} \tanh (\texttt{h}\mu ) + r_{10}(\mu \epsilon ) \end{pmatrix}\,, \end{aligned}$$
(5.13)

and

$$\begin{aligned} F^{(2)}= \begin{pmatrix} r_3(\mu \epsilon ^3 ) &{} \textrm{i}\,r_4(\mu \epsilon ^3 ) \\ \textrm{i}\,r_6(\mu \epsilon ^3 ) &{} r_7(\mu \epsilon ^3) \end{pmatrix} \,. \end{aligned}$$
(5.14)

The rest of the section is devoted to the proof of Lemma 5.4. For simplicity let \( S = S^{(1)} \).

The matrix \(\text {exp}(S)\) is symplectic and reversibility-preserving because the matrix S in (5.9) is Hamiltonian and reversibility-preserving, cfr. Lemma 3.8 in [6]. Note that S is reversibility preserving, since X has the form (5.8).

We now expand in Lie series the Hamiltonian and reversible matrix \( \texttt{L}_{\mu ,\epsilon }^{(2)} = \exp (S)\texttt{L}_{\mu ,\epsilon }^{(1)} \exp (-S) \).

We split \(\texttt{L}_{\mu ,\epsilon }^{(1)}\) into its \(2\times 2\)-diagonal and off-diagonal Hamiltonian and reversible matrices

$$\begin{aligned}&\texttt{L}_{\mu ,\epsilon }^{(1)} = D^{(1)} + R^{(1)} \, , \nonumber \\&D^{(1)} :=\begin{pmatrix} D_1 &{} 0 \\ 0 &{} D_0 \end{pmatrix} := \begin{pmatrix} \texttt{J}_2 E^{(1)} &{} 0 \\ 0 &{} \texttt{J}_2 G^{(1)} \end{pmatrix}, \quad R^{(1)} := \begin{pmatrix} 0 &{} \texttt{J}_2 F^{(1)} \\ \texttt{J}_2 [F^{(1)}]^* &{} 0 \end{pmatrix} , \end{aligned}$$
(5.15)

and we perform the Lie expansion

$$\begin{aligned} \texttt{L}_{\mu ,\epsilon }^{(2)}&= \exp (S)\texttt{L}_{\mu ,\epsilon }^{(1)} \exp (-S) = D^{(1)} +\left[ S\,,\, D^{(1)}\right] \nonumber \\&\quad + \frac{1}{2} [S, [S, D^{(1)}]] + R^{(1)} + [S, R^{(1)}] \nonumber \\&\quad + \frac{1}{2} \int _0^1 (1-\tau )^2 \exp (\tau S) \text {ad}_S^3( D^{(1)} ) \exp (-\tau S) \, \textrm{d}\tau \nonumber \\&\quad + \int _0^1 (1-\tau ) \, \exp (\tau S) \, \text {ad}_S^2( R^{(1)} ) \, \exp (-\tau S) \, \textrm{d}\tau \end{aligned}$$
(5.16)

where \(\text {ad}_A(B):= [A,B]:= AB - BA \) denotes the commutator between the linear operators AB.

We look for a \( 4 \times 4 \) matrix S as in (5.9) (which is Hamiltonian, reversibility-preserving and off-diagonal as the term \(R^{(1)}\) we wish to eliminate) that solves the homological equation \( R^{(1)} +\left[ S\,,\, D^{(1)} \right] = 0 \), which, recalling (5.15), reads

$$\begin{aligned} \begin{pmatrix} 0 &{} \texttt{J}_2F^{(1)}+\texttt{J}_2\Sigma D_0 - D_1\texttt{J}_2\Sigma \\ \texttt{J}_2{[F^{(1)}]}^*+\texttt{J}_2\Sigma ^*D_1-D_0\texttt{J}_2\Sigma ^* &{} 0 \end{pmatrix} =0 \,.\qquad \end{aligned}$$
(5.17)

Note that the equation \( \texttt{J}_2F^{(1)}+\texttt{J}_2\Sigma D_0 - D_1\texttt{J}_2\Sigma = 0 \) implies also \( \texttt{J}_2{[F^{(1)}]}^*+\texttt{J}_2\Sigma ^*D_1-D_0\texttt{J}_2\Sigma ^* = 0 \) and viceversa. Thus, writing \( \Sigma =\texttt{J}_2 X \), namely \( X = - \texttt{J}_2 \Sigma \), the equation (5.17) amounts to solve the “Sylvester" equation

$$\begin{aligned} D_1 X - X D_0 = - \texttt{J}_2F^{(1)} \,. \end{aligned}$$
(5.18)

We write the matrices \( E^{(1)}, F^{(1)}, G^{(1)}\) in (5.2) as

$$\begin{aligned} E^{(1)}= & {} \begin{pmatrix} E_{11}^{(1)} &{} \textrm{i}\,E_{12}^{(1)} \\ - \textrm{i}\,E_{12}^{(1)} &{} E_{22}^{(1)} \end{pmatrix}\,, \quad F^{(1)} = \begin{pmatrix} F_{11}^{(1)} &{} \textrm{i}\,F_{12}^{(1)} \\ \textrm{i}\,F_{21}^{(1)} &{} F_{22}^{(1)} \end{pmatrix} \,, \nonumber \\ G^{(1)}= & {} \begin{pmatrix} G_{11}^{(1)} &{} \textrm{i}\,G_{12}^{(1)} \\ - \textrm{i}\,G_{12}^{(1)} &{} G_{22}^{(1)} \end{pmatrix} \end{aligned}$$
(5.19)

where the real numbers \( E_{ij}^{(1)}, F_{ij}^{(1)}, G_{ij}^{(1)} \), \( i, j = 1,2 \), have the expansion in (5.4), (5.5), (5.6). Thus, by (5.15), (5.8) and (5.19), the equation (5.18) amounts to solve the \(4\times 4\) real linear system

(5.20)

We solve this system using the following result, verified by a direct calculus:

Lemma 5.5

The determinant of the matrix

$$\begin{aligned} A:= \begin{pmatrix} a &{} b &{} c &{} 0 \\ d &{} a &{} 0 &{} - c \\ e &{} 0 &{} a &{} -b \\ 0 &{} - e &{} -d &{} a \end{pmatrix} \end{aligned}$$
(5.21)

where abcde are real numbers, is

$$\begin{aligned} \det A= & {} a^4 -2 a^2 (b d + c e)+(b d - c e)^2\nonumber \\= & {} (bd-a^2)^2 -2ce\big (a^2 +bd-\frac{1}{2} ce\big ) \,. \end{aligned}$$
(5.22)

If \( \det A \ne 0 \) then A is invertible and

$$\begin{aligned} A^{-1} = {\frac{1}{ \det A} \left( \begin{array}{cccc} \! a \left( a^2-b d - c e\right) &{} \! b \left( -a^2+b d - c e\right) &{} -c \left( a^2+b d - c e\right) &{} \! - 2 a b c \\ \! d \left( -a^2+b d - c e\right) &{} \! a \left( a^2-b d - c e\right) &{} 2 a c d &{} \! - c \left( -a^2-b d + c e\right) \\ \! - e \left( a^2+b d - c e\right) &{} \! 2 a b e &{} a \left( a^2-b d - c e\right) &{} \! b \left( a^2-b d + c e\right) \\ \! - 2 a d e &{} \! - e \left( -a^2-b d + c e\right) &{} d \left( a^2-b d + c e\right) &{} \! a \left( a^2-b d - c e\right) \end{array} \right) } \, . \end{aligned}$$
(5.23)

The Sylvester matrix in (5.20) has the form (5.21) where, by (5.4)-(5.6) and since \( \tanh (\texttt{h}\mu ) = \texttt{h}\mu + r(\mu ^3)\),

$$\begin{aligned}&a = G_{12}^{(1)} - E_{12}^{(1)} = - \texttt{e}_{12}\frac{\mu }{2} \big (1 +r(\epsilon ^2, \mu \epsilon , \mu ^2)\big ) \, , \ b = G_{11}^{(1)} =\mu + r_8(\mu \epsilon ^2, \mu ^3 \epsilon ) \, , \nonumber \\&c = E_{22}^{(1)} =-\texttt{e}_{22}\frac{\mu }{8}(1+r_5(\epsilon ,\mu )) \, , \ d = G_{22}^{(1)} = \mu \texttt{h}+ r( \mu \epsilon , \mu ^3 )\, , \nonumber \\&e = E_{11}^{(1)} = r(\mu \epsilon ^2, \mu ^3) \, , \end{aligned}$$
(5.24)

where \(\texttt{e}_{12}\) and \(\texttt{e}_{22}\), defined respectively in (1.2), (1.3), are positive for any \( \texttt{h}> 0 \).

By (5.22), the determinant of the matrix is

(5.25)

where \(\texttt{D}_\texttt{h}\) is defined in (5.7). By (5.23), (5.24), (5.25) and, since \(\texttt{D}_\texttt{h}=\texttt{h}-\frac{1}{4}\texttt{e}_{12}^2\), we obtain

(5.26)

Therefore, for any \(\mu \ne 0\), there exists a unique solution of the linear system (5.20), namely a unique matrix X which solves the Sylvester equation (5.18).

Lemma 5.6

The matrix solution X of the Sylvester equation (5.18) is analytic in \((\mu ,\epsilon ) \), and admits an expansion as in (5.8).

Proof

By (5.20), (5.26), (5.19), (5.6) we obtain, for any \(\mu \ne 0\)

$$\begin{aligned} \begin{pmatrix} x_{11} \\ x_{12} \\ x_{21} \\ x_{22} \end{pmatrix}&= \frac{1}{\texttt{D}^2_\texttt{h}} \begin{pmatrix} \frac{1}{2}{\texttt{e}_{12}}\texttt{D}_\texttt{h}&{} \texttt{D}_\texttt{h}&{} \frac{1}{32} \texttt{e}_{22} (\texttt{e}_{12}^2+4\texttt{h}) &{} -\frac{1}{8}{\texttt{e}_{12}}\, \texttt{e}_{22} \\ \texttt{h}\texttt{D}_\texttt{h}&{} \frac{1}{2}{\texttt{e}_{12}}\texttt{D}_\texttt{h}&{}\frac{1}{8} \texttt{e}_{12}\texttt{e}_{22} \texttt{h}&{} - \frac{1}{32}\texttt{e}_{22} \, (\texttt{e}_{12}^2+4\texttt{h}) \\ r(\epsilon ^2, \mu ^2) \quad &{} r(\epsilon ^2, \mu ^2) &{} \frac{1}{2}{\texttt{e}_{12}} \texttt{D}_\texttt{h}&{} - {\texttt{D}_\texttt{h}} \\ r(\epsilon ^2, \mu ^2) \quad &{} r(\epsilon ^2, \mu ^2) &{} -\texttt{h}\texttt{D}_\texttt{h}&{} \frac{1}{2}\texttt{e}_{12}\texttt{D}_\texttt{h}\end{pmatrix}\\&\quad \begin{pmatrix} r(\epsilon ) \\ r(\epsilon ) \\ -\texttt{f}_{11}\epsilon + r(\epsilon ^3,\mu \epsilon ^2,\mu ^2\epsilon ) \\ {\mathtt c}_{\mathtt h}^{-\frac{1}{2}}\epsilon +r(\epsilon ^2, \mu \epsilon ) \end{pmatrix}(1+r(\epsilon ,\mu )) \, , \end{aligned}$$

which proves (5.8). In particular each \(x_{ij}\) admits an analytic extension at \(\mu = 0\). Note that, for \(\mu = 0\), one has \(E^{(2)}=G^{(2)}=F^{(2)}= 0\) and the Sylvester equation reduces to tautology. \(\quad \square \)

Since the matrix S solves the homological equation \(\left[ S\,,\, D^{(1)} \right] + R^{(1)} =0\), identity (5.16) simplifies to

$$\begin{aligned} \texttt{L}_{\mu ,\epsilon }^{(2)} = D^{(1)} +\frac{1}{2}\left[ S\,,\, R^{(1)} \right] + \frac{1}{2} \int _0^1 (1-\tau ^2) \, \exp (\tau S) \, \text {ad}_S^2( R^{(1)} ) \, \exp (-\tau S) \textrm{d}\tau \,.\nonumber \\ \end{aligned}$$
(5.27)

The matrix \(\frac{1}{2} \left[ S\,,\, R^{(1)} \right] \) is, by (5.9), (5.15), the block-diagonal Hamiltonian and reversible matrix

$$\begin{aligned} \begin{aligned}&\frac{1}{2} \left[ S\,,\, R^{(1)} \right] \\&= \begin{pmatrix} \frac{1}{2} \texttt{J}_2 ( \Sigma \texttt{J}_2 [F^{(1)}]^*- F^{(1)} \texttt{J}_2 \Sigma ^*) &{} 0 \\ 0 &{} \frac{1}{2} \texttt{J}_2 ( \Sigma ^* \texttt{J}_2 F^{(1)}- [F^{(1)}]^* \texttt{J}_2 \Sigma ) \end{pmatrix} \\&= \begin{pmatrix} \texttt{J}_2 {\tilde{E}} &{} 0 \\ 0 &{}\texttt{J}_2 {\tilde{G}} \end{pmatrix}, \end{aligned} \end{aligned}$$
(5.28)

where, since \( \Sigma = \texttt{J}_2 X \),

$$\begin{aligned} {\tilde{E}}:= {{\textbf {Sym}}} \big ( \texttt{J}_2 X \texttt{J}_2 [F^{(1)}]^* \big ) \,, \qquad {\tilde{G}}:= {{\textbf {Sym}}} \big ( X^* F^{(1)} \big ) \,, \end{aligned}$$
(5.29)

denoting \( {{\textbf {Sym}}}(A):= \frac{1}{2} (A+ A^* )\).

Lemma 5.7

The self-adjoint and reversibility-preserving matrices \( {\tilde{E}},\ {\tilde{G}} \) in (5.29) have the form

$$\begin{aligned} \begin{aligned}&{\tilde{E}} = \begin{pmatrix} {\tilde{\texttt{e}}}_{11}\mu \epsilon ^2 + {\tilde{r}}_1(\mu \epsilon ^3,\mu ^2\epsilon ^2) &{}\quad \textrm{i}\,{\tilde{r}}_2(\mu \epsilon ^2) \\ - \textrm{i}\,{\tilde{r}}_2(\mu \epsilon ^2) &{} {\tilde{r}}_5(\mu \epsilon ^2) \end{pmatrix} \,, \quad {\tilde{G}} = \begin{pmatrix} {\tilde{r}}_8(\mu \epsilon ^2) &{}\quad \textrm{i}\,{\tilde{r}}_9 (\mu \epsilon ^2) \\ -\textrm{i}\,{\tilde{r}}_9(\mu \epsilon ^2) &{}\quad {\tilde{r}}_{10}(\mu \epsilon ^2) \end{pmatrix} \,, \\&{\tilde{\texttt{e}}}_{11}:= -\texttt{D}_\texttt{h}^{-1} \big ( {\mathtt c}_{\mathtt h}^{-1} + \texttt{h}\texttt{f}_{11}^2 +\texttt{e}_{12}\texttt{f}_{11}{\mathtt c}_{\mathtt h}^{-\frac{1}{2}} \big )\,. \end{aligned} \nonumber \\ \end{aligned}$$
(5.30)

Proof

For simplicity we set \(F=F^{(1)}\). By (5.8), (5.6), one has

$$\begin{aligned} \texttt{J}_2 X \texttt{J}_2 F^*&= \begin{pmatrix} x_{21}F_{12}-x_{22}F_{11} &{} \quad \textrm{i}\,(x_{21}F_{22}+x_{22} F_{21}) \\ \textrm{i}\,(x_{11}F_{12}+x_{12}F_{11}) &{}\quad - x_{11}F_{22} + x_{12}F_{21} \end{pmatrix} \\&= \begin{pmatrix} {\tilde{\texttt{e}}}_{11}\mu \epsilon ^2 + r(\mu \epsilon ^3,\mu ^2\epsilon ^2) &{} \quad \textrm{i}\,r(\mu \epsilon ^2) \\ \textrm{i}\,r(\mu \epsilon ^2) &{}\quad r(\mu \epsilon ^2) \end{pmatrix}, \end{aligned}$$

with \( {\tilde{\texttt{e}}}_{11}\) being defined as in (5.30). The expansion of \({\tilde{E}}\) in (5.30) follows in view of (5.29). Since \(X = \mathcal {O}(\epsilon )\) by (5.8) and \( F = O(\mu \epsilon ) \) by (5.6) we deduce that \( X^* F = \mathcal {O}(\mu \epsilon ^2 )\) and the expansion of \({\tilde{G}} \) in (5.30) follows. \(\quad \square \)

Note that the term \( {\tilde{\texttt{e}}}_{11}\mu \epsilon ^2 \) in the matrix \( {\tilde{E}} \) in (5.29)–(5.30), has the same order of the (1, 1)-entry of \( E^{(1)} \) in (5.4), thus will contribute to the Whitham-Benjamin function \( \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}\) in the (1, 1)-entry of \( E^{(2)} \) in (5.11). Finally we show that the last term in (5.27) is small.

Lemma 5.8

The \( 4 \times 4 \) Hamiltonian and reversibility matrix

$$\begin{aligned} \frac{1}{2} \int _0^1 (1-\tau ^2) \, \exp (\tau S) \, ad _S^2( R^{(1)} ) \, \exp (-\tau S) \, \textrm{d}\tau = \begin{pmatrix} \texttt{J}_2 {\widehat{E}} &{} \texttt{J}_2 F^{(2)}\\ \texttt{J}_2 [ F^{(2)}]^* &{} \texttt{J}_2 {\widehat{G}} \end{pmatrix}\qquad \end{aligned}$$
(5.31)

where the \( 2 \times 2 \) self-adjoint and reversible matrices \({\widehat{E}} \), \( {\widehat{G}}\) have entries

$$\begin{aligned} {\widehat{E}}_{ij} \, {\widehat{G}}_{ij} = r(\mu \epsilon ^3) \,, \quad i,j = 1,2 \,, \end{aligned}$$
(5.32)

and the \(2\times 2\) reversible matrix \( F^{(2)}\) admits an expansion as in (5.14).

Proof

Since S and \( R^{(1)} \) are Hamiltonian and reversibility-preserving then \( ad _S R^{(1)} = [S, R^{(1)} ] \) is Hamiltonian and reversibility-preserving as well. Thus each \( \exp (\tau S) \, ad _S^2( R^{(1)} ) \, \exp (-\tau S)\) is Hamiltonian and reversibility-preserving, and formula (5.31) holds. In order to estimate its entries we first compute \(ad _S^2( R^{(1)} )\). Using the form of S in (5.9) and \([S, R^{(1)} ]\) in (5.28) one gets

$$\begin{aligned} ad _S^2(R^{(1)}) = \begin{pmatrix} 0 &{} \texttt{J}_2{\tilde{F}} \\ \texttt{J}_2 {\tilde{F}}^* &{} 0\end{pmatrix}\qquad \text {where} \qquad {\tilde{F}}:= 2\left( \Sigma \texttt{J}_2 {\tilde{G}} - {\tilde{E}} \texttt{J}_2 \Sigma \right) \end{aligned}$$
(5.33)

and \({\tilde{E}}\), \({\tilde{G}}\) are defined in (5.29). Since \( {\tilde{E}}, {\tilde{G}} = \mathcal {O}(\mu \epsilon ^2 )\) by (5.30), and \(\Sigma = \texttt{J}_2 X = \mathcal {O}( \epsilon ) \) by (5.8), we deduce that \({\tilde{F}} = \mathcal {O}(\mu \epsilon ^3) \). Then, for any \( \tau \in [0,1]\), the matrix \(\exp (\tau S) \, ad _S^2( R^{(1)} ) \, \exp (-\tau S) = ad _S^2( R^{(1)} ) (1 + \mathcal {O}(\mu ,\epsilon ))\). In particular the matrix \(F^{(2)}\) in (5.31) has the same expansion of \({\tilde{F}}\), namely \( F^{(2)} = \mathcal {O}(\mu \epsilon ^3) \), and the matrices \({\widehat{E}}\), \({\widehat{G}}\) have entries as in (5.32). \(\quad \square \)

Proof of Lemma 5.4

It follows by (5.27)–(5.28), (5.15) and Lemmata 5.7 and 5.8. The matrix \(E^{(2)}:= E^{(1)} + {\tilde{E}} + \widehat{ E}\) has the expansion in (5.11), with \( \texttt{e}_{\scriptscriptstyle {\textsc {WB}}}= \texttt{e}_{11} + {\tilde{\texttt{e}}}_{11} \) as in (5.12). Similarly \(G^{(2)}:= G^{(1)} + {\tilde{G}} + \widehat{G} \) has the expansion in (5.13). \(\quad \square \)

5.2 Complete Block-Decoupling and Proof of the Main Results

We now block-diagonalize the \( 4\times 4\) Hamiltonian and reversible matrix \(\texttt{L}_{\mu ,\epsilon }^{(2)}\) in (5.10). First we split it into its \(2\times 2\)-diagonal and off-diagonal Hamiltonian and reversible matrices

$$\begin{aligned}&\texttt{L}_{\mu ,\epsilon }^{(2)} = D^{(2)} + R^{(2)} \, ,\nonumber \\&D^{(2)}:= \begin{pmatrix} \texttt{J}_2 E^{(2)} &{} 0 \\ 0 &{} \texttt{J}_2 G^{(2)} \end{pmatrix}, \quad R^{(2)}:= \begin{pmatrix} 0 &{} \texttt{J}_2 F^{(2)} \\ \texttt{J}_2 [F^{(2)}]^* &{} 0 \end{pmatrix} . \end{aligned}$$
(5.34)

Lemma 5.9

There exist a \(4\times 4\) reversibility-preserving Hamiltonian matrix \(S^{(2)}:=S^{(2)}(\mu ,\epsilon )\) of the form (5.9), analytic in \((\mu , \epsilon )\), of size \(\mathcal {O}(\epsilon ^3)\), and a \(4\times 4\) block-diagonal reversible Hamiltonian matrix \(P:=P(\mu ,\epsilon )\), analytic in \((\mu , \epsilon )\), of size \({ \mathcal {O}(\mu \epsilon ^6)}\) such that

$$\begin{aligned} \exp (S^{(2)})(D^{(2)}+R^{(2)}) \exp (-S^{(2)}) = D^{(2)}+P \,. \end{aligned}$$
(5.35)

Proof

We set for brevity \( S = S^{(2)} \). The equation (5.35) is equivalent to the system

$$\begin{aligned} {\left\{ \begin{array}{ll} \Pi _{D}\big ( e^{ S} \big (D^{(2)}+R^{(2)} \big ) e^{- S} \big ) - D^{(2)} = P \\ \Pi _{\varnothing }\big ( e^{S} \big (D^{(2)}+R^{(2)} \big ) e^{- S}\big ) = 0 \,, \end{array}\right. } \end{aligned}$$
(5.36)

where \(\Pi _D\) is the projector onto the block-diagonal matrices and \(\Pi _\varnothing \) onto the block-off-diagonal ones. The second equation in (5.36) is equivalent, by a Lie expansion, and since \( [S, R^{(2)}] \) is block-diagonal, to

$$\begin{aligned} R^{(2)} + \left[ S\,,\, D^{(2)}\right] + \underbrace{\Pi _\varnothing \int _0^1 (1-\tau ) e^{\tau S} \text {ad}_S^2\big (D^{(2)}+R^{(2)} \big )e^{- \tau S} \textrm{d}\tau }_{=: \mathcal {R}(S)} = 0 \,.\qquad \end{aligned}$$
(5.37)

The “nonlinear homological equation" (5.37),

$$\begin{aligned}{}[S,D^{(2)}] = -R^{(2)} - \mathcal {R}(S) \,, \end{aligned}$$
(5.38)

is equivalent to solve the \(4\times 4\) real linear system

(5.39)

associated, as in (5.20), to (5.38). The vector \( \mu {v}(\mu ,\epsilon ) \) is associated with \( - R^{(2)} \) where \(R^{(2)} \) is in (5.34). The vector \( \mu {g}(\mu ,\epsilon ,{x}) \) is associated with the matrix \( - \mathcal {R}(S) \), which is a Hamiltonian and reversible block-off-diagonal matrix (i.e of the form (5.15)). The factor \(\mu \) is present in \(D^{(2)}\) and \(R^{(2)}\), see (5.11), (5.13), (5.14) and the analytic function \( {g}(\mu ,\epsilon ,{x}) \) is quadratic in \( {x} \) (for the presence of \( \text {ad}_S^2 \) in \( \mathcal {R}(S)\)). In view of (5.14) one has

$$\begin{aligned} \mu {v}(\mu ,\epsilon ):= (-F^{(2)}_{21},F^{(2)}_{22},-F^{(2)}_{11},F^{(2)}_{12})^\top , \quad F^{(2)}_{ij} = \, {r(\mu \epsilon ^3)} \,. \end{aligned}$$
(5.40)

System (5.39) is equivalent to and, writing (cfr. (5.26)), to

By the implicit function theorem this equation admits a unique small solution \({x}={x}(\mu ,\epsilon )\), analytic in \( (\mu , \epsilon ) \), with size \({\mathcal {O}(\epsilon ^3)} \) as \( {v} \) in (5.40). Then the first equation of (5.36) gives \( P = [S, R^{(2)}] + \Pi _D \int _0^1 (1-\tau ) e^{\tau S} \text {ad}_S^2\big (D^{(2)}+R^{(2)} \big )e^{- \tau S} \textrm{d}\tau \), and its estimate follows from those of S and \( R^{(2)} \) (see (5.14)). \(\quad \square \)

Proof of Theorems 2.5 and 1.1

By Lemma 5.9 and recalling (3.1) the operator \( \mathcal {L}_{\mu ,\epsilon }: \mathcal {V}_{\mu ,\epsilon } \rightarrow \mathcal {V}_{\mu ,\epsilon } \) is represented by the \(4\times 4\) Hamiltonian and reversible matrix

$$\begin{aligned} \textrm{i}\,{\mathtt c}_{\mathtt h}\mu + \exp ( S^{(2)})\texttt{L}_{\mu ,\epsilon }^{(2)} \exp (- S^{(2)}) = \textrm{i}\,{\mathtt c}_{\mathtt h}\mu + \begin{pmatrix} \texttt{J}_2 E^{(3)} &{} 0 \\ 0 &{} \texttt{J}_2 G^{(3)} \end{pmatrix} =: \begin{pmatrix} \texttt{U} &{} 0 \\ 0 &{} \texttt{S} \end{pmatrix} \,, \end{aligned}$$

where the matrices \(E^{(3)}\) and \(G^{(3)}\) expand as in (5.11), (5.13). Consequently the matrices \(\texttt{U}\) and \(\texttt{S}\) expand as in (2.40). Theorem 2.5 is proved. Theorem 1.1 is a straightforward corollary. The function \(\underline{\mu }(\epsilon ) \) in (1.4) is defined as the implicit solution of the function \(\Delta _{\scriptscriptstyle {\textsc {BF}}}(\texttt{h};\mu ,\epsilon )\) in (1.6) for \(\epsilon \) small enough, depending on \(\texttt{h}\). \(\quad \square \)