1 Introduction and main result

In the last several decades, models of (infinitely) many coupled oscillators have found diverse applications in various fields of science. Among a lot of examples are the collective dynamics of Josephson junctions [1, 2], lasers [3, 4], relativistic magnetrons [5], chemical reactions [69], circadian pacemakers [10, 11], intestinal electrical rhythms [12], a variety of biological processes [1315], etc. The research of these systems has brought outstanding examples of different types of dynamical behavior that can be induced by the attendance of coupling. See [13, 14, 1630] for more details. Meanwhile, the pendulum equation or analogous ones can be used to depict the synchronous electric motor models of a single machine infinite bus [31], Josephson junctions [3234], super-conducting derive [35], shunted model of electrical rotator [36], and many other applications. Among those interesting models are the networks of weakly coupled pendulum equations

$$ \frac{d^{2}x_{n}}{dt^{2}}+\lambda^{2}_{n} \sin x_{n}=\epsilon W_{n}(x_{n-1},x_{n},x_{n+1}),\quad n\in\mathbb{Z}, $$
(1.1)

where \(\lambda_{n}, n\in\mathbb{Z}\) are constants and \(W_{n}\) are real analytic functions given in (1.6). While the existence of normally hyperbolic invariant tori which are infinite dimensional in both tangent and normal directions for (1.1) can be showed via KAM theory in this paper.

The classical KAM theory [3739], founded by Kolmogorov, Arnold, and Moser in the last century, is a milestone of the evolution of Hamiltonian systems. It provided a new method for the research of Hamiltonian systems. The classical KAM theory established on a 2n-dimensional smoothly manifold affirms that most of the non-resonant tori of a non-degenerate integrable Hamiltonian system are not destroyed under perturbations, they only have a small deformation. The KAM theory has been developed into a very complete theory in the past nearly half a century.

In the 1980s, the distinguished KAM theory was triumphantly developed to infinitely dimensional Hamiltonian systems of short range so as to research a class of Hamiltonian networks of weakly coupled oscillators. To describe it more precisely, we consider from three aspects the infinitely dimensional Hamiltonian system

$$ H=H(u,v)=\frac{1}{2}\sum_{n\in\mathbb {Z}} \lambda_{n}\bigl(u_{n}^{2}+v_{n}^{2} \bigr)+\epsilon P(u,v), $$
(1.2)

where P is of short range.

(i) Introducing the action-angle variables on \(u_{n},v_{n}\) for \(\forall n\in\mathbb{Z}\), then the Hamiltonian system (1.2) takes the form

$$ H=H(I,\theta)=\sum_{n\in\mathbb {Z}} \omega_{n}I_{n}+\epsilon P(I,\theta), $$
(1.3)

where tangent frequencies \(\omega_{n}=\lambda_{n},n\in\mathbb{Z}\). Vittot and Bellissard [40], Fröhlich et al. [41] asserted that there is a full dimensional invariant torus with infinite dimension for (1.3). Pöschel [42] got also the above results for the Hamiltonian systems with a more general spatial structure for P. Thus, any solution starting from the torus is almost-periodic in time. For more results on the existence of almost-periodic solutions, see Bourgain [43] and Cong et al. [44].

(ii) Constructing the action-angle variables on \(u_{n},v_{n}\) for \(n\in\{ 1,2,\ldots,m\}\) and by abuse of notation to rearrange the subscript of \(u_{n},v_{n},\lambda_{n}\), we get

$$0\rightarrow0;-n\rightarrow-n;m+n\rightarrow n, \quad\text{for } n\geq1. $$

Then Hamiltonian (1.2) is of the form

$$ H=H(I,\theta)=\sum^{m}_{n=1} \omega _{n}I_{n}+\frac{1}{2}\sum _{n\in\mathbb{Z}}\lambda_{n}\bigl(u_{n}^{2}+v_{n}^{2} \bigr)+\epsilon P(I,\theta,u,v). $$
(1.4)

Here tangent frequencies ω are m-dimensional and normal frequencies are infinite dimensional. Kuksin [45], Pöschel [46, 47], and Wayne [48] (in alphabetic order) concluded that, under the multiplicity of \(\lambda_{n}\) equals to 1 for \(\forall n\in\mathbb{Z}\) and some non-resonant conditions (Melnikov condition), most of elliptic type lower-dimensional invariant tori for (1.4) without being of short range will remain under small perturbations. While all of \(\lambda_{n}\) are the same, Yuan [49, 50] obtained similar KAM results for infinitely dimensional Hamiltonian system (1.4) of short range. On the other hand, the persistence problem of hyperbolic type lower-dimensional invariant tori with finite dimension was researched first by Moser [51]. Graff [52] then generalized Moser’s theory. Then, Zehnder in [53, 54] has brought a substitute proof of Graff’s conclusion by an implicit function technique. For more results on the evolutions in this direction, one can refer to [5561]. By virtue of the KAM theory of this situation, we obtain that there are quasi-periodic solutions for the coupled pendulum equations.

(iii) For the case that both tangent frequencies ω and normal frequencies are infinite dimensional, according to our knowledge, not only of elliptic type but also of hyperbolic type, there has not been any KAM theorem to deal with this situation. However, the existence of almost-periodic solutions for the networks of weakly coupled pendulum equations (1.1) needs to be proved. That is the problem we are most concerned with in this paper.

To describe it more accurately, by virtue of the technique of action-angle variables on \(x_{2j+1}, j\in\mathbb{Z}\) and the transformation \((x_{2j}-\pi,\dot{x}_{2j})=(1/\sqrt{2\lambda _{2j}}(u_{j}-v_{j}), \sqrt{\lambda_{2j}/2}(u_{j}+v_{j})),j\in\mathbb{Z}\), we transform the Hamiltonian of Eq. (1.1) into the form

$$ H=H(I,\theta,u,v)=\sum_{j\in\mathbb{Z}}\omega _{j}I_{j}+\sum_{j\in\mathbb{Z}} \lambda_{2j}u_{j}v_{j} +\epsilon P(I,\theta,u,v), $$
(1.5)

where P is of short range. Obviously, both tangent frequencies \(\omega=(\omega_{j})_{j\in\mathbb{Z}}\) and normal frequencies \(\Omega =(\lambda_{2j})_{j\in\mathbb{Z}}\) are infinite dimensional belonging to case (iii). Meanwhile, \((u_{j},v_{j})_{j\in\mathbb{Z}}=(0,0)_{j\in \mathbb{Z}}\) is a hyperbolic equilibrium point on the normal direction. From this, we then mainly study the persistence of normally hyperbolic invariant tori which are infinite dimensional in both tangent and normal directions for this Hamiltonian system in this issue.

Before starting our theorem, we define the norm

$$\vert x \vert _{\infty}:=\sup_{j\in\mathbb{Z}} \vert x_{j} \vert $$

in \({\mathbb{C}}^{\mathbb{Z}}\) and give the following assumptions:

(A1) \(\lambda_{n}, n\in\mathbb{Z}\) satisfy

$$\lambda_{2j+1}=\lambda>0,\qquad \lambda_{2j}\geq \vert j \vert ^{-N}, \qquad \vert \lambda _{2i}-\lambda_{2j} \vert \geq \bigl\vert \vert i \vert ^{-N}- \vert j \vert ^{-N} \bigr\vert ,\quad i,j\in\mathbb{Z}\backslash\{0\},N>0. $$

(A2) \(W_{n}\) satisfies

$$ W_{n}=W'\bigl(x_{n+1}-x_{n}+(-1)^{n} \pi\bigr)e^{-\frac{3}{4} \vert n \vert ^{1+\alpha}}- W'\bigl(x_{n}-x_{n-1}-(-1)^{n} \pi\bigr)e^{-\frac{3}{4} \vert n-1 \vert ^{1+\alpha}} $$
(1.6)

with \(\alpha>0\) (arbitrarily small), and \(W=O( \vert x \vert ^{3})\) is real analytic in the strip domain \(\{x\in\mathbb{C}: \vert \operatorname{Im} x \vert <\delta_{0}\}\) for some constant \(\delta_{0}>0\).

The expression \(\frac{1}{2}y^{2}+\lambda^{2}(1-\cos x)=h\) with \(\frac {\lambda^{2}}{2}\leq h\leq\lambda^{2}\) denotes a simple closed curve Γ which encloses \((0, 0)\) in the \((x, y)\)-plane. Let \(\rho=\rho(h)\) be the area enclosed by \(\Gamma(h)\), i.e.,

$$\rho(h)= \oint_{\frac{1}{2}y^{2}+\lambda^{2}(1-\cos x)=h}y\,dx. $$

Then we can see that \(\rho'(h)>0,\rho''(h)\neq0\) for any \(h\in [\frac{\lambda^{2}}{2},\lambda^{2}]\).

Equation (1.1) can be regarded as a perturbation of the following system:

$$\begin{aligned} &\frac{d^{2}x_{2j+1}}{dt^{2}}+\lambda^{2} \sin x_{2j+1}=0, \quad j \in\mathbb{Z}, \end{aligned}$$
(1.7a)
$$\begin{aligned} &\frac{d^{2}x_{2j}}{dt^{2}}-\lambda^{2}_{2j} (x_{2j}-\pi)=0, \quad j\in\mathbb{Z}. \end{aligned}$$
(1.7b)

For any \(\eta=(h_{j})_{j\in\mathbb{Z}}, h_{j}\in[\frac{\lambda ^{2}}{2},\lambda^{2}]\), then, by the fact that \(\frac{1}{2}y^{2}+\lambda ^{2}(1-\cos x)=h_{j},j\in\mathbb{Z}\) is a first integral of (1.7a), \(\prod_{j\in\mathbb{Z}}\Gamma(h_{j})\) is an invariant torus with the frequencies \(\omega(\eta)=(H'_{0}(\rho(h_{j})))_{j\in \mathbb{Z}}\) for (1.7a), where \(H_{0}\) is the inverse of \(\rho =\rho(h)\). Observe that \((\pi,0)\) is an equilibrium of (1.7b). Thus,

$$\mathcal{T}(\eta)=\prod_{j\in\mathbb{Z}}\Gamma(h_{j}) \times\prod_{j\in\mathbb{Z}}\bigl\{ (\pi,0)\bigr\} $$

is an invariant torus with the frequencies \(\omega(\eta)\) for (1.7a)–(1.7b). Therefore, any solution of (1.7a)–(1.7b) starting from \(\mathcal{T}(\eta)\) is a trivial breather for (1.7a)–(1.7b). Our goal is to show that the torus \(\mathcal{T}(\eta)\) remains under the small perturbation. Here is our main result which expresses that there does persist a large Cantor sub-family of rotational \(\mathbb{Z}\)-tori which are only slightly deformed, thus the solutions starting from the persisted tori are almost-periodic breathers of (1.1).

Theorem 1.1

Suppose that Eq. (1.1) satisfies assumptions (A1), (A2). Then, for the set \(\Omega=[\frac{\lambda^{2}}{2},\lambda ^{2}]^{\mathbb{Z}}\), there is a positive constant \(\epsilon^{*}\) sufficiently small such that, when \(0<\epsilon<\epsilon^{*}\), there are a set \(\mathcal{S}\subset\Omega\) with \(\operatorname{Prob}(\mathcal{S})\) arbitrarily close to one (depending on ϵ), a family of \(\mathbb{Z}\)-tori

$$\mathcal{T}[\mathcal{S}]=\bigcup_{\eta\in\mathcal{S}}\mathcal {T}( \eta)\subset\bigcup_{\eta\in\Omega}\mathcal{T}(\eta) $$

over \(\mathcal{S}\), and an analytic embedding

$$\Phi:\mathcal{T}[\mathcal{S}]\hookrightarrow\mathbb{R}^{\mathbb {Z}}\times \mathbb{T}^{\mathbb{Z}}\times\mathbb{R}^{\mathbb {Z}}\times \mathbb{R}^{\mathbb{Z}}, $$

which is a higher order perturbation of the inclusion map \(\Phi _{0}:\bigcup_{\eta\in\Omega}\mathcal{T}(\eta)\hookrightarrow \mathbb{R}^{\mathbb{Z}}\times\mathbb{T}^{\mathbb{Z}}\times\mathbb {R}^{\mathbb{Z}}\times\mathbb{R}^{\mathbb{Z}}\) restricted to \(\mathcal{T}[\mathcal{S}]\), such that the restriction Φ to each \(\mathcal{T}(\eta)\) in the family is an embedding of a rotational \(\mathbb{Z}\)-torus for (1.1). Moreover, any solution of (1.1) starting from \(\Phi(\mathcal{T}(\eta ))\) is an almost-periodic breather of frequencies \(\omega^{*}\) with \(\vert \omega^{*}-\omega \vert _{\infty}=O(\epsilon^{1/6})\).

This paper is organized as follows. In Sect. 2, Eq. (1.1) is, by the technique of action-angle variables, reduced to a normal form to which a KAM theorem is applicable. In Sect. 3, a KAM theorem and its iterative lemma are given, and the proof for the iterative lemma is finished. Theorem 1.1 and the KAM theorem are proven in Sect. 4.

2 Reduced to normal form

In this section, we will find a series of changes in variables to transform Eq. (1.1) into a normal form.

Let \(\dot{x}_{n}=y_{n}\). Then (1.1) is a Hamiltonian system with its Hamiltonian

$$\begin{aligned} H={}&\sum_{j\in\mathbb{Z}}\biggl\{ \frac{1}{2}y^{2}_{2j+1}+ \lambda ^{2}(1-\cos x_{2j+1})\biggr\} +\sum _{j\in\mathbb{Z}}\biggl\{ \frac{1}{2}y^{2}_{2j}-(1+ \cos x_{2j})\lambda^{2}_{2j}\biggr\} \\ &{}+\epsilon\sum_{j\in\mathbb{Z}}\bigl\{ W(x_{2j+2}-x_{2j+1}- \pi )e^{-\frac{3}{2} \vert j+\frac{1}{2} \vert ^{1+\alpha}}+W(x_{2j+1}-x_{2j}+\pi)e^{-\frac{3}{2} \vert j \vert ^{1+\alpha}}\bigr\} \\ ={}&\sum_{j\in\mathbb{Z}}\biggl\{ \frac{1}{2}y^{2}_{2j+1}+ \lambda^{2}(1-\cos x_{2j+1})\biggr\} +\frac{1}{2}\sum _{j\in\mathbb{Z}}\bigl\{ y^{2}_{2j}-\lambda ^{2}_{2j}(x_{2j}-\pi)^{2}\bigr\} +\sum_{j\in\mathbb{Z}}O\bigl( \vert x_{2j}-\pi \vert ^{4}\bigr) \\ &{} +\epsilon\sum_{j\in\mathbb{Z}}\bigl\{ W(x_{2j+2}-x_{2j+1}-\pi )e^{-\frac{3}{2} \vert j +\frac{1}{2}\vert ^{1+\alpha}}+W(x_{2j+1}-x_{2j}+ \pi)e^{-\frac{3}{2} \vert j \vert ^{1+\alpha}}\bigr\} . \end{aligned}$$

We now carry out the standard reduction to action-angle variables. To construct the map \((x, y) \mapsto(\theta, \rho)\), where ρ and θ are action and angle variables, respectively, we let \(H_{0}(\rho)\) be the value of the function \(\frac{1}{2}y^{2}+\lambda^{2}(1-\cos x)\) on the closed curve which encloses area ρ in the \((x, y)\)-plane, i.e., we define \(H_{0}(\rho)\) implicitly by

$$ \oint_{\frac{1}{2}y^{2}+\lambda^{2}(1-\cos x)=H_{0}(\rho)}y\,dx=\rho. $$
(2.1)

We now define a generating function \(S(x, \rho)\) as follows:

$$ S(x,\rho)= \int_{\Gamma^{*}}y\,dx, $$
(2.2)

where \(\Gamma^{*}\) is a part of the closed curve \(\frac{1}{2}y^{2}+\lambda ^{2}(1-\cos x)=H_{0}(\rho)\) connecting the y-axis with point \((x, y)\), oriented clockwise. We define the map \(\psi: (\theta, \rho) \mapsto (x, y)\) via

$$ S_{x}(x, \rho)=y,\qquad S_{\rho}(x, \rho)=\theta. $$
(2.3)

Then

$$\begin{aligned} &dx\wedge \,dy=dx\wedge(S_{xx}\,dx+S_{x\rho}\,d \rho)=S_{x\rho}\,dx\wedge \,d\rho, \\ &d\theta\wedge \,d\rho=(S_{\rho x}\,dx+S_{\rho\rho}\,d\rho)\wedge \,d \rho =S_{\rho x}\,dx\wedge \,d\rho. \end{aligned}$$

Thus,

$$dx\wedge \,dy=d\theta\wedge \,d\rho. $$

Let

$$ \Psi: \textstyle\begin{cases}(x_{2j+1},y_{2j+1})=\psi(\theta_{j},\rho_{j}),\\ (x_{2j}-\pi,y_{2j})=(1/\sqrt{2\lambda_{2j}}(u_{j}-v_{j}), \sqrt{\lambda _{2j}/2}(u_{j}+v_{j})). \end{cases} $$
(2.4)

Then

$$ \sum_{j\in\mathbb{Z}}\,dx_{2j+1}\wedge \,dy_{2j+1}+\sum_{j\in\mathbb{Z}}\,dx_{2j} \wedge \,dy_{2j}=\sum_{j\in\mathbb {Z}}\,d \theta_{j}\wedge \,d\rho_{j}+\sum _{j\in\mathbb{Z}}\,du_{j}\wedge \,dv_{j}. $$

This implies that Ψ is symplectic. Thus, Hamiltonian H is transformed into

$$\begin{aligned} H={}&\sum_{j\in\mathbb{Z}}H_{0}( \rho_{j})+\sum_{j\in \mathbb{Z}}\lambda_{2j}u_{j}v_{j} +\sum_{j\in\mathbb {Z}}O\bigl( \vert u_{j}-v_{j} \vert ^{4}\bigr) \\ &{}+\epsilon\sum_{j\in\mathbb{Z}}\bigl\{ W(x_{2j+2}-x_{2j+1}- \pi)e^{-\frac{3}{2} \vert j+\frac{1}{2} \vert ^{1+\alpha}}+W(x_{2j+1}-x_{2j}+\pi)e^{-\frac{3}{2} \vert j \vert ^{1+\alpha}}\bigr\} . \end{aligned}$$
(2.5)

By assumption (A2), there exists the inverse \(H_{0}^{-1}\) of \(H_{0}\). Let \([\mu,\nu]=H_{0}^{-1}([\frac{\lambda^{2}}{2},\lambda^{2}])\). For any \(\xi=(\xi_{j})_{j\in\mathbb{Z}}\in[\mu,\nu]^{\mathbb {Z}}\), let \(\rho=I+\xi\), where \(I=(I_{j})_{j\in\mathbb{Z}}\). Expand \(H_{0}(\xi_{j}+I_{j})\) in \(\xi_{j}\) by Taylor’s formula:

$$H_{0}(\xi_{j}+I_{j})=H_{0}( \xi_{j})+H'_{0}(\xi_{j})I_{j}+O \bigl( \vert I_{j} \vert ^{2}\bigr), \quad j\in \mathbb{Z}. $$

Let \(\omega=(H'_{0}(\xi_{j}))_{j\in\mathbb{Z}}, \Pi=[\mu,\nu ]^{\mathbb{Z}}\), and from transformation (2.4), we can denote

$$W(x_{2j+2}-x_{2j+1}-\pi)e^{-\frac{3}{2} \vert j +\frac{1}{2}\vert ^{1+\alpha}}+W(x_{2j+1}-x_{2j}+ \pi)e^{-\frac{3}{2} \vert j \vert ^{1+\alpha}}=f_{j}(I_{j}, \theta_{j},u_{j},u_{j+1},v_{j},v_{j+1}, \xi_{j}). $$

Then \(\xi\in\Pi\) and (2.5) can be written as

$$\begin{aligned} H={}&\sum_{j\in\mathbb{Z}} \omega_{j}I_{j}+\sum_{j\in \mathbb{Z}} \lambda_{2j}u_{j}v_{j} +\sum _{j\in\mathbb {Z}}O\bigl( \vert u_{j}-v_{j} \vert ^{4}\bigr)+\sum_{j\in\mathbb{Z}}O\bigl( \vert I_{j} \vert ^{2}\bigr) \\ &{}+\epsilon\sum_{j\in\mathbb{Z}}f_{j}(I_{j}, \theta_{j},u_{j},u_{j+1},v_{j},v_{j+1}, \xi_{j}), \end{aligned}$$
(2.6)

where the constant \(\sum_{j\in\mathbb{Z}}H_{0}(\xi_{j})\) is omitted since it does not affect the dynamics.

Now we need to introduce the domain of the definition for Hamiltonian H. Set

$$\begin{aligned} \mathcal{D}={}&\bigl\{ (I,\theta,u,v,\xi)\in\mathbb {C}^{\mathbb{Z}}\times\mathbb{C}^{\mathbb{Z}}\times\mathbb {C}^{\mathbb{Z}} \times\mathbb{C}^{\mathbb{Z}}\times\mathbb {C}^{\mathbb{Z}}: \vert I_{j} \vert < \rho^{0}_{j}, \vert \operatorname{Im} \theta_{j} \vert < \delta_{0}, \\ &{} \vert u_{j} \vert < \varrho ^{0}_{j}, \vert v_{j} \vert < \varrho^{0}_{j}, \text{ and } \bigl\vert \xi_{j}-\xi'_{j} \bigr\vert < w \text{ for some } \xi'\in\Pi,j\in\mathbb{Z}\bigr\} , \end{aligned}$$
(2.7)

here \(\rho^{0}_{j}=\frac{1}{4}\mu e^{- \vert j \vert ^{1+\alpha}}\), \(\varrho ^{0}_{j}=\frac{1}{4}\sqrt{\rho^{0}_{j}}\). \(f_{j}(I,\theta,u,v), j\in\mathbb{Z}\) are real analytic on the domain \(\mathcal{D}\) and satisfy

$$ \sup_{\mathcal{D}} \bigl\vert f_{j}(I,\theta,u,v,\xi) \bigr\vert \leq K\exp \biggl[-\frac{3}{2} \bigl(\vert j \vert -1\bigr) ^{1+\alpha}\biggr],\quad j\in\mathbb{Z}, $$
(2.8)

for some \(K>0\).

Letting

$$\begin{aligned} &P(I,\theta,u,v,\xi)=\sum_{j\in\mathbb{Z}}f_{j}(I_{j}, \theta _{j},u_{j},u_{j+1},v_{j},v_{j+1}, \xi_{j}), \\ &Q(I,u,v)=\sum_{j\in\mathbb{Z}}O\bigl( \vert u_{j}-v_{j} \vert ^{4}\bigr)+\sum _{j\in\mathbb {Z}}O\bigl( \vert I_{j} \vert ^{2} \bigr). \end{aligned}$$

Then Hamiltonian (2.6) is of the form

$$ H=H(I,\theta,u,v,\xi)=\sum_{j\in \mathbb{Z}} \omega_{j}I_{j}+\sum_{j\in\mathbb{Z}} \lambda_{2j}u_{j}v_{j} +Q(I,u,v)+\epsilon P(I, \theta,u,v,\xi), $$
(2.9)

where the Hamiltonian H satisfies the following conditions:

(B1) H is real analytic in \(\mathcal{D}\).

(B2) (Non-degenerate) There are constants \(\delta_{b}>\delta_{a}> 0\) such that on some complex neighborhood of Π

$$\delta_{a}\leq \biggl\vert \frac{\partial\omega_{j}}{\partial\xi_{j}} \biggr\vert \leq \delta _{b},\quad j\in\mathbb{Z}. $$

(B3) The inequality

$$\vert P \vert \leq K_{1} $$

holds on the domain \(\mathcal{D}\), where \(K_{1}\) is a positive constant.

3 KAM theorem and its iterative lemma

3.1 Statement of KAM theorem

Let \(\hat{\mathbb{T}}^{\mathbb{Z}}=\mathbb{C}^{\mathbb{Z}}/(2\pi \mathbb{Z})^{\mathbb{Z}}\). Define the phase space

$$\mathcal{P}=\mathbb{C}^{\mathbb{Z}}\times\hat{\mathbb {T}}^{\mathbb{Z}} \times\mathbb{C}^{\mathbb{Z}}\times\mathbb {C}^{\mathbb{Z}}\ni(I,\theta,u,v). $$

We now consider a small perturbation

$$ H=H_{0}+\epsilon P(I,\theta,u,v,\xi),\quad \xi\in\Pi, $$
(3.1)

of an infinite dimensional Hamiltonian in the parameter-dependent normal form

$$ H_{0}=\sum_{j\in\mathbb{Z}}\omega_{j}I_{j}+ \sum_{j\in \mathbb{Z}}\lambda_{2j}u_{j}v_{j},\quad \xi\in\Pi, $$
(3.2)

on the phase space \(\mathcal{P}\) with the symplectic structure

$$ \sum_{j\in\mathbb{Z}}\,d\theta_{j}\wedge \,dI_{j}+\sum_{j\in\mathbb{Z}}\,du_{j} \wedge \,dv_{j}. $$
(3.3)

The Hamiltonian equations of motion of \(H_{0}\) are as follows:

$$ \dot{\theta}=\omega, \qquad \dot{I}=0,\qquad \dot {u}=\Lambda v,\qquad \dot{v}=-\Lambda u, $$

here \(\Lambda=\operatorname{diag}(\lambda_{2j})_{j\in\mathbb{Z}}\). Hence, for each \(\xi\in\Pi\), there is an infinite dimensional invariant torus: \(\mathcal{T}^{\mathbb{Z}}_{0}=\mathbb{T}^{\mathbb {Z}}\times\{0\}\times\{0\}\times\{0\}\) for \(H_{0}\).

Our aim in this issue is to prove the persistence of the torus \(\mathcal{T}^{\mathbb{Z}}_{0}\) under the small perturbation ϵP for “most” \(\xi\in\Pi\) via a KAM method similar to that in [41].

Theorem 3.1

Suppose that Hamiltonian (2.9) satisfies conditions (B1)–(B3). Then there exists a small constant \(\epsilon^{*}\) such that, if \(0<\epsilon<\epsilon^{*}\), then there are a set \(\Pi_{\infty}\subset\Pi\) with \(\operatorname{Prob}(\Pi_{\infty})\) arbitrarily close to one (depending on ϵ), an analytic torus embedding \(\mathcal{C}^{\infty}: \mathbb{T}^{\mathbb {Z}}\times\Pi_{\infty}\rightarrow\mathcal{P}\), and a map \(\omega ^{\infty}: \Pi_{\infty}\rightarrow\mathbb{R}^{\mathbb{Z}}\) such that, for each \(\xi\in\Pi_{\infty}\), the map \(\mathcal{C}^{\infty }\) restricted to \(\mathbb{T}^{\mathbb{Z}}\times\{\xi\}\) is an analytic embedding of rotational torus with frequencies \(\omega^{\infty}\) satisfying \(\vert \omega^{\infty}-\omega \vert _{\infty}<\epsilon^{1/6}\) for the Hamiltonian H defined by (2.9).

3.2 Iterative constants and iterative domains

In what follows, we denote by \(C,C_{1}, C_{2},\ldots \) positive constants which arrive in estimates, and by \(K,K_{1},K_{2},\ldots \) positive constants which arrive in lemmas and theorems. Both of them are independent of ϵ and the number m of the iteration, and may be different in different parts of the text. Let \(C(m)\) be the function of m of the form \(C_{1}m^{C_{2}m}\) or \(C_{1}m^{C_{2}m^{2}}\) or \(C_{1}m^{C_{2}m^{4}}\).

As usual, the KAM theorem is proved by the Newton-type iteration procedure which involves an infinite sequence of coordinate changes. In order to make our iteration procedure run, we need the following iterative constants and iterative domains.

1. \(\epsilon_{m}=\epsilon^{(\frac{5}{4})^{m}}, \epsilon_{m}\) bounds the size of the interaction after m iterations.

2. \(\delta_{m+1}=\delta_{m}-b_{m}=\delta_{m}-\delta_{0}/[64(m+1)^{2}]\), \(\delta_{m}\) measures the size of the analyticity domain in the angular variables after m iterations, and \(b_{m}\) is the amount by which the domain shrinks in the \((m+1)\)th step.

3. \(w_{m}=(\epsilon_{m})^{2\gamma}\), \(w_{m}\) measures the size of the analyticity domain in the frequency space. γ is a small positive constant.

4. \(L_{m}=\{2(1+\beta) \vert \ln\epsilon_{m} \vert /3\}^{1/1+\alpha}\); \(L_{m}\) determines the size of the region we must consider at the mth iterative step. Here β is a small positive constant, α is the constant in (1.6).

5. \(M_{m+1}=3 \vert \ln\epsilon_{m} \vert /(2b_{m})\), \(M_{m}\) determines the number of Fourier coefficients we must consider at the mth step of the iteration, \(b_{m}\) is defined in (2).

6.

$$\begin{aligned} \rho^{m+1}_{j}={}&2^{-3} \rho^{m}_{j},\quad \text{if } \vert j \vert > L_{m+1}, \\ ={}&2^{-3}\rho^{m}_{L_{m+1}},\quad \text{if } \vert j \vert \leq L_{m+1}, \end{aligned}$$

\(\rho^{m}\) measures the size of the analyticity domain for the action variables.

7.

$$\begin{aligned} \varrho^{m+1}_{j}={}&2^{-3} \varrho^{m}_{j}, \quad\text{if } \vert j \vert > L_{m+1}, \\ ={}&2^{-3}\varrho^{m}_{L_{m+1}},\quad \text{if } \vert j \vert \leq L_{m+1}, \end{aligned}$$

\(\varrho ^{m}\) measures the size of the analyticity domain for variables \(u,v\).

8.

$$\begin{aligned} \hat{\omega}^{m}_{j}={}&0,\quad\text{if } \vert j \vert > L_{m}, \\ ={}&\sum^{m-1}_{n=n(j)}(\epsilon_{n})^{2/9},\quad \text{if } \vert j \vert \leq L_{m} \end{aligned}$$

[Here, \(n(j)\) is defined by \(L_{n(j)}< \vert j \vert \leq L_{n(j)+1}\)].

9.

$$\begin{aligned} \eta^{m}_{ij}={}&\min\Biggl\{ \sum ^{m-1}_{n=n(i)}(\epsilon _{n})^{1/6}, \sum^{m-1}_{n=n(j)}(\epsilon_{n})^{1/6} \Biggr\} \quad \text{if } \vert i \vert \leq L_{m}, \text{ and } \vert j \vert \leq L_{m}, \\ ={}&0, \quad\text{otherwise.} \end{aligned}$$

10. \(\{\Pi_{m}\}^{\infty}_{m=0}\): be a sequence of compact subsets of \(\mathbb{R}^{\mathbb{Z}}_{+}\) with

$$\Pi_{0}\supset\Pi_{1}\supset\cdots\supset\Pi_{m} \supset\Pi _{m+1}\supset\cdots, $$

here \(\Pi_{0}=\Pi\).

11. \(\mathcal{D}^{l}_{m}=\{(I,\theta,u,v)\in\mathbb{C}^{\mathbb {Z}}\times\mathbb{C}^{\mathbb{Z}}\times\mathbb{C}^{\mathbb {Z}}\times\mathbb{C}^{\mathbb{Z}}: \vert I_{j} \vert <\frac{\rho^{m}_{j}}{2^{l}}, \vert \operatorname{Im} \theta_{j} \vert <\delta _{m}-(1-\frac{1}{2^{l}})b_{m}, \vert u_{j} \vert <\frac{\varrho^{m}_{j}}{2^{l}}, \vert v_{j} \vert <\frac {\varrho^{m}_{j}}{2^{l}},j\in\mathbb{Z}\},l=0,1,2; \text{ and denote } \mathcal{D}_{m}=\mathcal{D}^{0}_{m}\).

12. \(\mathcal{O}_{m}=\{\xi\in\mathbb{C}^{\mathbb{Z}}: \vert \xi_{j}-\xi '_{j} \vert < w_{m},j\in\mathbb{Z}, \text{for some } \xi'\in\Pi_{m}\} \).

3.3 Iterative lemma

The proof of Theorem 3.1 uses the KAM method with a novel addition:

We introduce a sequence of length scales, \(L_{m}\nearrow\infty\), and at the mth stage of our iterative procedure, we consider only sites \(j: \vert j \vert \leq L_{m}\).

As a standard way of proving the theorem, we must give the iterative lemma.

Lemma 3.2

Consider a family of Hamiltonians \(\mathcal{H}_{l}\ (0\leq l\leq m)\):

$$ \mathcal{H}_{l}=\omega^{l}\cdot I+\Lambda^{l} u \cdot v+Q+P_{l}+\epsilon\sum_{ \vert j \vert \geq L_{l+1}}f_{j}(I, \theta,u,v,\xi), $$
(3.4)

where \((I,\theta,u,v,\xi)\in\mathcal{D}_{l}, \Lambda^{l} u\cdot v=\sum_{j\in\mathbb{Z}}\Lambda^{l}_{j}u_{j}v_{j}\). Write \(P_{l}:=P'_{2l}+P'_{3l}\), where \(P'_{3l}=P_{l}-P'_{2l}\) and

$$ P'_{2l}=\sum_{2 \vert p \vert + \vert q+\bar{q} \vert \leq 2}R^{l}_{pq\bar{q}}( \theta,\xi)I^{p}(l)u^{q}(l)v^{\bar{q}}(l), $$
(3.5)

in the usual multi-index notation, where

$$I(l)=(I_{j})_{ \vert j \vert \leq L_{l+1}},\qquad u(l)=(u_{j})_{ \vert j \vert \leq L_{l+1}},\qquad v(l)=(v_{j})_{ \vert j \vert \leq L_{l+1}}. $$

Assume that, for \(0 \leq l \leq m\), the following conditions hold true:

\((l.1)\) \(\mathcal{H}_{l}\) is real analytic in the domain \(\mathcal {D}_{l} \times \mathcal{O}_{l}\), \(\mathcal{H}_{0}=H\);

\((l.2)\) \(P_{l}=P^{l}+\epsilon\sum_{L_{l}\leq \vert j \vert < L_{l+1}}f_{j}\) and \(P^{l}\) depends only on \((I_{j},\theta_{j},u_{j},v_{j},\xi_{j})\) with \(\vert j \vert \leq L_{l}\), \(P^{0}=\epsilon\sum_{ \vert j \vert < L_{0}}f_{j}(I,\theta,u,v,\xi)\), and \(\vert P^{l} \vert ^{\mathcal{D}_{l}\times\mathcal{O}_{l}}\leq C(l)\epsilon_{l}\);

\((l.3)\) \(\omega^{l}_{j}=\omega^{l-1}_{j}+R^{l-1}_{j00}(0,\xi),l\geq 1,\omega^{0}=\omega\), and \(R^{l-1}_{j00}(0,\xi)=0\) with \(\vert j \vert > L_{l}\), and \(\vert \omega^{l}_{j}-\omega_{j} \vert ^{\mathcal{O}_{l-1}}\leq\hat {\omega}_{j}, \vert \partial_{\xi_{i}}(\omega^{l}_{j}-\omega_{j}) \vert ^{\mathcal {O}_{l}}\leq\eta^{l}_{ij}\);

\((l.4)\) \(\Lambda^{l}_{j}=\Lambda^{l-1}_{j}+R^{l-1}_{0jj}(0,\xi),l\geq 1,\Lambda^{0}=\Lambda\), and \(R^{l-1}_{0jj}(0,\xi)=0\) with \(\vert j \vert > L_{l}\), \(\vert \Lambda^{l}_{j}-\Lambda_{j} \vert ^{\mathcal{O}_{l}}\leq\hat{\omega}_{j}\);

\((l.5)\) \(\operatorname{Prob}(\Pi_{l})\geq1-\sum^{l}_{j=0}(\epsilon_{j})^{\kappa}\) for some \(\kappa>0\);

\((l.6)\) Writing \(\mathcal{C}^{l}=\mathcal{C}_{1}\circ\cdots\circ \mathcal{C}_{l}=[I+\Phi^{l},\theta+\Psi^{l},u+\phi^{l},v+\psi^{l}]\), we have

$$ \begin{aligned} &\bigl\vert \Phi^{l}_{j} \bigr\vert \leq \sum^{l-1}_{n=n(j)}(\epsilon_{n})^{\frac{8}{9}},\qquad \bigl\vert \Psi^{l}_{j} \bigr\vert \leq\sum ^{l-1}_{n=n(j)}(\epsilon_{n})^{\frac{2}{9}},\\ &\bigl\vert \phi^{l}_{j} \bigr\vert \leq\sum ^{l-1}_{n=n(j)}(\epsilon_{n})^{\frac{5}{9}},\qquad \bigl\vert \psi^{l}_{j} \bigr\vert \leq\sum ^{l-1}_{n=n(j)}(\epsilon_{n})^{\frac{5}{9}}, \end{aligned} $$
(3.6)

and \(\mathcal{C}^{l}=\textit{ identity at sites}\), j, with \(\vert j \vert > L_{l}\).

Then there is a positive constant \(\epsilon^{*}\) small enough such that, if \(0 <\epsilon<\epsilon^{*}\), there is a set \(\Pi_{m+1}\subset\Pi _{m}\) with \(\operatorname{Prob}(\Pi_{m}\setminus\Pi_{m+1})\leq(\epsilon _{m})^{\kappa}\), and a change of variables \(\mathcal{C}_{m+1}:\mathcal {D}_{m+1}\times\mathcal{O}_{m+1}\rightarrow\mathcal{D}_{m}\times \mathcal{O}_{m}\) is real analytic in \(\mathcal{D}_{m+1}\times\mathcal {O}_{m+1}\). Furthermore, the new Hamiltonian \(\mathcal{H}_{m+1} = \mathcal{H}_{m}\circ\mathcal{C}_{m+1}=\mathcal{H}_{0}\circ\mathcal {C}^{m+1}\) is of the form

$$ \mathcal{H}_{m+1}=\omega^{m+1}\cdot I+ \Lambda^{m+1} u\cdot v+Q+P_{m+1}+\epsilon\sum _{ \vert j \vert \geq L_{m+2}}f_{j}(I,\theta,u,v,\xi) $$
(3.7)

and satisfies all the above conditions \((l.1)\)\((l.6)\) with l being replaced by \(m + 1\).

3.4 Derivation of homological equations

Step 1. Splitting the perturbation. Let us consider the Hamiltonian \(\mathcal{H}_{m}\). Following Kuksin [45] and Yuan [49], we split the perturbation \(P_{m}\) into an “essential” part \(P'_{2m}\) (i.e., \(l = m\) in (3.5)) which is linear in I, quadratic in \(u,v\), and an unessential part \(P'_{3m}\).

Lemma 3.3

If \(0<\epsilon<\epsilon^{*}\ll1\), then the following estimates hold true:

  1. (a)
    $$ \begin{aligned} &\bigl\vert R^{m}_{j00}(0,\xi) \bigr\vert ^{\mathcal{O}_{m}}\leq(\epsilon _{m})^{\frac{2}{9}},\qquad \bigl\vert R^{m}_{0jj}(0,\xi) \bigr\vert ^{\mathcal{O}_{m}}\leq( \epsilon _{m})^{\frac{2}{9}},\quad \vert j \vert \leq L_{m+1}, \\ &\bigl\vert \partial_{\xi_{i}} R^{m}_{j00}(0,\xi ) \bigr\vert ^{\mathcal{O}_{m+1}}\leq(\epsilon_{m})^{\frac{1}{6}}, \quad \vert i \vert , \vert j \vert \leq L_{m+1},\\ &\partial_{\xi_{i}} R^{m}_{j00}(0,\xi)=0,\quad \textit{otherwise}; \\ &R^{m}_{j00}(0,\xi)=R^{m}_{0jj}(0, \xi)=0,\quad \vert j \vert >L_{m+1}. \end{aligned} $$
    (3.8)
  2. (b)

    \(\vert P'_{3m} \vert ^{\mathcal{D}_{m+1}\times\mathcal{O}_{m}}\leq C(m+1)\epsilon_{m+1}\);

  3. (c)

    the functions \(P'_{2m}\) and \(P'_{3m}\) are real analytic and depend only on \((I_{j},\theta_{j},u_{j},v_{j},\xi_{j})\) with \(\vert j \vert \leq L_{m+1}\).

Proof

For (a), we consider \(R^{m}_{j00}(\theta,\xi)\). Since \(\vert P_{m} \vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}\leq C(m)\epsilon_{m}\), by Assumption \((l.2)\), we get that \(\vert P'_{2m} \vert ^{\mathcal{D}_{m}\times \mathcal{O}_{m}}\leq C(m)\epsilon_{m}\). Therefore,

$$\biggl\vert \sum_{ \vert j \vert \leq L_{m+1}}R^{m}_{j00}( \theta,\xi)I(m)_{j} \biggr\vert ^{\mathcal {D}_{m}\times\mathcal{O}_{m}}\leq C(m) \epsilon_{m}. $$

For any k with \(\vert k \vert \leq L_{m+1}\), let \(I^{*}(m)\) satisfy

$$ I^{*}(m)_{j}= \textstyle\begin{cases}\frac{1}{2}\rho^{m}_{k}&j=k,\\ 0&\text{otherwise}. \end{cases} $$

Then \(\vert \sum_{ \vert j \vert \leq L_{m+1}}R^{m}_{j00}(\theta,\xi )I^{*}(m)_{j} \vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}\leq C(m)\epsilon_{m}\). At the same time,

$$\biggl\vert \sum_{ \vert j \vert \leq L_{m+1}}R^{m}_{j00}( \theta,\xi)I^{*}(m)_{j} \biggr\vert ^{\mathcal {D}_{m}\times\mathcal{O}_{m}}=\frac{1}{2} \rho^{m}_{k} \bigl\vert R^{m}_{k00}( \theta,\xi ) \bigr\vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}. $$

Thus

$$ \bigl\vert R^{m}_{k00}(\theta,\xi) \bigr\vert ^{\mathcal {D}_{m}\times\mathcal{O}_{m}}\leq\bigl(\rho^{m}_{k} \bigr)^{-1}C(m)\epsilon_{m}. $$
(3.9)

By the definition of \(\rho^{m}\), when \(\vert j \vert \leq L_{m}\), we have

$$ \rho^{m}_{j}=2^{-3}\varrho^{m-1}_{L_{m}}=2^{-3m} \rho ^{0}_{L_{m}}=2^{-3m-2}\tau(\epsilon_{m})^{\frac{2+2\beta}{3}}. $$

Hence, if \(\vert k \vert \leq L_{m}\), we have

$$ \bigl\vert R^{m}_{k00}(\theta,\xi) \bigr\vert ^{\mathcal {D}_{m}\times\mathcal{O}_{m}}\leq C(m) (\epsilon_{m})^{\frac{1}{3}-\beta}. $$
(3.10)

For the case \(L_{m}< \vert k \vert \leq L_{m+1}\), by Assumption \((l.2)\), we know that \(P^{m}\) makes no contribution to \(R^{m}_{k00}(\theta,\xi)\). Thus we see that in this case the factor \(\epsilon_{m}\) in (3.9) can be replaced by \(e^{-\frac{3}{2}( \vert k \vert -1)^{1+\alpha}}\) and by the definition of \(\rho^{m}\):

$$\begin{aligned} \bigl\vert R^{m}_{k00}( \theta,\xi) \bigr\vert ^{\mathcal{D}_{m}\times\mathcal {O}_{m}}&\leq\bigl(\rho^{m}_{k} \bigr)^{-1}C(m)\exp\biggl\{ -\frac{3}{2}\bigl( \vert k \vert -1\bigr)^{1+\alpha}\biggr\} \\ &\leq C(m)\exp\biggl\{ \vert k \vert ^{1+\alpha}-\frac{3}{2} \bigl( \vert k \vert -1\bigr)^{1+\alpha}\biggr\} \\ &\leq C(m)\exp\biggl\{ (1+\beta) \bigl( \vert k \vert -1\bigr)^{1+\alpha}- \frac{3}{2}\bigl( \vert k \vert -1\bigr)^{1+\alpha}\biggr\} \\ &\leq C(m)\exp\biggl\{ -\frac{1-2\beta }{2}(L_{m})^{1+\alpha}\biggr\} \leq C(m) (\epsilon_{m})^{\frac{1}{3}-\beta}. \end{aligned}$$
(3.11)

Also, by Assumption \((l.2)\), if \(\vert k \vert >L_{m+1}, R^{m}_{k00}(\theta,\xi )=0\). From (3.10) and (3.11), for β small enough, we have \(\vert R^{m}_{j00}(\theta,\xi) \vert ^{\mathcal {D}_{m}\times\mathcal{O}_{m}}\leq(\epsilon_{m})^{2/9}, \vert j \vert \leq L_{m+1}\). Thus

$$ \bigl\vert R^{m}_{j00}(0,\xi) \bigr\vert ^{\mathcal{O}_{m}}\leq(\epsilon _{m})^{2/9}, \quad \vert j \vert \leq L_{m+1}\quad \text{and}\quad R^{m}_{j00}(0,\xi)=0,\quad \vert j \vert >L_{m+1}. $$
(3.12)

The case of \(R^{m}_{0jj}(0,\xi)\) can be proved in a similar way.

By the Cauchy estimate, we have

$$\bigl\vert \partial_{\xi_{i}} R^{m}_{j00}(0,\xi) \bigr\vert ^{\mathcal{O}_{m+1}}\leq\frac { \vert R^{m}_{j00}(0,\xi) \vert ^{\mathcal{O}_{m}}}{w_{m}-w_{m+1}}\leq2(\epsilon _{m})^{\frac{2}{9}-2\gamma}\leq(\epsilon_{m})^{\frac{1}{6}},\quad \vert i \vert , \vert j \vert \leq L_{m+1}. $$

If \(\vert i \vert > L_{m+1} \text{ or } \vert j \vert > L_{m+1}\), \(\partial_{\xi_{i}} R^{m}_{j00}(0,\xi)=0\) is obvious.

For (b). Let \((I,\theta,u,v)\in\mathcal{D}_{m+1}, P_{m}=\epsilon _{m}\mathcal{P}_{m}\), and \(\upsilon=(\epsilon_{m})^{1/12}\). Then \(((\frac {z}{\upsilon})^{2}I,\theta,(\frac{z}{\upsilon})u,(\frac{z}{\upsilon })v)\) \(\in\mathcal{D}_{m}\) for \(z\in\mathbb{C}, \vert z \vert \leq1\). Let us consider the function \(z\mapsto\mathcal{P}_{m}((\frac{z}{\upsilon })^{2}I,\theta,(\frac{z}{\upsilon})u,(\frac{z}{\upsilon})v)\) and its Taylor series at zero:

$$\mathcal{P}_{m}\biggl(\biggl(\frac{z}{\upsilon}\biggr)^{2}I, \theta,\biggl(\frac{z}{\upsilon }\biggr)u,\biggl(\frac{z}{\upsilon}\biggr)v \biggr)=h_{0}+h_{1}z+h_{2}z^{2}+\cdots. $$

From \(\vert P_{m} \vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}\leq C(m)\epsilon_{m}\), we have \(\vert \mathcal{P}_{m} \vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}\leq C(m)\). Thus \(\vert h_{k} \vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}\leq C(m)\) for all k. Since \(P'_{3m}=\epsilon_{m}(h_{3}\upsilon^{3}+h_{4}\upsilon ^{4}+\cdots)\), then

$$\bigl\vert P'_{3m} \bigr\vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}= \epsilon_{m} \bigl\vert h_{3}\upsilon ^{3}+h_{4} \upsilon^{4}+\cdots \bigr\vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}\leq \frac{C(m)(\epsilon_{m})^{5/4}}{1-(\epsilon_{m})^{1/12}} \leq C(m+1)\epsilon_{m+1}. $$

For (c). From Assumptions \((l.1)\), \((l.2)\), the proof of (c) is obvious. □

Step 2. Truncation. Let

$$\begin{aligned} &\omega^{m+1}_{j}=\omega^{m}_{j}+R^{m}_{j00}(0, \xi),\quad j\in \mathbb{Z}, \end{aligned}$$
(3.13)
$$\begin{aligned} &\Lambda^{m+1}_{j}=\Lambda^{m}_{j}+R^{m}_{0jj}(0, \xi),\quad j\in \mathbb{Z}. \end{aligned}$$
(3.14)

Then, by Lemma 3.3, the frequencies satisfy assumptions \((l.3)\) and \((l.4)\) with \(l = m + 1\). Write

$$ \begin{aligned}&P_{2m}=\sum _{ \vert k \vert \leq M_{m+1}}\sum_{2 \vert p \vert + \vert q+\bar {q} \vert \leq2}R^{m}_{kpq\bar{q}}( \xi)e^{\sqrt{-1}k\cdot\theta} I^{p}u^{q}v^{\bar{q}}-\sum _{j}\bigl\{ R^{m}_{0j00}I_{j}+R^{m}_{00jj}u_{j}v_{j} \bigr\} \\ &\hat{P}_{2m}=\sum_{ \vert k \vert > M_{m+1}}\sum _{2 \vert p \vert + \vert q+\bar{q} \vert \leq 2}R^{m}_{kpq\bar{q}}(\xi)e^{\sqrt{-1}k\cdot\theta} I^{p}u^{q}v^{\bar {q}},\quad k\in\mathbb{Z}^{2L_{m+1}}, \\ &P_{3m}=\hat{P}_{2m}+P'_{3m} \end{aligned} $$
(3.15)

here \(I=I(m), u=u(m), v=v(m)\). In addition, \(R^{m}_{0j00}=R^{m}_{j00}(0,\xi ), R^{m}_{00jj}=R^{m}_{0jj}(0,\xi)\) is obvious. Then we can write \(\mathcal{H}_{m}\) as

$$\begin{aligned} \mathcal{H}_{m}={}&\omega^{m+1}\cdot I+\Lambda^{m+1} u \cdot v+Q(I,u,v)+P_{2m}+P_{3m} \\ &{}+\epsilon\sum _{ \vert j \vert \geq L_{m+1}}f_{j}(I,\theta,u,v,\xi), \end{aligned}$$
(3.16)

and the functions \(P_{2m}\) and \(P_{3m}\) are real analytic and depend only on \((I_{j},\theta_{j},u_{j},v_{j},\xi_{j})\) with \(\vert j \vert \leq L_{m+1}\).

Claim

$$ \vert P_{3m} \vert ^{\mathcal{D}_{m+1}\times\mathcal{O}_{m}}\leq C(m+1) \epsilon_{m+1}. $$

Proof

From (3.15) we first consider

$$ \begin{aligned} \vert \hat{P}_{2m} \vert ^{\mathcal{D}_{m+1}\times \mathcal{O}_{m}} \leq{}&\sum_{ \vert k \vert >M_{m+1}} \bigl\vert \widehat{P'}_{2m}(k) \bigr\vert ^{\mathcal{D}_{m}\times \mathcal{O}_{m}}e^{ \vert k \vert \delta _{m+1}} \\ \leq{}&\sum_{ \vert k \vert >M_{m+1}} \bigl\vert P'_{2m} \bigr\vert ^{\mathcal {D}_{m}\times \mathcal{O}_{m}}e^{- \vert k \vert b_{m}} \\ \leq{}&C(m+1) (\epsilon_{m})^{2}\leq\epsilon_{m+1}. \end{aligned} $$

With Lemma 3.3(b), we obtain that

$$ \vert P_{3m} \vert ^{\mathcal{D}_{m+1}\times\mathcal {O}_{m}}\leq C(m+1) \epsilon_{m+1}. $$
(3.17)

Thus the proof of the claim is complicated. □

Step 3. Derivation of the homological equations.

Proof

We look for a near-to-the-identity transformation \(\mathcal {C}_{m+1}\) so that (3.7) holds; such transformation will be determined by a generating function of the form

$$ I'\theta+u'v+S\bigl(I', \theta,u',v\bigr), \qquad \textstyle\begin{cases}I=I'+\frac{\partial S}{\partial\theta},\qquad \theta '=\theta+\frac{\partial S}{\partial I'},\\u=u'+\frac{\partial S}{\partial v}, \qquad v'=v+\frac{\partial S}{\partial u'}, \end{cases} $$
(3.18)

and assume that S is \(O(\epsilon_{m})\).

Inserting \(I=I'+\frac{\partial S}{\partial\theta}, u=u'+\frac {\partial S}{\partial v}\) into \(\mathcal{H}_{m}\), one finds

$$ \begin{aligned}&\mathcal{H}_{m} \biggl(I'+\frac{\partial S}{\partial\theta },\theta,u'+ \frac{\partial S}{\partial v},v\biggr)\\ &\quad=\omega^{m+1}\cdot I'+ \Lambda^{m+1} u'\cdot v'+Q \bigl(I',u',v'\bigr) \\ &\qquad{}+\omega^{m+1}\cdot\frac {\partial S}{\partial\theta}+\Lambda^{m+1}\biggl(v \cdot\frac{\partial S}{\partial v}-u'\cdot\frac{\partial S}{\partial u'} \biggr)+P_{2m}\bigl(I',\theta,u',v\bigr) \\ &\qquad{}+P_{m+1}+\epsilon\sum_{ \vert j \vert \geq L_{m+2}}f_{j} \biggl(I'+\frac{\partial S}{\partial\theta},\theta,u'+ \frac{\partial S}{\partial v},v\biggr), \end{aligned} $$

where \(P_{m+1}=P^{m+1}+\epsilon\sum_{L_{m+1}\leq \vert j \vert < L_{m+2}}f_{j}(I'+\frac{\partial S}{\partial\theta},\theta,u'+\frac {\partial S}{\partial v},v)\) and

$$ \begin{aligned}P^{m+1}={}&Q\biggl(I'+ \frac{\partial S}{\partial\theta },u'+\frac{\partial S}{\partial v},v\biggr)-Q \bigl(I',u',v'\bigr)+P_{3m} \biggl(I'+\frac {\partial S}{\partial\theta},\theta,u'+ \frac{\partial S}{\partial v},v\biggr) \\ &{}+P_{2m}\biggl(I'+\frac{\partial S}{\partial\theta},\theta ,u'+\frac{\partial S}{\partial v},v\biggr)-P_{2m} \bigl(I',\theta,u',v\bigr). \end{aligned} $$

Clearly, we hope to find the transformation S satisfying

$$ \omega^{m+1}\cdot\frac{\partial S}{\partial\theta}+\Lambda^{m+1} \cdot\biggl(v\cdot\frac{\partial S}{\partial v}-u'\cdot\frac{\partial S}{\partial u'} \biggr)+P_{2m}\bigl(I',\theta,u',v\bigr)=0, $$
(3.19)

i.e., the homological equation. □

3.5 Solutions to the homological equations and investigation of \(\mathcal{C}_{m+1}\)

We can solve (3.19) by means of Fourier series, and we find

$$ S_{kpq\bar{q}}= \textstyle\begin{cases}\frac{R^{m}_{kpq\bar{q}}}{\sqrt{-1}\langle k,\omega ^{m+1}\rangle+\langle\bar{q}-q,\Lambda^{m+1}\rangle}& \vert k \vert + \vert q-\bar {q} \vert \neq0,\\ 0&\text{otherwise}. \end{cases} $$
(3.20)

Thus

$$ S=S\bigl(I',\theta,u',v,\xi\bigr)= \sum_{ \vert k \vert + \vert q-\bar{q} \vert \neq0}\frac{R^{m}_{kpq\bar{q}}(\xi)e^{\sqrt {-1}k\cdot\theta} I^{\prime p}u^{\prime} q v^{\bar{q}}}{\sqrt{-1}\langle k,\omega ^{m+1}\rangle+\langle\bar{q}-q,\Lambda^{m+1}\rangle}. $$
(3.21)

In general, the sum in (3.21) will diverge. To cure this problem, we first reduce the infinite sum to a finite one. By the definition of \(P_{2m}\), we can restrict the sum in (3.21) to vectors \(k\in\mathbb{X}^{m+1}\) with

$$ \mathbb{X}^{m+1}\equiv\bigl\{ k\in\mathbb {Z}^{2L_{m+1}}:0< \vert k \vert < M_{m+1}\bigr\} . $$
(3.22)

With these restrictions, the sum in (3.21) contains only a finite number of terms, and a simple estimate shows this number is bounded by \((2M_{m+1})^{(2L_{m+1})}\).

To prevent that the sum in (3.21) fails to be well defined, we exclude

$$ R^{m}\equiv\bigl\{ \xi\in\Pi_{m}:\exists k\in\mathbb {X}^{m+1} \text{ s.t. } \bigl\vert k\cdot\omega^{m+1} \bigr\vert < (\epsilon_{m})^{\gamma }\bigr\} , $$
(3.23)

and set \(\Pi_{m+1}=\Pi_{m}\setminus R^{m}\).

Now we start to estimate \(\Pi_{m+1}\). By the definition of \(\mathbb{X}^{m+1}\),

$$R^{m}\cap\biggl(\prod_{ \vert j \vert >L_{m+1}}\mathbb{R}_{+} \biggr)=\emptyset, $$

therefore we just need to consider the finite dimensional situation. Let \(\xi(m)=(\xi)_{ \vert j \vert \leq L_{m+1}}, \omega^{m+1}(m)=\{(\omega _{j}^{m+1})(\xi(m))\}_{ \vert j \vert \leq L_{m+1}},\Pi_{m}(m)=\Pi_{m}\cap(\prod_{ \vert j \vert \leq L_{m+1}}\mathbb{R}_{+}),\mathcal{O}_{m}(m)\) be the complex \(w_{m}\)-neighborhood of \(\Pi_{m}(m)\). In view of estimates \((l.3)\) with \(l=m+1\) and assumption (B2), it is easy to see that, if \(0<\epsilon <\epsilon^{*}<\frac{\delta_{a}}{2}\),

$$\biggl\vert \frac{\partial\omega^{m+1}(m)}{\partial\xi(m)} \biggr\vert ^{\mathcal {O}_{m}(m)}\geq \frac{\delta_{a}}{2}. $$

Moreover, by the inverse function theorem, there exists the inverse \((\omega^{m+1}(m))^{-1}(\omega(m))\) for \(\omega(m)\in\omega ^{m+1}(m)(\mathcal{O}_{m}(m))=^{\text{def}}\Omega_{m}\), and

$$\biggl\vert \frac{\partial(\omega^{m+1}(m))^{-1}}{\partial\omega(m)} \biggr\vert ^{\Omega _{m}}\leq \frac{K_{2}}{\delta_{a}}, \quad\text{here } K_{2}=\max_{\xi\in\Pi } \bigl\vert \omega(\xi) \bigr\vert _{\infty}. $$

Obviously, the Kolmogorov measure

$$\operatorname{Prob}\bigl\{ \omega(m)\in\bigl(\omega^{m+1}(m) \bigr)^{-1}\bigl(\Pi _{m}(m)\bigr): \bigl\vert k\cdot \omega(m) \bigr\vert \leq(\epsilon_{m})^{\gamma}\bigr\} \leq C( \epsilon _{m})^{\gamma},\quad k\in\mathbb{X}^{m+1}, $$

then

$$\operatorname{Prob}\bigl\{ \xi(m)\in\Pi_{m}(m): \bigl\vert k\cdot \omega^{m+1}(m) \bigr\vert \leq (\epsilon_{m})^{\gamma} \bigr\} \leq CK_{3}(\epsilon_{m})^{\gamma},\quad k\in\mathbb {X}^{m+1}, $$

here \(K_{3}=\max_{\xi(m)\in\Pi_{m}(m)} \vert \frac{\partial\omega ^{m+1}(m)}{\partial\xi(m)} \vert \). Hence

$$\operatorname{Prob}\bigl\{ \xi\in\Pi_{m}: \bigl\vert k\cdot \omega^{m+1} \bigr\vert \leq(\epsilon _{m})^{\gamma} \bigr\} \leq CK_{3}(\epsilon_{m})^{\gamma},\quad k\in \mathbb{X}^{m+1}. $$

Since there are at most \((2M_{m+1})^{2L_{m+1}}\) vectors in \(\mathbb {X}^{m+1}\), we find that \(\operatorname{Prob}( \Pi_{m}\backslash\Pi_{m+1})\) is bounded by \((\epsilon _{m})^{\kappa}\) for some \(0 < \kappa< \gamma\), and the bound on \(\operatorname{Prob}(\Pi_{m+1})\) follows.

We can bound the denominators in (3.21) only if \(\xi\in\Pi _{m+1}\). However, note that for any \(\xi'\in\mathcal{O}_{m+1}\), we can write

$$\begin{aligned} &k\cdot\omega^{m+1}\bigl(\xi'\bigr) \\ &\quad=k\cdot \omega^{m+1}(\xi)\bigl\{ 1-\bigl(k\cdot\omega^{m+1}(\xi) \bigr)^{-1}\bigl[k\cdot\omega^{m+1}(\xi)-k\cdot \omega^{m+1}\bigl(\xi'\bigr)\bigr]\bigr\} . \end{aligned}$$
(3.24)

Since

$$\begin{aligned} \bigl\vert k\cdot\omega^{m+1}(\xi)-k\cdot \omega^{m+1}\bigl(\xi '\bigr) \bigr\vert &\leq\sum _{ \vert j \vert \leq L_{m+1}} \vert k_{j} \vert \biggl\vert \sum_{ \vert i \vert \leq L_{m+1}}\frac {\partial\omega^{m+1}_{j}}{\partial\xi_{i}}\bigl( \xi_{i}-\xi'_{i}\bigr) \biggr\vert \\ &\leq(\epsilon_{m})^{2\gamma}\sum_{ \vert i \vert , \vert j \vert \leq L_{m+1}} \vert k_{j} \vert \biggl\{ \biggl\vert \frac{\partial(\omega^{m+1}_{j}-\omega_{j})}{\partial\xi_{i}} \biggr\vert + \biggl\vert \frac {\partial\omega_{j}}{\partial\xi_{i}} \biggr\vert \biggr\} \\ &\leq(\epsilon_{m})^{2\gamma}\bigl(\delta_{b}+4L_{m+1} \epsilon^{1/6}\bigr)\sum_{ \vert j \vert \leq L_{m+1}} \vert k_{j} \vert \\ &\leq(\epsilon_{m})^{2\gamma}\bigl(\delta_{b}+4L_{m+1} \epsilon^{1/6}\bigr)M_{m+1} , \end{aligned}$$
(3.25)

thus, for ϵ small enough, one gets

$$ \bigl\vert \bigl(k\cdot\omega^{m+1}(\xi)\bigr)^{-1}\bigl[k \cdot\omega ^{m+1}(\xi)-k\cdot\omega^{m+1}\bigl( \xi'\bigr)\bigr] \bigr\vert \leq(\epsilon_{m})^{\gamma } \bigl(\delta_{b}+4L_{m+1}\epsilon^{1/6} \bigr)M_{m+1}\leq\frac{1}{2}. $$

Therefore

$$ \bigl\vert k\cdot\omega^{m+1}\bigl(\xi'\bigr) \bigr\vert \geq\frac{1}{2}(\epsilon _{m})^{\gamma} $$
(3.26)

remains valid on the domain \(\mathcal{O}_{m+1}\).

For these preparations, we now can estimate the transformation \(S=S(I',\theta,u',v)\). For the estimates, we decompose \(P_{2m}=R^{0}+R^{1}+R^{2}\), where \(R^{j}\) comprises \(\vert q+\bar{q} \vert =j\); and furthermore,

$$ \begin{aligned}&R^{0}=R^{00}, \\ &R^{1}=\bigl\langle R^{10},u(m)\bigr\rangle +\bigl\langle R^{01},v(m)\bigr\rangle , \\ &R^{2}=\bigl\langle R^{20}u(m),u(m)\bigr\rangle +\bigl\langle R^{11}u(m),v(m)\bigr\rangle +\bigl\langle R^{02}v(m),v(m)\bigr\rangle , \end{aligned} $$
(3.27)

where \(R^{ij}\) depend on \(\theta, \xi\), and \(R^{00}\) depends in addition on I. With a similar decomposition of S, it suffices to discuss each term individually. In the following we do this for \(\dot{S}=S^{10}\) and \(\ddot{S}=S^{11}\).

Consider the term \(\dot{S}=S^{10}\), and the corresponding coefficient of is given by

$$ \dot{S}_{k,j}=\frac{\dot{R}_{k,j}}{\sqrt{-1}k\cdot \omega^{m+1}-\Lambda^{m+1}_{j}}, \quad \vert j \vert \leq L_{m+1}, \vert k \vert \leq M_{m+1}. $$
(3.28)

By the small divisor assumptions, we have

$$\bigl\vert \sqrt{-1}k\cdot\omega-\Lambda^{m+1}_{j} \bigr\vert ^{\mathcal{O}_{m+1}}\geq \operatorname{min}\biggl\{ \frac{(\epsilon_{m})^{\gamma}}{2}, \frac{1}{2j^{N}}\biggr\} \geq \frac{(\epsilon_{m})^{\gamma}}{2} $$

and thus \(\vert \dot{S}_{k,j} \vert ^{\mathcal{O}_{m+1}}\leq2(\epsilon _{m})^{-\gamma} \vert \dot{R}_{k,j} \vert ^{\mathcal{O}_{m}}\). Hence

$$ \begin{aligned} \vert \dot{S}_{j} \vert ^{\mathcal{D}^{1}_{m}\times{\mathcal {O}_{m+1}}}&\leq\sum_{ \vert k \vert \leq M_{m+1}} \vert \dot{S}_{k,j} \vert ^{\mathcal {O}_{m+1}}e^{ \vert k \vert (\delta_{m}-\frac{b_{m}}{2})} \\ &\leq2(\epsilon _{m})^{-\gamma}\sum_{ \vert k \vert \leq M_{m+1}} \vert \dot{R}_{k,j} \vert ^{\mathcal {O}_{m}}e^{ \vert k \vert (\delta_{m}-\frac{b_{m}}{2})} \\ &\leq2(\epsilon_{m})^{-\gamma}\sum_{ \vert k \vert \leq M_{m+1}} \vert \dot {R}_{j} \vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}e^{- \vert k \vert \frac{b_{m}}{2}} \\ &\leq2\biggl(\frac{2}{b_{m}}\biggr)^{2L_{m+1}}(\epsilon_{m})^{-\gamma} \vert \dot {R}_{j} \vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}. \end{aligned} $$

Therefore

$$ \begin{aligned} \bigl\vert \bigl\langle S^{10},u'(m+1) \bigr\rangle \bigr\vert ^{\mathcal{D}^{1}_{m}\times {\mathcal{O}_{m+1}}}&\leq\sum_{ \vert j \vert \leq L_{m+1}} \bigl\vert \dot {S}_{j}u'_{j} \bigr\vert ^{\mathcal{D}^{1}_{m}\times{\mathcal{O}_{m+1}}} \\ &\leq2\biggl(\frac{2}{b_{m}}\biggr)^{2L_{m+1}}(\epsilon_{m})^{-\gamma} \sum_{ \vert j \vert \leq L_{m+1}}\delta^{m}_{j} \vert \dot{R}_{j} \vert ^{\mathcal{D}_{m}\times\mathcal {O}_{m}} \\ \text{(by Cauchy estimate)}&\leq C(m)L_{m+1}\biggl(\frac{2}{b_{m}} \biggr)^{2L_{m+1}}(\epsilon_{m})^{1-\gamma}. \end{aligned} $$

Consider now the term \(\ddot{S}=S^{11}\), and the corresponding coefficient of is given by

$$ \ddot{S}_{k,ij}=\frac{\ddot{R}_{k,ij}}{\sqrt {-1}k\cdot\omega^{m+1}+\Lambda^{m+1}_{j}-\Lambda ^{m+1}_{i}},\quad \vert i \vert , \vert j \vert \leq L_{m+1}, \vert k \vert \leq M_{m+1}. $$
(3.29)

Without loss of generality, let \(i>j\), and from the norm frequencies assumption, we get

$$ \bigl\vert \Lambda^{m+1}_{j}-\Lambda^{m+1}_{i} \bigr\vert ^{\mathcal {O}_{m}}\geq j^{-N-1}-2(\epsilon_{n(j)})^{2/9} \geq L_{m+1}^{-N-1}-2(\epsilon_{m})^{2/9}. $$

Choose ϵ small enough such that

$$\bigl[2(\epsilon_{m})^{2/9}+(\epsilon_{m})^{\gamma} \bigr]\biggl(\frac{2}{3}(1+\beta ) \vert \ln\epsilon_{m+1} \vert \biggr)^{\frac{N+1}{1+\alpha}}\leq1. $$

Thus

$$ \bigl\vert \Lambda^{m+1}_{j}-\Lambda^{m+1}_{i} \bigr\vert ^{\mathcal {O}_{m}}\geq(\epsilon_{m})^{\gamma},\quad i\neq j. $$
(3.30)

For \({\mathcal{O}_{m+1}}\subset\mathcal{O}_{m}\), the small divisor satisfies

$$ \bigl\vert \sqrt{-1}k\cdot\omega^{m+1}+\Lambda ^{m+1}_{j}- \Lambda^{m+1}_{i} \bigr\vert ^{\mathcal{O}_{m+1}}\geq \frac{1}{2}(\epsilon_{m})^{\gamma}. $$
(3.31)

Using this estimate, we see that

$$ \vert \ddot{S}_{k,ij} \vert ^{\mathcal{O}_{m+1}}\leq2(\epsilon _{m})^{-\gamma} \vert \ddot{R}_{k,ij} \vert ^{\mathcal{O}_{m}}, \quad \vert i \vert , \vert j \vert \leq L_{m+1}, \vert k \vert \leq M_{m+1}. $$
(3.32)

Hence

$$ \begin{aligned} \vert \ddot{S}_{ij} \vert ^{\mathcal{D}^{1}_{m}\times{\mathcal {O}_{m+1}}}&\leq\sum_{ \vert k \vert \leq M_{m+1}} \vert \ddot{S}_{k,ij} \vert ^{\mathcal {O}_{m+1}}e^{ \vert k \vert (\delta_{m}-\frac{b_{m}}{2})} \\ &\leq2(\epsilon _{m})^{-\gamma}\sum_{ \vert k \vert \leq M_{m+1}} \vert \ddot{R}_{k,ij} \vert ^{\mathcal {O}_{m}}e^{ \vert k \vert (\delta_{m}-\frac{b_{m}}{2})} \\ &\leq2(\epsilon_{m})^{-\gamma}\sum_{ \vert k \vert \leq M_{m+1}} \vert \ddot {R}_{ij} \vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}e^{-\frac {b_{m}}{2} \vert k \vert } \\ &\leq2\biggl(\frac{2}{b_{m}}\biggr)^{2L_{m+1}}(\epsilon_{m})^{-\gamma } \vert \ddot{R}_{ij} \vert ^{\mathcal{D}_{m}\times\mathcal{O}_{m}}. \end{aligned} $$

Therefore

$$ \begin{aligned} \bigl\vert \bigl\langle S^{11}u'(m+1),v(m+1) \bigr\rangle \bigr\vert ^{\mathcal {D}^{1}_{m}\times{\mathcal{O}_{m+1}}}&\leq\sum_{ \vert i \vert , \vert j \vert \leq L_{m+1}} \bigl\vert \ddot{S}_{ij}u'_{i}v_{j} \bigr\vert ^{\mathcal{D}^{1}_{m}\times{\mathcal {O}_{m+1}}} \\ &\leq\frac{2(\frac{2}{b_{m}})^{2L_{m+1}}}{(\epsilon_{m})^{\gamma}}\sum_{ \vert i \vert , \vert j \vert \leq L_{m+1}} \delta^{m}_{i}\delta^{m}_{j} \vert \ddot{R}_{ij} \vert ^{\mathcal {D}_{m}\times\mathcal{O}_{m}} \\ \text{(by Cauchy estimate)}&\leq C(m)L^{2}_{m+1}\biggl( \frac{2}{b_{m}}\biggr)^{2L_{m+1}}(\epsilon_{m})^{1-\gamma}. \end{aligned} $$

The remaining terms of S can be estimated in the same line. Therefore we obtain

$$ \vert S \vert ^{\mathcal{D}^{1}_{m}\times\mathcal{O}_{m+1}}\leq C(m)L^{2}_{m+1}\biggl( \frac{2}{b_{m}}\biggr)^{2L_{m+1}}(\epsilon_{m})^{1-\gamma} \leq (\epsilon_{m})^{1-\gamma}, $$
(3.33)

where we have written 2γ as γ by abuse of notation.

Using the similar discussion in the proof of Lemma (a) and the Cauchy estimate, on the domain \(\mathcal{D}^{2}_{m}\times{\mathcal {O}_{m+1}}\), one has

$$ \begin{aligned} &\biggl\vert \frac{\partial S}{\partial I'_{j}} \biggr\vert \leq4\bigl(\rho ^{m}_{j}\bigr)^{-1}( \epsilon_{m})^{1-\gamma},\qquad \biggl\vert \frac{\partial S}{\partial\theta _{j}} \biggr\vert \leq4(b_{m})^{-1}(\epsilon_{m})^{1-\gamma}, \\ &\biggl\vert \frac{\partial S}{\partial u'_{j}} \biggr\vert , \biggl\vert \frac{\partial S}{\partial v_{j}} \biggr\vert \leq4\bigl(\varrho ^{m}_{j} \bigr)^{-1}(\epsilon_{m})^{1-\gamma}, \quad \vert j \vert \leq L_{m} \\ &\biggl\vert \frac{\partial S}{\partial I'_{j}} \biggr\vert \leq4\bigl(\rho^{m}_{j} \bigr)^{-1}(\epsilon _{m})^{-\gamma}e^{-\frac{3}{2}( \vert k \vert -1)^{1+\alpha}},\qquad \biggl\vert \frac{\partial S}{\partial\theta_{j}} \biggr\vert \leq4(b_{m})^{-1}( \epsilon_{m})^{-\gamma }e^{-\frac{3}{2}( \vert k \vert -1)^{1+\alpha}}, \\ &\biggl\vert \frac{\partial S}{\partial u'_{j}} \biggr\vert , \biggl\vert \frac{\partial S}{\partial v_{j}} \biggr\vert \leq4\bigl(\varrho ^{m}_{j} \bigr)^{-1}(\epsilon_{m})^{-\gamma}e^{-\frac{3}{2}( \vert k \vert -1)^{1+\alpha }},\quad L_{m}< \vert j \vert \leq L_{m+1}. \end{aligned} $$
(3.34)

Choose \(\beta,\gamma\) small enough, thus

$$ \begin{aligned} &\biggl\vert \frac{\partial S}{\partial I'_{j}} \biggr\vert ^{\mathcal {D}^{2}_{m}\times{\mathcal{O}_{m+1}}}\ll\frac{b_{m}}{8}, \qquad \biggl\vert \frac{\partial S}{\partial\theta_{j}} \biggr\vert ^{\mathcal{D}^{2}_{m}\times{\mathcal {O}_{m+1}}}\ll\frac{\rho^{m}_{j}}{8}, \\ &\biggl\vert \frac{\partial S}{\partial u'_{j}} \biggr\vert ^{\mathcal{D}^{2}_{m}\times{\mathcal{O}_{m+1}}},\qquad \biggl\vert \frac {\partial S}{\partial v_{j}} \biggr\vert ^{\mathcal{D}^{2}_{m}\times{\mathcal {O}_{m+1}}}\ll\frac{\varrho^{m}_{j}}{8},\quad \vert j \vert \leq L_{m+1}. \end{aligned} $$
(3.35)

By the analytic inverse function theorems, the equations

$$ \begin{aligned}&I=I'+\frac{\partial S}{\partial\theta} \bigl(I',\theta ,u',v,\xi\bigr),\qquad \theta'= \theta+\frac{\partial S}{\partial I'}\bigl(I',\theta,u',v,\xi \bigr), \\ &u=u'+\frac{\partial S}{\partial v}\bigl(I', \theta,u',v,\xi\bigr),\qquad v'=v+\frac{\partial S}{\partial u'} \bigl(I',\theta,u',v,\xi\bigr) \end{aligned} $$
(3.36)

can be inverted to yield an analytic and invertible canonical transformation on the domain \(\mathcal{D}_{m+1}\times\mathcal{O}_{m+1}\). More precisely, we have

$$\begin{aligned} \mathcal{C}_{m+1}\bigl(I', \theta',u',v'\bigr)={}& \bigl(I'+\Xi\bigl(I',\theta ',u',v', \xi\bigr),\theta'+\Theta\bigl(I',\theta',u',v', \xi\bigr), \\ &{}u'+\Delta \bigl(I',\theta',u',v', \xi\bigr),v'+\Upsilon\bigl(I',\theta',u',v', \xi\bigr)\bigr) \end{aligned}$$
(3.37)

maps \(\mathcal{D}_{m+1}\times\mathcal{O}_{m+1}\) into \(\mathcal {D}_{m}\times\mathcal{O}_{m}\). Furthermore, on the domain \(\mathcal {D}_{m+1}\times\mathcal{O}_{m+1}\), we get

$$ \begin{aligned}& \vert \Xi_{j} \vert \leq(\epsilon_{m})^{1-\gamma}\leq(\epsilon _{m})^{\frac{8}{9}},\qquad \vert \Theta_{j} \vert \leq(\epsilon_{m})^{\frac{1}{3}-\beta-\gamma } \leq(\epsilon_{m})^{\frac{2}{9}},\quad \vert j \vert \leq L_{m} , \\ &\vert \Delta_{j} \vert , \vert \Upsilon_{j} \vert \leq(\epsilon_{m})^{\frac{2}{3}-\beta-\gamma }\leq(\epsilon_{m})^{\frac{5}{9}}, \\ &\vert \Xi_{j} \vert \leq4(b_{m})^{-1}( \epsilon_{m})^{-\gamma}e^{-\frac{3}{2}( \vert k \vert -1)^{1+\alpha}}\leq(\epsilon_{m})^{1-\gamma} \leq(\epsilon _{m})^{\frac{8}{9}},\quad L_{m}< \vert j \vert \leq L_{m+1}, \\ &\vert \Theta_{j} \vert \leq4\bigl(\rho ^{m}_{j} \bigr)^{-1}(\epsilon_{m})^{-\gamma}e^{-\frac{3}{2}( \vert k \vert -1)^{1+\alpha}}\leq ( \epsilon_{m})^{\frac{1}{3}-\beta-\gamma}\leq(\epsilon_{m})^{\frac{2}{9}} , \\ &\vert \Delta_{j} \vert , \vert \Upsilon_{j} \vert \leq4\bigl(\varrho^{m}_{j}\bigr)^{-1}(\epsilon _{m})^{-\gamma}e^{-\frac{3}{2}( \vert k \vert -1)^{1+\alpha}}\leq(\epsilon_{m})^{\frac{2}{3}-\beta-\gamma} \leq(\epsilon_{m})^{\frac{5}{9}}. \end{aligned} $$
(3.38)

Since \(\mathcal{C}^{m+1}=\mathcal{C}^{m}\circ\mathcal{C}_{m+1}\),

$$ \begin{aligned}&\Phi^{m+1}(I,\theta,u,v)=\Xi(I,\theta,u,v)+ \Phi ^{m}(I+\Xi,\theta+\Theta,u+\Delta,v+\Upsilon), \\ &\Psi ^{m+1}(I,\theta,u,v)=\Theta(I,\theta,u,v)+\Psi^{m}(I+ \Xi,\theta +\Theta,u+\Delta,v+\Upsilon), \\ &\phi^{m+1}(I,\theta,u,v)=\Delta(I,\theta,u,v)+\phi^{m}(I+ \Xi ,\theta+\Theta,u+\Delta,v+\Upsilon), \\ &\psi^{m+1}(I,\theta,u,v)=\Upsilon(I,\theta,u,v)+\psi^{m}(I+ \Xi ,\theta+\Theta,u+\Delta,v+\Upsilon), \end{aligned} $$

and (3.38) imply the bounds on \(\mathcal{C}^{m+1}\) stated in (3.6) with \(l=m+1\). In addition, on the domain \(\mathcal{D}_{m+1}\times\mathcal {O}_{m+1}\), by Cauchy estimate and (3.38), we obtain

$$\begin{aligned} \bigl\vert \Phi_{j}^{m+1}- \Phi_{j}^{m} \bigr\vert ={}& \bigl\vert \Xi_{j}+\Phi^{m}_{j}(I+\Xi ,\theta+\Theta,u+ \Delta,v+\Upsilon)-\Phi_{j}^{m}(I,\theta,u,v) \bigr\vert \\ \leq{}&(\epsilon_{m})^{\frac{8}{9}}+\sum_{ \vert i \vert \leq L_{m+1}} \biggl\{ \biggl\vert \frac {\partial\Phi_{j}^{m}}{\partial I_{i}} \biggr\vert ^{\mathcal{D}^{2}_{m}} \vert \Xi _{i} \vert ^{\mathcal{D}_{m+1}}+ \biggl\vert \frac{\partial\Phi_{j}^{m}}{\partial\theta _{i}} \biggr\vert ^{\mathcal{D}^{2}_{m}} \vert \Theta_{i} \vert ^{\mathcal{D}_{m+1}} \\ &{}+ \biggl\vert \frac {\partial\Phi_{j}^{m}}{\partial u_{i}} \biggr\vert ^{\mathcal{D}^{2}_{m}} \vert \Delta _{i} \vert ^{\mathcal{D}_{m+1}}+ \biggl\vert \frac{\partial\Phi_{j}^{m}}{\partial v_{i}} \biggr\vert ^{\mathcal{D}^{2}_{m}} \vert \Upsilon_{i} \vert ^{\mathcal{D}_{m+1}} \biggr\} , \\ \leq{}&(\epsilon_{m})^{\frac{8}{9}}+ \bigl\vert \Phi_{j}^{m} \bigr\vert ^{\mathcal{D}_{m}}\biggl\{ \sum_{ \vert i \vert \leq L_{m}}C(m) (\epsilon_{m})^{\frac{2}{9}-\beta} \\ &{}+\sum_{L_{m}< \vert i \vert \leq L_{m+1}}C(m) (\epsilon_{m})^{\frac{1}{3}-2\beta-\gamma} \biggr\} \leq(\epsilon_{m})^{\frac{1}{6}},\quad \vert j \vert \leq L_{m+1}. \end{aligned}$$
(3.39)

Making use of the same way to the other three terms, on the domain \(\mathcal{D}_{m+1}\times\mathcal{O}_{m+1}\), one gets

$$ \bigl\vert \Psi_{j}^{m+1}-\Psi_{j}^{m} \bigr\vert , \bigl\vert \phi_{j}^{m+1}-\phi _{j}^{m} \bigr\vert , \bigl\vert \psi_{j}^{m+1}- \psi_{j}^{m} \bigr\vert \leq(\epsilon_{m})^{\frac{1}{6}},\quad \vert j \vert \leq L_{m+1}. $$
(3.40)

By virtue of the fact that S does not depend on \((I_{j},\theta _{j},u_{j},v_{j})\) with \(\vert j \vert >L_{m+1}\), we see that \(\mathcal{C}_{m+1}\), and hence \(\mathcal{C}^{m+1}\) will reduce to the identity at these sites. Then with these bounds we conclude that

$$ \bigl\vert \mathcal {C}^{m+1}-\mathcal{C}^{m} \bigr\vert _{\infty}^{\mathcal{D}_{m+1}\times \mathcal{O}_{m+1}}\leq(\epsilon_{m})^{\frac{1}{6}}. $$
(3.41)

It remains to verify \(l.2\) with \(l=m+1\). Write \(\mathcal {H}_{m+1}=\mathcal{H}_{m}\circ\mathcal{C}_{m+1}(I',\theta ',u',v')=\mathcal{H}_{m}(I'+\Xi,\theta'+\Theta,u'+\Delta ,v'+\Upsilon)\), which is in turn equal to

$$ \omega^{m+1}\cdot I'+\Lambda^{m+1} u'\cdot v'+Q\bigl(I',u',v' \bigr)+P_{m+1}+\epsilon\sum_{ \vert j \vert \geq L_{m+2}}f_{j} \bigl(I',\theta',u',v' \bigr) $$

with \(P^{m+1}\) having the form

$$\begin{aligned} &\sum_{ \vert j \vert \leq L_{m+1}} \bigl(2 \bigl\vert I'_{j} \bigr\vert \vert \Xi_{j} \vert + \vert \Xi_{j} \vert ^{2} \bigr)+\sum_{ \vert j \vert \leq L_{m+1}}\bigl( \bigl\vert u'_{j}-v'_{j}- \Delta_{j}+\Upsilon _{j} \bigr\vert ^{4}- \bigl\vert u'_{j}-v'_{j} \bigr\vert ^{4}\bigr)+P_{3m} \\ &\quad{}+\sum_{ \vert j \vert \leq L_{m+1}} \int^{1}_{0}\biggl\{ \frac {\partial P_{2m}}{\partial I_{j}} \bigl(I'+t\Xi,\theta,u'+\Delta,v\bigr)\Xi _{j} \\ &\quad{}+\frac{\partial P_{2m}}{\partial u_{j}}\bigl(I'+\Xi, \theta,u'+t\Delta ,v\bigr)\Delta_{j}\biggr\} \,dt. \end{aligned}$$
(3.42)

Now we estimate \(P^{m+1}\). From (3.38), the first term in (3.42) is bounded on the domain \(\mathcal{D}_{m+1}\times \mathcal{O}_{m+1}\) by

$$ 2L_{m+1}\bigl[2^{-3m-5}\tau(\epsilon _{m+1})^{\frac{2+2\beta}{3}}(\epsilon_{m})^{1-\gamma}+(\epsilon _{m})^{2-2\gamma}\bigr]\leq(\epsilon_{m})^{\frac{3}{2}}. $$
(3.43)

The second term in (3.42) can be similarly estimated on \(\mathcal{D}_{m+1}\times\mathcal{O}_{m+1}\) by \((\epsilon _{m})^{\frac{3}{2}}\). Finally, the fourth term in (3.42) is bounded by using similar discussion as that in the proof of Lemma 3.3(a) on the domain \(\mathcal{D}_{m+1}\times\mathcal{O}_{m+1}\), and we find it is less than

$$ 2L_{m+1}\bigl[C(m) (\epsilon_{m})^{\frac{1}{3}-\beta }( \epsilon_{m})^{1-\gamma}+C(m) (\epsilon_{m})^{\frac{2}{3}-\beta }( \epsilon_{m})^{\frac{2}{3}-\beta-\gamma}\bigr]\leq(\epsilon_{m})^{\frac{5}{4}}= \epsilon_{m+1}. $$
(3.44)

Thus, with (3.17), we obtain that

$$\bigl\vert P^{m+1} \bigr\vert ^{\mathcal{D}_{m+1}\times\mathcal{O}_{m+1}}\leq C(m+1) \epsilon_{m+1} $$

completing the verification of \(l.2\) with \(l=m+1\) and the proof of Lemma 3.2.

4 Proof of the theorem

Proof of Theorem 3.1

The proof is finished by running Lemma 3.2. Obviously, the Hamiltonian H defined by (2.9) satisfies conditions \((l.1)\)\((l.6)\), with \(l = 0\). Thus, the iterative lemma (Lemma 3.2) works. Inductively, we get the following sequences:

$$ \begin{aligned}&\mathcal{D}_{m+1}\times\Pi_{m+1} \subset\mathcal {D}_{m+1}\times\Pi_{m+1}, \\ &\mathcal{C}^{m+1}:\mathcal{D}_{m+1}\times \Pi_{m+1} \rightarrow\mathcal{D}_{0}, \\ &\mathcal{H}_{m+1}=\mathcal {H}_{m}\circ\mathcal{C}_{m+1}= \omega^{m+1}\cdot I+\Lambda^{m+1} u\cdot v+Q+P_{m+1}+ \epsilon\sum_{ \vert j \vert \geq L_{m+2}}f_{j}. \end{aligned} $$

Let

$$\begin{aligned} &\Pi_{\infty}=\bigcap^{\infty}_{m=0}\Pi_{m}, \\ &\mathcal{D}_{\infty}=\biggl\{ (I,\theta,u,v)\in\mathbb{C}^{\mathbb {Z}} \times\mathbb{C}^{\mathbb{Z}}\times\mathbb{C}^{\mathbb {Z}}\times \mathbb{C}^{\mathbb{Z}}: I_{j}=u_{j}=v_{j}=0, \vert \operatorname{Im} \theta_{j} \vert < \frac{\delta_{0}}{2},j\in \mathbb{Z}\biggr\} . \end{aligned}$$

By (3.41), \(l.2\), and \(l.3\), we conclude that \(\mathcal{H}_{m},\mathcal{C}^{m}\) converges uniformly on the domain \(\mathcal{D}_{\infty}\times\Pi _{\infty}\), and

$$ \begin{aligned} &\mathcal{C}^{\infty}=\lim_{m\rightarrow\infty } \mathcal{C}^{m}, \\ &\mathcal{H}_{\infty}=\omega^{\infty}\cdot I+\Lambda^{\infty} u\cdot v+Q. \end{aligned} $$

Thus, \(\mathbb{T}^{\mathbb{Z}}\times\{0\}\times\{0\}\times\{0\}\) is an embedding torus with rotational frequencies \(\omega^{\infty}\in \Pi_{\infty}\) of the Hamiltonian \(\mathcal{H}_{\infty}\). Returning to the original Hamiltonian H, it has an embedding torus \(\mathcal{C}^{\infty}(\mathbb{T}^{\mathbb{Z}}\times \{0\}\times\{0\}\times\{0\})\) with frequencies \(\omega^{\infty}\). This proves the theorem. □

Proof of Theorem 1.1

By Sect. 2, Theorem 1.1 is just a corollary of Theorem 3.1. □