1 Introduction

The study of normal form theory has a long history, which is original from Poincaré. The basic idea is to simplify ordinary differential equations or diffeomorphisms through changes of variables near referenced solutions. Nowadays, the theory has extended its domain over various systems such as random dynamical systems, control systems and so on. Moreover, it also does great importance to bifurcations, stability theory and others.

As we have known, the celebrated Poincaré-Dulac scheme ensures the existence of formal normal forms. So the convergence of formal normal forms plays the central role of the whole research. Now let us recall some beautiful theorems on history. On the one hand, in Poincaré domain the system analytically conjugates to its polynomial normal forms. Meanwhile, in Siegel domain the system can be analytically linearized under some small divisor conditions. However, by the dichotomy method or the result of Yoccoz there actually exists a large gap between formal and analytic normal forms. On the other hand, in the rougher topology Hartman, Sternberg and Chen proved \(C^0\), \(C^k\) and \(C^\infty \) conjugacy under the hyperbolic condition, respectively. The brief introductions can be found in [1]. Anyway, above arguments remind us of the importance for the proper topology, where normal forms can inherit common properties from both analytic and \(C^\infty \) cases.

Then comes the Gevrey smooth topology in the ultra-differentiable class, that belongs to the \(C^\infty \) functional class but the derivatives of functions have certain norm controls. It can be regarded as the particular case in the ultra-differential class mentioned by Rudin [7]. With such a magnifying glass we persuade ourselves to detect their interactions of the classical Siegel small divisor conditions and non-vanished nonlinear resonant terms in the formal normal forms. More precisely, in the previous series work [2, 5, 6, 9], related topics about Gevrey normalization were largely covered. Especially, restrict our focus on the classical vector fields \(X= Dx+R(x)\) in the Gevrey smooth category, where D is a diagonal matrix and R contains all higher order nonlinearities. It was proved in [9] (Theorem 1.11) when \(D=\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\) is hyperbolic and satisfies Siegel type conditions, i.e. it fulfills

$$\begin{aligned} |\langle k,\lambda \rangle -\lambda _j| \ge c |k|^{-\mu }, \end{aligned}$$

on all \((k,j)\in \Omega _{nr}=\{(k,j)~|~|k|\ge 2, \langle k,\lambda \rangle -\lambda _j \ne 0\}\) for the positive constants c and \(\mu \), then the Gevrey-\(\alpha \) smooth vector fields can be changed into their normal forms by the Gevrey-\((\alpha +\mu +1)\) smooth coordinates substitutions at the origin. Moreover, if X can be formally linearized additionally, then the conjugacy shall have no loss of Gevrey smoothness, namely, the changes persist Gevrey-\(\alpha \) smooth. Such result is developed from [5], where it was proved that analytic vector can be changed into their normal forms via the formal Gevrey \(1+\tau \) transformations, and then [6] for more degenerated vector fields and formal Gevrey \(\alpha \) vector fields. See also Sternberg’s pioneering work [2] and [8] for the hyperbolic smooth and Gevrey linearization, respectively. Therefore, the natural gap between \(\alpha +\mu +1\) and \(\alpha \) implies the possibility of the existence of the fine structures for nonlinear resonant terms.

In this way, we consider the Gevrey-\(\alpha \) smooth system

$$\begin{aligned} \dfrac{dx}{dt} = Dx +Nx +R(x), \end{aligned}$$
(1)

where D is diagonal, N is nilpotent and R contains all higher order nonlinearities. By taking its formal normal forms, it admits

$$\begin{aligned} \dfrac{dx}{dt} = Dx +Nx + {{\hat{R}}}(x), \end{aligned}$$

for \({{\hat{R}}}\in {\mathbb {C}}^d[[x]]\) is a formal Taylor series, then q is denoted by the lowest degree of terms in \(Nx+{{\hat{R}}}\). Of course, the formal linearization corresponds to the procedure as \(q \rightarrow \infty \). And \(q=1\) is in general the worst case if \(D\ne 0\), which prevents the convergence of formal transformations frequently. Next we have the following two conditions for system (1)

(C1):

There exists a positive constant c such that \(|\langle k,\lambda \rangle -\lambda _j|\ge c |k|\) on \(\Omega _{nr}=\{(k,j)~|~|k|\ge 2, \langle k,\lambda \rangle -\lambda _j \ne 0\}\).

(C2):

The linear part matrix is diagonal, i.e. \(N=0\). And there exist the positive constant c and constant \(\mu >-1\) such that \(|\langle k,\lambda \rangle -\lambda _j|\ge c |k|^{-\mu }\) on \(\Omega _{nr}=\{(k,j)~|~|k|\ge 2, \langle k,\lambda \rangle -\lambda _j \ne 0\}\).

The condition \(\mathrm{(C1)}\) stems from the Poincaré domain. Whereas condition \(\mathrm{(C2)}\) implies the restriction \(q\ge 2\). When \(\mu >0\), it accords with the classical Siegel small divisor condition. When \(\mu = 0\), it is satisfied by complete integrable systems from [10]. If \(-1\le \mu <0\), we have polynomial formal normal forms in general.

In this paper, our results can be summarized as follows.

Theorem 1

The following statements hold.

  1. (i)

    (Formal conjugacy) Assume that system (1) is formal Gevrey-\(\alpha \)(\(\alpha \ge 0\)). Then under condition \(\mathrm{(C1)}\), there exits a formal Gevrey-\(\alpha \) coordinates substitution, which turns system (1) into its normal form; under condition \(\mathrm{(C2)}\), there exits a formal Gevrey-\({{\hat{\alpha }}}\) coordinates substitution to do so, where \({{\hat{\alpha }}} =\max \{\alpha ,\frac{\mu +1}{q-1}\}\).

  2. (ii)

    (Smooth conjugacy) Assume that system (1) is Gevrey-\(\alpha \)(\(\alpha \ge 1\)) smooth and hyperbolic, i.e. all eigenvalues of D have non-zero real parts. Then under condition \(\mathrm{(C1)}\), there exits a Gevrey-\(\alpha \) smooth coordinates substitution, which turns system (1) into its normal form; under condition \(\mathrm{(C2)}\), there exits a Gevrey-\({{\hat{\alpha }}}\) smooth coordinates substitution to do so, where \({{\hat{\alpha }}} =\max \{\alpha ,\frac{\mu +q}{q-1}\}\).

  3. (iii)

    (Siegel type) When \(q\rightarrow \infty \), the above results also hold, namely, under condition \(\mathrm{(C2)}\), if system (1) is formal Gevrey-\(\alpha \) and can be formally linearized, then the change persists formal Gevrey-\(\alpha \) class. Additionally assume that system (1) is hyperbolic, the same result is valid for the Gevrey-\(\alpha \) smooth system.

At this moment, the study of Gevrey smooth normal forms all follows Stolovitch’s two steps scheme. It begins with the seeking of formal Gevrey smooth normal forms, which provides a necessary aim for the further exploration. Then for the ‘real’ Gevrey smooth system we only need deal with the cancelation of Gevrey flat remainders due to Gevrey Whitney type extension theorems. When the system is hyperbolic specially, the Gevrey smoothness of the transformation can be directly checked in a complicated but explicit formula as shown in [9]. In this paper, we mainly improve the results to get an accurate expression of the Gevrey index \({{\hat{\alpha }}}\) at the first step. Comparing with other results, On the one hand, from \(\frac{\mu +1}{q-1} \rightarrow 0\) as \(q\rightarrow \infty \), it precisely characterizes the action of resonant terms on the convergence of changes. On the other hand, it implies that the increasing of the Gevrey index is not always necessary in the normalization even under Siegel small divisor conditions, which strengths the result in [9]. Above all we think that those clearly indicate the effect of different topology on the normalization. Now we consider system (1) under Siegel type small divisor conditions. For the analytic topology, there is no analytical normalization as \(\alpha =0\). Now the topology changes weak as Gevrey smooth index \(\alpha \) increases. When the index is small, it happens the loss of Gevrey smoothness, i.e. the convergent transformation has larger index \(\hat{\alpha }=\frac{\mu +1}{q-1}\) than \(\alpha < \frac{\mu +1}{q-1}\). However, when the Gevrey index is large enough, the loss stops for \({{\hat{\alpha }}}=\alpha \ge \frac{\mu +1}{q-1}\). Until the weakest \(C^\infty \) topology, the normalization is always guaranteed in the hyperbolic case. But when \(q\rightarrow \infty \) as the boundary value case, such slight difference disappears and formal convergence is always valid. So we say that the topology of Gevrey smoothness is proper to detect fine structures of the normalization under Siegel type conditions.

The rest paper are organized as follows. In Sect. 2, notations, definitions and basic lemmas are written down. Then in Sect. 3, we solve the homological equation, which is the linear approximation of the equation given by normal form reductions. So in Sect. 4, KAM methods and Contracting Mapping Principle can be applied to get formal Gevrey normalization for \(\mu \ge 0\) and \(\mu <0\), respectively. Thus, together with Stolovitch’s arguments we get our main theorem in the last section.

2 Preliminaries

First of all, we introduce some notations using throughout this paper.

  1. 1.

    Denote \({\mathbb {Z}}_+\), \({\mathbb {Z}}\) and \({\mathbb {Z}}^d\) by the set of natural numbers, scale and vector integers, respectively.

  2. 2.

    Set \(\Omega _r = \{(k,j)~|~|k|\ge 1, \langle k,\lambda \rangle -\lambda _j =0\}\) and \(\Omega _{nr}=\{(k,j)~|~|k|\ge 2, \langle k,\lambda \rangle -\lambda _j \ne 0\}\).

  3. 3.

    Use \(\langle f \rangle _{r} = \sum _{(k,j)\in \Omega _r} f_{k,j}x^ke_j\) for the given \(f=\sum _{|k|\ge 1,j} f_{k,j}x^k e_j\).

  4. 4.

    Use \(\langle f \rangle _{nr} = \sum _{(k,j)\in \Omega _{nr}}f_{k,j}x^ke_j\) for the given \(f=\sum _{|k|\ge 2,j} f_{k,j}x^k e_j\).

  5. 5.

    As usual, \(\partial ^kf\) is the k-th order differential operator.

  6. 6.

    \(\langle k ,\lambda \rangle = \sum _{i=1}^d k_i\lambda _i\) and \(|k|=\sum _{i=1}^d |k_i|\) for \(k\in {\mathbb {Z}}^d\).

Now listed below are the majorant operator and norms. Set \({\mathbb {C}}[[x]]\) and \({\mathbb {C}}^d[[x]]\) be the formal scale and d-dimensional vector series with respect to the variable x on the complex field, respectively. The classical majorant operator is the nonlinear operator acting on \({\mathbb {C}}^d[[x]]\) by

$$\begin{aligned} {\mathcal {M}}: \sum _{k,j} c_{k,j}x^ke_j \mapsto \sum _{k,j} |c_{k,j}|x^ke_j. \end{aligned}$$

associated with the majorant norm \( |f|_r = \sum _{k,j} |c_{k,j}| r^{|k|} \). In the book [4] (Lemma 5.10, pp. 51), such important properties are mentioned.

Lemma 2

The following statements hold.

  1. (i)

    For any two scale series f and \(g\in {\mathbb {C}}[[x]]\), we have that \(|fg|_r \le |f|_r\cdot |g|_r\), provided that all norms are finite.

  2. (ii)

    For any two vector series f and \(g\in {\mathbb {C}}^d[[x]]\), we have that \(|f\circ g|_r \le |f|_\sigma \) for \(|g|_r \le \sigma \).

Then we introduce some basic information about Gevery type smoothness. Let \(\Omega \) be an open set \({\mathbb {R}}^d\) and \(\alpha \ge 1\). A smooth complex-valued function f on this set \(\Omega \) is said to be Gevery-\(\alpha \) smooth, if for any compact set \(K\subseteq \Omega \), there exist positive constants M and C such that

$$\begin{aligned} \sup _{x\in K}|\partial ^kf(x)|\le M C^{|k|} (|k|!)^\alpha ,\quad \forall k\in {\mathbb {Z}}_+^d. \end{aligned}$$

As usual, \(\partial ^kf\) is the k-th order differential operator. However, the formal power series \(f=\sum _{k,j} f_{k,j}x^ke_j \in {\mathbb {C}}^d[[x]]\) is said to be formal Geverey-\(\alpha \) if there exist positive constants M and C such that \(|f_{k,j}|\le M C^{|k|} (|k|!)^\alpha \). Of course, we shall note that the Taylor expansion at the origin of a smooth Gevrey-\(\alpha \) function is a formal Gevrey-\((\alpha +1)\) power series. See [9] for more details. Hence we modify the majorant norm. For any formal power series \(f\in {\mathbb {C}}^d[[x]]\) and the fixed \(\alpha \ge 0\), we can set

$$\begin{aligned} |f|_{r,\alpha } = |{\mathscr {J}}_\alpha f|_r = \sum _{k,j}\dfrac{|f_{k,j}|}{(|k|!)^\alpha } r^{|k|} <\infty , \end{aligned}$$

associated with the classical majorant norms \(|\cdot |_r\) and the modified operator

$$\begin{aligned} {\mathscr {J}}_\alpha f =\sum _{k,j} \dfrac{|f_{k,j}|}{(|k|!)^\alpha } x^k e_j. \end{aligned}$$

First comes the research of the modified majorant norm according to Lemma 2.

Corollary 3

For \(f\in {\mathbb {C}}^d[[x]]\) satisfying \(f(0)=0\), we have that

$$\begin{aligned} |f^k|_{r,\alpha }=\left| \prod _{i=1}^d f_i^{k_i}\right| _{r,\alpha } \le (|k|!)^{-\alpha } \prod _{i=1}^d |f_i|_{r,\alpha }^{k_i}, \end{aligned}$$

where \(f=(f_1,\ldots ,f_d)\) and \(k=(k_1,\ldots ,k_d)\in {\mathbb {Z}}_+^d\).

Proof

First we show a fact. Notice that \(|l|! \ge t! \prod _{i=1}^t (l_{i}!)\) by setting \(l=(l_1,\ldots ,l_t)\in {\mathbb {Z}}_+^t\) satisfying \(l_i>0\) for \(i=1,\ldots ,t\). If \(l_{i}=1\) for \(i=1,\ldots ,t\) or \(t=1\), by simple computations we have the equality. Otherwise, it yields \(|l| > \max \{t, l_{i}\}\). Since in |l|! there are \(|l|-1\) factors except the trivial number 1 and the same umbers \(|l|-1 = (t-1)+\sum _{i=1}^t (l_i-1) \) of non-trivial factors in \(t! \prod _{i=1}^t l_{i}!\), it comes our statement.

Now for \(f_j = \sum _{|m|\ge 1,j}f_{m,j}x^m \in {\mathbb {C}}[[x]]\), we have that

$$\begin{aligned} f_j^{k_j} = \sum \left( f_{m^{(1,j)},j}f_{m^{(2,j)},j}\cdots f_{m^{(k_j,j)},j}\right) x^{m^{(1,j)}+\cdots +m^{(k_j,j)}}, \end{aligned}$$

which leads to

$$\begin{aligned} f^k =\sum \left( \prod _{j=1}^d\prod _{i=1}^{k_j}f_{m^{(i,j)},j}\right) x^{\sum _{j=1}^d\sum _{i=1}^{k_j}m^{(i,j)}}. \end{aligned}$$

Since we have that \(|m^{(i,j)}| \ge 1\) for all i and j by \(f(0)=0\), by the above fact it leads to

$$\begin{aligned} \left| x^{\kappa }\right| _{r,\alpha } = \frac{r^{| \kappa |}}{(| \kappa |!)^{\alpha }} \le (|k|!)^{-\alpha }\frac{r^{|\kappa |}}{\prod _{j=1}^d\prod _{i=1}^{k_j}(|m^{(i,j)}|!)^\alpha }, \end{aligned}$$

where \(\kappa = \sum _{j=1}^d\sum _{i=1}^{k_j}m^{(i,j)}\). Comparing with the expression

$$\begin{aligned} \prod _{j=1}^d ({\mathscr {J}}_\alpha f_j)^{k_j}= \sum \left( \prod _{j=1}^d\prod _{i=1}^{k_j}\frac{|f_{m^{(i,j)},j}|}{(|m^{(i,j)}|!)^\alpha }\right) x^{\sum _{j=1}^d\sum _{i=1}^{k_j}m^{(i,j)}}, \end{aligned}$$

it yields

$$\begin{aligned} |f^k|_{r,\alpha } \le (|k|!)^{-\alpha }\left| \prod _{j=1}^d ({\mathscr {J}}_\alpha f_j)^{k_j} \right| _r \le (|k|!)^{-\alpha } \prod _{j=1}^d |f_j|_{r,\alpha }^{k_j} \end{aligned}$$

by Lemma 2(i). This completes the proof. \(\square \)

With the aid of the above corollary, the preparation for the study of modified majorant norms is ready.

Proposition 4

The following statements hold for the modified majorant norm \(|\cdot |_{r,\alpha }\).

  1. (i)

    The space \(({\mathscr {X}}_r, |\cdot |_{r,\alpha })\) is a complete Banach space for the set \({\mathscr {X}}_r=\{f\in {\mathbb {C}}^d[[x]]~|~ |f|_{r,\alpha }<\infty \}\).

  2. (ii)

    For f and \(g\in {\mathbb {C}}[[x]]\), we have that \(|fg|_{r,\alpha } \le |f|_{r,\alpha } |g|_{r,\alpha }\), provided that \(|f|_{r,\alpha }\) and \(|g|_{r,\alpha }\) are both finite.

  3. (iii)

    For f and \(g\in {\mathbb {C}}^d[[x]]\) satisfying \(g(0)=0\), we have that \(|f\circ g|_{r,\alpha } \le |f|_{\sigma ,\alpha }\) with \(|g|_{r,\alpha } \le \sigma <\infty \).

Proof

When \(r=1\), notice that \(|f|_{1,\alpha } =|{\mathscr {J}}_\alpha f|_1 \). But \(|\cdot |_1\) is in fact \(l^1\), which is complete. And so is \(({\mathscr {X}}_1, |\cdot |_{1,\alpha })\). The general case of an arbitrary r follows from the fact the correspondence \(f(rx) \leftrightarrow f(x)\) is an isomorphism. This confirms \(\mathrm{(i)}\).

Then we verify results \(\mathrm{(ii)}\) and \(\mathrm{(iii)}\). On the one hand, by simple computations we obtain that

$$\begin{aligned} |fg|_{r,\alpha }= |(f_0+{{\hat{f}}})(g_0+{{\hat{g}}})|_{r,\alpha }\le & {} |f_0 g_0| + |f_0||{{\hat{g}}}|_{r,\alpha }+|g_0||{{\hat{f}}}|_{r,a} + (2!)^{-\alpha } |{{\hat{f}}}|_{r,\alpha }|{{\hat{g}}}|_{r,\alpha } \\\le & {} \left( |f_0|+|\hat{f}|_{r,\alpha }\right) \left( |g_0|+|{{\hat{g}}}|_{r,\alpha }\right) = |f|_{r,\alpha }|g|_{r,\alpha }, \end{aligned}$$

where \(g_0=g(0)\), \(f_0=f(0)\), \({{\hat{f}}}=f-f_0\) and \({{\hat{g}}}=g-g_0\). On the other hand, from Corollary 3 it yields

$$\begin{aligned} |f\circ g|_{r,\alpha } \le&\sum _{k,j} |f_{k,j}| |g^k e_j|_{r,\alpha } \le \sum _{k,j} |f_{k,j}|(|k|!)^{-\alpha } \sigma ^{|k|} = |f|_{\sigma ,\alpha }. \end{aligned}$$

That completes the proof. \(\square \)

Next for any \(f\in {\mathbb {C}}^d[[x]]\) we define the power shift operator \({\mathscr {P}}_\mu \) (\(\mu \ge -1\)) given by

$$\begin{aligned} {\mathscr {P}}_\mu f = \sum _{k,j}|k|^\mu |f_{k,j}| x^k e_j. \end{aligned}$$
(2)

Then we study the property of \({\mathscr {P}}_\mu \) acting on the classical differential type operator \(\partial _{x_i}\) with respect to the variable \(x=(x_1,\ldots ,x_d)\), which is the key of the whole proof.

Lemma 5

The following statements hold.

  1. (i)

    Assume that \(f(0)=g(0)=0\), \(0<\delta <1\), \(|f|_{r,\alpha }\) and \(|g|_{re^{-\delta },\alpha }\) are both finite. Then we have that

    $$\begin{aligned} |\partial _{x_i}f\cdot g|_{re^{-\delta },\alpha } \le \delta ^{-1}r^{-1}|f|_{r,\alpha }|g|_{re^{-\delta },\alpha } \end{aligned}$$

    for any \(i=1,\ldots ,d\), f and \(g\in {\mathbb {C}}[[x]]\).

  2. (ii)

    Assume that \(f(0)=g(0)=0\), \(\partial ^s f(0)=\partial ^s g(0)=0\) for \(s=1,\ldots ,q-1\) and \(2\le q \in {\mathbb {Z}}_+\), \(|f|_{r,\alpha }\) and \(|g|_{r,\alpha }\) are both finite. When \(\alpha \ge \frac{\mu +1}{q-1}\) and \(\mu \ge -1\), we have that

    $$\begin{aligned} |{\mathscr {P}}_\mu (\partial _{x_i}f \cdot g)|_{r,\alpha }\le C(\alpha ,q) r^{-1}|f|_{r,\alpha }|g|_{r,\alpha } \end{aligned}$$

    for any \(i=1,\ldots ,d\), f and \(g\in {\mathbb {C}}[[x]]\). Here the positive constant \(C(\alpha ,q)\) is given by

    $$\begin{aligned} C(\alpha ,q)= (q!M)^\alpha \end{aligned}$$
    (3)

    with \(M=\sup _{u\in {\mathbb {Z}}_+} \{\frac{u^{q-1}}{(u-q+2)\cdots u}\}\ge 1\) depending on \(\alpha \) and q only.

Proof

First we note that \(u! \ge s! t! \) for \(u+1=s+t\), \(s\ge 1\) and \(t\ge 1\). Since \(t\ge 1\) and \(s\ge 1\) imply \(u\ge s\) and \(u\ge t\), by counting the non-trivial factor except number 1 of both sides, it yields \(u-1 = s-1+ t-1\), which confirms our result. Then by simple calculations, we obtain that

$$\begin{aligned} |\partial _{x_i} f\cdot g|_{re^{-\delta },\alpha }&\le \sum _{u=1}^\infty \sum _{|k|+|l|=u+1} \dfrac{ k_i|f_{k,1}||g_{l,1}|}{(u!)^\alpha }r^u e^{-u\delta }\\&\le r^{-1}e^\delta \sum _{u=1}^\infty \sum _{|k|+|l|=u+1} \dfrac{|f_{k,1}|}{(|k|!)^\alpha }r^{|k|} |k|e^{-|k|\delta } \dfrac{|g_{l,1}|}{(|l|!)^\alpha } (re^{-\delta })^{|l|}\\&\le r^{-1} \delta ^{-1} |f|_{r,\alpha } |g|_{r e^{-\delta },\alpha }, \end{aligned}$$

from the fact \(\max _{t\ge 0} \{t e^{-\delta t}\} \le \delta ^{-1} e^{-1}\). This verifies (i).

To confirm \(\mathrm{(ii)}\), we verify that

$$\begin{aligned} s^{q-1} s!t!\le q!u! \end{aligned}$$
(4)

for \(s+t=u+1\), \(s\ge q\) and \(t\ge q\) at first. If \(s\ge t\), then we have that \( s^{q-1} s!t! \le (s+q-1)!(u+1-s)! \). Note that \(s+q-1 \le u+q-t\le u\), which yields

$$\begin{aligned} \dfrac{(s+q-1)!(u+1-s)!}{u!} = \dfrac{(u+1-s)!}{(s+q)\cdots u}\le q ! \dfrac{q+1}{s+q}\cdots \dfrac{u+1-s}{u} \le q!. \end{aligned}$$

For the case \(s<t\), we get that \( s^{q-1} s!t! \le (t+q-1)!(u+1-t)!\) and other arguments are similar. Then for \(\mu \ge 0\), from (4) we can show that

$$\begin{aligned} su^\mu (s!t!)^\alpha \le u^{\mu +1}((u-q+1)!q!)^\alpha \le \left( \dfrac{u^{q-1}}{(u-q+2)\cdots u}(q!u!)\right) ^\alpha \le C(\alpha ,q)(u!)^\alpha , \end{aligned}$$
(5)

where \(M=\sup _{u\in {\mathbb {Z}}_+} \{\frac{u^{q-1}}{(u-q+2)\cdots u}\}\ge 1\) is a constant depending on q only and \(C(\alpha ,q)=(q!M)^\alpha \).

Therefore, by setting \(|k|\ge q, |l|\ge q\) and \(\alpha \ge \frac{\mu +1}{q-1}\), when \(\mu \le 0\), from (4) we obtain that

$$\begin{aligned} |{\mathscr {P}}_\mu (\partial _{x_i} f\cdot g)|_{r,\alpha }\le & {} \sum _{u=2q-1}^\infty \sum _{|k|+|l|=u+1} \dfrac{u^\mu k_i|f_{k,1}||g_{l,1}|}{(u!)^\alpha }r^u\\\le & {} r^{-1}\sum _{u=2q-1}^\infty \sum _{|k|+|l|=u+1} \left( \dfrac{|k|^{(\mu +1)/\alpha }}{u!}\right) ^\alpha |f_{k,1}| |g_{l,1}|r^{u+1} \\\le & {} (q!)^\alpha r^{-1}\sum _{u=2q-1}^\infty \sum _{|k|+|l|=u+1} \dfrac{|f_{k,1}|}{(|k|!)^\alpha }\dfrac{|g_{l,1}|}{(|l|!)^\alpha } r^{u+1} \\\le & {} (q!)^\alpha r^{-1}|f|_{r,\alpha }|g|_{r,\alpha }. \end{aligned}$$

And when \(\mu >0\), from (5) we have that

$$\begin{aligned} |{\mathscr {P}}_\mu (\partial _{x_i} f\cdot g)|_{r,\alpha }\le & {} \sum _{u=2q-1}^\infty \sum _{|k|+|l|=u+1} \dfrac{u^\mu k_i|f_{k,1}||g_{l,1}|}{(u!)^\alpha }r^u \\\le & {} r^{-1} \sum _{u=2q-1}^\infty \sum _{|k|+|l|=u+1} \dfrac{u^{\mu +1} |f_{k,1}||g_{l,1}|}{(u!)^\alpha }r^{u+1} \\\le & {} C(\alpha ,q) r^{-1}\sum _{u=2q-1}^\infty \sum _{|k|+|l|=u+1} \dfrac{|f_{k,1}|}{(|k|!)^\alpha }\dfrac{|g_{l,1}|}{(|l|!)^\alpha } r^{u+1} \\= & {} C(\alpha ,q)r^{-1}|f|_{r,\alpha }|g|_{r,\alpha }. \end{aligned}$$

This completes the proof. \(\square \)

3 The solution of the homological equation

In this part, we discuss the formal and Gevrey smooth solvable of the homological equation

$$\begin{aligned}{}[F,H]=G, \end{aligned}$$
(6)

where \(F = Dx + \widetilde{R}(x) = Dx + \sum _{|k|\ge 1,j}r_{k,j}x^ke_j\) satisfying \([Dx,\widetilde{R}(x)]=0\), or \(\langle \widetilde{R} \rangle _r=\widetilde{R}\) equivalently. First comes the lemma of the formal solvability.

Proposition 6

For any \(G\in {\mathbb {C}}^d[[x]]\) satisfying \(\langle G \rangle _{nr}=G\), equation (6) has a unique formal solution H such that \(\langle H \rangle _{nr}=H\).

Proof

First of all, if we make \(x=Py\) for \(\det P\ne 0\), naturally by (6) it implies

$$\begin{aligned}{}[P^{-1}F(Py), P^{-1}H(Py)] = P^{-1}[F(Py),H(Py)] = P^{-1}G(Py). \end{aligned}$$

Then without loss of generality, we can set \(R=Nx + O_2\) with N nilpotent and in the lower triangle form.

Now set \(H=\sum _{|l|\ge 2, m} h_{l,m}x^l e_m\). By careful computations, we obtain that

$$\begin{aligned}{}[F,H]= & {} [Dx,H]+[R,H] \\= & {} \sum _{|l|\ge 2,m}(\lambda _m-\langle l,\lambda \rangle )h_{l,m}x^le_m \\&+ \sum _{|k|\ge 1, |l|\ge 2,j,m} r_{k,j}h_{l,m}\left( k_m x^{k+l-e_m}e_j - l_j x^{k+l-e_j}e_m\right) . \end{aligned}$$

On the one hand, from the fact

$$\begin{aligned} \langle k+l-e_m, \lambda \rangle -\lambda _j= & {} \langle k+l-e_j, \lambda \rangle -\lambda _m \\= & {} \langle k,\lambda \rangle -\lambda _j+ \langle l,\lambda \rangle -\lambda _m, \end{aligned}$$

we have \(\langle [F,H] \rangle _{nr} = [F,H]\) for H fulfilling \(\langle H \rangle _{nr}=H\), because \(r_{k,j}\ne 0\) implies \(\langle k,\lambda \rangle =\lambda _j\). On the other hand, since the linear nilpotent part N is in the lower triangle form, then we can define a full order < on \(\Omega _{nr}\), which in given by \((k,j)<(k',j')\) for \(|k|< |k'|\), or \(|k|=|k'|\), but \(j<j'\) or \(k_{s}<k'_{s}\) with \(j=j'\) and \(s=\min \{t~|~k_t\ne k'_t\}\). Arbitrary choosing one monomial \(x^{l}e_m\) for \((l,m)\in \Omega _{nr}\), then it leads to

$$\begin{aligned}{}[F,x^{l}e_m] = \left( \lambda _m-\langle l,\lambda \rangle \right) x^le_m + P_1 + P_2, \end{aligned}$$

where

$$\begin{aligned} P_1=[Nx,x^le_m]= & {} \sum _{|k|= 1,j} r_{k,j}\left( k_m x^{k+l-e_m}e_j - l_j x^{k+l-e_j}e_m\right) \\= & {} \sum _m r_{e_{m},m+1}x^l e_{m+1} - \sum _{j} r_{e_{j-1},j}l_j x^{l+e_{j-1}-e_j}e_m. \end{aligned}$$

and others monomials are in \(P_2\). Therefore, the part \(P_1+P_2\) contains all monomials whose indexes are larger than (lm). Namely, we can solve (6) by this full order on the index set \(\Omega _{nr}\). This completes the proof. \(\square \)

Set \(D ={\mathrm {diag}}\lambda \). Here we restrict the focus on two cases of equation (6) under conditions (C1) and (C2). Denote by \(ad_F H = [F,H]\). If \(F=Ax\) is linear, we simply use \(ad_A\) instead of \(ad_{Ax}\). Let \(F^{(s)} = \sum _{|k|=s,j} F_{k,j}x^k e_j\) for \(F=\sum _{k,j}F_{k,j}x^ke_j\). And set \(\mathrm{ord}(F) = \min \{|k|~|~F_{k,j}\ne 0, |k|>1,(k,j)\in \Omega _r \}\). Moreover, in the next lemma we can get \(\delta =0\) for \(\mu \le 0\) and \(\delta >0\) for \(\mu >0\) in result (ii). Thus by using the remark \(0^\mu =1\), we can get the uniform expression.

Proposition 7

The following statements hold for the solution of equation (6).

  1. (i)

    In the case \(\mathrm{(C1)}\), assume that \(|F|_{\alpha ,r_0}<\infty \) for some \(r_0>0\) and \(|G|_{r,\alpha }<\infty \) for \(G\in {\mathbb {C}}^d[[x]]\) satisfying \(\langle G \rangle _{nr}=G\), then there exists a positive number \(r_1\le r_0\) such that for all \(0<r\le r_1\) we have that \(|ad_F^{-1}G|_{r,\alpha }\le C |G|_{r,\alpha }\), where \(ad_F^{-1}G\) denotes the unique solution H satisfying \(H=\langle H \rangle _{nr}\) to \(ad_F(H)=G\).

  2. (ii)

    In the case \(\mathrm{(C2)}\), assume that \(\alpha \ge \frac{\mu +1}{q-1}\) with \(q=\mathrm{ord}({{\hat{F}}})\) and there exist positive r and \(\delta \) such that \(4c^{-1}C(\alpha ,q) r^{-1}e^{\delta }|{{\hat{F}}}|_{re^{-\delta },\alpha }<1\) and \(|G|_{r,\alpha }<\infty \) for \(G=\sum _{|k|\ge q,j}G_{k,j}x^ke_j\in {\mathbb {C}}^d[[x]]\) satisfying \(\langle G \rangle _{nr}=G\) and \({{\hat{F}}}=F-Dx\), then we have that \(|ad_F^{-1}G|_{re^{-\delta },\alpha }\le C \delta ^{-\mu } |G|_{r,\alpha }\) for some \(C>0\) depending on q. Here \(C(\alpha ,q)\) is the same constant given by (3).

Proof

Note that \(ad_F=ad_D+ad_R=ad_D\circ (\mathrm{Id}+ad_D^{-1}\circ ad_R)\). Here \(ad_D^{-1}(G)\) denotes the unique solution H satisfying \(H=\langle H \rangle _{nr}\) to \(ad_D(H)=G\), which has a clear representation

$$\begin{aligned} ad_D^{-1}: G = \sum _{(k,j)\in \Omega _{nr}} g_{k,j}x^ke_j \mapsto \sum _{(k,j)\in \Omega _{nr}} \dfrac{g_{k,j}}{\lambda _j-\langle k,\lambda \rangle }x^ke_j. \end{aligned}$$

Under condition \(\mathrm{(C1)}\), without loss of generality we can assume that the linear nilpotent part has the form \(\varepsilon N\), where entries of N are 1 or 0. Thus \(ad_R=ad_N+ad_{O_2}\), if we set \(R(x)=Nx+O_2\). On the one hand, by Lemma 5(ii) it implies

$$\begin{aligned}&|ad_D^{-1}\circ ad_{O_2}H|_{r,\alpha } \le c^{-1}|{\mathscr {P}}_{-1}([O_2,H])|_{r,\alpha } \le C_1 r^{-1} |O_2|_{r,\alpha } |H|_{r,\alpha }\\&\quad \le C_1 r_0^{-2} r |O_2|_{r_0,\alpha } |H|_{r,\alpha } \end{aligned}$$

for all \(\alpha \ge 0\), where \({\mathscr {P}}_\mu \) is the same operator given by (2). On the other hand, by the fact \(ad_N H = [Nx,H]=NH-\partial HNx\). we obtain that \(|NH|_{r,\alpha } \le \varepsilon |H|_{r,\alpha }\) and

$$\begin{aligned} |ad_D^{-1}(\partial H Nx)|_{r,\alpha } \le c^{-1}\sum _{k,j}\dfrac{\varepsilon |h_{k,j}||k|}{|k|(|k|!)^\alpha }r^{|k|} \le c^{-1} \varepsilon |H|_{r,\alpha }. \end{aligned}$$

When \(\varepsilon \le c/4\) and \(r_1 \le r_0^2/(4C_1|F|_{\alpha ,r_0})\), we obtain that \(|ad_D^{-1}\circ ad_R| \le 3/4\), then \((\mathrm{Id}+ad_D^{-1}\circ ad_R)\) has an inverse given by the Neuman series with the control \(|(\mathrm{Id}+ad_D^{-1}\circ ad_R)^{-1}| \le 4\), which admits \(|ad_F^{-1}| \le C\) for \(C=4c^{-1}\).

Then comes the case \(\mathrm{(C2)}\). Set \(F^{(s)}\) and \(H^{(t)}\) be the homogeneous polynomials of degree s and t, respectively. From Proposition 6, the solution of (6) formally exists, which yields \(H=\sum _{t\ge q} H^{(t)}\) by comparing terms of lowest degree on both sides. Using more precise estimations, first by Lemma 2 we obtain that

$$\begin{aligned} |\partial F^{(s)}\cdot H^{(t)}|_\sigma \le \sum _{\xi ,\varrho =1}^d |F^{(s)}_{\xi \varrho }G^{(t)}_\varrho |_\sigma \le \sum _{\xi ,\varrho =1}^d s \sigma ^{-1} |F^{(s)}_{\xi }|_\sigma |G^{(t)}_\varrho |_\sigma \le s \sigma ^{-1} |F^{(s)}|_\sigma |G^{(t)}|_\sigma \end{aligned}$$

from the fact that \(|F^{(s)}_{\xi \varrho }|_\sigma \le \sum _{|k|=s} |k| |F_{k\xi }| \sigma ^{s-1} = s\sigma ^{-1}|F_\xi ^{(s)}|_\sigma \), where \(F^{(s)}=(F^{(s)}_1,\ldots ,F^{(s)}_d)\), \(G^{(t)}=(G^{(t)}_1,\ldots ,G^{(t)}_d)\) and \(F^{(s)}_{\xi \varrho } = \partial _{x_\varrho }F^{(s)}_{\xi }\). However, note that we have that

$$\begin{aligned} |F^{(s)}|_r = \sum _{|k|=s,j}|F_{k,j}| r^s = (s!)^{\alpha } |F^{(s)}|_{r,\alpha }, \end{aligned}$$

So for \(u=s+t-1\), \(2\le q \le s \in {\mathbb {Z}}_+\) and \(q \le t \in {\mathbb {Z}}_+\) it yields

$$\begin{aligned} |[F^{(s)}, H^{(t)}]|_\sigma \le (s+t) \sigma ^{-1}|F^{(s)}|_\sigma |G^{(t)}|_\sigma = (u+1)\sigma ^{-1}(s!t!)^\alpha |F^{(s)}|_{\sigma ,\alpha }|G^{(t)}|_{\sigma , \alpha }. \end{aligned}$$
(7)

Thus we can solve (6) by using the expansion \(F=Dx+\sum _{s\ge q}F^{(s)}\) and \(H=\sum _{s\ge q}H^{(s)}\) for \(G=\sum _{s\ge q}G^{(s)}\). Naturally, the solution is governed by

$$\begin{aligned} \left[ Dx, H^{(u)}\right] = G^{(u)} -\left[ F^{(q)},H^{(u+1-q)}\right] -\cdots - \left[ F^{(u+1-q)}, H^{(q)}\right] ,\quad u\ge q. \end{aligned}$$

By the estimation (7) we obtain that

$$\begin{aligned} |H^{(u)}|_\sigma \le&c^{-1}u^\mu \left( |G^{(u)}|_\sigma +(u+1) \sigma ^{-1} (q!(u+1-q)!)^\alpha |F^{(q)}|_{\sigma ,\alpha } |H^{(u+1-q)}|_{\sigma ,\alpha } \right. \\&\left. +\cdots + (u+1) \sigma ^{-1} ((u+1-q)!q!)^\alpha |F^{(u+1-q)}|_{\sigma ,\alpha } |H^{(q)}|_{\sigma ,\alpha }\right) . \end{aligned}$$

Then with the restriction \(\alpha \ge \frac{\mu +1}{q-1}\), for \(\mu > 0\) it admits

$$\begin{aligned} u^{\mu }(u+1)(s!t!)^\alpha (u!)^{-\alpha } =s u^\mu (s!t!)^\alpha (u!)^{-\alpha } + t u^\mu (s!t!)^\alpha (u!)^{-\alpha } \le 2C(\alpha ,q) \end{aligned}$$

from (5) and for \(\mu \le 0\) it leads to

$$\begin{aligned} u^{\mu }(u+1)(s!t!)^\alpha (u!)^{-\alpha }\le & {} s^{\mu +1}(s!t!)^\alpha (u!)^{-\alpha } + t^{\mu +1}(s!t!)^\alpha (u!)^{-\alpha } \\\le & {} \left( s^{q-1}s!t!(u!)^{-1}\right) ^\alpha + \left( t^{q-1}s!t!(u!)^{-1}\right) ^\alpha \le 2(q!)^\alpha \le 2C(\alpha ,q) \end{aligned}$$

from (4). Now set \(\sigma = re^{-\delta }\) and \(\sigma =r\) for \(\mu >0\) and \(-1<\mu \le 0\), respectively. When \(\mu >0\), we obtain that

$$\begin{aligned} (u!)^{-\alpha }|H^{(u)}|_{re^{-\delta }} \le \,&c^{-1} u^\mu (u!)^{-\alpha }|G^{(u)}|_{re^{-\delta }}+ C_3 r^{-1} e^{\delta }|F^{(q)}|_{re^{-\delta },\alpha } |H^{(u+1-q)}|_{re^{-\delta },\alpha } \\&+ \cdots + C_3 r^{-1} e^{\delta } |F^{(u+1-q)}|_{re^{-\delta },\alpha } |H^{(q)}|_{re^{-\delta },\alpha } \\ \le \,&C_2\delta ^{-\mu } (u!)^{-\alpha }|G^{(u)}|_{r} + C_3 r^{-1} e^{\delta }|F^{(q)}|_{re^{-\delta },\alpha } |H^{(u+1-q)}|_{re^{-\delta },\alpha }\\&+ \cdots + C_3 r^{-1} e^{\delta } |F^{(u+1-q)}|_{re^{-\delta },\alpha } |H^{(q)}|_{re^{-\delta },\alpha }. \end{aligned}$$

While \(\mu \le 0\), then \(u^\mu \le 1\) and it yields

$$\begin{aligned} (u!)^{-\alpha }|H^{(u)}|_{r} \le \,&c^{-1} u^\mu (u!)^{-\alpha }|G^{(u)}|_{r}+ C_3 r^{-1} |F^{(q)}|_{r,\alpha } |H^{(u+1-q)}|_{r,\alpha } \\&+ \cdots + C_3 r^{-1} |F^{(u+1-q)}|_{r,\alpha } |H^{(q)}|_{r,\alpha } \\ \le \,&C_2 (u!)^{-\alpha }|G^{(u)}|_{r} + C_3 r^{-1} |F^{(q)}|_{r,\alpha } |H^{(u+1-q)}|_{r,\alpha } \\&+ \cdots + C_3 r^{-1} |F^{(u+1-q)}|_{r,\alpha } |H^{(q)}|_{r,\alpha }. \end{aligned}$$

Here \(C_2=\max \{c^{-1}\mu ^{\mu },c^{-1}\}\) and \(C_3 = 2c^{-1}C(\alpha ,q)\) from the fact \(\max _{x\ge 0}\{x^\mu e^{-\delta x}\}\le \mu ^\mu \delta ^{-\mu }\). Now choosing a very large N, summing all inequalities together we obtain that

$$\begin{aligned} \sum _{u\ge q}^N (u!)^{-\alpha }|H^{(u)}|_{re^{-\delta }} \le&C_2 \delta ^{-\mu } \sum _{u\ge q}^N (u!)^{-\alpha } |G^{(u)}|_{r} \\&+\, C_3r^{-1}e^{\delta }\left( \sum _{u\ge q}^N |F^{(u)}|_{re^{-\delta },\alpha }\right) \left( \sum _{u\ge q}^N |H^{(u)}|_{re^{-\delta },\alpha }\right) \\ \le&C_2 \delta ^{-\mu } \sum _{u\ge q}^N (u!)^{-\alpha } |G^{(u)}|_{r} + C_3r^{-1}e^{\delta }|{{\hat{F}}}|_{re^{-\delta },\alpha }\left( \sum _{u\ge q}^N |H^{(u)}|_{re^{-\delta },\alpha }\right) . \end{aligned}$$

Note that we have \(\delta >0\) for \(\mu >0\) and \(\delta =0\) for \(-1<\mu \le 0\). If we make \(0^\mu =1\), then it leads to the same expression. Making \(N\rightarrow \infty \), we get that \( |H|_{\alpha ,re^{-\delta }} \le 2C_2 \delta ^{-\mu } |G|_{r,\alpha }\), when \(2C_3r^{-1}e^\delta |{{\hat{F}}}|_{re^{-\delta },\alpha }< 1\). This completes the proof. \(\square \)

4 KAM methods and contracting mapping principle in the formal Gevrey normalization

In this part, we use KAM steps and Contracting Mapping Principle to detect formal Gevrey normalization for \(\mu <0\) and \(\mu \ge 0\), respectively. Here we follow the scheme as shown in [3] (pp. 70–72) and [4] (pp. 52–56) by some modifications due to our case.

First we take the case \(\mu \ge 0\) into account.

Consider the system

$$\begin{aligned} \dot{x} = Dx+ f(x) = Dx+f_r(x)+f_{nr}(x), \end{aligned}$$
(8)

where \(D=\mathrm{diag}\lambda \), \(f_{nr}=\langle f\rangle _{nr}\). Without loss of generality, we can set that the degree of monomials in \(f_{nr}\) is greater than q. Otherwise, we can apply the Poincaré-Dulac formal normal form reductions to do cancelations. Doing the coordinates substitution \(x=y + h(y)\) with \(\langle h \rangle _{nr}=h\) to system (8), it yields

$$\begin{aligned} \dot{y}= Dy + f_r(y) + [Dy+f_r(y),h]+ f_{nr} + {\mathscr {R}}_1 + {\mathscr {R}}_2 + {\mathscr {R}}_3 +{\mathscr {R}}_4, \end{aligned}$$
(9)

where

$$\begin{aligned} {\mathscr {R}}_1= & {} f_r(y+h(y))-f_r(y) - \partial f_r(y)h(y) \\ {\mathscr {R}}_2= & {} f_{nr}(y+h(y))-f_{nr}(y) \end{aligned}$$

and

$$\begin{aligned}&{\mathscr {R}}_3 = -\partial h f_{nr}(y)-\partial h \partial f_{r}(y)h(y)-\partial h Dh -\partial h ({\mathscr {R}}_1+{\mathscr {R}}_2) \\&{\mathscr {R}}_4 = \left( (I+\partial h)^{-1} -(I - \partial h)\right) \left( Dy+f_r(y)+Dh+ f_{nr}(y)+\partial f_r h + {\mathscr {R}}_1+{\mathscr {R}}_2\right) . \end{aligned}$$

First we study the remainder parts \({\mathscr {R}}_1\) and \({\mathscr {R}}_2\).

Proposition 8

Assume that f and \(h\in {\mathbb {C}}^d[[x]]\) and \(h(0)=0\). Set \(\rho = r+|h|_{r,\alpha }\). Then we have that

$$\begin{aligned}&|f\circ (\mathrm{id}+h)-f|_{r,\alpha } \le \rho ^{-1} \delta ^{-1} |f|_{\rho e^\delta ,\alpha } |h|_{r,\alpha }, \\&|f(\mathrm{id}+h)-f-\partial f h|_{r,\alpha } \le C \rho ^{-2} \delta ^{-2}|f|_{\rho e^\delta ,\alpha }|h|_{r,\alpha }^2, \end{aligned}$$

where \(C=4\).

Proof

This proof shares the same kernel as Proposition 4. For the fixed t, \(k=(k_1,\ldots ,k_d)\) and \(h=(h_1,\ldots ,h_d)\), using the equality

$$\begin{aligned} (x+h)^k-x^k = \sum _{t=1}^d(x_1+h_1)^{k_1}\cdots (x_{t-1}+h_{t-1})^{k_{t-1}}\left( (x_t+h_t)^{k_t}-x_t^{k_t}\right) x_{t+1}^{k_{t+1}}\cdots x_d^{k_d} \end{aligned}$$

and by rough estimations we obtain that

$$\begin{aligned} |(x_t+h_t)^{k_t}- x_t^{k_t}|_{r,\alpha }= & {} \left| \sum _{q=1}^{k_t}C_{k_t}^q x_t^{k_t-q}h^q_{t}\right| _{r,\alpha } \le \sum _{q=1}^{k_t}C_{k_t}^q r^{k_t-q}|h_{t}|_{r,\alpha }^q \\= & {} (r+|h_t|_{r,\alpha })^{k_t} -r^{k_t} \le k_t (r+|h_t|_{r,\alpha })^{k_t-1}|h_t|_{r,\alpha } \end{aligned}$$

from the classical mean value inequality, where \(C_q^t = \frac{q!}{(q-t)!t!}\). Then from Corollary 3 we obtain that

$$\begin{aligned}&|f\circ (\mathrm{id}+h) - f|_{r,\alpha }\nonumber \\&\quad \le \sum _{|k|,j} |f_{k,j}| |\left( (x+h)^k-x^k\right) e_j |_{r,\alpha } \nonumber \\&\quad \le \sum _{|k|,j}\sum _{t=1}^d |f_{k,j}| |(x_1+h_1)^{k_1}\cdots \left( (x_t+h_t)^{k_t}- x_t^{k_t}\right) x_{t+1}^{k_{t+1}}\cdots x_d^{k_d} e_j |_{r,\alpha } \nonumber \\&\quad \le \sum _{|k|,j}\sum _{t=1}^d |f_{k,j}| (|k|!)^{-\alpha } \left( r+|h_1|_{r,\alpha }\right) ^{k_1}\cdots |(x_t+h_t)^{k_t}- x_t^{k_t}|_{r,\alpha } r^{k_{t+1}+\cdots +k_d} \nonumber \\&\quad \le \sum _{|k|,j}\sum _{t=1}^d |f_{k,j}|(|k|!)^{-\alpha } |k| \rho ^{|k|-1} |h_t|_{r,\alpha } \nonumber \\&\quad \le \sum _{|k|,j} \rho ^{-1} |k| e^{-|k|\delta }|f_{k,j}|(|k|!)^{-\alpha } (\rho e^\delta )^{|k|} |h|_{r,\alpha } \nonumber \\&\quad \le \rho ^{-1} \delta ^{-1}|f|_{\rho e^\delta ,\alpha } |h|_{r,\alpha }\nonumber , \end{aligned}$$

by the fact \(\max _{x\ge 0}\{x e^{-\delta x}\} \le \delta ^{-1}\). Furthermore, by similar arguments we obtain that

$$\begin{aligned} |f(\mathrm{id}+h)-f-\partial f h|_{r,\alpha } \le&\sum _{|k|\ge 2,j} |f_{k,j}||\left( (x + h)^{k} - x^{k} - \sum _{t=1}^d k_t x^{k-e_t} h_t\right) e_j|_{r,\alpha } \\ \le&\sum _{|k|\ge 2,j} |f_{k,j}| (|k|!)^{-\alpha }\left( (r+|h|_{r,\alpha })^{|k|}-r^{|k|} - |k|r^{|k|-1} |h|_{r,\alpha }\right) \\ \le&\sum _{|k|\ge 2,j} |f_{k,j}|(|k|!)^{-\alpha } \left( C_{|k|}^2 r^{|k|-2}|h|_{r,\alpha }^2 \right. \nonumber \\&\left. + \,C_{|k|}^3r^{|k|-3}|h|_{r,\alpha }^3 + \cdots + |h|_{r,\alpha }^{|k|}\right) \\ \le \,&|h|_{r,\alpha }^2 \sum _{|k|\ge 2,j} |f_{k,j}|(|k|!)^{-\alpha } e^{-\delta |k|}|k|^2 (r+|h|_{r,\alpha })^{|k|-2} e^{\delta |k|} \\ \le \,&C \rho ^{-2} \delta ^{-2}|f|_{\rho e^\delta ,\alpha }|h|_{r,\alpha }^2, \end{aligned}$$

where \(C=4\) by the same arguments. That completes the proof. \(\square \)

By Proposition 7, we take \({{\hat{h}}}\) to be the non-resonant solution of \([Dy+f_r,{{\hat{h}}}]=-f_{nr}\) in system (9), which leads to a new one

$$\begin{aligned} \dot{y} = Dy + f_r(y) + f_r^+(y) + f_{nr}^+(y), \end{aligned}$$
(10)

where \(f^+ = \sum _{t=1}^4 {\mathscr {R}}_t\), \(f^+_{nr}= \langle f^+\rangle _{nr}\) and \(f^+_r=f-f^+_{nr}\). Note that \(\mathrm{ord}({{\hat{h}}})\ge q\).

Now comes the iterative lemma.

Lemma 9

Assume that conditions

$$\begin{aligned}&e^{-2}<re^{-3\delta }< r\le 1, \quad 0<\delta <1/3,\end{aligned}$$
(11)
$$\begin{aligned}&|f_{nr}|_{r,\alpha }\le \frac{\delta ^{\mu +1}}{e^2C C(\alpha ,q)} \end{aligned}$$
(12)

and

$$\begin{aligned} |f_r|_{r,\alpha }< \frac{c }{4e^2C(\alpha ,q)} \end{aligned}$$
(13)

are satisfied. Then in system (10) under condition \(\mathrm{(C2)}\) but for \(\mu \ge 0\) and \(\alpha \ge \frac{\mu +1}{q-1}\), we have that

$$\begin{aligned} |f^+|_{re^{-3\delta },\alpha } \le K \delta ^{-2(\mu +1)} |f_{nr}|_{r,\alpha }^2, \end{aligned}$$
(14)

where K is a constant. Here C is the same positive constant as mentioned in Proposition 7 and \(C(\alpha ,q)\) is given by (3) in Lemma 5.

Proof

First of all, we control \(|\partial {\hat{h}} |_{\rho ,\alpha }\) and \(|(I+\partial {{\hat{h}}})^{-1} -(I - \partial {{\hat{h}}})|_{\rho ,\alpha }\) by Lemma 5(ii) to handle \({\mathscr {R}}_3 \) and \({\mathscr {R}}_4\) for \(re^{-3\delta }\le \rho \le r\). Here we regard \(\partial {{\hat{h}}}\) as the operator on \({\mathscr {X}}_\rho \), whose operator norm is also denoted by \(|\partial {{\hat{h}}}|_{\rho ,\alpha }\) for the simplicity of expressions. Then for any g satisfying the condition of Lemma 5(ii), it yields

$$\begin{aligned} |\partial {{\hat{h}}} g |_{\rho ,\alpha } = \sum _{j=1}^d\sum _{i=1}^d |\partial _{x_i}{{\hat{h}}}_j g_i|_{\rho ,\alpha } \le \sum _{j=1}^d\sum _{i=1}^d |{\mathscr {P}}_\mu (\partial _{x_i}{{\hat{h}}}_j g_i)|_{\rho ,\alpha } \le C(\alpha ,q)\rho ^{-1}|{{\hat{h}}}|_{\rho ,\alpha } |g|_{\rho ,\alpha } \end{aligned}$$

with \(\mu \ge 0\) and \(\alpha \ge \frac{\mu +1}{q-1}\), where \({{\hat{h}}}=({{\hat{h}}}_1,\ldots ,{{\hat{h}}}_d)\) and \(g=(g_1,\ldots ,g_d)\) are of order no less than q at \(x=0\). Namely, the operator norm admits

$$\begin{aligned} |\partial {{\hat{h}}}|_{\rho ,\alpha } \le C(\alpha ,q)\rho ^{-1}|{{\hat{h}}}|_{\rho ,\alpha }. \end{aligned}$$
(15)

Then when

$$\begin{aligned} C(\alpha ,q)\rho ^{-1}|{{\hat{h}}}|_{\rho ,\alpha }\le 1/3, \end{aligned}$$
(16)

we obtain that

$$\begin{aligned} |(I+\partial {{\hat{h}}})^{-1} -(I - \partial {{\hat{h}}})|_{\rho ,\alpha } \le \sum _{i\ge 2}|\partial {{\hat{h}}}|^i_{\rho ,\alpha } \le \frac{3}{2}C^2(\alpha ,q)\rho ^{-2}|{{\hat{h}}}|_{\rho ,\alpha }^2 <1 \end{aligned}$$
(17)

by the Neuman series \((I+\partial {{\hat{h}}})^{-1} = \sum _{i\ge 0} (-1)^i (\partial {{\hat{h}}})^i\).

Next since the degree of monomials in \(f_{nr}\) is greater than q, so are \({{\hat{h}}}\) and \({\mathscr {R}}_i\) for \(i=1,2,3,4\). Then under condition (13), it yields

$$\begin{aligned} 4c^{-1}C(\alpha ,q) \rho ^{-1} |f_{r}|_{\rho ,\alpha } \le 4c^{-1}C(\alpha ,q) e^2 |f_{r}|_{r,\alpha }<1 \end{aligned}$$

for any \(\rho \) satisfying \(re^{-3\delta }\le \rho \le r\), which leads to \(|{{\hat{h}}}|_{re^{-\delta },\alpha } \le C \delta ^{-\mu }|f_{nr}|_{r,\alpha }\) from Proposition 7 with \(\rho =re^{-\delta }\). Furthermore, under condition (12) it admits

$$\begin{aligned} |{{\hat{h}}}|_{re^{-\delta },\alpha } \le C\delta ^{-\mu } |f_{nr}|_{r,\alpha } \le \frac{\delta }{e^2C(\alpha ,q)} \le e^{-2} \delta \end{aligned}$$
(18)

from the fact that \(C(\alpha ,q) >1\), which is given by (3) in Lemma 5. Thus, (16) is satisfied. Then use \(re^{-3\delta }\) instead of r as in Proposition 8. Note that from the above inequality, we obtain, by (11) that

$$\begin{aligned} \rho e^\delta = re^{-2\delta }+|{{\hat{h}}}|_{re^{-3\delta },\alpha }e^\delta \le re^{-2\delta }\left( 1+ r^{-1} e^{3\delta }e^{-2}\delta \right) \le re^{-2\delta }(1+\delta ) \le re^{-\delta }. \end{aligned}$$

Thus from Proposition 8 and \(\rho \ge re^{-3\delta }\) we obtain that

$$\begin{aligned}&|{\mathscr {R}}_1|_{re^{-3\delta },\alpha } \le 4 \rho ^{-2} \delta ^{-2} |f_r|_{\rho e^\delta ,\alpha }|{{\hat{h}}}|_{re^{-3\delta },\alpha }^2 \le 4e^4 \delta ^{-2} |f_r|_{r,\alpha }|{{\hat{h}}}|_{re^{-\delta },\alpha }^2 \le K_1 \delta ^{-(2\mu +2)} |f_{nr}|_{r,\alpha }^2 \\&|{\mathscr {R}}_2|_{re^{-3\delta },\alpha } \le \rho ^{-1} \delta ^{-1} |f_{nr}|_{\rho e^\delta ,\alpha } |{{\hat{h}}}|_{re^{-3\delta },\alpha } \le K_2 \delta ^{-(\mu +1)}|f_{nr}|^2_{r,\alpha } \le K_2 \delta ^{-(2\mu +2)}|f_{nr}|_{r,\alpha }^2, \end{aligned}$$

where \(K_1= e^2 c C^2/ C(\alpha ,q)\) and \(K_2=C e^2\). Furthermore, from (15) and (18) we can obtain that

$$\begin{aligned} |\partial {{\hat{h}}} D {{\hat{h}}}|_{re^{-3\delta },\alpha }&\le \overline{\lambda }C(\alpha ,q)r^{-1}e^{3\delta } |{{\hat{h}}}|_{re^{-3\delta },\alpha }^2 \le \overline{\lambda }C(\alpha ,q)e^2 C^2 \delta ^{-2\mu } |f_{nr}|_{r,\alpha }^2,\\ |\partial {{\hat{h}}} f_{nr}|_{re^{-3\delta },\alpha }&\le C(\alpha ,q) r^{-1}e^{3\delta }|{{\hat{h}}}|_{re^{-3\delta },\alpha }|f_{nr}|_{re^{-3\delta },\alpha } \le C(\alpha ,q)e^2 C \delta ^{-\mu } |f_{nr}|_{r,\alpha }^2, \end{aligned}$$

and, together with condition (13),

$$\begin{aligned} |\partial {{\hat{h}}} \partial f_r {{\hat{h}}}|_{re^{-3\delta },\alpha }&\le C^2(\alpha ,q)r^{-2}e^{6\delta }|f_r|_{re^{-3\delta },\alpha } |{{\hat{h}}}|_{re^{-3\delta },\alpha }^2 \\&\le C^2(\alpha ,q)e^4 \frac{c }{4e^2C(\alpha ,q)} C^2 \delta ^{-2\mu }|f_{nr}|_{r, \alpha }^2 \\&\le C(\alpha ,q)e^2 c C^2 \delta ^{-2\mu } |f_{nr}|_{r,\alpha }^2, \end{aligned}$$

where \(\overline{\lambda }=\max _i\{|\lambda _i|\}\). Moreover, from (15) and (18) we obtain that

$$\begin{aligned} |\partial {{\hat{h}}}|_{re^{-3\delta },\alpha } \le C(\alpha ,q)(re^{-3\delta })^{-1}\frac{\delta }{e^2C(\alpha ,q)} \le \delta \le \frac{1}{3}. \end{aligned}$$
(19)

On the one hand, it means that

$$\begin{aligned} |\partial {{\hat{h}}} ({\mathscr {R}}_1+{\mathscr {R}}_2)|_{re^{-3\delta },\alpha } \le \frac{1}{3} |{\mathscr {R}}_1+{\mathscr {R}}_2|_{re^{-3\delta },\alpha } \le |{\mathscr {R}}_1|_{re^{-3\delta },\alpha }+ |{\mathscr {R}}_2|_{re^{-3\delta },\alpha } \end{aligned}$$

So we have that

$$\begin{aligned} |{\mathscr {R}}_3|_{re^{-3\delta },\alpha } \le K_3 \delta ^{-(2\mu +2)}|f_{nr}|_{r,\alpha }^2, \end{aligned}$$

where \(K_3 = C(\alpha ,q)e^2 C+ C(\alpha ,q)e^2 c C^2+\overline{\lambda }C(\alpha ,q)e^2 C^2+K_1+K_2\). On the other hand, inequality (19) also guarantees the validity of (17), which means we can nearly handle all terms in \({\mathscr {R}}_4\) by similar arguments except \(((I+\partial {{\hat{h}}})^{-1} -(I - \partial {{\hat{h}}})) Dy\), because Dy is only of degree 1.

At last, we control the term \(((I+\partial {{\hat{h}}})^{-1} -(I - \partial {{\hat{h}}})) Dy\) by Lemma 5(i) instead. Using \(re^{-3\delta }\) instead of \(re^{-\delta }\) in Lemma 5(i), it yields

$$\begin{aligned} |\partial {{\hat{h}}}|_{re^{-3\delta },\alpha } \le \delta ^{-1} (re^{-2\delta })^{-1} |{{\hat{h}}}|_{r e^{-2\delta },\alpha } \le \delta ^{-1} e^2 \frac{\delta }{e^2 C(\alpha ,q)} =\frac{1}{ C(\alpha ,q)}<1 \end{aligned}$$

from (18). Then from the Neuman series again, we have that

$$\begin{aligned} |(I+\partial {{\hat{h}}})^{-1} -(I - \partial {{\hat{h}}})|_{re^{-3\delta },\alpha }&\le \sum _{i\ge 2}|\partial {{\hat{h}}}|^i_{r e^{-3\delta },\alpha } \\&\le \frac{C(\alpha ,q)}{C(\alpha ,q)-1}(re^{-2\delta })^{-2}\delta ^{-2}|{{\hat{h}}}|_{re^{-2\delta },\alpha }^2, \end{aligned}$$

which implies

$$\begin{aligned} |((I+\partial {{\hat{h}}})^{-1} -(I - \partial {{\hat{h}}}))Dy|_{re^{-3\delta },\alpha } \le {{\hat{C}}} \delta ^{-2(\mu +1)}|f_{nr}|_{r,\alpha }^2 \end{aligned}$$

for the constant \({{\hat{C}}} = d\overline{\lambda }C(\alpha ,q)e^4 C^2/(C(\alpha ,q)-1)\) by the simple computation \(|Dy|_{re^{-3\delta },\alpha } \le |Dy|_{1,\alpha } \le d\overline{\lambda }\). Therefore, from (13), (18), (12) and Lemma 5(ii) we obtain the estimation

$$\begin{aligned}&|f_r(y)+D{{\hat{h}}}+ f_{nr}(y)+\partial f_r {{\hat{h}}}|_{\rho ,\alpha } \le |f_r|_{\rho ,\alpha }+\overline{\lambda } |{{\hat{h}}}|_{\rho ,\alpha }\nonumber \\&\quad +\, |f_{nr}|_{\rho ,\alpha }+ \rho ^{-1} C(\alpha ,q)|f_r|_{\rho ,\alpha }|{{\hat{h}}}|_{\rho ,\alpha } \le \widetilde{C} \end{aligned}$$

with the constant

$$\begin{aligned} \widetilde{C} = \frac{c }{4e^2C(\alpha ,q)}+\frac{\overline{\lambda }}{3e^2}+ \frac{1}{3^{\mu +1}e^2C C(\alpha ,q)} + \frac{c}{12e^2} \end{aligned}$$

for \(re^{-3\delta }\le \rho \le re^{-\delta }\). Together with the last two inequalities of (17) and by similar arguments as above it yields

$$\begin{aligned}&|\left( (I+\partial h)^{-1} -(I - \partial h)\right) \left( f_r(y)+Dh+ f_{nr}(y)+\partial f_r h \right) |_{re^{-3\delta },\alpha } \\&\quad \le \frac{3}{2}\widetilde{C} C^2(\alpha ,q)r^{-2}e^{6\delta }|{{\hat{h}}}|_{re^{-3\delta },\alpha }^2 \le \frac{3}{2}\widetilde{C} C^2(\alpha ,q)e^4 C^2\delta ^{-2\mu }| f_{nr}|_{r,\alpha }^2,\\&|\left( (I+\partial h)^{-1} -(I - \partial h)\right) \left( {\mathscr {R}}_1+{\mathscr {R}}_2\right) |_{re^{-3\delta },\alpha } \nonumber \\&\quad \le |{\mathscr {R}}_1+{\mathscr {R}}_2|_{re^{-3\delta },\alpha } \le (K_1+K_2) \delta ^{-(2\mu +2)} |f_{nr}|_{r,\alpha }^2. \end{aligned}$$

So it leads to \(|{\mathscr {R}}_4|_{re^{-3\delta },\alpha }\le K_4 \delta ^{-2(\mu +1)}|f_{nr}|_{r,\alpha }^2\) for another positive constant \(K_4= {{\hat{C}}}+3\widetilde{C} C^2(\alpha ,q)e^4 C^2/2+K_1+K_2\). Thus, we can choose \(K=\sum _{i=1}^4 K_i\), which completes the proof. \(\square \)

Thus the formal coordinates substitution can be found for \(\mu \ge 0\).

Theorem 10

Assume that system (1) is formal Gevrey-\(\alpha \)(\(\alpha \ge 0\)). Then under condition \(\mathrm{(C2)}\) with \(\mu \ge 0\) and \(\alpha \ge \frac{\mu +1}{q-1}\), there exits a formal Gevrey-\(\alpha \) coordinates substitution, which turns system (1) into its normal form.

Proof

Since \(N=0\) for this case in system (1), we make \(f=R\). By the scaling \(x \mapsto \varepsilon _0 x\), we can set \(|f|_{1,\alpha } = \varepsilon _0\), whose norm can be sufficiently small. Now choose \(\delta _n =\delta _0 2^{-n}\). Taking \(\delta _0 <1/3\), \(r_0=1\) and \(r_n = r_{n-1}e^{-\delta _{n-1}}\), by induction we can assume that \(f^{(0)}=f\) and in the n-th step it begins at system (8) with \(f^{(n-1)}(x)\) instead of f(x), solves (6) with \(F=Dx+f^{(n-1)}_r(x)\), \(H = {{\hat{h}}}_n\) and \(G = -f^{(n-1)}_{nr}\) in the norm \(|\cdot |_{r_n e^{-\delta _n},\alpha }\) and end in system (10) for \(f^+ = f^{(n)}\).

Thus by the control (14), if in each step, which is realized by \(r_{n+1}=r_ne^{-\delta _n}=re^{-3\delta }\) as in Lemma 9, can be applied, we shall get that

$$\begin{aligned} |f^{(n)}|_{r_{n+1},\alpha }= & {} |f^{(n)}|_{r_n e^{-\delta _n},\alpha } \le {{\hat{K}}} \delta _n^{-2(\mu +1)}|f_{nr}^{(n-1)}|^2_{r_n,\alpha }\\= & {} {\hat{K}} \delta _n^{-2(\mu +1)}|f_{nr}^{(n-1)}|^2_{r_{n-1}e^{-\delta _{n-1}},\alpha } \\\le & {} {\hat{K}}^{1+2} \delta _n^{-2(\mu +1)}\delta _{n-1}^{-4(\mu +1)}|f_{nr}^{(n-2)}|_{r_{n-1},\alpha }^{2^2} \\\le & {} \cdots \le \left( {\hat{K}}\delta _0^{-2(\mu +1)}\right) ^{1+2+2^2+\cdots +2^{n-1}} 2^{2(\mu +1)(n+2(n-1)+\cdots +2^{n-1}\cdot 1)}|f^{(0)}_{nr}|_{1,\alpha }^{2^{n+1}} \\\le & {} \left( {\hat{K}} \delta _0^{-2(\mu +1)}2^{2(\mu +1)}\varepsilon _0\right) ^{2^{n+1}}, \end{aligned}$$

where \({{\hat{K}}}:= 3^{-2(\mu +1)K}\). Note again we have set that the degree of all monomials of \(f^{(0)}_{nr}\) is greater than q, and so are \(f^{(n)}_{nr}\) for all n by the form of \({\mathscr {R}}_i\) mentioned in system (9). Namely, we always have that \(\mathrm{ord}(f^{(n)}_{r})\ge q\) and Proposition 7 is ready to be applied in each step. Now we verified conditions one by one in Lemma 9 by choosing a proper \(\varepsilon _0\). First by simple calculations we have that \(r_n = e^{-\delta _0(2^{-n} +2^{-(n-1)}+ \cdots +1)}\cdot 1 \ge e^{-2\delta _0} \ge e^{-2}\), which fulfills (11) with \(r=r_{n}\) and \(\delta =\delta _{n}/3\) for any n. And conditions (12) and (13) shall be satisfied by making

$$\begin{aligned} \left( {{\hat{K}}} (\delta _0/2)^{-2(\mu +1)}\varepsilon _0\right) ^{2^{n+1}} \le \dfrac{(\delta _02^{-(n+1)})^{\mu +1}}{3^{\mu +1}e^2C C(\alpha ,q)}, \quad n \in {\mathbb {Z}}_+ \end{aligned}$$

and

$$\begin{aligned} |f^{(n)}_r|_{r_{n+1},\alpha } \le \varepsilon _0 + \sum _{t\ge 1}\left( {{\hat{K}}} (\delta _0/2)^{-2(\mu +1)}\varepsilon _0\right) ^{2^{t+1}} < \frac{c }{4e^2C(\alpha ,q)}, \quad n \in {\mathbb {Z}}_+. \end{aligned}$$

Since it admits

$$\begin{aligned} L_0=\inf _{n\in {\mathbb {Z}}_+}\left( \dfrac{(\delta _02^{-(n+1)})^{\mu +1}}{3^{\mu +1}e^2C C(\alpha ,q)}\right) ^{\frac{1}{2^{n+1}}}>0, \end{aligned}$$

then we set \(Q= {{\hat{K}}} (\delta _0/2)^{-2(\mu +1)}\) and know that

$$\begin{aligned} \varepsilon _0 \le \min \left\{ \frac{1}{2Q}, \dfrac{L_0}{Q}, \frac{c }{4e^2C(\alpha ,q)(2Q+1)}\right\} \end{aligned}$$

is enough.

At last, set \(h_n=\mathrm{id}+{{\hat{h}}}_n\) with \(h_0=\mathrm{id}\) and we have that \(h^{(n)}=h_n\circ h_{n-1}\circ \cdots h_0\), which implies \(h^{(n)}-h^{(n-1)}={{\hat{h}}}_n\circ h_{n-1}\circ \cdots h_0\). We can naturally show that \(|h^{(n)}|_{{{\hat{r}}},\alpha }\) converges on a non-trivial domain with \({{\hat{r}}} = 2e^{-2}/(3d)\) from (18). First we confirm that for this \({{\hat{r}}} = 2e^{-2}/(3d)\) the compositions are well defined in this reign, i.e. \(|h^{(n)}|_{{{\hat{r}}},\alpha }\le e^{-2}\) for any n. Since \({{\hat{r}}} \le e^{-2} \le r_n\) for any n and \(r_{n} = r_{n-1}e^{-\delta _{n-1}}\le r_{n-1}e^{-\delta _{n-1}/3}\le r_{n-1}\), we obtain that \(|{{\hat{h}}}_n|_{{{\hat{r}}},\alpha } \le |{{\hat{h}}}_n|_{r_{n-1} e^{-\delta _{n-1}/3},\alpha }\le e^{-2}\delta _{n-1}/3 = 2^{-(n-1)}e^{-2}\delta _0/3\) by (18). When \(n=0\), we have that \(|h^{(0)}|_{{{\hat{r}}},\alpha }=|h_0|_{{{\hat{r}}},\alpha }=|\mathrm{id}|_{{{\hat{r}}},\alpha } =d{{\hat{r}}} = 2e^{-2}/3\). When \(n=1\), we have that \(|h^{(1)}|_{{{\hat{r}}}, \alpha } =|h_1|_{{{\hat{r}}}, \alpha }= |\mathrm{id} +{{\hat{h}}}_1|_{{{\hat{r}}},\alpha }\le |\mathrm{id}|_{{{\hat{r}}},\alpha } +|{{\hat{h}}}_1|_{{{\hat{r}}},\alpha }\le d{{\hat{r}}} +|{{\hat{h}}}_1|_{r_0e^{-\delta _0/3},\alpha } \le 2e^{-2}/3 + e^{-2}\delta _0/3 <e^{-2}\). Now assume that \(|h^{(k)}|_{{{\hat{r}}}, \alpha } \le 2e^{-2}/3 + e^{-2}\delta _0/3+\cdots + e^{-2}\delta _{k-1}/3 =2e^{-2}/3 + e^{-2}\delta _0(1+2^{-1}+\cdots +2^{-(k-1)})/3\le e^{-2}/3 + e^{-2}2\delta _0/3 <e^{-2}\). Thus

$$\begin{aligned} |h^{(n)}|_{{{\hat{r}}},\alpha }&\le |{{\hat{h}}}_n\circ h^{(n-1)}+h^{(n-1)}|_{{{\hat{r}}},\alpha } \le |{{\hat{h}}}_n\circ h^{(n-1)}|_{{{\hat{r}}},\alpha }+|h^{(n-1)}|_{{{\hat{r}}},\alpha } \\&\le |{{\hat{h}}}_n|_{e^{-2},\alpha }+ 2e^{-2}/3 + e^{-2}\delta _0\left( 1+2^{-1}+\cdots +2^{-(n-2)}\right) /3 \\&\le |{{\hat{h}}}_n|_{r_{n-1}e^{-\delta _{n-1}/3},\alpha }+2e^{-2}/3 + e^{-2}\delta _0\left( 1+2^{-1}+\cdots +2^{-(n-2)}\right) /3 \\&\le 2e^{-2}/3 + e^{-2}\delta _0\left( 1+2^{-1}+\cdots +2^{-(n-1)}\right) /3 \le 2e^{-2}/3 + e^{-2}2\delta _0/3 <e^{-2}. \end{aligned}$$

Therefore, we can show that above priori estimations imply the convergence from the control

$$\begin{aligned} |h^{(n)}-h^{(n-1)}|_{{{\hat{r}}},\alpha } = |{{\hat{h}}}_n \circ h^{(n-1)}|_{{{\hat{r}}},\alpha } \le |{{\hat{h}}}_n |_{e^{-2},\alpha } \le |{{\hat{h}}}_n|_{r_{n-1}e^{-\delta _{n-1}/3}} \le 2^{-(n-1)}e^{-2}\delta _0/3,\quad \forall n. \end{aligned}$$

So \(h^{(n)}\) is a convergent sequence in \(|\cdot |_{{{\hat{r}}},\alpha }\), which completes the proof. \(\square \)

Then we deal with the case \(-1\le \mu < 0\) by Contracting Mapping Principle.

Since the formal normal form is a polynomial by Proposition 15, we consider the particular form of system (1) as follows

$$\begin{aligned} \dot{x}= (D+ N)x + P(x) + R(x), \end{aligned}$$
(20)

where P and R are nonlinearities satisfying P is a polynomial, \(\langle P \rangle _{nr}=0\) and \(\langle R \rangle _{nr}=R\), N is the well chosen nilpotent linear part fulfilling Proposition 7(i) for \(\mu =-1\) and \(N = 0\) for \(-1<\mu < 0\). Without loss of generality, we can assume that the degree of all nonlinear monomials in R is greater than \({{\hat{q}}} = \deg (P)\). As usual, \(\deg (P)\) is the degree of the polynomial P. If the transformation \(x= y+h(y)\) can turn system (20) into its normal form

$$\begin{aligned} \dot{y} = (D+ N)y + P(y), \end{aligned}$$

then h shall admit

$$\begin{aligned}{}[F,h] = \partial P h - P(y+h)+P(y) - R(y+h), \end{aligned}$$
(21)

where \(F(y)=(D+N)y+P(y)\) and \([\cdot ,\cdot ]\) is the classical Lie bracket with respect to the variable y.

Now we restrict our focus on the ball

$$\begin{aligned} B_r = \left\{ h~|~|h|_{r,\alpha }\le r, h= \sum _{|k|\ge s,j}h_{k,j}x^ke_j \in {\mathbb {C}}^d[[x]]~\right\} \subseteq {\mathscr {X}}_r \end{aligned}$$

equipped with the norm \(|\cdot |_{r,\alpha }\), where \(s={{\hat{q}}}+1\ge 2\) for \(\mu =-1\) and \(s={{\hat{q}}}+1\ge q\ge 2\) for \(-1<\mu <0\). Here \({{\hat{q}}}\) and q are the same defined as before. Then for any operator \({\mathcal {T}}\) acting on the formal vector series h, we say that the operator \({\mathcal {T}}\) is strongly contracting, if \(|{\mathcal {T}}(0)|_{r,\alpha } = O(r^2)\) and \({\mathcal {T}}\) is Lipschitz on the ball \(B_r\) under the norm \(|\cdot |_{r,\alpha }\), with the Lipschitz constant no greater than O(r) as \(r\rightarrow 0\). As usual, O(1) refers to the bounded quantity by a limiting process. In this context, denote operators \({\mathcal {T}}_1\), \({\mathcal {T}}_2\) and \({\mathcal {T}}_3\) by

$$\begin{aligned} {\mathcal {T}}_1: h \mapsto \partial P h,\quad {\mathcal {T}}_2: h \mapsto P(\mathrm{Id}+h)-P,\quad {\mathcal {T}}_3: h\mapsto R(\mathrm{Id}+h). \end{aligned}$$

Hence, equation (21) has an equivalent form by above operators

$$\begin{aligned}{}[F,h]={\mathcal {T}}_1 (h)-{\mathcal {T}}_2 (h)-{\mathcal {T}}_3 (h). \end{aligned}$$
(22)

Next come the properties of \({\mathcal {T}}_i\) for \(i=1, 2\) and 3.

Lemma 11

Set \(f=P+R\) and \(-1\le \mu <0\). The operator \({\mathcal {T}}_i\) is strongly contracting for \(i=1, 2\) and 3, provided that \(|f|_{r_0,\alpha } < \infty \).

Proof

First we note again that

$$\begin{aligned} |g|_{r,\alpha } = \sum _{|k|\ge s, j} \frac{|g_{k,j}|}{(|k|!)^\alpha } r^{|k|} \le \max _{|k|\ge s}\left\{ (rr_0^{-1})^{|k|}\right\} \sum _{|k|\ge s, j} \frac{|g_{k,j}|}{(|k|!)^\alpha } r_0^{|k|}=r^s r_0^{-s} |g|_{r_0,\alpha }, \end{aligned}$$

provided that \(r< r_0\), \(g=\sum _{|k|\ge s,j} g_{k,j}x^ke_j \in {\mathbb {C}}^d[[x]]\) and \(|g|_{r_0,\alpha }<\infty \). Since we have set that the degree of all nonlinear monomials in R is greater than \({{\hat{q}}}=\deg (P)\), so is the degree of ones in \({\mathcal {T}}_i\) for all i. Make \(s={{\hat{q}}}+1\ge 2\) for \(\mu =-1\) and \(s={{\hat{q}}}+1\ge q\ge 2\) for \(-1<\mu <0\) as above. Then, by Lemma 5 \(\mathrm{(i)}\) and the above fact, the linear operator \({\mathcal {T}}_1\) satisfies

$$\begin{aligned} |{\mathcal {T}}_1(h)|_{r,\alpha }\le |\partial P h|_{r,\alpha } \le (\ln 2)^{-1} r^{-1}|P|_{2r,\alpha }|h|_{r,\alpha } \le C_4 |f|_{r_0,\alpha } r^{s-1} |h|_{r,\alpha }, \end{aligned}$$

where \(C_4 = (\ln 2)^{-1}r_0^{-s}\) from Lemma 5(i) and \(r\le r_0/2\). Whatever the case is, it leads to \(s-1\ge 1\) and we have the strongly contractive operator \({\mathcal {T}}_1\).

Next taking \({\mathcal {T}}_2\) into account, we get that \({\mathcal {T}}_2(0) =0\). Then from Proposition 4 it yields

$$\begin{aligned} |(y_t+h_t)^{k_t}- (y_t+{{\hat{h}}}_t)^{k_t}|_{r,\alpha }&= \left| \left( \sum _{i=0}^{k_t-1}(y_t+h_t)^{k_t-1-i}(y_t+{{\hat{h}}}_t)^{i}\right) (h_t-{{\hat{h}}}_t)\right| _{r,\alpha } \\&\le \left| \Bigg (\sum _{i=0}^{k_t-1}(y_t+h_t)^{k_t-1-i} (y_t+{{\hat{h}}}_t)^{i}|_{r,\alpha }|h_t-{{\hat{h}}}_t\right| _{r,\alpha } \\&\le k_t \left( r+\max \left\{ |h_t|_{r,\alpha },|{{\hat{h}}}_t|_{r,\alpha }\right\} \right) ^{k_t-1}|h_t-{{\hat{h}}}_t|_{r,\alpha }, \end{aligned}$$

where \(h=(h_1,\ldots ,h_d)\), \({{\hat{h}}}=({{\hat{h}}}_1,\ldots ,{{\hat{h}}}_d)\), \(k=(k_1,\ldots ,k_d)\) and t is fixed. In this way, we obtain that

$$\begin{aligned}&|{\mathcal {T}}_2(h)-{\mathcal {T}}_2({{\hat{h}}})|_{r,\alpha }\\&\quad \le \sum _{|k|\ge s,j}\sum _{t=1}^d |P_{k,j}| |(y_1+h_1)^{k_1}\cdots \left( (y_t+h_t)^{k_t}\right. \nonumber \\&\qquad \left. -\,(y_t+{{\hat{h}}}_t)^{k_t}\right) (y_{t+1}+{{\hat{h}}}_{t+1})^{k_{t+1}}\cdots (y_d+{{\hat{h}}}_d)^{k_d} e_j |_{r,\alpha } \nonumber \\&\quad \le \sum _{|k|\ge s,j} \sum _{t=1}^d |P_{k,j}|(|k|!)^{-\alpha } k_t (2r)^{k_t-1}|h_t-{{\hat{h}}}_t|_{r,\alpha }(2r)^{k_1+\cdots +k_{t-1}+k_{t+1}+\cdots +k_d} \\&\quad \le \sum _{|k|\ge s,j} |P_{k,j}|(|k|!)^{-\alpha } (2r)^{|k|-1} |k| |h-{{\hat{h}}}|_{r,\alpha } \\&\quad \le C_5 r^{s-1}|f|_{r_0,\alpha } |h-{{\hat{h}}}|_{r,\alpha } \end{aligned}$$

from Corollary 3, where h and \({{\hat{h}}}\in B_r\), \(r\le r_0/4\) and \(C_5= 2^{s-1}r_0^{-s} \max _{|k|\ge s,j} 2^{-(|k|-s)}|k|\). By similar arguments, so is \({\mathcal {T}}_3\). This completes the proof. \(\square \)

With the aid of above lemma and Proposition 7 we can solve (22) finally.

Theorem 12

Assume that system (1) is formal Gevrey-\(\alpha \)(\(\alpha \ge 0\)). Then under condition \(\mathrm{(C1)}\) or under condition \(\mathrm{(C2)}\) with \(-1<\mu <0\) and \(\alpha \ge \frac{\mu +1}{q-1}\), there exits a formal Gevrey-\(\alpha \) coordinates substitution, which turns system (1) into its normal form.

Proof

As we have shown, the existence of the change is equivalent to the solvability of the operator equation (22). Rewrite it in another form, (22) turns to

$$\begin{aligned} h = ad_F^{-1}\left( {\mathcal {T}}_1 (h)-{\mathcal {T}}_2 (h)-{\mathcal {T}}_3 (h)\right) , \end{aligned}$$

where \(ad_F (\cdot ) = [F,\cdot ]\) and \({\mathcal {T}}_i\) is the same as defined above for \(i=1,2,3\). Notice that no resonance happens in \(B_r\). Therefore, by Proposition 7(i) the operator \(ad_F^{-1}\) is bounded for \(\mu =-1\). Note that \(|P|_{r,\alpha } = O(r^q)\) as \(r\rightarrow 0\). So the condition of Proposition 7(ii) is also satisfied, which means \(ad_F^{-1}\) is bounded for \(-1<\mu <0\), provided that we take r small enough. Then from Lemma 11 the operators \({\mathcal {T}}_i\) is strongly contractive for \(i=1,2,3\). And so is \(ad_F^{-1}\circ ({\mathcal {T}}_1 -{\mathcal {T}}_2 -{\mathcal {T}}_3 )\). Thus, we can choose \({{\hat{r}}}>0\) small enough such that \(ad_F^{-1}\circ ({\mathcal {T}}_1 -{\mathcal {T}}_2 -{\mathcal {T}}_3 )\) maps \(B_{{{\hat{r}}}}\) into itself and the corresponding Lipschitz of this operator is less then 1. By Contracting Mapping Principle, we completes the proof. \(\square \)

5 Proof of the main theorem

In this part, we provide the proof of the main theorem and do further considerations.

Proof of Theorem 1

Result (i) is directly from Theorem 10 and 12together. Then by Stolovitch’s arguments (Theorem 2.8, pp. 252) in [9], we get (ii) and (iii). This completes the proof. \(\square \)

At last, we consider one known result in our context, which refers to Bruno type conditions(Proposition 2.5, pp. 248) in [9] under the assumption that the system can be formally linearized. Now altering the classical Bruno conditions into the small divisor form, our methods can be applied.

Theorem 13

Assume that system (1) is formal Gevrey-\(\alpha \)(\(\alpha \ge 0\)) and there exists positive constants c and \(\nu \in (0,1)\) such that

$$\begin{aligned} |\langle k,\lambda \rangle -\lambda _j| \ge ce^{-|k|^\nu },\quad \forall (k,j)\in \Omega _{nr}. \end{aligned}$$

If D is in the diagonal form and system (1) can be formally linearized, then the linearized transformation can be chosen in the formal Gevrey-\(\alpha \) class.

Proof

By Proposition 4 and Lemma 5(i), we can analogously apply the original proof for analytic case via KAM methods except using our norms \(|\cdot |_{r,\alpha }\) instead of the classical majorant norms \(|\cdot |_r\). This completes the proof. \(\square \)

Here we shall note that it seems hopeless to build similar criterion as Lemma 5, which means that the Gevrey smooth topology may be too fine for Bruno type conditions.

At last, two example are well illustrated for the application.

Example 1

Consider the following planar Gevrey-\(\alpha \) smooth vector fields

$$\begin{aligned} \dfrac{dx}{dt}=Ax+f(x), \end{aligned}$$
(23)

where A is hyperbolic. From Theorem 1 and using a possible constant time scaling, by a Gevrey-\({{\hat{\alpha }}}\) smooth coordinates substitution we have the smooth normal form as follows

  1. (i)

    If real parts of eigenvalues of A are both positive or negative, then either the normal form is

    $$\begin{aligned} \dfrac{dx_1}{dt}=k x_1+ b_k x_2^k,\quad \dfrac{dx_2}{dt}=x_2, \end{aligned}$$

    for \(b_k\ne 0\) or the system can be linearized. Moreover, for both cases it admits \({{\hat{\alpha }}} = \alpha \) because of \(\mu =-1\).

  2. (ii)

    If real parts of eigenvalues of A have different signs, then

    1. (a)

      either the normal form is

      $$\begin{aligned} \dfrac{dx_1}{dt}=-p x_1+ \sum _{t\ge k}c_t x_1(x_1^qx_2^p)^t, \quad \dfrac{dx_2}{dt}=q x_2+ \sum _{t\ge k}{{\hat{c}}}_t x_2(x_1^qx_2^p)^t, \end{aligned}$$

      for \(c_k\ne 0\), p and \(q\in {\mathbb {Z}}_+\). Then \(\hat{\alpha }=\max \{\alpha ,\frac{(q+p)k+1}{(q+p)k}\}\) for \(\mu =0\). Or it can be formally linearized, i.e. the normal form is

      $$\begin{aligned} \dfrac{dx_1}{dt}=-p x_1,\quad \dfrac{dx_2}{dt}=q x_1, \end{aligned}$$

      and \({{\hat{\alpha }}}=\alpha \).

    2. (b)

      either the normal form is

      $$\begin{aligned} \dfrac{dx_1}{dt}=-\mu x_1, \quad \dfrac{dx_2}{dt}= x_2, \end{aligned}$$

      where \(\mu >0\) is irrational. Moreover, When \((-\mu ,1)\) fulfils Bruno condition, we have \(\alpha ={{\hat{\alpha }}}\). In other cases, the transformation is only \(C^\infty \).

Example 2

Now we plus additional one dimension in case \(\mathrm{(i)}\) of example 1 by making \(A=\mathrm{diag}\lambda = \mathrm{diag}(p,1,-\xi )\) in system (23), where \(p\in {\mathbb {Z}}_+{\setminus }\{1\}\), \(\xi >0\) is irrational and diophantine, i.e. we have that

$$\begin{aligned} |l_1\xi +l_2| \ge c|l|^{-\mu }, \end{aligned}$$

for \(l=(l_1,l_2)\in {\mathbb {Z}}^2\) and \(|l|=|l_1|+|l_2|\). Then for any \(k=(k_1,k_2,k_3)\in {\mathbb {Z}}_+^3\) and \(|k|\ge 2\), by simple computation we obtain that \( \Omega _{r}=\{(0,p,0)\}\) and condition (C2) is fulfilled with the same \(\mu \). So we have the smooth normal forms

$$\begin{aligned} \dfrac{dx_1}{dt}=p x_1+ b_p x_2^p,\quad \dfrac{dx_2}{dt}=x_2,\quad \dfrac{dx_2}{dt}=-\xi , \end{aligned}$$

by a Gevrey-\({{\hat{\alpha }}}\) smooth change for \(\hat{\alpha }=\max \{\alpha , \frac{\mu +p}{p-1}\}\).