1 Introduction

In this paper we study the local in time solvability of the Cauchy problem associated to the following quasilinear Schrödinger equation

$$\begin{aligned} \textrm{i}u_{t}+(\partial _{\overline{u}}F)(u,\nabla u)-\sum _{j=1}^{d}\partial _{x_j}\big (\partial _{\overline{u}_{x_{j}}}F\big )(u,\nabla u)=0 \,, \end{aligned}$$
(1.1)

where \( u=u(t,x)\,\) and \( x=(x_1,\ldots ,x_{d})\in \mathbb {T}^{d}:=(\mathbb {R}/2\pi \mathbb {Z})^{d} \) and where we denoted \(\partial _{u}:=(\partial _{\textrm{Re}(u)}-\textrm{i}\partial _{\textrm{Im}(u)})/2\) and \(\partial _{\bar{u}}:=(\partial _{\textrm{Re}(u)}+\textrm{i}\partial _{\textrm{Im}(u)})/2\) the Wirtinger derivatives. The function \(F(y_0,y_1,\ldots ,y_{d})\) is a real valued polynomial in the complex variables \((y_0,\ldots ,y_d)\in {{\mathbb {C}}}^{d+1}\). Here \( \nabla u=(\partial _{x_{1}} u, \ldots , \partial _{x_d}u) \) is the gradient.

Notice that Eq. (1.1) is Hamiltonian, i.e.

$$\begin{aligned} u_{t}=\textrm{i}\nabla _{\bar{u}}H(u,\bar{u})\,, \quad H(u,\bar{u}):=\int _{\mathbb {T}^{d}}F(u,\nabla u) \hbox {d}x\,, \end{aligned}$$
(1.2)

where \(\nabla _{\bar{u}}:=(\nabla _{\textrm{Re}(u)}-\textrm{i}\nabla _{\textrm{Im}(u)})/2\) and \(\nabla \) denote the \(L^{2}\)-gradient. We assume that the function F, defining the non-linearity, satisfies following ellipticity condition: there exists \(\texttt {c}>0\) such that for any \(\xi =(\xi _1,\ldots ,\xi _{d})\in \mathbb {R}^{d}\), \(y=(y_0,\ldots ,y_d)\in \mathbb {C}^{d+1}\) one has

$$\begin{aligned} \sum _{j,k=1}^{d}\xi _{j}\xi _{k}\partial _{y_j} \partial _{\overline{y}_{k}}F(y) -\left| \sum _{j,k=1}^{d}\xi _{j}\xi _{k}\partial _{\overline{y}_j}\partial _{\overline{y}_k}F(y) \right| \ge \texttt {c}|\xi |^2\,. \end{aligned}$$
(1.3)

Note that this condition implies that

$$\begin{aligned}&\sum _{j,k=1}^{d}\xi _{j}\xi _{k}\partial _{y_j} \partial _{\overline{y}_{k}}F(y)\ge \texttt {c}|\xi |^2,\\&\quad \left( \sum _{j,k=1}^{d}\xi _{j}\xi _{k}\partial _{y_j} \partial _{\overline{y}_{k}}F(y)\right) ^2 -\left| \sum _{j,k=1}^{d}\xi _{j}\xi _{k}\partial _{\overline{y}_j}\partial _{\overline{y}_{k}}F(y) \right| ^2 \ge \texttt {c}|\xi |^2, \end{aligned}$$

which are the assumptions made in the paper [9]. The main result of this paper is the following local well posedness theorem on Sobolev spaces (see Sect. 2).

Theorem 1.1

(Local well-posedness) Let \(s\ge \mathfrak {s}_0> d/2+3\), consider the Eq. (1.1) with initial condition \(u(0,x)=u_0(x)\) in \(H^s(\mathbb {T}^d;\mathbb {C})\), and with F satisfying (1.3). There exists a time \(0<T=T(\Vert u_0\Vert _{\mathfrak {s}_0})\) and a unique solution u(tx) of (1.1) in \(C^0([0,T),H^s) \cap C^1([0,T),H^{s-2}).\) Moreover the solution map \(u_0(x)\mapsto u(t,x)\) is continuous with respect to the \(H^s\) topology for any t in [0, T).

Theorem 1.1 improves the results in [8, 9] in terms of the required regularity on the initial datum. Furthermore the time of existence of the solutions depends only on the low norm \(\Vert \cdot \Vert _{\mathfrak {s}_0}\). The regularity threshold \(d/2+3\) is the same required by Marzuola et al. [15] (which improves the pioneering result by Kenig et al. [14]) for general quasilinear Schrödinger equations on \({{\mathbb {R}}}^d\). Moreover, as in [15], when the function defining the Hamiltonian has the special form \(F(u,\nabla u)=|\nabla h(|u|^2)|^2\) we can improve the result requiring only \(s>d/2+2\). We have the following.

Theorem 1.2

If, in addition to the hypotheses of Theorem 1.1, one has \(F(u,\nabla u)=|\nabla h(|u|^2)|^2\) for some polynomial function \(h:{{\mathbb {C}}}\rightarrow {{\mathbb {C}}}\), then the result of Theorem 1.1 holds true for any \(\mathfrak {s}_0>d/2+2\).

The equation satisfying the hypotheses of Theorem 1.2 is the one considered in [5, 7, 11].

A very efficient way to prove the well posedness for a quasilinear equation, is to paralinearize it and consider a quasilinear iterative scheme à la Kato [13] to build up the solutions. The key step, in order to pass to the limit in the iterative scheme, is to obtain some a priori estimates on the solutions of the paralinearized system. This method has been refined recently, see for instance [2, 8, 9, 15].

In [9] the authors introduce a non sharp paradifferential calculus on \({{\mathbb {T}}}^d\), which has been refined by Berti et al. [2]. Moreover in [9], to obtain a priori estimates, the authors perform some changes of coordinates of \(H^s\) in order to diagonalize the paralinearized system at the positive orders. In [2], instead of changing unknown, the authors, inspired also by Alazard et al. [1], introduce a modified norm in \(H^s\) which is tailored to the problem and which allows them to obtain the wanted estimates. This is less expensive in terms of amount of regularity, for this reason we follow closely this approach. The main difference with respect to [2] is that the the matrix of symbols which diagonalizes the principal order, see (4.5), depends on \(\xi \), u and \(\nabla u\), while in [2] (and in the case that F satisfies the hypotheses of Theorem 1.2) it depends only on u. The \(\xi \)-dependence imposes a lower order correction in the diagonalization process, see (4.6), in order to have a well defined parametrix with a remainder gaining two derivatives. In [2], to show the existence for the paralinearized problem, the authors use a finite rank projection with a cut-off which is tailored to the problem. It is not clear to the author of the present paper if this approach works for lower order symbolic calculus, which is needed here for the aforementioned reasons for the general Eq. (1.1) (it clearly works in the case that the nonlinearity satisfies the hypothesis of Theorem 1.2). We use, instead, the artificial viscosity approximation to build the solutions. A Garding type inequality is therefore needed, see Lemma 4.8, to close the energy estimates.

The (1.1) is slightly more general than the equation considered in [9], where the presence of the linear term \(\Delta u\) was required. The presence of this linear term was somehow used in the proof of the existence of the solutions of the paralinearized problem.

We conclude this introduction quoting the recent paper by Jeong and Oh [12] about the ill posedness in \({{\mathbb {R}}}^d\) of the problem in the case that (1.3) is not satisfied. We also mention the paper by Christ [4], in which the author gives some examples of non Hamiltonian Schrödinger type equations which are ill posed on the circle.

2 Paradifferential calculus

We fix some notation concerning Sobolev spaces. We denote by \(H^{s}(\mathbb {T}^d;\mathbb {C})\) (resp. \(H^{s}(\mathbb {T}^d;\mathbb {C}^{2})\)) the usual Sobolev space of functions \(\mathbb {T}^{d}\ni x \mapsto u(x)\in \mathbb {C}\) (resp. \({{\mathbb {C}}}^{2}\)). We expand a function u(x) , \(x\in \mathbb {T}^{d}\), in Fourier series as

$$\begin{aligned} u(x) = {(2\pi )^{-{d}/{2}}} \sum _{n \in \mathbb {Z}^{d} } \hat{u}(n)e^{\textrm{i}n\cdot x } \, , \qquad \hat{u}(n) := {(2\pi )^{-{d}/{2}}} \int _{\mathbb {T}^{d}} u(x) e^{-\textrm{i}n \cdot x } \, \hbox {d}x \, . \end{aligned}$$
(2.1)

We also use the notation \(u_n := \hat{u}(n)\) and \( \overline{u_n} := \overline{\hat{u}(n)} \,. \) We set \(\langle j \rangle :=\sqrt{1+|j|^{2}}\) for \(j\in \mathbb {Z}^{d}\). We endow \(H^{s}(\mathbb {T}^{d};\mathbb {C})\) with the norm

$$\begin{aligned} \Vert u(\cdot )\Vert _{s}^{2}:=\sum _{j\in \mathbb {Z}^{d}}\langle j\rangle ^{2s}|u_{j}|^{2}\,. \end{aligned}$$
(2.2)

For \(U=(u_1,u_2)\in H^{s}(\mathbb {T}^d;\mathbb {C}^{2})\) we just set \(\Vert U\Vert _{{s}}=\Vert u_1\Vert _{{s}}+\Vert u_2\Vert _{{s}}\). We shall also write the norm in (2.2) as

$$\begin{aligned} \Vert u\Vert ^{2}_{{s}}=(\langle D\rangle ^{s}u,\langle D\rangle ^{s} u)_{L^{2}}\,, \qquad \langle D\rangle e^{\textrm{i}j\cdot x}=\langle j\rangle e^{\textrm{i}j\cdot x}\,,\;\;\; \textrm{for}\; \textrm{all}\,\, \, j\in \mathbb {Z}^{d}\,, \end{aligned}$$

where \((\cdot ,\cdot )_{L^{2}}\) denotes the standard complex \(L^{2}\)-scalar product

$$\begin{aligned} (u,v)_{L^{2}}:=\int _{\mathbb {T}^{d}}u\bar{v}\hbox {d}x\,, \qquad \textrm{for}\; \textrm{all}\,\,\, u,v\in L^{2}(\mathbb {T}^{d};\mathbb {C})\,. \end{aligned}$$
(2.3)

We introduce also the product spaces

$$\begin{aligned} \mathcal {H}^s=\{ U=(u^+, u^-)\in H^s\times H^s : u^+=\overline{u^-} \}\,. \end{aligned}$$
(2.4)

With abuse of notation we shall denote by \(\Vert \cdot \Vert _{{s}}\) the natural norm of the product space \(\mathcal {H}^{s}\). On the space \( \mathcal {H}^s\) we naturally extend the scalar product (2.3) as

$$\begin{aligned} (Z,W)_{L^{2}\times L^{2}}:=\textrm{Re}(z,w)_{L^{2}}=\tfrac{1}{2}\int _{{{\mathbb {T}}}^d} z\cdot \bar{w}+\bar{z}\cdot w \hbox {d}x\,, \qquad Z={\bigl [{\begin{matrix}z\\ \bar{z}\end{matrix}}\bigr ]}\,,\; W={\bigl [{\begin{matrix}w\\ \bar{w}\end{matrix}}\bigr ]}\in \mathcal {H}^{0}\,.\nonumber \\ \end{aligned}$$
(2.5)

We recall the following interpolation estimate

$$\begin{aligned} \Vert u\Vert _{{s}}\le \Vert u\Vert ^{\theta }_{{s_1}}\Vert u\Vert ^{1-\theta }_{{s_2}}\,, \qquad \theta \in [0,1]\,,\;\;0\le s_1\le s_2\,,\quad s=\theta s_1+(1-\theta )s_2\,. \end{aligned}$$
(2.6)

Notation. We shall use the notation \(A\lesssim B\) to denote \(A\le C B\) where C is a positive constant depending on parameters fixed once for all, for instance d and s. We will emphasize by writing \(\lesssim _{q}\) when the constant C depends on some other parameter q. When we have both \(A\lesssim B\) and \(B\lesssim A\) we shall write \(A\sim B\).

We now recall some results concerning the paradifferential calculus. We follow [2].

Definition 2.1

Given \(m,s\in \mathbb {R}\) we denote by \(\Gamma ^m_s\) the space of functions \(a(x,\xi )\) defined on \({{\mathbb {T}}}^d\times {{\mathbb {R}}}^d\) with values in \({{\mathbb {C}}}\), which are \(C^{\infty }\) with respect to the variable \(\xi \in {{\mathbb {R}}}^d\) and such that for any \(\beta \in {{\mathbb {N}}}\cup \{0\}\), there exists a constant \(C_{\beta }>0\) such that

$$\begin{aligned} \Vert \partial _{\xi }^{\beta } a(\cdot ,\xi )\Vert _{s} \le C_{\beta }\langle \xi \rangle ^{m-|\beta |}\,, \quad \textrm{for}\; \textrm{all}\,\, \xi \in {{\mathbb {R}}}^d. \end{aligned}$$
(2.7)

The elements of \(\Gamma _s^m\) are called symbols of order m.

We endow the space \(\Gamma ^m_{s}\) with the family of semi-norms

$$\begin{aligned} |a|_{m,s,n}:=\max _{\beta \le n}\, \sup _{\xi \in {{\mathbb {R}}}^d}\Vert \langle \xi \rangle ^{\beta -m}\partial ^\beta _\xi a(\cdot ,\xi )\Vert _{s}\,. \end{aligned}$$
(2.8)

Consider a function \(\chi \in C^{\infty }({{\mathbb {R}}}^d,[0,1])\) such that

$$\begin{aligned} \chi (\xi )={\left\{ \begin{array}{ll} 1 \qquad \textrm{if}\,\,|\xi |\le 1.1,\\ 0 \qquad \textrm{if}\,\,|\xi |\ge 1.9. \end{array}\right. } \end{aligned}$$

Let \(\epsilon \in (0,1)\) and define \(\chi _{\epsilon }(\xi ):=\chi (\xi /\epsilon ).\) Given a symbol \(a(x,\xi )\) in \(\Gamma ^m_s\) we define its Bony–Weyl quantization

$$\begin{aligned} T_{a}h:={Op^{\textrm{BW}}}(a(x,\xi ))h:=\sum _{j\in \mathbb {Z}^d}e^{\textrm{i}j x} \sum _{k\in \mathbb {Z}^d} \chi _{\epsilon }\left( \tfrac{|j-k|}{\langle j+k\rangle }\right) \widehat{a}\left( j-k,\tfrac{j+k}{2}\right) \widehat{h}(k). \end{aligned}$$
(2.9)

We list a series of technical results that we shall systematically use throughout the paper. The following is Theorem 2.4 in [2] and concerns the action of paradifferential operators on Sobolev spaces.

Theorem 2.2

(Action on Sobolev spaces) Let \(s_0>d/2\), \(m\in {{\mathbb {R}}}\) and \(a\in \Gamma ^m_{s_0}\). Then \({Op^{\textrm{BW}}}(a)\) extends as a bounded operator from \({H}^{s}({{\mathbb {T}}}^d;{{\mathbb {C}}})\) to \({H}^{s-m}({{\mathbb {T}}}^d;{{\mathbb {C}}})\), for all \(s\in {{\mathbb {R}}}\), with the following estimate

$$\begin{aligned} \Vert {Op^{\textrm{BW}}}(a)u\Vert _{{s-m}}\lesssim |a|_{m,s_0,2(d+1)}\Vert u\Vert _{s} \qquad \textrm{for}\; \textrm{all}\,\, \,u\in {H}^s\,. \end{aligned}$$
(2.10)

Moreover for any \(\rho \ge 0\) we have

$$\begin{aligned} \Vert {Op^{\textrm{BW}}}(a)u\Vert _{{s-m-\rho }}\lesssim |a|_{m,s_0-\rho ,2(d+1)}\Vert u\Vert _{s} \qquad \textrm{for}\; \textrm{all}\,\,\, {u}\in {H}^s\,. \end{aligned}$$
(2.11)

Definition 2.3

Let \(\rho \in (0, 2]\). Given two symbols \(a\in \Gamma ^m_{s_0+\rho }\) and \(b\in \Gamma ^{ m'}_{s_0+\rho }\), we define

$$\begin{aligned} a\#_{\rho } b= {\left\{ \begin{array}{ll} ab \quad \qquad \quad \quad \,\, \rho \in (0,1]\\ ab+\frac{1}{2\textrm{i}}\{a,b\}\quad \rho \in (1,2],\\ \end{array}\right. } \end{aligned}$$
(2.12)

where we denoted by \(\{a,b\}:=\nabla _{\xi }a\cdot \nabla _xb-\nabla _xa\cdot \nabla _{\xi }b\) the Poisson’s bracket between symbols.

The following is Theorem 2.5 in [2]. This result concerns the symbolic calculus for the composition of Bony–Weyl paradifferential operators.

Theorem 2.4

(Composition) Let \(s_0>d/2\), \(m,m'\in \mathbb {R}\), \(\rho \in (0,2]\) and \(a\in \Gamma ^m_{s_0+\rho }\), \(b\in \Gamma ^{m'}_{s_0+\rho }\). We have

$$\begin{aligned} {Op^{\textrm{BW}}}(a)\circ {Op^{\textrm{BW}}}(b)={Op^{\textrm{BW}}}(a\#_{\rho }b)+R^{c}_{\rho }(a,b)\,, \end{aligned}$$

where the linear operator \(R^{c}_{\rho }\) maps \({H}^s({{\mathbb {T}}})\) into \({H}^{s+\rho -m-m'}\), for any \(s\in {{\mathbb {R}}}\).

Moreover it satisfies the following estimate, for all \(u\in {H}^s\),

$$\begin{aligned} \Vert R^{c}_{\rho }(a,b)u\Vert _{{s-(m+m')+\rho }} \lesssim _s (|a|_{m,s_0+\rho ,N}|b|_{m',s_0,N}+|a|_{m,s_0,N}|b|_{m',s_0+\rho ,N})\Vert u\Vert _{s},\nonumber \\ \end{aligned}$$
(2.13)

where \(N\ge 3d+4\).

The next result is Lemma 2.7 in [2].

Lemma 2.5

(Paraproduct) Fix \(s_0>d/2\) and let \(f\in H^s\) and \(g\in H^r\) with \(s + r \ge 0\). Then

$$\begin{aligned} fg={Op^{\textrm{BW}}}(f)g+{Op^{\textrm{BW}}}({g})f+{R}^p(f,g)\,, \end{aligned}$$
(2.14)

where the bilinear operator \(R: H^s \times H^r\rightarrow H^{s + r - s_0}\) satisfies the estimate

$$\begin{aligned} \Vert {R}^p(f,g)\Vert _{{s+r - s_0}} \lesssim _s \Vert f\Vert _{s} \Vert g\Vert _{r}\,. \end{aligned}$$
(2.15)

3 Paralinearization

We rewrite Eq. (1.1) as a system of paradifferential equations as done in [8, 9], but we use the paradifferential framework introduced above. Let \(u\in H^{{s}}\), with \({s}>d/2+1\), we introduce the symbols

$$\begin{aligned}{} & {} a_2(x,\xi ):=a_{2}(U;x,\xi ) :=\sum _{j,k=1}^{d}(\partial _{\overline{u}_{x_k} u_{x_{j}}}F) \xi _{j}\xi _{k}\,\in \Gamma _{s}^2, \nonumber \\{} & {} \quad b_2(x,\xi ):=b_{2}(U;x,\xi ) :=\sum _{j,k=1}^{d}(\partial _{\overline{u}_{x_k}\, \overline{u}_{x_{j}}}F) \xi _{j}\xi _{k}\,\in \Gamma _{s}^2,\nonumber \\{} & {} \quad a_1(x,\xi ):=a_{1}(U;x,\xi ):=\frac{\textrm{i}}{2}\sum _{j=1}^{d}\Big ( (\partial _{\overline{u} u_{x_{j}}}F)-(\partial _{{u} \overline{u}_{x_{j}}}F) \Big )\xi _{j}\,\in \Gamma _{s}^1, \end{aligned}$$
(3.1)

where \(F=F(u,\nabla u)\) in (1.2). Furthermore, since the symbols may contain one derivative of the unknown u, one proves the estimates on the seminorms (recall (2.8))

$$\begin{aligned} |a_2(x,\xi )|_{2,p,\alpha }+ |b_2(x,\xi )|_{2,p,\alpha }+ |\vec {a}_1(x)\cdot \xi |_{2,p,\alpha }\lesssim _s \texttt {C}(\Vert u\Vert _{p+1}), \quad \textrm{for}\; \textrm{all}\,\,\, \tfrac{d}{2}<p.\nonumber \\ \end{aligned}$$
(3.2)

We have the following.

Proposition 3.1

Fix \(s_0>d/2\). The Eq. (1.1) is equivalent to the following system:

$$\begin{aligned} \dot{U}=\textrm{i}E{Op^{\textrm{BW}}}\big (A_{2}(x,\xi )+A_{1}(x,\xi )\big )U+ R(U)(U)\,, \end{aligned}$$
(3.3)

where

$$\begin{aligned} U:={\bigl [{\begin{matrix}u\\ \bar{u}\end{matrix}}\bigr ]}\,,\quad E:={\bigl [{\begin{matrix}1&{}0\\ 0&{}-1\end{matrix}}\bigr ]}\,, \end{aligned}$$
(3.4)

the matrices \(A_2(x,\xi )=A_2(U;x,\xi )\), \(A_1(x,\xi )=A_1(U;x,\xi )\) have the form

$$\begin{aligned} A_2(x,\xi ):=\left( \begin{matrix}a_2(x,\xi ) &{} \quad b_{2}(x,\xi ) \\ \overline{b_{2}(x,-\xi )} &{} \quad {a_{2}(x,\xi )} \end{matrix}\right) \,, \qquad A_1(x,\xi ):=\left( \begin{matrix}a_1(x,\xi ) &{} \quad 0 \\ 0 &{} \quad \overline{a_{1}(x,-\xi )} \end{matrix}\right) \end{aligned}$$
(3.5)

and \(a_2,a_1,b_2\) are the symbol in (3.1). For any \(s\ge s_0+3\) and any \( U,V\in {{\mathcal {H}}}^{s}(\mathbb {T}^{d};\mathbb {C}^{2}) \), the remainder R satisfies the estimates

$$\begin{aligned}&\Vert R(U)U\Vert _{s}\lesssim \texttt {C}(\Vert U\Vert _{s_0+3})\Vert U\Vert _{s}\,, \end{aligned}$$
(3.6)
$$\begin{aligned}&\Vert R(U)U-R(V)V\Vert _{s}\lesssim \texttt {C}(\Vert U\Vert _{s_0+3},\Vert V\Vert _{s_0+3}) \Vert U-V\Vert _{s}+\texttt {C}(\Vert U\Vert _{s},\Vert V\Vert _{s})\Vert U-V\Vert _{s_0+3}\,, \end{aligned}$$
(3.7)
$$\begin{aligned}&\Vert R(U)U-R(V)V\Vert _{s_0+1}\lesssim \texttt {C}(\Vert U\Vert _{s_0+3},\Vert V\Vert _{s_0+3}) \Vert U-V\Vert _{s_0+1} \end{aligned}$$
(3.8)

where \(\texttt {C}(\cdot ,\cdot ):={C}(\max \{\cdot ,\cdot \})>0\), for some non decreasing function \(C(\cdot )\). Concerning the unbounded part of the equation we have for any \({\sigma }\ge 0\) and \(j=1,2\)

$$\begin{aligned} \Vert {Op^{\textrm{BW}}}(A_j(U;x,\xi ))W\Vert _{{\sigma }}&\le {C}(\Vert U\Vert _{s_0+1})\Vert W\Vert _{{\sigma }+j}, \end{aligned}$$
(3.9)
$$\begin{aligned} \Vert {Op^{\textrm{BW}}}(A_j(U;x,\xi )-A_{j}(V;x,\xi ))W\Vert _{{\sigma }}&\le {C}(\Vert U-V\Vert _{s_0+1})\Vert W\Vert _{{\sigma }+j}, \end{aligned}$$
(3.10)

for any UVW in \({{\mathcal {H}}}^{{\sigma }}\).

In order to prove Proposition 3.1, we first show the following lemma.

Lemma 3.2

Fix \(s_0>d/2\) and \(s\ge s_0\). Consider \(u\in H^{s}(\mathbb {T}^{d};\mathbb {C})\), then we have that

$$\begin{aligned}&(\partial _{\overline{u}}F)(u,\nabla u)-\sum _{j=1}^{d}\partial _{x_j}\big (\partial _{\overline{u}_{x_{j}}}F\big )(u,\nabla u)=T_{\partial _{u\bar{u}}F}[u]+T_{\partial _{\bar{u}\,\bar{u}}F}[\bar{u}] \end{aligned}$$
(3.11)
$$\begin{aligned}&\quad +\sum _{j=1}^{d}\left( T_{\partial _{\bar{u}u_{x_{j}}}F}[u_{x_j}] +T_{\partial _{\bar{u}\,\overline{u}_{x_{j}}}F}[\overline{u}_{x_j}] \right) -\sum _{j=1}^{d}\partial _{x_j}\left( T_{\partial _{{u}\overline{u}_{x_{j}}}F}[u] +T_{\partial _{\bar{u}\,\overline{u}_{x_{j}}}F}[\bar{u}] \right) \end{aligned}$$
(3.12)
$$\begin{aligned}&\quad -\sum _{j=1}^{d}\partial _{x_{j}} \sum _{k=1}^{d}\left( T_{\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F}[u_{x_k}] +T_{\partial _{\overline{u}_{x_{j}}\, \overline{u}_{x_{k}}}F}[\overline{u}_{x_{k}}] \right) +R(u)u\,, \end{aligned}$$
(3.13)

where R(u)u is a remainder satisfying

$$\begin{aligned} \Vert R(u)u\Vert _{{s}}\lesssim \texttt {C}(\Vert u\Vert _{s_0+1})\Vert u\Vert _{{s}}^{2}\,, \end{aligned}$$
(3.14)

where \(\texttt {C}(\cdot )\) is a non decreasing function.

Proof

Since the nonlinearity is a polynomial, the (3.11)–(3.13) follow by a repeated application of the paraproduct Lemma 2.5 (for more general nonlinearities one can look at [16]). \(\square \)

proof of Proposition 3.1

Consider now the first paradifferential term in (3.13). We have, for any \(j,k=1,\ldots ,d\),

$$\begin{aligned} \partial _{x_{j}} T_{\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F}\partial _{x_k}u= {Op^{\textrm{BW}}}(\textrm{i}\xi _{j})\circ {Op^{\textrm{BW}}}(\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F)\circ {Op^{\textrm{BW}}}(\textrm{i}\xi _k)u\,. \end{aligned}$$

By applying Proposition 2.4 with \(\rho =2\), we obtain

$$\begin{aligned}&{Op^{\textrm{BW}}}(\textrm{i}\xi _{j})\,\circ \,{Op^{\textrm{BW}}}(\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F)\circ {Op^{\textrm{BW}}}(\textrm{i}\xi _k) = {Op^{\textrm{BW}}}\left( -\xi _{j}\xi _{k}\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F \right) \end{aligned}$$
(3.15)
$$\begin{aligned}&\quad +{Op^{\textrm{BW}}}\left( \frac{\textrm{i}}{2 } \xi _{k}\partial _{x_{j}}(\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F) -\frac{\textrm{i}\xi _{j}}{2}\partial _{x_{k}}(\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F)\right) \end{aligned}$$
(3.16)
$$\begin{aligned}&\quad +R(u)u\,, \end{aligned}$$
(3.17)

where\( \Vert {R}(u)u\Vert _{{s}}\lesssim \texttt {C}(\Vert u\Vert _{{s_0+3}})\Vert u\Vert _{{s}}\,,\) for some non increasing function \(\texttt {C}(\cdot )>0\). Then

$$\begin{aligned} \begin{aligned}&-\sum _{j,k=1}^{d}\partial _{x_j} T_{\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F}\partial _{x_{k}}u= {Op^{\textrm{BW}}}\left( \sum _{j,k=1}^{d}\xi _{j}\xi _{k}\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F\right) + {R}(u) \\ {}&\qquad -\frac{\textrm{i}}{2}{Op^{\textrm{BW}}}\left( \sum _{j,k=1}^{d}\left( -\xi _{j}\partial _{x_{k}}(\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F) + \xi _{k}\partial _{x_{j}}(\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F) \right) \right) \\ {}&\quad {\mathop {=}\limits ^{(3.1)}} {Op^{\textrm{BW}}}(a_2(x,\xi ))+{R}(u) +\frac{\textrm{i}}{2}{Op^{\textrm{BW}}}\left( \sum _{j,k=1}^{d} \xi _{j}\partial _{x_{k}}\left( (\partial _{\overline{u}_{x_{j}} \,{u_{x_k}}}F)-(\partial _{\overline{u}_{x_{k}} \,{u_{x_j}}}F)\right) \right) \\&\quad ={Op^{\textrm{BW}}}(a_2(x,\xi ))+{R}(u)\,, \end{aligned} \end{aligned}$$

where we used the symmetry of the matrix \(\partial _{\overline{\nabla u}\, \nabla u}F\) (recall F is real). By performing similar explicit computations on the other summands in (3.11)–(3.13) we get the (3.3), (3.5) with symbols in (3.1). By the discussion above we deduced that the remainder R(U)U in (3.3) satisfies the bound (3.6). The estimate (3.7) follow by the fact that the nonlinearity is polynomial and therefore all the remainders are by multilinear. \(\square \)

Remark 3.3

If the nonlinearity satisfies the hypotheses of Theorem 1.2, then the symbols of the paralinearized system are simpler, in particular \(a_2\) and \(b_2\) do not depend on \(\xi \) and on \(\nabla u\)

$$\begin{aligned} a_2(x)= & {} \, \left[ h'(|u|^2)\right] ^2|u|^2\,, \quad b_2(x)= \left[ h'(|u|^2)\right] ^2u^2,\nonumber \\ a_1(x,\xi )= & {} \vec {a}_1(x)\cdot \xi =\,\left[ h'(|u|^2)\right] ^2 \sum _{j=1}^d\Im (u\bar{u}_{x_j})\xi _j\,. \end{aligned}$$
(3.18)

Since they do not contain derivatives of u, we have the improved estimates on their semi-norms

$$\begin{aligned} |a_2(x)|_{2,p,\alpha }+ |b_2(x)|_{2,p,\alpha }+ |\vec {a}_1(x)\cdot \xi |_{2,p,\alpha }\lesssim _s \texttt {C}(\Vert u\Vert _{p}), \quad \textrm{for}\; \textrm{all}\,\,\, \tfrac{d}{2}<p. \end{aligned}$$

This implies that the result of Theorem 3.1, in the case that F satisfies the hypotheses of Theorem 1.2, holds true changing \(s_0+1\rightsquigarrow s_0\) therein. In other words, the minimal regularity needed on the solution U is \(\mathfrak {s}_0>d/2+2\) instead of \(d/2+3\).

4 Modified energy and linear well posedness

In this section we shall define a norm on \(H^s\) which is almost equivalent to the standard Sobolev one and which is tailored to the problem (3.1), in such a way that we shall be able to obtain a priori estimates on the solutions. We consider first a linearized, regularized, homogeneous version of (3.3)

$$\begin{aligned} \partial _t{U^{\epsilon }}=\textrm{i}E{Op^{\textrm{BW}}}\left( A_{2}({{\underline{\varvec{U}}}};x,\xi )+A_{1}({{\underline{\varvec{U}}}};x,\xi )\right) U^{\epsilon }- \epsilon \Delta ^2 U^{\epsilon }, \end{aligned}$$
(4.1)

where \(\Delta ^2U:={Op^{\textrm{BW}}}(|\xi |^4)U \) and \({{\underline{\varvec{U}}}}\) is a fixed function satisfying for \(s_0>d/2\)

$$\begin{aligned} \Vert {{\underline{\varvec{U}}}}\Vert _{L^{\infty }H^{{s}_0+3}}+\Vert \partial _t{{{\underline{\varvec{U}}}}}\Vert _{L^{\infty }H^{{s}_0+1}}\le \Theta ,\quad \Vert {{{\underline{\varvec{U}}}}}\Vert _{L^{\infty }H^{{s}_0+1}}\le r, \end{aligned}$$
(4.2)

for some \(\Theta \ge r>0\). For any \(\epsilon >0\) the Eq. (4.1) admits a unique solution defined on a small interval (depending on \(\epsilon \)), this is the content of the following lemma. This method is called artificial viscosity, or parabolic regularization, it was used directly on the nonlinear problem in [14], while it has used on the linear problem in [6, 10].

Lemma 4.1

Let \(\sigma \ge 0\) and assume (4.2). For any \(\epsilon >0\) there exists \(T_{\epsilon }>0\) such that the following holds. For any initial condition \(U_0=U(0,x)\in {{\mathcal {H}}}^{{\sigma }}\) there exists a unique solution \(U^{\epsilon }(t,x)\) of (4.1) which belongs to the space \(C^0([0,T_{\epsilon });{{\mathcal {H}}}^{{\sigma }})\cap C^1([0,T_{\epsilon });{{\mathcal {H}}}^{{\sigma }-2})\).

Proof

We consider the operator

$$\begin{aligned} \Gamma U^{\epsilon }:= e^{-\epsilon t\Delta ^2}U_0+\int _0^te^{-\epsilon (t-t')\Delta ^2}E{Op^{\textrm{BW}}}(A_2+A_1)U^{\epsilon }(t')\hbox {d}t'. \end{aligned}$$

We have \(\Vert e^{-\epsilon t\Delta ^2}U_0\Vert _{{{\sigma }}}\le \Vert U_0\Vert _{H^{{\sigma }}}\) and \(\Vert \int _0^te^{-\epsilon (t-t')\Delta ^2} f(t',\cdot )\hbox {d}t'\Vert _{{{\sigma }}}\le t^{\frac{1}{2}}\epsilon ^{-\frac{1}{2}}\Vert f\Vert _{{{\sigma }-2}}\), with these estimates, (3.2), (4.2) and Theorem 2.2 one may apply a fixed point argument in a suitable subspace of \(C^{0}([0,T_{\epsilon });{{\mathcal {H}}}^{{\sigma }})\) for a suitable time \(T_{\epsilon }\) (going to zero when \(\epsilon \) goes to zero). Let us prove the second one of the above inequalities. We use the Minkowski inequality and the boundedness of the function \(\alpha ^{3/2}e^{-\alpha }\) for \(\alpha \ge 0\), we get

$$\begin{aligned} \Vert \int _0^te^{-\epsilon (t-t')\Delta ^2} f(t',\cdot )\hbox {d}t'\Vert _{{{\sigma }}}&\le \int _0^t\Vert e^{-\epsilon (t-t')\Delta ^2} f(t',\cdot )\Vert _{{{\sigma }}}\hbox {d}t'\\&\quad \sim \int _0^t\sqrt{\sum _{\xi \in {{\mathbb {Z}}}^d}e^{-2\epsilon (t-t')|\xi |^4}|\xi |^{2{\sigma }}|\hat{f}(t',\xi )|^2}\hbox {d}t'\\&\lesssim \int _0^t\epsilon ^{-\frac{1}{2}}(t-t')^{-\frac{1}{2}}\Vert f(t',\cdot )\Vert _{H^{{\sigma }-2}}\hbox {d}t'\lesssim t^{\frac{1}{2}}\epsilon ^{-\frac{1}{2}}\Vert f\Vert _{L^{\infty }H^{{\sigma }-2}}. \end{aligned}$$

\(\square \)

4.1 Diagonalization at the highest order

Recall (3.4), (3.5), (3.1), consider the matrix \(E\widetilde{A}_{2}(x,\xi )\)

$$\begin{aligned} {\widetilde{A}}_{2}(x,\xi ) := \left( \begin{matrix} {\widetilde{a}}_{2}(x,\xi ) &{} \quad {\widetilde{b}}_{2}(x,\xi ) \\ \overline{{\widetilde{b}}_{2}(x,-\xi )} &{} \quad {{\widetilde{a}}_{2}(x,\xi )} \end{matrix}\right) \,, \quad {\left\{ \begin{array}{ll} {\widetilde{a}}_{2}(x,\xi ):=|\xi |^{-2}a_2(x,\xi )\,,\\ {\widetilde{b}}_{2}(x,\xi ):=|\xi |^{-2}b_2(x,\xi )\,.\end{array}\right. } \end{aligned}$$
(4.3)

Define

$$\begin{aligned} \lambda (x,\xi ):= \sqrt{\widetilde{a}_{2}(x,\xi )^{2} -|\widetilde{b}_{2}(x,\xi )|^{2}}. \end{aligned}$$
(4.4)

Notice that the symbol \(\lambda \) is well-defined thanks to (1.3). The matrix of the normalized eigenvectors associated to the eigenvalues \(\pm \lambda (x,\xi )\) of \(E\widetilde{A}_{2}(x,\xi )\) is

$$\begin{aligned}{} & {} S(x,\xi ):=\left( \begin{matrix} {s}_1(x,\xi ) &{} \quad {s}_2(x,\xi )\\ {\overline{s_2(x,\xi )}} &{} {{s_1(x,\xi )}} \end{matrix} \right) \,, \qquad S^{-1}(x,\xi ):=\left( \begin{matrix} {s}_1(x,\xi ) &{}\quad -{s}_2(x,\xi )\\ -{\overline{s_2(x,\xi )}} &{} {{s_1(x,\xi )}} \end{matrix} \right) \,, \nonumber \\{} & {} s_{1}:=\frac{\widetilde{a}_{2} +\lambda }{\sqrt{2\lambda \left( \widetilde{a}_{2}+ \lambda \right) }}, \qquad s_{2}:=\frac{-\widetilde{b}_{2}}{\sqrt{2\lambda \left( \widetilde{a}_{2} +\lambda \right) }}\,. \end{aligned}$$
(4.5)

Let us also define the lower order correction

$$\begin{aligned}{} & {} S_*(x,\xi ):= \frac{1}{2\textrm{i}}\left( \begin{matrix} \{s_{2},\overline{s_{2}}\}(x,\xi ) &{} \quad 2\{s_{1},{s_{2}}\}(x,\xi )\\ -{2\{s_{1},\overline{s_{2}}\}(x,-\xi )} &{} \quad {\{s_{2},\overline{s_{2}}\}(x,-\xi )} \end{matrix} \right) S(x,\xi )\,\nonumber \\{} & {} \quad :=\left( \begin{matrix} s_{1}^{*}(x,\xi ) &{} s_{2}^{*}(x,\xi ) \\ \overline{s_{2}^{*}(x,-\xi ) } &{} \quad {s_{1}^{*}(x,-\xi )} \end{matrix}\right) . \end{aligned}$$
(4.6)

We needed to introduce such a lower order correction in order to build a parametrix (i.e. an invertible map up to smoothing operators) having a remainder gaining two derivatives, see Lemma 4.2. We have the following estimates on the seminorms of the symbols in the matrices above

$$\begin{aligned}{} & {} |s_1(x,\xi )|_{0,p,\alpha }+|s_2(x,\xi )|_{0,p,\alpha }\lesssim _s \texttt {C}(\Vert {{\underline{\varvec{U}}}}\Vert _{p+1})\quad \textrm{for}\; \textrm{all}\,\,\, \tfrac{d}{2}<p,\nonumber \\{} & {} \quad |\{m,n\}(x,\xi )|_{-1,p,\alpha }\lesssim _s \texttt {C}(\Vert {{\underline{\varvec{U}}}}\Vert _{p+2})\,\,\,\textrm{for}\; \textrm{all}\,\,\, m,\,n\in \{s_{1},s_2,\bar{s}_2\},\,\,\, \tfrac{d}{2}<p. \end{aligned}$$
(4.7)

Lemma 4.2

Fix \(s_0>{d}/{2}\), there exists a linear operator \(R_{-2}({{\underline{\varvec{U}}}})[\cdot ]\) satisfying

$$\begin{aligned} \Vert R_{-2}({{\underline{\varvec{U}}}})V\Vert _{s+2}\le \texttt {C}(\Vert {{\underline{\varvec{U}}}}\Vert _{{s}_0+3})\Vert V\Vert _{s}, \end{aligned}$$
(4.8)

for any \(s\ge {s}_0+3\) and such that

$$\begin{aligned} {Op^{\textrm{BW}}}(S(x,\xi )+S_{*}(x,\xi )){Op^{\textrm{BW}}}(S^{-1}(x,\xi ))V=V+R_{-2}({{\underline{\varvec{U}}}})V. \end{aligned}$$
(4.9)

Proof

It is a direct consequence of Theorem 2.4 used with \(\rho =1\). \(\square \)

We diagonalize the highest order \(E{Op^{\textrm{BW}}}(A_2)\) (recall Proposition 3.1) by means of the parametrix constructed above.

Proposition 4.3

Fix \(s_0>d/2\), there exists a linear operator \(R_{0}({{\underline{\varvec{U}}}})[\cdot ]\), satisfying

$$\begin{aligned} \Vert R_{0}({{\underline{\varvec{U}}}})V\Vert _{s}\le \texttt {C}(\Vert {{\underline{\varvec{U}}}}\Vert _{{s}_0+3})\Vert V\Vert _{s}\quad \textrm{for}\; \textrm{all}\,\,\, s\ge {s}_0+3, \end{aligned}$$
(4.10)

such that

$$\begin{aligned}{} & {} {Op^{\textrm{BW}}}(S^{-1})E{Op^{\textrm{BW}}}({A}_2+{A}_1){Op^{\textrm{BW}}}(S+S_{*})V=E{Op^{\textrm{BW}}}(\lambda (x,\xi )|\xi |^2)V\nonumber \\{} & {} \quad +E{Op^{\textrm{BW}}}({A}_1^{+}(x,\xi ))V+R_{0}({{\underline{\varvec{U}}}})V, \end{aligned}$$
(4.11)

where \(\lambda (x,\xi )\) is the eigenvalue defined in (4.4) and \({A}_1^+(x,\xi )\) is a matrix of symbols having the form

$$\begin{aligned} \left( \begin{matrix} a_1^+(x,\xi ) &{} b_1^+(x,\xi )\\ \overline{b_1^+}(x,\xi ) &{} {a_1^+}(x,\xi ) \end{matrix}\right) , \end{aligned}$$

with \({a_1^+}\) being real and \(a_1^+, b_1^+\) odd with respect to \(\xi \). Moreover we have the estimates on the seminorms

$$\begin{aligned} |\lambda (x,\xi )|\xi |^2|_{2,p,\alpha }\lesssim & {} \texttt {C}(\Vert {{\underline{\varvec{U}}}}\Vert _{p+1}),\quad {\textrm{for}\; \textrm{all}}\,\,\, \tfrac{d}{2}<p,\nonumber \\ |{b_1^+}|_{1,p,\alpha }+|{a_1^+}|_{1,p,\alpha }\lesssim & {} \texttt {C}(\Vert {{\underline{\varvec{U}}}}\Vert _{p+2}),\quad \textrm{for}\; \textrm{all}\,\,\, \tfrac{d}{2}<p. \end{aligned}$$
(4.12)

Proof

We set the following notation: given an operator H,  we define \(\overline{H}(u):=\overline{H(\overline{u})}\), \(R_0({{\underline{\varvec{U}}}})\) is a remainder satisfying (4.10) and may change from line to line throughout the proof. We have

$$\begin{aligned} \begin{aligned}&{Op^{\textrm{BW}}}(S^{-1}) {Op^{\textrm{BW}}}(E {A}_1) {Op^{\textrm{BW}}}(S)=\textrm{i}E \left( \begin{matrix} C_1 &{} \quad C_2\\ \overline{C_2} &{} \quad \overline{C_1} \end{matrix} \right) \\&C_1:=T_{s_{1}}T_{a_{1}}T_{s_{1}} -T_{s_{2}}T_{a_{1}}T_{\overline{s_{2}}}\,,\qquad C_2:= T_{s_{1}}T_{a_{1}}T_{s_{2}} -T_{s_{2}}T_{a_{1}}T_{{s_{1}}}\,. \end{aligned} \end{aligned}$$

By using Theorem 2.4 with \(\rho =1\) and \(s_1^2-|s_2|^2=1\), we have \( C_1=T_{a_{1}}+R_0({{\underline{\varvec{U}}}})\,\) and \( C_2=R_0({{\underline{\varvec{U}}}}). \) By an explicit computation, using Theorem 2.4 with \(\rho =2\) and (3.2), (4.7) we have

$$\begin{aligned}{} & {} {Op^{\textrm{BW}}}(S^{-1})E{Op^{\textrm{BW}}}\left( {A}_2(x,\xi ))\right) {Op^{\textrm{BW}}}(S+S_*)=E\left( \begin{matrix} B_1 &{} \quad B_2\\ \overline{B_2} &{} \quad \overline{B_1} \end{matrix} \right) \nonumber \\{} & {} \quad +{Op^{\textrm{BW}}}\left( S^{-1}E{A}_2S_*) \right) +R_0({{\underline{\varvec{U}}}}) \end{aligned}$$
(4.13)

where

$$\begin{aligned}{} & {} B_1:=T_{s_{1}}T_{a_{2}}T_{s_{1}}+T_{s_{1}} T_{b_{2}}T_{\overline{s_{2}}} +T_{s_{2}}T_{\overline{b_{2}}}T_{{s_{1}}} +T_{s_{2}} T_{a_{{2}}} T_{\overline{s_{2}}} \,,\nonumber \\{} & {} B_2:= T_{s_{1}}T_{a_{2}}T_{s_{2}} +T_{s_{1}}T_{b_{2}}T_{{s_{1}}} +T_{s_{2}}T_{\overline{b_{2}}}T_{{s_{2}}} +T_{s_{2}}T_{a_{2}}T_{{s_{1}}}\,. \end{aligned}$$
(4.14)

We study each term separately. By using symbolic calculus, i.e. Theorem 2.4 with \(\rho =2\), and the estimates on the seminorms (3.2), (4.7), we get

$$\begin{aligned} B_1:=T_{c_2}+T_{c_1}+R_0({{\underline{\varvec{U}}}}) \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} c_2(x,\xi )&:=a_{2}(s_1^{2}+|s_2|^{2}) +b_{2} s_1\overline{s_2}+\overline{b_{2}}s_1s_2 \,,\\ c_1(x,\xi )&:=\frac{1}{2\textrm{i}}\Big (\{s_{1}, a_{2}s_{1}\} +{s_{1}}\{a_{2}, s_{1}\} \\ {}&\quad +\{s_{1},b_{2}\overline{s_{2}}\} +{s_{1}}\{b_{2},\overline{s_{2}}\} +\{s_{2},\overline{b_{2}}s_{1}\} \\ {}&\quad + {s_{2}}\{\overline{b_{2}},s_{1}\} +\{s_{2},a_{2}\overline{s_{2}}\}+ {s_{2}}\{a_{2}, \overline{s_{2}}\}\,\Big ). \end{aligned} \end{aligned}$$

By expanding the Poisson bracket we get that

$$\begin{aligned} c_2(x,\xi )=\overline{c_2(x,\xi )}\,, \qquad c_1(x,\xi )=\overline{c_1(x,\xi )}\,, \qquad c_1(x,-\xi )=-{c_1(x,\xi )}\,\,. \end{aligned}$$

Moreover, recalling the estimates on the seminorms (3.2), (4.7), we have

$$\begin{aligned} |c_1|_{1,p,\alpha }\lesssim \texttt {C}(\Vert {{\underline{\varvec{U}}}}\Vert _{H^{p+2}})\quad \textrm{for}\; \textrm{all}\,\, \tfrac{d}{2}<p. \end{aligned}$$

Reasoning similarly we can develop \(B_2\) in (4.14) as follows \(B_2:=T_{d_2}+T_{d_1}+R_0({{\underline{\varvec{U}}}}),\) where

$$\begin{aligned} \begin{aligned} d_2(x,\xi )&:=a_{2}s_1s_2 +b_2 s_1^{2}+\overline{b_{2}}s_2^{2}\,,\\ d_1(x,\xi )&:=\frac{1}{2\textrm{i}}\Big (\{s_{1},a_{2}s_{2}\} +{s_{1}}\{a_{2}, s_{2}\} \\ {}&\quad +\{s_{1},b_{2} s_{1}\} +{s_{1}}\{b_{2}, s_{1}\} +\{s_{2},\overline{b_{2}}s_{2}\} \\ {}&\quad + {s_{2}}\{\overline{b_{2}},s_{2}\} +\{s_{2},a_{2}s_{1}\}+ {s_{2}}\{a_{2}, s_{1}\}\,\Big ). \end{aligned} \end{aligned}$$

By expanding the Poisson bracket we get that \(d_1(x,\xi )\equiv 0\,.\) We now study the second summand in the right hand side of (4.13), we compute the matrix of symbols of order 1

$$\begin{aligned} S^{-1} E(A_2(x,\xi ))S_* = E\left( \begin{matrix}r_1(x,\xi ) &{} \quad r_2(x,\xi ) \\ \overline{r_2(x,-\xi )} &{} \quad \overline{r_1(x,-\xi )}\end{matrix} \right) \,, \end{aligned}$$
(4.15)

where

$$\begin{aligned}{} & {} r_1(x,\xi ):=a_{2}s_1s_1^{*}+ b_{2}s_1\overline{s_2^{*}} +\overline{b_{2}}s_2s_1^{*}+ a_{2}s_2\overline{s_2^{*}}\,,\nonumber \\{} & {} r_2(x,\xi ):= a_{2}s_1s_2^{*}+ b_{2}s_1{s_1^{*}} +\overline{b_{2}}s_2s_2^{*}+ a_{2}s_2{s_1^{*}}\,. \end{aligned}$$
(4.16)

Moreover one can check that the symbols \(r_1,r_2\) satisfy

$$\begin{aligned} r_1(x,\xi )=\overline{r_1(x,\xi )}\,, \qquad r_1(x,-\xi )=-{r_1(x,\xi )}\,\,,\quad r_2(x,-\xi )=-{r_2(x,\xi )}\,, \end{aligned}$$

and, using the estimates on the seminorms (3.2), (4.7), that

$$\begin{aligned} |r_i |_{1,p,\alpha }\lesssim \texttt {C}(\Vert U\Vert _{H^{p+2}})\,,\quad \tfrac{d}{2}<p\,,\;\;\; i=1,2\,. \end{aligned}$$
(4.17)

One concludes by noticing that \({Op^{\textrm{BW}}}(S^{-1}EA_2S)=E{Op^{\textrm{BW}}}(\lambda (x,\xi )|\xi |^2)\). \(\square \)

Remark 4.4

When the nonlinearity satisfies the hypotheses of Theorem 1.2, the eigenvalues (4.4) and the symbols in (4.5) do not depend on \(\xi \). This implies, by using Theorem 2.4 with \(\rho =2\), that \({Op^{\textrm{BW}}}(S(x))\circ {Op^{\textrm{BW}}}(S^{-1}(x))=\mathbbm {1}+R_{-2}({{\underline{\varvec{U}}}})[\cdot ]\). In other words, we do not need, in this special case, corrections to obtain a parametrix with a remainder gaining two derivatives. This case is very similar, indeed, to [2]. As a consequence we also have that \(b_1^+\) in the matrix \(A_1^+\) in (4.11) equals to zero, indeed the off diagonal terms at order one were generated by (see (4.15), (4.16)) \({Op^{\textrm{BW}}}\Big ( S^{-1}E{A}_2S_*)\), which is equal to zero in this context.

4.2 Diagonalization at the sub-principal order

In this section we eliminate the off diagonal term in the matrix of symbols of order one \(A^+_1(x,\xi )\) in (4.11), this is possible by multiplying on the left by a matrix of paradifferential operators of order \(-1\).

Let \(\varphi (\xi )\) an even function in \(C^{\infty }_c({{\mathbb {R}}})\) with \(supp(\varphi )\subset \{|\xi |\ge 1/2\}\) and \(\varphi \equiv 0\) on \(\{|\xi |\le 1/4\}\). We define the matrix

$$\begin{aligned} C(x,\xi ):=\left( \begin{matrix} 0 &{} \quad c(x,\xi )\\ \overline{c(x,-\xi )}&{} \quad 0 \end{matrix}\right) ,\quad c(x,\xi ):=-\frac{\varphi (\xi )}{2}\frac{b^{+}_1(x,\xi )}{\lambda (x,\xi )|\xi |^2}, \end{aligned}$$
(4.18)

where \(b_1^+(x,\xi )\) is given by Proposition 4.3 and \(\lambda (x,\xi )\) is defined in (4.4). We note that, thanks to the ellipticity (1.3) and the estimates (4.12), \(c(x,\xi )\) is a well defined symbol of order \(-1\) verifying

$$\begin{aligned} |c|_{-1,p,\alpha }\lesssim \texttt {C}(\Vert {{\underline{\varvec{U}}}}\Vert _{p+2}), \quad \textrm{for all}\,\,\, \tfrac{d}{2}<p. \end{aligned}$$
(4.19)

Lemma 4.5

There exists an operator \(R_0({{\underline{\varvec{U}}}})[\cdot ]\), different from the one in Proposition 4.3, satisfying the estimate (4.10), such that

$$\begin{aligned} \begin{aligned} \left( \mathbbm {1}-{Op^{\textrm{BW}}}(C(x,\xi ))\right)&E\left( {Op^{\textrm{BW}}}(\lambda (x,\xi )|\xi |^2)+{Op^{\textrm{BW}}}(A_1^+(x,\xi ))\right) \left( \mathbbm {1}+{Op^{\textrm{BW}}}(C(x,\xi ))\right) \\&=E {Op^{\textrm{BW}}}(\lambda (x,\xi )|\xi |^2)+\mathbbm {1}{Op^{\textrm{BW}}}(a_1^+(x,\xi ))+R_0({{\underline{\varvec{U}}}}), \end{aligned} \end{aligned}$$

where \(a_1^+(x,\xi )\) is the one of Proposition 4.3 and \(\lambda (x,\xi )\) is defined in (4.4).

Proof

We begin by noticing that the operators

$$\begin{aligned} {Op^{\textrm{BW}}}(C(x,\xi ))E{Op^{\textrm{BW}}}(A_1^+(x,\xi )), \quad E{Op^{\textrm{BW}}}(A_1^+(x,\xi )){Op^{\textrm{BW}}}(C(x,\xi )), \end{aligned}$$

are bounded, then their contribution may be absorbed in the remainder \(R_0({{\underline{\varvec{U}}}})\) thanks to Theorems 2.2, (4.12) and (4.19). We use symbol calculus, i. e. Theorem 2.4, with \(\rho =1\) and we obtain

$$\begin{aligned} \begin{aligned}&{Op^{\textrm{BW}}}(C(x,\xi ))E{Op^{\textrm{BW}}}(\lambda (x,\xi )|\xi |^2)\\&\quad ={Op^{\textrm{BW}}}\left( \begin{matrix} 0 &{} -|\xi |^2\lambda (x,\xi ) c(x,\xi )\\ |\xi |^2\lambda (x,\xi )\overline{c}(x,-\xi )&{}0\end{matrix} \right) +R_0({{\underline{\varvec{U}}}}),\\&E{Op^{\textrm{BW}}}(\lambda (x,\xi )|\xi |^2){Op^{\textrm{BW}}}(C(x,\xi ))\\&\quad ={Op^{\textrm{BW}}}\left( \begin{matrix} 0 &{} |\xi |^2\lambda (x,\xi ) c(x,\xi )\\ -|\xi |^2\lambda (x,\xi )\overline{c}(x,-\xi )&{}0\end{matrix} \right) +R_0({{\underline{\varvec{U}}}}) \end{aligned} \end{aligned}$$

from which, thanks to the choice of \(c(x,\xi )\) in (4.18), we conclude the proof. \(\square \)

Remark 4.6

This step is not needed in the case that the nonlinearity satisfies the hypothesis of Theorem 1.2.

4.3 The modified energy

We define the operator

$$\begin{aligned} {{\mathcal {C}}}:=(\mathbbm {1}-{Op^{\textrm{BW}}}(C(x,\xi ))){Op^{\textrm{BW}}}(S^{-1}(x,\xi )), \end{aligned}$$
(4.20)

where \(C(x,\xi )\) is defined in (4.18) and \(S^{-1}(x,\xi )\) is defined in (4.5). We introduce the following norm

$$\begin{aligned} \Vert U\Vert _{{{\underline{\varvec{U}}}},s}^2:=\langle {Op^{\textrm{BW}}}(\lambda ^s(x,\xi )|\xi |^{2s}){{\mathcal {C}}}U, {{\mathcal {C}}}U \rangle _{L^2}. \end{aligned}$$
(4.21)

We prove that this norm is almost equivalent to the standard Sobolev one \(\Vert \cdot \Vert ^2_{{\sigma }}\).

Lemma 4.7

Fix \(\sigma \ge 0\) and \(r>0\) as in (4.2). There exists a constant \(C_r>0\) such that

$$\begin{aligned} C_r^{-1}\Vert V\Vert _{\sigma }^2-\Vert V\Vert _{-2}^2\le \Vert V\Vert _{{{\underline{\varvec{U}}}},{\sigma }}^2\le C_r\Vert V\Vert _{{\sigma }}^2, \quad \textrm{for}\; \textrm{all}\,\,\, V\in {{\mathcal {H}}}^{{\sigma }}. \end{aligned}$$

Proof

The upper bound follows by Theorems 2.2, (4.12) and (4.19), indeed

$$\begin{aligned} \Vert V\Vert _{{{\underline{\varvec{U}}}},{\sigma }}^2\le \Vert {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}(x,\xi )|\xi |^{2{\sigma }}){{\mathcal {C}}}V\Vert _{-{\sigma }}\Vert {{\mathcal {C}}}V\Vert _{{\sigma }}\le C_r \Vert V\Vert _{{\sigma }}^2. \end{aligned}$$

We focus on the lower bound. Let \(\delta >0\) such that \(s_0-\delta >d/2\), we use Theorem 2.4 with \(\rho =\delta \) and we obtain

$$\begin{aligned} {Op^{\textrm{BW}}}(\lambda ^{-\frac{{\sigma }}{2}}){Op^{\textrm{BW}}}(S){Op^{\textrm{BW}}}(\lambda ^{\frac{{\sigma }}{2}}){{\mathcal {C}}}=\mathbbm {1}+R_{-\delta }({{\underline{\varvec{U}}}}), \end{aligned}$$
(4.22)

where \(\Vert R_{-\delta }({{\underline{\varvec{U}}}})V\Vert _{\sigma }\lesssim C_{\Theta }\Vert V\Vert _{\sigma -\delta }\). Analogously we obtain

$$\begin{aligned} {Op^{\textrm{BW}}}(\lambda ^{\frac{{\sigma }}{2}}){Op^{\textrm{BW}}}(|\xi |^{2{\sigma }}) {Op^{\textrm{BW}}}(\lambda ^{\frac{{\sigma }}{2}})={Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }})+R_{2{\sigma }-\delta }({{\underline{\varvec{U}}}}), \end{aligned}$$
(4.23)

where \(\Vert R_{2{\sigma }-\delta }({{\underline{\varvec{U}}}})\Vert _{{\sigma }'-2{\sigma }+\delta }\le C_{\Theta }\Vert U\Vert _{{\sigma }'}\). We have the following chain of inequalities

$$\begin{aligned} \begin{aligned} \Vert V\Vert ^2_{{\sigma }}&{\mathop {\le }\limits ^{{({4.22})}}} \Vert {Op^{\textrm{BW}}}(\lambda ^{-\frac{{\sigma }}{2}}){Op^{\textrm{BW}}}(S){Op^{\textrm{BW}}}(\lambda ^{\frac{{\sigma }}{2}}){{\mathcal {C}}}V\Vert _{{\sigma }}^2+C_{\Theta }\Vert V\Vert _{{\sigma }-\delta }^2\\&\le C_r\Vert {Op^{\textrm{BW}}}(\lambda ^{\frac{{\sigma }}{2}}){{\mathcal {C}}}V\Vert _{{\sigma }}^2+C_{\Theta }\Vert V\Vert _{{\sigma }-\delta }^2\\&=C_{r}\langle {Op^{\textrm{BW}}}(\lambda ^{\frac{{\sigma }}{2}}){Op^{\textrm{BW}}}(|\xi |^{2{\sigma }}){Op^{\textrm{BW}}}(\lambda ^{\frac{{\sigma }}{2}}){{\mathcal {C}}}V,{{\mathcal {C}}}V\rangle +C_{\Theta }\Vert V\Vert _{{\sigma }-\delta }^2\\&{\mathop {\le }\limits ^{{({4.23})}}} C_r\Vert V\Vert _{{{\underline{\varvec{U}}}},{\sigma }}^2+C_{\Theta }\Vert V\Vert _{{\sigma }-{\delta }/{2}}^2. \end{aligned} \end{aligned}$$

By using the interpolation inequality (2.6), Young inequality \(ab\le p^{-1}a^p+q^{-1}b^q\) with \(p=\frac{2({\sigma }+2)}{\delta }\), \(q=\frac{2({\sigma }+2)}{2({\sigma }+2)-\delta }\), we obtain \(\Vert V\Vert _{{\sigma }-{\delta }/{2}}^2\le \eta ^{-\frac{2({\sigma }+2)}{\delta }}\Vert V\Vert _{-2}^2+\eta ^{\frac{2({\sigma }+2)}{2({\sigma }+2)-\delta }}\Vert V\Vert _{{\sigma }}^2\) for any \(\eta >0\). We conclude by choosing \(\eta \) small enough. \(\square \)

Lemma 4.8

(Garding type inequality) Let \({{\underline{\varvec{U}}}}\) be as in (4.2), there exist \(C_{\Theta }, C_r>0\) such that

$$\begin{aligned}&\langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}\Delta ^2U,{{\mathcal {C}}}U\rangle \ge C_r\Vert U\Vert _{{\sigma }+2}^2-C_{\Theta }\Vert U\Vert _{{\sigma }}^2,\\&\langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U,{{\mathcal {C}}}\Delta ^2U\rangle \ge C_r\Vert U\Vert _{{\sigma }+2}^2-C_{\Theta }\Vert U\Vert _{{\sigma }}^2. \end{aligned}$$

Proof

We prove the first inequality, being the second one similar. By using symbolic calculus, i.e. Theorem 2.4, with \(\rho =1\) we obtain an operator \(R_{2\sigma +3}({{\underline{\varvec{U}}}})\) satisfying \(\Vert R_{2\sigma +3}({{\underline{\varvec{U}}}})V\Vert _{0}\le C_{\Theta }\Vert V\Vert _{2{\sigma }+3}\) such that

$$\begin{aligned} {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}{Op^{\textrm{BW}}}(|\xi |^4)-{Op^{\textrm{BW}}}(|\xi |^2){Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}{Op^{\textrm{BW}}}(|\xi |^2)=R_{2\sigma +3}({{\underline{\varvec{U}}}}). \end{aligned}$$

As a consequence of the above equation we have

$$\begin{aligned}{} & {} \langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}\Delta ^2U,{{\mathcal {C}}}U\rangle \nonumber \\{} & {} \quad =\langle {Op^{\textrm{BW}}}(|\xi |^2){Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}{Op^{\textrm{BW}}}(|\xi |^2)U,{{\mathcal {C}}}U\rangle +\langle R_{2{\sigma }+3}({{\underline{\varvec{U}}}})U,{{\mathcal {C}}}U\rangle \nonumber \\{} & {} \quad =\Vert {Op^{\textrm{BW}}}(|\xi |^2)U\Vert _{{{\underline{\varvec{U}}}},\sigma }+\langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}{Op^{\textrm{BW}}}(|\xi |^2)U, R_{1}({{\underline{\varvec{U}}}}) U\rangle \nonumber \\{} & {} \qquad +\langle R_{2{\sigma }+3}({{\underline{\varvec{U}}}})U,{{\mathcal {C}}}U\rangle , \end{aligned}$$
(4.24)

where in the last equality we have used the self adjoint character of \({Op^{\textrm{BW}}}(|\xi |^2)\) and \({Op^{\textrm{BW}}}(|\xi |^2){{\mathcal {C}}}-{{\mathcal {C}}}{Op^{\textrm{BW}}}(|\xi |^2)=R_{1}({{\underline{\varvec{U}}}})\), with \(\Vert R_{1}({{\underline{\varvec{U}}}})\Vert _{{\sigma }}\le C_{\Theta }\Vert U\Vert _{{\sigma }+1}\) for any \({\sigma }\ge 0\). By means of the duality inequality \(\langle \cdot ,\cdot \rangle \le \Vert \cdot \Vert _{-{\sigma }-1/2}\Vert \cdot \Vert _{{\sigma }+1/2}\), the action Theorem 2.2, the estimates on the semi norms (4.12), (4.19), we infer that

$$\begin{aligned} \langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}{Op^{\textrm{BW}}}(|\xi |^2)U, R_{1}({{\underline{\varvec{U}}}}) U\rangle \le C_{\Theta }\Vert U\Vert _{\sigma +3/2}^2. \end{aligned}$$
(4.25)

Analogously, by using \(\langle \cdot ,\cdot \rangle \le \Vert \cdot \Vert _{-{\sigma }-3/2}\Vert \cdot \Vert _{{\sigma }+3/2}\) we obtain

$$\begin{aligned} \langle R_{2{\sigma }+3}({{\underline{\varvec{U}}}})U,{{\mathcal {C}}}U\rangle \le C_{\Theta }\Vert U\Vert ^2_{{\sigma }+3/2}. \end{aligned}$$
(4.26)

Recalling (4.24), we use the left inequality in Lemma 4.7 and inequalities (4.25), (4.26) to infer that

$$\begin{aligned} \langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}\Delta ^2U,{{\mathcal {C}}}U\rangle \ge C_r^{-1}\Vert U\Vert _{{\sigma }+2}^2-\Vert U\Vert _{{\sigma }}^2-C_{\Theta }\Vert U\Vert ^2_{{\sigma }+3/2}, \end{aligned}$$

from which one obtains the thesis by using the interpolation inequality \(\Vert U\Vert ^2_{{\sigma }+3/2}\le \eta \Vert U\Vert ^2_{{\sigma }+2}+\eta ^{-1}\Vert U\Vert ^2_{{\sigma }}\) with \(\eta \) small enough. \(\square \)

In the following we prove the energy estimates on the solutions of the linear problem (4.1).

Proposition 4.9

Let \({{\underline{\varvec{U}}}}\) satisfy (4.2). For any \(\sigma >0\) there exist constants \(C_r>0\) and \(C_{\Theta }>0\) such that the unique solution of (4.1) fullfills

$$\begin{aligned} \Vert U^{\epsilon }\Vert _{\sigma }^2\le C_r\Vert U_0\Vert _{\sigma }^2+C_{\Theta }\int _0^t\Vert U^{\epsilon }(\tau )\Vert _{{\sigma }}^2d\tau , \quad \textrm{for}\; \textrm{all} \,\,\,t\in [0,T). \end{aligned}$$

As a consequence one also has

$$\begin{aligned} \Vert U^{\epsilon }\Vert _{\sigma }\le C_re^{C_{\Theta }t}\Vert U_0\Vert _{\sigma }, \quad \textrm{for}\; \textrm{all} \,\,\, t\in [0,T). \end{aligned}$$
(4.27)

Proof

We take the time derivative of the energy (4.21) along the solutions of the Eq. (4.1). We obtain

$$\begin{aligned} \tfrac{\hbox {d}}{\hbox {d}t}\Vert U^{\epsilon }\Vert _{{{\underline{\varvec{U}}}},{\sigma }}^2= & {} \langle {Op^{\textrm{BW}}}(\tfrac{\hbox {d}}{\hbox {d}t}(\lambda (x,\xi )^{{\sigma }})|\xi |^{2{\sigma }}{{\mathcal {C}}}U^{\epsilon },{{\mathcal {C}}}U^{\epsilon }\rangle \nonumber \\{} & {} +\langle {Op^{\textrm{BW}}}(\lambda (x,\xi )^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}_t U^{\epsilon },{{\mathcal {C}}}U^{\epsilon }\rangle +\langle {Op^{\textrm{BW}}}(\lambda (x,\xi )^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U^{\epsilon },{{\mathcal {C}}}_t U^{\epsilon }\rangle \nonumber \\{} & {} +\langle {Op^{\textrm{BW}}}(\lambda (x,\xi )^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U^{\epsilon }_t,{{\mathcal {C}}}U^{\epsilon }\rangle +\langle {Op^{\textrm{BW}}}(\lambda (x,\xi )^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U^{\epsilon },{{\mathcal {C}}}U^{\epsilon }_t\rangle .\nonumber \\ \end{aligned}$$
(4.28)

The first summand in the r.h.s. of (4.28) is bounded from above by \(C_{\Theta }\Vert U^{\epsilon }\Vert _\sigma ^2\), indeed, by using (4.2)

$$\begin{aligned} |\partial _t\lambda (x,\xi )|_{0,s_0,\alpha }\lesssim \texttt {C}(\Vert \partial _{t}{{\underline{\varvec{U}}}}\Vert _{s_0})\le C_{\Theta }, \end{aligned}$$

therefore, thanks also to (4.7) and (4.19) one is in position to use Theorem 2.2 together with the duality inequality \(\langle \cdot ,\cdot \rangle \le \Vert \cdot \Vert _{-{\sigma }}\Vert \cdot \Vert _{{\sigma }} \). With an analogous reasoning one may bound from above the second line in (4.28) by \(C_{\Theta }\Vert U^{\epsilon }\Vert _\sigma ^2\).

We focus on the third line of (4.28), by using the Eq. (4.1) and setting \(A:=A_2({{\underline{\varvec{U}}}};x,\xi )+A_1({{\underline{\varvec{U}}}};x,\xi )\), we get

$$\begin{aligned} \langle&{Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U^{\epsilon }_t,{{\mathcal {C}}}U^{\epsilon }\rangle +\langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U^{\epsilon },{{\mathcal {C}}}U^{\epsilon }_t\rangle \nonumber \\&\quad =\langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}\textrm{i}E {Op^{\textrm{BW}}}(A)U^{\epsilon },{{\mathcal {C}}}U^{\epsilon }\rangle +\langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U^{\epsilon },{{\mathcal {C}}}\textrm{i}E {Op^{\textrm{BW}}}(A)U^{\epsilon }\rangle \end{aligned}$$
(4.29)
$$\begin{aligned}&\qquad - \langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}\epsilon \Delta ^2U^{\epsilon },{{\mathcal {C}}}U^{\epsilon }\rangle -\langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U^{\epsilon },{{\mathcal {C}}}\epsilon \Delta ^2U^{\epsilon }\rangle . \end{aligned}$$
(4.30)

Concerning (4.30) we may use Lemma 4.8 and bound it from above by

$$\begin{aligned} -C_r\Vert U^{\epsilon }\Vert _{{\sigma }+2}^2+C_{\Theta }\Vert U^{\epsilon }\Vert _{{\sigma }}^2\le C_{\Theta }\Vert U^{\epsilon }\Vert _{{\sigma }}^2. \end{aligned}$$

In (4.29) we have to see a cancellation. By Lemma 4.2 we have

$$\begin{aligned} \begin{aligned} ({4.29})&= \langle \textrm{i}{Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}E {Op^{\textrm{BW}}}(A){Op^{\textrm{BW}}}(S+S_{*}){Op^{\textrm{BW}}}(S^{-1})U^{\epsilon },{{\mathcal {C}}}U^{\epsilon }\rangle \\&\quad +\langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U^{\epsilon },\textrm{i}{{\mathcal {C}}}E {Op^{\textrm{BW}}}(A){Op^{\textrm{BW}}}(S+S_{*}){Op^{\textrm{BW}}}(S^{-1})U^{\epsilon }\rangle \end{aligned} \end{aligned}$$

modulo a remainder which, by means of (4.8), Cauchy–Schwarz inequality and Theorem 2.2, is bounded from above by \(C_{\Theta }\Vert U^{\epsilon }\Vert _{\sigma }^2\). Furthermore, we plug the operator \((\mathbbm {1}+{Op^{\textrm{BW}}}(C(x,\xi )))(\mathbbm {1}-{Op^{\textrm{BW}}}(C(x,\xi )))\) between \({Op^{\textrm{BW}}}(S+S_{*})\) and \({Op^{\textrm{BW}}}(S^{-1})\), inside the scalar products above, this is possible since \((\mathbbm {1}+{Op^{\textrm{BW}}}(C(x,\xi )))(\mathbbm {1}-{Op^{\textrm{BW}}}(C(x,\xi )))\) equals to the identity modulo and operator gaining two derivatives. Recalling (4.20), we are in position to use Proposition 4.3 and Lemma 4.5 and obtain

$$\begin{aligned} ({4.29})&=\langle \textrm{i}{Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}) {Op^{\textrm{BW}}}(E\lambda |\xi |^{2}+\mathbbm {1}a_1^+\big ){{\mathcal {C}}}U^{\epsilon }, {{\mathcal {C}}}U^{\epsilon }\rangle \\&\quad +\langle {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}){{\mathcal {C}}}U^{\epsilon }, \textrm{i}{Op^{\textrm{BW}}}(E\lambda |\xi |^{2}+\mathbbm {1}a_1^+\big ){{\mathcal {C}}}U^{\epsilon }\rangle . \end{aligned}$$

modulo remainders which are bounded from above by \(C_{\Theta }\Vert U^{\epsilon }\Vert _{\sigma }^2\). Using the skew-adjoint character of the operator \( \textrm{i}{Op^{\textrm{BW}}}(E\lambda |\xi |^{2}+\mathbbm {1}a_1^+\big )\), we may write

$$\begin{aligned} ({4.29})=\left\langle \textrm{i}\left[ {Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}), {{\mathcal {A}}}\right] {{\mathcal {C}}}U^{\epsilon }, {{\mathcal {C}}}U^{\epsilon }\right\rangle \end{aligned}$$

where we have defined \({{\mathcal {A}}}:={Op^{\textrm{BW}}}(E\lambda |\xi |^{2}+\mathbbm {1}a_1^+)\), modulo terms bounded by \(C_{\Theta }\Vert U^{\epsilon }\Vert _{\sigma }^2\). By using the symbolic calculus, i.e. Theorem 2.4, with \(\rho =2\) we obtain

$$\begin{aligned} \left\langle \left[ \mathbbm {1}{Op^{\textrm{BW}}}(\lambda ^{{\sigma }}|\xi |^{2{\sigma }}), {{\mathcal {A}}}\right] {{\mathcal {C}}}U^{\epsilon },{{\mathcal {C}}}U^{\epsilon }\right\rangle = \left\langle {Op^{\textrm{BW}}}\left( E\left\{ \lambda ^{{\sigma }}|\xi |^{2{\sigma }},\lambda |\xi |^{2}\right\} \right) {{\mathcal {C}}}U^{\epsilon },{{\mathcal {C}}}U^{\epsilon }\right\rangle , \end{aligned}$$

modulo remainders bounded by \(C_{\Theta }\Vert U^{\epsilon }\Vert _{{\sigma }}^2\). We conclude the proof noticing that the above Poisson bracket equals to zero.

We eventually obtained \(\tfrac{\hbox {d}}{\hbox {d}t}\Vert U^{\epsilon }\Vert _{{{\underline{\varvec{U}}}},{\sigma }}^2\le C_{\Theta }\Vert U^{\epsilon }\Vert ^2_{{\sigma }}\), integrating over the time interval [0, t) we obtain

$$\begin{aligned} \Vert U^{\epsilon }\Vert _{{{\underline{\varvec{U}}}}(t),{\sigma }}^2\le \Vert U^{\epsilon }(0)\Vert _{{{\underline{\varvec{U}}}}(0),{\sigma }}^2+C_{\Theta }\int _0^{t}\Vert U^{\epsilon }(\tau )\Vert _{{\sigma }}^2d\tau \le C_r \Vert U^{\epsilon }(0)\Vert _{{\sigma }}^2+C_{\Theta }\int _0^{t}\Vert U^{\epsilon }(\tau )\Vert _{{\sigma }}^2d\tau . \end{aligned}$$

We now use (4.7) and the fact that \(\Vert \partial _{t}U^{\epsilon }\Vert _{{-2}}\le C_{\Theta }\Vert U^{\epsilon }\Vert _{{0}}\le C_{\Theta } \Vert U^{\epsilon }\Vert _{{\sigma }}\) since \(\sigma \ge 0.\) \(\square \)

We are in position to state and prove the following linear well posedness result.

Proposition 4.10

Let \(\Theta \ge r>0\) and \({{\underline{\varvec{U}}}}\) be a function in \(C^0([0,T); {{\mathcal {H}}}^{s_0+3})\cap C^1([0,T); {{\mathcal {H}}}^{s_0+1})\), satisfying (4.2) with \(s_0>d/2\). Let \(\sigma \ge 0\) and R(t) be a function in \(C^0([0,T);{{\mathcal {H}}}^{{\sigma }})\). Then there exists a unique solution V in \(C^0([0,T); {{\mathcal {H}}}^{{\sigma }})\cap C^1([0,T); {{\mathcal {H}}}^{{\sigma }-2})\) of the linear inhomogeneous problem

$$\begin{aligned} \partial _t V=\textrm{i}E{Op^{\textrm{BW}}}\left( A_{2}({{\underline{\varvec{U}}}};x,\xi )+A_{1}({{\underline{\varvec{U}}}};x,\xi )\right) V+ R(t)\,, \end{aligned}$$
(4.31)

with initial condition \(V(0,x)=V_0(x)\in {{\mathcal {H}}}^{{\sigma }}\). Furthermore the solution satisfies

$$\begin{aligned} \Vert V\Vert _{L^{\infty }H^{{\sigma }}}\le C_{r,{\sigma }} e^{C_{\Theta ,{\sigma }}T}\Vert V_0\Vert _{{\sigma }}+TC_{\Theta ,{\sigma }}e^{C_{\Theta ,{\sigma }}T}\Vert R(t)\Vert _{L^{\infty }H^{{\sigma }}}. \end{aligned}$$
(4.32)

Proof

We consider the following smoothed version of the initial condition \(V_0\)

$$\begin{aligned} V_0^{\epsilon }:=\chi (\epsilon ^{\frac{1}{8}}|D|)V_0 :=\mathcal {F}^{-1}(\chi (\epsilon ^{\frac{1}{8}}|\xi |)\hat{V}_0(\xi ))\,, \end{aligned}$$

where \(\chi \) is a \(C^{\infty }\) function with compact support being equal to one on \([-1,1]\) and zero on \({{\mathbb {R}}}\setminus [-2,2]\). We consider moreover the smoothed, homogeneous version of (4.31), i.e. equation (4.1). All the solutions \(V^{\epsilon }\) are defined on a common time interval [0, T) with \(T>0\) independent on \(\epsilon \), because of Proposition 4.9. We prove that the sequence \(V^{\epsilon }\) converges to a solution of (4.1) with \(\epsilon \) equal to zero both in the initial condition and in the equation. Let \(0<\epsilon '<\epsilon \), set \(V^{\epsilon ,\epsilon '}=V^{\epsilon }-V^{\epsilon '}\), then

$$\begin{aligned} \partial _tV^{\epsilon ,\epsilon '}=\textrm{i}E{Op^{\textrm{BW}}}(A_2({{\underline{\varvec{U}}}};x,\xi )+A_1({{\underline{\varvec{U}}}};x,\xi ))V^{\epsilon ,\epsilon '}-\epsilon {\Delta ^2}V^{\epsilon ,\epsilon '} +{\Delta ^2}V^{\epsilon }(\epsilon -\epsilon ')\,. \nonumber \\ \end{aligned}$$
(4.33)

Thanks to the discussion above there exists the flow \(\Phi (t)\) of the equation

$$\begin{aligned} \partial _tV^{\epsilon ,\epsilon '}=\textrm{i}E{Op^{\textrm{BW}}}(A_2({{\underline{\varvec{U}}}};x,\xi )+A_1({{\underline{\varvec{U}}}};x,\xi ))V^{\epsilon ,\epsilon '}-\epsilon {\Delta ^2}V^{\epsilon ,\epsilon '}\,, \end{aligned}$$

and it has estimates independent of \(\epsilon ,\epsilon '\). By means of Duhamel formula, we can write the solution of (4.33) in the implicit form

$$\begin{aligned} V^{\epsilon ,\epsilon '}(t,x)=\Phi (t)(V_0^{\epsilon '}-V_0^{\epsilon })+(\epsilon '-\epsilon )\Phi (t)\int _0^t\Phi (s)^{-1}{\Delta ^2}V^{\epsilon }(s,x)\hbox {d}s\,, \end{aligned}$$

using (4.27) we obtain \(\Vert V^{\epsilon ,\epsilon '}(t,x)\Vert _{L^{\infty }H^{\sigma }}\le C (\epsilon -\epsilon ')\Vert V_0\Vert _{ H^{\sigma }}+(\epsilon -\epsilon ')\Vert V_0^{\epsilon }(t)\Vert _{H^{\sigma +4}}\) . We conclude using the smoothing estimate \(\Vert V_0^{\epsilon }\Vert _{H^{\sigma +4}}\le \epsilon ^{-\frac{1}{2}}\Vert V_0\Vert _{H^{\sigma }}\). We eventually proved that we have a well defined flow of the Eq. (4.31) with \(\mathfrak {R}(t)=0\). The non homogeneous case, i.e. \(\mathfrak {R}(t)\ne 0\), follows by using the Duhamel formula. \(\square \)

Remark 4.11

In the case that the nonlinearity satisfies the hypotheses of Theorem 1.2, the modified energy in (4.20), (4.21) is defined by the simpler operator \({{\mathcal {C}}}:={Op^{\textrm{BW}}}(S^{-1}(x,\xi )).\) This modified energy is similar to the one in [2]. As consequence of this and of Remarks 3.3, 4.4, 4.6 the (4.2) may be replaced by

$$\begin{aligned} \Vert {{\underline{\varvec{U}}}}\Vert _{L^{\infty }H^{{s}_0+2}}+\Vert \partial _t{{{\underline{\varvec{U}}}}}\Vert _{L^{\infty }H^{{s}_0}}\le \Theta ,\quad \Vert {{{\underline{\varvec{U}}}}}\Vert _{L^{\infty }H^{{s}_0}}\le r, \end{aligned}$$
(4.34)

so that Proposition 4.10 holds for \({{\underline{\varvec{U}}}}\) satisfying (4.34).

5 Nonlinear well posedness

This section is nowadays standard, we mostly follow [2], analogous reasonings are developed also in [6, 9]. We set \({{\mathcal {A}}}({{\underline{\varvec{U}}}}):=\textrm{i}E{Op^{\textrm{BW}}}(A_2({{\underline{\varvec{U}}}};x,\xi )+A_{1}({{\underline{\varvec{U}}}};x,\xi ))\), \(U^0(t,x)=U_0(x)\) and we define the following iterative scheme (recall (3.3))

$$\begin{aligned} {{\mathcal {P}}}_n:={\left\{ \begin{array}{ll} \partial _t U^n={{\mathcal {A}}}(U^{n-1})U^n+R(U^{n-1})U^{n-1} \\ U^n(0,x)=U_0(x) \end{array}\right. },\quad \textrm{for}\; \textrm{all} \quad n\ge 1. \end{aligned}$$
(5.1)

In the following lemma we prove that each problem \({{\mathcal {P}}}_n\), admits a unique solution for any \(n\ge 1\), and that the sequence of such solutions converges.

Lemma 5.1

Let \(U_0\in {{\mathcal {H}}}^{s}\), \(s>d/2+3\) and set \(r=2\Vert U_0\Vert _{H^{s_0+1}}\), \(s_0>d/2\). There exists \(T:=T(s,\Vert U_0\Vert _{s_0+3})\) such that for any \(n\ge 1\) we have the following.

\((S1)_n\)-:

The problem \({{\mathcal {P}}}_n\) admits a unique solution \(U^n\) belonging to the functional space \(C^0([0,T);{{\mathcal {H}}}^{s})\cap C^1([0,T);{{\mathcal {H}}}^{s-2})\).

\((S2)_n\)-:

There exists a constant \(C_r>0\) such that, setting \(\Theta := 4C_r\Vert U_0\Vert _{s_0+3}\) and \(M:= 4C_r\Vert U_0\Vert _{s}\) we have for any \(1\le m\le n\)

$$\begin{aligned}&\Vert U^m\Vert _{L^{\infty }H^{s_0+1}}\le r, \end{aligned}$$
(5.2)
$$\begin{aligned}&\Vert U^m\Vert _{L^{\infty }H^{s_0+3}}\le \Theta , \quad \Vert \partial _tU^m\Vert _{L^{\infty }H^{s_0+1}}\le C_r \Theta , \end{aligned}$$
(5.3)
$$\begin{aligned}&\Vert U^m\Vert _{L^{\infty }H^{s}}\le M, \quad \quad \Vert \partial _tU^m\Vert _{L^{\infty }H^{s-2}}\le C_r M. \end{aligned}$$
(5.4)
\((S3)_n\)-:

For any \(1\le m\le n\) we have \(\Vert U^m-U^{m-1}\Vert _{L^{\infty }H^{s_0+1}}\le 2^{-m}r\).

Proof

We proceed by induction over \(n\ge 1.\) We prove \((S1)_1\) by using Proposition 4.10 with \(R(t):=R(U_0)U_0\). Moreover by using (4.32) we have

$$\begin{aligned} \Vert U^1\Vert _{L^{\infty }{ H^{{\sigma }}}}\le C_{r,{\sigma }}e^{C_{\Theta }T}\Vert U_0\Vert _{{\sigma }}+C_{\Theta }e^{TC_{\Theta }}\Vert U_0\Vert _{{\sigma }}, \end{aligned}$$

for any \(\sigma \ge s_0+1\). By using the latter inequality with \({\sigma }= s_0+3\), the choice of \(\Theta \) in the statement and \(T>0\) such that \(TC_{\Theta } e^{TC_{\Theta }}\le 1/4\) we obtain the first in (5.3) at the level \(n=1\). To obtain the second in (5.3) one has to use directly the Eq. (5.1) together with the estimates (3.6), (3.9) and the first in (5.3) The (5.4) at the level \(n=1\) is obtained similarly. To prove (5.2) at the level \(n=1\) we note that \(V^1:=U^1-U_0\) solves the equation

$$\begin{aligned} \partial _t V_1={{\mathcal {A}}}(U_0)V^1+R(U_0)U_0+{{\mathcal {A}}}(U_0)U_0, \end{aligned}$$

with initial condition \(V^1(0,x)=0\). Using (4.32), (3.9) and (3.6), we may bound \(\Vert V^1\Vert _{L^{\infty }H^{{\sigma }}}\)by \(C_rC_{\Theta }\Theta Te^{C_{\Theta }T}\), therefore we prove (5.2) at the level \(n=1\) and \((S3)_1\) by choosing T in such a way that \(C_rC_{\Theta }\Theta Te^{C_{\Theta }T}\le r/2\).

We now assume that \((S1)_{n-1},\) \((S2)_{n-1},\) \((S3)_{n-1},\) hold true and we prove them at the level n. Owing to (5.2),(5.3), (5.4) at the level \(n-1\) we can use Proposition 4.10 in order to show \((S1)_n\). We prove (5.3). By using (4.32) with \(\sigma =s_0+3\), we obtain

$$\begin{aligned} \begin{aligned} \Vert U^n\Vert _{L^{\infty }H^{{\sigma }}}&\le C_{r,{\sigma }} e^{C_{\Theta }T}\Vert U_0\Vert _{{\sigma }}+TC_{\Theta ,{\sigma }}e^{C_{\Theta }T}\Vert R(U^{n-1})U^{n-1}\Vert _{L^{\infty }H^{{\sigma }}}\\&\le e^{C_{\Theta }T}\tfrac{\Theta }{4}+TC_{\Theta } e^{TC_{\Theta }}, \end{aligned} \end{aligned}$$

therefore we obtain the first in (5.3) by choosing \(T\Theta \) small enough. To obtain the second in (5.3), we can reason as done in the case of (5.3) at the level \(n=1\). The (5.4) is obtained in a similar way, by using Proposition 4.10 with \({\sigma }=s\). The (5.2) is a consequence of \((S3)_n\). We prove \((S3)_n\). Set \(V^n:=U^{n}-U^{n-1}\), we have

$$\begin{aligned} \begin{aligned}&{\left\{ \begin{array}{ll} \partial _t V^{n}={{\mathcal {A}}}(U^{n-1})V^n+f_n, \\ V^n(0,x)=0 \end{array}\right. } \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} f_n:&=\left[ {{\mathcal {A}}}(U^{n-1})-{{\mathcal {A}}}(U^{n-2})\right] U^{n-1}+R(U^{n-1})U^{n-1}-R(U^{n-2})U^{n-2}. \end{aligned} \end{aligned}$$

We note that, by means of (3.10) and (3.8) we have

$$\begin{aligned} \Vert f_n\Vert _{s_0+1}\lesssim C_{\Theta }\Vert V^{n-1}\Vert _{s_0+1}. \end{aligned}$$

We are in position to use Proposition 4.10 with \({\sigma }=s_0+1\) and \(R(t):= f_{n}\)

$$\begin{aligned} \Vert V^n\Vert _{L^{\infty }H^{s_0+1}}\le TC_{\Theta ,{\sigma }}e^{C_{\Theta }T}\Vert f_n\Vert _{L^{\infty }H^{s_0+1}}\le e^{C_{\Theta }T}TC_{\Theta }\Vert V^{n-1}\Vert _{s_0+1}, \end{aligned}$$

we conclude again by choosing T small enough. \(\square \)

Remark 5.2

In view of Remarks 3.3, 4.4, 4.6, 4.11 if the nonlinearity satisfies the hypotheses of Theorem 1.1, we may replace \(s_0+1\rightsquigarrow s_0\) in the statement of Lemma 5.1.

We are now in position to prove the main theorem.

Proof of Theorems 1.1 and 1.2

We prove Theorem 1.1. By means of Proposition 3.1 we know that (1.1) is equivalent to (3.3). As a consequence of Lemma 5.1 we obtain a Cauchy sequence in \(U^n\in C^0([0,T),{{\mathcal {H}}}^{s'})\cap C^1([0,T),{{\mathcal {H}}}^{s'-2})\) for \(s>s'\ge s_0+1\). Indeed, for \(s'=s_0+1\) this is the \((S3)_n\) and for \(s>s'> s_0+1\) we can interpolate \((S3)_n\) and (5.4) by means of (2.6). Analogously one proves that \(\partial _t U^n\) is a Cauchy sequence in \(C^0([0,T),{{\mathcal {H}}}^{s'-2})\). Let U(t) be the limit, in order to show that U(t) solves (3.3) it is enough to prove that

$$\begin{aligned} {{\mathcal {A}}}(U^{n})U^{n}-{{\mathcal {A}}}(U)U+R(U^{n})U^{n}-R(U)U, \end{aligned}$$

converges to 0 in \(L^{\infty }{{\mathcal {H}}}^{s'-2}\), but this is a consequence of Theorem 2.2 and contraction estimates (3.10), (3.7). The uniqueness may be proved by contradiction with similar computations to the ones performed in Lemma 5.1. Thanks to (5.4) we have that \(U^n\) is a bounded sequence in \(C^0([0,T),{{\mathcal {H}}}^{s})\cap C^1([0,T),{{\mathcal {H}}}^{s-2})\), which implies that \(U\in L^{\infty }([0,T),{{\mathcal {H}}}^{s})\cap Lip([0,T),{{\mathcal {H}}}^{s-2})\). To prove that it is actually continuous with the \({{\mathcal {H}}}^s\) topology, as well as the continuity of the solution map, one has to use the Bona–Smith technique [3] as done in [2] or [9]. We do not reproduce the proof here.

Concerning the proof of Theorem 1.2 one has to reason exactly in the same way taking into account Remarks 3.3, 4.4, 4.6, 4.11, 5.2. \(\square \)