1 Introduction

We consider the Cauchy problem

$$\begin{aligned} \left\{ \begin{array}{ll} D_t u - A(t,D_x)u = 0,&{}\quad (t,x) \in [0,T] \times \mathbb {R}^n,\\ \left. u \right| _{t=0} = u_0,&{}\quad x \in \mathbb {R}^n, \end{array} \right. \end{aligned}$$
(1)

where \(D_t=-\mathrm{i}\partial _t\), \(D_x=-\mathrm{i}\partial _x\), and \(A(t,D_x)\) is an \(m \times m\) matrix of first-order differential operators with time-dependent coefficients and u is a column vector with components \(u_1\), \(\dots \), \(u_m\). We assume that (1) is hyperbolic, whereby we mean that the matrix \(A(t,\xi )\) has only real eigenvalues. These eigenvalues, rescaled to order 0 by multiplying by \(\langle \xi \rangle ^{-1}\), will be denoted by \(\lambda _1(t,\xi ),\dots ,\lambda _m(t,\xi )\). Following Kinoshita and Spagnolo in [22], we assume throughout this paper that there exists a positive constant C such that

$$\begin{aligned} \lambda _i^2(t,\xi ) + \lambda _j^2(t,\xi ) \le C (\lambda _i(t,\xi ) - \lambda _j(t,\xi ))^2,\quad ~(t,\xi ) \in [0,T] \times \mathbb {R}^n \end{aligned}$$
(2)

for all \(1 \le i < j \le m\).

As observed in [14] combining the well-posedness results in [21, 25] we already know that the Cauchy problem (1) is well-posed in the Gevrey class \(\gamma ^s\), with

$$\begin{aligned} 1\le s<1+\frac{1}{m} \end{aligned}$$

as well as in the corresponding spaces of (Gevrey–Beurling) ultradistributions. In this paper we want to prove that when \(A(t,D_x)\) has smooth coefficients and the condition (2) on the eigenvalues holds, then the Gevrey well-posedness result above can be extended to any \(s \ge 1\). Since, by the results of Kajitani and Yuzawa when \(s\ge 1+\frac{1}{m}\) at least an ultradistributional solution to the Cauchy problem (1) exists, we will prove that this solution does actually belong to the Gevrey class \(\gamma ^s\). In the case of analytic coefficients, we will prove instead that the Cauchy problem (1) is \(C^\infty \) well-posed.

In this paper we assume that the Gevrey classes \(\gamma ^s({\mathbb R}^n)\) are well-known: these are spaces of all \(f\in C^\infty (\mathbb {R}^n)\) such that for every compact set \(K\subset \mathbb {R}^n\) there exists a constant \(C>0\) such that for all \(\beta \in \mathbb N_0^n\) we have the estimate

$$\begin{aligned} \sup _{x\in K}|\partial ^\beta f(x)|\le C^{|\beta |+1}(\beta !)^s. \end{aligned}$$

For \(s=1\), we obtain the class of analytic functions. We refer to [11] for a detailed discussion and Fourier characterisations of Gevrey spaces of different types and the definition of the corresponding spaces of ultradistributions.

The well-posedness of hyperbolic equations and systems with multiplicities has been a challenging problem for a long time. In the last decades several results have been obtained for scalar equations with t-dependent coefficients ([24, 6, 7, 1113, 22], to quote a few) but the research on hyperbolic systems with multiplicities has not been as successful. We mention here the work of D’Ancona, Kinoshita and Spagnolo [8] on weakly hyperbolic systems (i.e. systems with multiplicities) of size \(2\times 2\) and \(3\times 3\) with Hölder dependent coefficients later generalised to any matrix size by Yuzawa in [25] and to (tx)-dependent coefficients by Kajitani and Yuzawa in [21]. In all these papers, well-posedness is obtained in Gevrey classes of a certain order depending on the regularity of the coefficients and the system size. Systems of this type have recently also been investigated in [10, 14].

It is a natural question to ask if under stronger assumptions on the regularity of the coefficients, for instance smooth or analytic coefficients, the well-posedness of the corresponding Cauchy problem could be improved, in the sense if one could get well-posedness in every Gevrey class or \(C^\infty \)–well-posedness. It is known that this is possible for scalar equations under suitable assumptions on the multiple roots and Levi conditions on the lower order terms, see [12, 22] for \(C^k\) and \(C^\infty \) coefficients and [12, 17, 22] for analytic coefficients. This paper gives a positive answer to this question by extending the results for scalar equations in [12, 22] to systems with multiplicities. This will require a transformation of the system in (1) into block-diagonal form with Sylvester blocks which increases the system size from \(m\times m\) to \(m^2\times m^2\) but does not change the eigenvalues, in the sense that every block will have the same eigenvalues as \(A(t,\xi )\). Such a transformation, introduced by D’Ancona and Spagnolo in [9], has the side effect to generate a matrix of lower order terms even when the original system is homogeneous, i.e., (1) will be transformed into a Cauchy problem of the type

$$\begin{aligned} \left\{ \begin{array}{l} D_t U = \mathcal A(t,D_x) U + \mathcal B(t,D_x)U, \\ \left. U \right| _{t=0} = U_0. \end{array} \right. \end{aligned}$$

It becomes therefore crucial to understand how the lower order terms in \(\mathcal {B}(t,\xi )\) are related to the matrix \(\mathcal {A}(t,\xi )\), which is in turn related to \(A(t,\xi )\), and which Levi-type conditions have to be formulated on them to get the desired well-posedness. These Levi-type conditions will then be expressed in terms of the matrix \(A(t,\xi )\). In the next subsection we collect our main results and we give a scheme of the proof.

1.1 Results and scheme of the proof

In the sequel, we denote the elementary symmetric polynomials \(\sigma _h^{(m)}(\lambda )\) by

$$\begin{aligned} \sigma _h^{(m)}(\lambda )=(-1)^h\sum _{1\le i_1<\cdots <i_h\le m}\lambda _{i_1}\ldots \lambda _{i_h}, \end{aligned}$$

for \(1 \le h \le m\) and \(\sigma _0^{(m)}(\lambda ) = 1\), where \(\lambda =(\lambda _1,\ldots ,\lambda _m)\) is given by the rescaled eigenvalues \(\lambda _i = \lambda _i(t,\xi )\) of \(A(t,\xi )\) and \(\pi _i\lambda =(\lambda _1,\ldots ,\lambda _{i-1},\lambda _{i+1},\ldots ,\lambda _m)\). Moreover, given \(f=f(t,\xi )\) and \(g(t,\xi )\) we use the notation \(f\prec g\), when it exists a constant \(C>0\) such that \(f(t,\xi )\le C g(t,\xi )\) for all \(t\in [0,T]\) and \(\xi \in \mathbb {R}^n\). We will also use \((\cdot )\) in the upper left corner of a symbol as in \(b_{ij}^{(l)}\). By that we will not denote derivatives but use this as an index.

Theorem 1.1

Let \(A(t,D_x)\), \(t\in [0,T]\), \(x\in \mathbb {R}^n\), be an \(m\times m\) matrix of first order differential operators with \(C^\infty \)-coefficients. Let \(A(t,\xi )\) have real eigenvalues satisfying condition (2). Assume that the Cauchy problem

$$\begin{aligned} \left\{ \begin{array}{l} D_t U = \mathcal A(t,D_x) U + \mathcal B(t,D_x)U, \\ \left. U \right| _{t=0} = U_0, \end{array} \right. \end{aligned}$$

obtained from (1) by block Sylvester transformation has the lower order terms matrix \( \mathcal B(t,\xi )\) with entries \(b_{kj}^{(l)}(t,\xi )\) fulfilling the Levi-type conditions

$$\begin{aligned} \sum _{k=1}^m|b_{kj}^{(l)}(t,\xi )|^2\prec \sum _{i=1}^m |\sigma _{m-l}^{(m-1)}(\pi _i \lambda )|^2, \end{aligned}$$
(3)

for \(l=1,\dots ,m-1\) and \(j=1,\dots ,m\). Hence, for all \(s\ge 1\) and for all \(u_0\in \gamma ^s(\mathbb {R}^n)^m\) there exists a unique solution \(u\in C^1([0,T], \gamma ^s(\mathbb {R}^n))^m\) of the Cauchy problem (1).

The formulation of the Levi-type conditions given above requires a precise knowledge of the matrix \(\mathcal {B}(t,\xi )\). For that see the Sect. 3.4. It is possible to state the previous well-posedness result completely in terms of the matrix \(A(t,\xi )\) and the Cauchy problem (1). This means to introduce an additional hypothesis on the coefficients of \(A(t,\xi )\) which implies the Levi-type conditions on \(\mathcal {B}(t,\xi )\). In the final section of the paper we will prove that in some cases, for instance when \(m=2\), this second formulation is equivalent to the one given in Theorem 1.1.

Theorem 1.2

Let \(A(t,D_x)\), \(t\in [0,T]\), \(x\in \mathbb {R}^n\), be an \(m\times m\) matrix of first order differential operators with \(C^\infty \)-coefficients. Let \(A(t,\xi )\) have real eigenvalues satisfying condition (2) and let \(Q=(q_{ij})\) be the symmetriser of \(A_0=\langle \xi \rangle ^{-1}A\). Assume that

$$\begin{aligned} \max _{k=1,\dots ,m-1}\Vert D_t^k A_0(t,\xi )\Vert ^2 \prec q_{j,j}(t,\xi ) \end{aligned}$$
(4)

for all \(j=1,\dots ,m-1\) and all \((t,\xi ) \in [0,T] \times \mathbb {R}^n\). Hence, for all \(s\ge 1\) and for all \(u_0\in \gamma ^s(\mathbb {R}^n)^m\) there exists a unique solution \(u\in C^1([0,T], \gamma ^s(\mathbb {R}^n))^m\) of the Cauchy problem (1). Here, \(\Vert \cdot \Vert \) denotes the standard matrix norm.

Remark 1.3

For some more concrete examples in the cases \(m = 2\) and 3, see the remarks in Sect. 6.

Since the entries of the symmetriser are polynomials depending on the eigenvalues of \(A(t,\xi )\), we require in Theorem 1.2 that the t-derivatives of \(A(t,\xi )\) up to order \(m-1\) are bounded by suitable polynomials of the eigenvalues \(\lambda _1(t,\xi ),\dots ,\lambda _m(t,\xi )\). Note that, as observed already in the appendix of [12], these polynomials can be expressed in terms of the entries of \(A(t,\xi )\).

When the entries of \(A(t,\xi )\) are analytic, then we prove that the Cauchy problem (1) is \(C^\infty \) well-posed. The precise statements can be obtained by replacing \(\gamma ^s\) with \(C^\infty \) in Theorems 1.1 and 1.2.

We conclude this subsection by presenting the scheme of the proof of Theorem 1.1 which combines ideas from [9, 12] .

  1. Step 1

    Compute the adjunct matrix \({{{\mathrm{\mathbf {adj}}}}}(I_m\tau - A(t,\xi )) = {{{\mathrm{\mathbf {cof}}}}}(I_m\tau - A^T(t,\xi ))\), where \(I_m\) is the identity matrix of size \(m \times m\). We thus have the relation

    $$\begin{aligned} {{{\mathrm{\mathbf {adj}}}}}(I_m \tau - A(t,\xi ))(I_m \tau - A(t,\xi )) = \sum \limits _{h=0}^{m} c_{h}(t,\xi )I_m \tau ^{m-h}, \end{aligned}$$

    where the \(c_{h}(t,\xi )\) are homogeneous polynomials of order h in \(\xi \) and are given by the coefficients of the characteristic polynomial of \(A(t,\xi )\). See Appendix.

  2. Step 2

    Apply the operator \({{{\mathrm{\mathbf {adj}}}}}(I_m D_t-A(t,D_x))\), associated to the symbol \({{{\mathrm{\mathbf {adj}}}}}(I_m \tau - A(t,\xi ))\), to the system (1) and obtain a set of scalar equations for \(u_1\) to \(u_m\), where the operator acting on these is associated to \(\det (I_m \tau - A(t,\xi ))\). Additionally, one gets some lower order terms which can be computed explicitly.

  3. Step 3

    Convert the resulting set of equations

    $$\begin{aligned} \det (I_m D_t - A(t,D_x))u + l.o.t. = 0 \end{aligned}$$

    to Sylvester block diagonal form following the method of Taylor in [23], i.e by setting

    $$\begin{aligned} U= & {} \left( U_1 , U_2, \dots , U_m \right) ^T,\quad \text {where} \nonumber \\ U_{k}= & {} ( \langle D_{x} \rangle ^{m-1} u_k, D_t\langle D_{x} \rangle ^{m-2} u_k, \dots , D_t^{m-1} u_k ) \end{aligned}$$
    (5)

    for \(k=1,\dots ,m\). This transformation maps each equation to a system in Sylvester form and glues those systems in block diagonal form together. Hence, we achieve a block diagonal form with Sylvester blocks associated to the characteristic polynomial of (1). This means that each block will have the same eigenvalues as \(A(t,\xi )\). The initial data will be transformed in the same way to obtain a new set of initial data \(U_0\) for the new system.

  4. Step 4

    Consider the resulting system

    $$\begin{aligned} \left\{ \begin{array}{l} D_t U = \mathcal {A}(t,D_x)U + \mathcal {B}(t,D_x)U \\ \left. U \right| _{t=0} = U_0, \end{array} \right. \end{aligned}$$
    (6)

    where \(\mathcal A(t,D_x)\) and \(\mathcal B(t,D_x)\) are matrices of size \(m^2 \times m^2\) with a special structure. As explained above, \(\mathcal A(t,D_x)\) is a block diagonal matrix with m identical blocks of Silvester matrices having the same eigenvalues as \(A(t,\xi )\) and \(\mathcal B(t,D_x)\) is composed of \(m \times m^2\) blocks with only the last row not identically zero. Since the original homogeneous system has been transformed into a system with lower order terms, to get well-posedness of the corresponding Cauchy problem (6), we need to find some Levi-type conditions. These are obtained by following the ideas for scalar equations in [12].

  5. Step 5

    We apply the partial Fourier transform with respect to x to (6) and we prove an energy estimate from which the assertions of the well-posedness theorems follow in a standard way. A key point is the construction of the quasi-symmetriser of the matrix \(\mathcal {A}(t,\xi )\).

The remainder of the paper is organised as follows. In Sect. 2, we present a short survey on the quasi-symmetriser which will be employed to formulate and prove the energy estimate. The core of Sect. 3 is the transformation of \(A(t,\xi )\) from (1) to block Sylvester form. An explicit description of \({{{\mathrm{\mathbf {adj}}}}}(I_m D_t - A(t,D_x))\) and the resulting lower order terms is also given in Sect. 3, together with a detailed scheme of the proof in the cases \(m=2\) and \(m=3\). Section 4 is devoted to the energy estimate and Sect. 5 to the estimates for the lower order terms. The paper ends with the well-posedness results in Sect. 6 and the Appendix, where we collect some algebraic results concerning \({{{\mathrm{\mathbf {adj}}}}}(I_m \tau - A(t,\xi ))\).

2 The quasi-symmetriser

Here we recall some facts about the quasi-symmetriser that we will need throughout the paper. For more details see [9, 22]. Note that for \(m\times m\) matrices \(A_1\) and \(A_2\) the notation \(A_1\le A_2\) means \((A_1v,v)\le (A_2v,v)\) for all \(v\in \mathbb {C}^m\) with \((\cdot ,\cdot )\) the scalar product in \(\mathbb {C}^m\). Let \(M(\lambda )\) be a \(m\times m\) Sylvester matrix with real eigenvalues \(\lambda _l\), i.e.,

$$\begin{aligned} M(\lambda )=\left( \begin{array}{ccccc} 0 &{} 1 &{} 0 &{} \dots &{} 0\\ 0 &{} 0 &{} 1 &{} \dots &{} 0 \\ \dots &{} \dots &{} \dots &{} \dots &{} 1 \\ -\sigma _m^{(m)}(\lambda ) &{} -\sigma _{m-1}^{(m)}(\lambda ) &{} \dots &{} \dots &{} -\sigma _1^{(m)}(\lambda ) \\ \end{array} \right) , \end{aligned}$$

where the \(\sigma _h^{(m)}(\lambda )\) are defined as

$$\begin{aligned} \sigma _h^{(m)}(\lambda )=(-1)^h\sum _{1\le i_1<\cdots <i_h\le m}\lambda _{i_1}\ldots \lambda _{i_h} \end{aligned}$$
(7)

for all \(1\le h\le m\). We further set \(\sigma _0^{(m)}(\lambda ) = 1\). In the sequel we make use of the following notations: \(\mathcal {P}_m\) for the class of permutations of \(\{1,\ldots ,m\}\), \(\lambda _\rho =(\lambda _{\rho _1},\ldots ,\lambda _{\rho _m})\) with \(\lambda \in \mathbb {R}^m\) and \(\rho \in \mathcal {P}_m\), \(\pi _i\lambda =(\lambda _1,\ldots ,\lambda _{i-1},\lambda _{i+1},\ldots ,\lambda _m)\) and \(\lambda '=\pi _m\lambda =(\lambda _1,\ldots ,\lambda _{m-1})\).

To construct the quasi-symmetriser, we follow [22] and define \(P^{(m)}(\lambda )\) inductively by \(P^{(1)}(\lambda )=1\) and

$$\begin{aligned} P^{(m)}(\lambda )=\left( \begin{array}{ccccc} \, &{} \, &{} \, &{} \, &{} 0\\ \, &{} \, &{} P^{(m-1)}(\lambda ') &{} \, &{} \vdots \\ \, &{} \, &{} \, &{} \, &{} 0 \\ \sigma _{m-1}^{(m-1)}(\lambda ') &{} \dots &{} \dots &{} \sigma _1^{(m-1)}(\lambda ') &{} 1 \\ \end{array} \right) . \end{aligned}$$

Further, we set, for \(\varepsilon \in (0,1]\),

$$\begin{aligned} P_\varepsilon ^{(m)}(\lambda )=H^{(m)}_\varepsilon P^{(m)}(\lambda ), \end{aligned}$$

where \(H_\varepsilon ^{(m)}=\mathrm{diag}\{\varepsilon ^{m-1},\ldots ,\varepsilon ,1\}\). We remark that \(P^{(m)}(\lambda )\) depends only on \(\lambda '\). Finally, the quasi-symmetriser is the Hermitian matrix

$$\begin{aligned} Q^{(m)}_\varepsilon (\lambda )=\sum _{\rho \in \mathcal {P}_m} P_\varepsilon ^{(m)}(\lambda _\rho )^*P_\varepsilon ^{(m)}(\lambda _\rho ). \end{aligned}$$

To describe the properties of \(Q^{(m)}_\varepsilon (\lambda )\) in more detail in the next proposition, we denote by \(W^{(m)}_i(\lambda )\) the row vector

$$\begin{aligned} \big (\sigma _{m-1}^{(m-1)}(\pi _i\lambda ),\ldots ,\sigma _1^{(m-1)}(\pi _i\lambda ),1\big ),\quad 1\le i\le m, \end{aligned}$$

and let \(W^{(m)}(\lambda )\) be the matrix with row vectors \(W^{(m)}_i\).

The following proposition collects the main properties of the quasi-symmetriser \(Q^{(m)}_\varepsilon (\lambda )\). For a detailed proof we refer the reader to Propositions 1 and 2 in [22] and to Proposition 1 in [9].

Proposition 2.1

  1. (i)

    The quasi-symmetriser \(Q_\varepsilon ^{(m)}(\lambda )\) can be written as

    $$\begin{aligned} Q_0^{(m)}(\lambda )+\varepsilon ^2 Q_1^{(m)}(\lambda )+\cdots +\varepsilon ^{2(m-1)}Q_{m-1}^{(m)}(\lambda ), \end{aligned}$$

    where the matrices \(Q^{(m)}_i(\lambda )\), \(i=1,\ldots ,m-1,\) are non-negative and Hermitian with entries being symmetric polynomials in \(\lambda _1,\ldots ,\lambda _m\).

  2. (ii)

    There exists a function \(C_m(\lambda )\) bounded for bounded \(|\lambda |\) such that

    $$\begin{aligned} C_m(\lambda )^{-1}\varepsilon ^{2(m-1)}I\le Q^{(m)}_\varepsilon (\lambda )\le C_m(\lambda )I. \end{aligned}$$
  3. (iii)

    We have

    $$\begin{aligned} -C_m(\lambda )\varepsilon Q_\varepsilon ^{(m)}(\lambda )\le Q_\varepsilon ^{(m)}(\lambda ) M(\lambda )- M(\lambda )^*Q_\varepsilon ^{(m)}(\lambda )\le C_m(\lambda )\varepsilon Q_\varepsilon ^{(m)}(\lambda ). \end{aligned}$$
  4. (iv)

    For any \((m-1)\times (m-1)\) matrix T let \(T^\sharp \) denote the \(m\times m\) matrix

    $$\begin{aligned} \left( \begin{array}{cc} T &{} 0\\ 0 &{} 0 \\ \end{array} \right) . \end{aligned}$$

    Then, \(Q_\varepsilon ^{(m)}(\lambda )=Q_0^{(m)}(\lambda )+\varepsilon ^2\sum _{i=1}^m Q_\varepsilon ^{(m-1)}(\pi _i\lambda )^\sharp \).

  5. (v)

    We have

    $$\begin{aligned} Q_0^{(m)}(\lambda )=(m-1)! W^{(m)}(\lambda )^*W^{(m)}(\lambda ). \end{aligned}$$
  6. (vi)

    We have

    $$\begin{aligned} \det Q_0^{(m)}(\lambda )=(m-1)!\prod _{1\le i<j\le m}(\lambda _i-\lambda _j)^2. \end{aligned}$$
  7. (vii)

    There exists a constant \(C_m\) such that

    $$\begin{aligned} q_{0,11}^{(m)}(\lambda )\cdots q_{0,mm}^{(m)}(\lambda )\le C_m\prod _{1\le i<j\le m}(\lambda ^2_i+\lambda ^2_j). \end{aligned}$$

We finally recall that a family \(\{Q_\alpha \}\) of non-negative Hermitian matrices is called nearly diagonal if there exists a positive constant \(c_0\) such that

$$\begin{aligned} Q_\alpha \ge c_0\,\mathrm{diag}\,Q_\alpha \end{aligned}$$

for all \(\alpha \), with \(\mathrm{diag}\,Q_\alpha =\mathrm{diag}\{q_{\alpha ,11},\ldots ,q_{\alpha , mm}\}\). The following linear algebra result is proven in [22, Lemma1].

Lemma 2.2

Let \(\{Q_\alpha \}\) be a family of non-negative Hermitian \(m\times m\) matrices such that \(\det Q_\alpha >0\) and

$$\begin{aligned} \det Q_\alpha \ge c\, q_{\alpha ,11}q_{\alpha ,22}\ldots q_{\alpha , mm} \end{aligned}$$

for a certain constant \(c>0\) independent of \(\alpha \). Then,

$$\begin{aligned} Q_\alpha \ge c\, m^{1-m}\,\mathrm{diag}\,Q_\alpha \end{aligned}$$

for all \(\alpha \), i.e., the family \(\{Q_\alpha \}\) is nearly diagonal.

Lemma 2.2 is employed to prove that the family \(Q_\varepsilon ^{(m)}(\lambda )\) of quasi-symmetrisers defined above is nearly diagonal when \(\lambda \) belongs to a suitable set. The following statement is proven in [22, Proposition 3].

Proposition 2.3

For any \(M>0\) define the set

$$\begin{aligned} \mathcal {S}_M=\{\lambda \in \mathbb {R}^m:\, \lambda _i^2+\lambda _j^2\le M (\lambda _i-\lambda _j)^2,\quad 1\le i<j\le m\}. \end{aligned}$$

Then the family of matrices \(\{Q_\varepsilon ^{(m)}(\lambda ):\, 0<\varepsilon \le 1, \lambda \in \mathcal {S}_M\}\) is nearly diagonal.

We conclude this section with a result on nearly diagonal matrices depending on three parameters, \(\varepsilon \), t, and \(\xi \) which will be crucial in Sect. 4. Note that this is a straightforward extension of Lemma 2 in [22] valid for matrices depending on two parameters, \(\varepsilon \) and t.

Lemma 2.4

Let \(\{ Q^{(m)}_\varepsilon (t,\xi ): 0<\varepsilon \le 1, 0\le t\le T, \xi \in \mathbb {R}^n\}\) be a nearly diagonal family of coercive Hermitian matrices of class \({C}^k\) in t, \(k\ge 1\). Then, there exists a constant \(C_T>0\) such that for any continuous function \(V:[0,T]\times \mathbb {R}^n\rightarrow \mathbb {C}^m\) we have

$$\begin{aligned} \int _{0}^T \frac{|(\partial _t Q^{(m)}_\varepsilon (t,\xi ) V(t,\xi ),V(t,\xi ))|}{(Q^{(m)}_\varepsilon (t,\xi )V(t,\xi ),V(t,\xi ))^{1-1/k} |V(t,\xi )|^{2/k}}\, dt\le C_T \Vert Q^{(m)}_\varepsilon (\cdot ,\xi )\Vert ^{1/k}_{{C}^k([0,T])} \end{aligned}$$

for all \(\xi \in {\mathbb R}^n.\)

Remark 2.5

All results of this section hold true in the when \(Q_{\varepsilon }^{(m)}(t,\xi )\) is replaced by a block diagonal matrix \(\mathcal {Q}_{\varepsilon }^{(m)}(t,\xi )\) with m identical matrices \(Q_{\varepsilon }^{(m)}(t,\xi )\) on its diagonal. The corresponding block diagonal matrix with \(W^{m}(\lambda )\) blocks is denoted by \(\mathcal {W}^{(m)}(\lambda )\). Proofs follow from a block-wise treatment and application of the results above.

2.1 The quasi-symmetriser in the case \(m=2\) and \(m=3\)

For the advantage of the reader, we conclude this section by computing the quasi-symmetrisers \(Q^{(2)}_{\varepsilon }\) and \(Q^{(3)}_\varepsilon \). For \(m=2\), we obtain

$$\begin{aligned} W^{(2)}(\lambda )= & {} \left( \begin{array}{ll} -\lambda _2 &{} 1 \\ -\lambda _1 &{} 1 \end{array}\right) \\ Q_\varepsilon ^{(2)}(\lambda )= & {} \left( \begin{array}{ll} \lambda _1^2 + \lambda _2^2 &{} -(\lambda _1 + \lambda _2) \\ -(\lambda _1 + \lambda _2) &{} 2 \end{array}\right) + 2\varepsilon ^2 \left( \begin{array}{ll} 1 &{} 0 \\ 0 &{} 0 \end{array}\right) . \end{aligned}$$

Similarly, for \(m=3\), we obtain

$$\begin{aligned} W^{(3)}(\lambda )= & {} \left( \begin{array}{lll} \lambda _2 \lambda _3 &{} -(\lambda _2+\lambda _3) &{} 1 \\ \lambda _3 \lambda _1 &{} -(\lambda _3+\lambda _1) &{} 1 \\ \lambda _1 \lambda _2 &{} -(\lambda _1+\lambda _2) &{} 1 \end{array}\right) \\ Q_\varepsilon ^{(3)}(\lambda )= & {} 2 \sum \limits _{1 \le i < j \le 3} \left( \begin{array}{lll} (\lambda _i\lambda _j)^2 &{} -\lambda _i\lambda _j(\lambda _i+\lambda _j) &{} \lambda _i\lambda _j \\ -\lambda _i\lambda _j(\lambda _i+\lambda _j) &{} (\lambda _i + \lambda _j)^2 &{} -(\lambda _i + \lambda _j) \\ \lambda _i\lambda _j &{} -(\lambda _i+\lambda _j) &{} 1 \end{array}\right) \\&+\, 2\varepsilon ^2 \sum \limits _{1 \le i \le 3} \left( \begin{array}{lll} \lambda _i^2 &{} -\lambda _i &{} 0 \\ -\lambda _i &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 \end{array}\right) + 6\varepsilon ^4 \left( \begin{array}{lll} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \end{array}\right) . \end{aligned}$$

3 Sylvester block diagonal reduction

This section is devoted to the Sylvester block diagonal reduction that will be employed on the system (1). This transformation has been introduced by D’Ancona and Spagnolo in [9]. Here we give a detailed description of how this reduction works on the system \(I_m D_t - A(t,D_x)\) and we present explicit formulas for the matrix of lower order terms generated by the procedure. Note that these results are obtained from general linear algebra statements that are collected in the appendix at the end of the paper. We will refer to Appendix throughout this section. The subsections refer to the steps of the proof outlined in Sect. 1.1.

3.1 Step 1: The adjunct \({{{\mathrm{\mathbf {adj}}}}}(I_mD_t - A(t,D_x))\)

A straightforward application of Lemma 7.4 leads us to the following proposition.

Proposition 3.1

Let \(I_m D_t - A(t,D_x)\) be the operator in (1). Then,

$$\begin{aligned} {{{\mathrm{\mathbf {adj}}}}}(I_m D_t - A(t,D_x)) = \sum \limits _{h = 0}^{m-1} \mathbf A _{h}(t,D_x) D_t^{m-1-h} \end{aligned}$$

where

$$\begin{aligned} \mathbf A _{h}(t,D_x) = \sum \limits _{h'=0}^{h} \sigma _{h'}^{(m)}(\lambda ) A^{h-h'}(t,D_x), \end{aligned}$$
(8)

\(\lambda = (\lambda _1, \dots , \lambda _m)\) and \(\sigma _h^{(m)}(\lambda )\) as defined in (7). The differential operator \({{{\mathrm{\mathbf {adj}}}}}(I_m D_t - A(t,D_x))\) is of order \(m-1\) with respect to \(D_t\) and every differential operator \(\mathbf A _h(t,D_x)\), \(1 \le h \le m\), is of order h with respect to \(D_x\). We set \(A^0(t,D_x) = I_m\).

Proposition 3.1 completes Step 1 of our proof as outlined in the scheme. We can therefore proceed to Step 2.

3.2 Step 2: Computation of the lower order terms

Proposition 3.2

The lower order terms that arise after applying the adjunct \({{{\mathrm{\mathbf {adj}}}}}(I_mD_t-A(t,D_x))\) to the original operator \(I_mD_t -A(t,D_x)\) are given by

$$\begin{aligned} B(t,D_t,D_x)u = - \sum \limits _{h =0}^{m-2} \mathbf A _{h}(t,D_x)\mathbf A '_{h}(t,D_t,D_x) , \end{aligned}$$
(9)

where \(\mathbf A _{h}(t,D_x)\) is defined in (8) and

$$\begin{aligned} \mathbf A '_{h}(t,D_t,D_x) = \sum _{h'=h}^{m-2} \left( {\begin{array}{c}m-1-h\\ h'+1-h\end{array}}\right) (D_t^{h'+1-h}A)(t,D_x)D_t^{m-2-h'}u. \end{aligned}$$
(10)

Proof

From Proposition 3.1 and Leibniz rule, we have

$$\begin{aligned}&{{{\mathrm{\mathbf {adj}}}}}(I_m D_t - A(t,D_x))(I_m D_t u - A(t,D_x)u) \nonumber \\&= \sum \limits _{h = 0}^{m-1} \mathbf A _{h}(t,D_x) D_t^{m-1-h} (I_m D_t u - A(t,D_x)u) \nonumber \\&= \sum \limits _{h = 0}^{m-1} \mathbf A _{h}(t,D_x) D_t^{m-h} u - \sum \limits _{h =0}^{m-1} \mathbf A _{h}(t,D_x) D_t^{m-1-h} (A(t,D_x)u) \nonumber \\&= \sum \limits _{h =0}^{m-1} \mathbf A _{h}(t,D_x) D_t^{m-h} u \nonumber \\&\quad - \sum \limits _{h =0}^{m-1} \mathbf A _{h}(t,D_x) \sum _{h'=0}^{m-1-h} \left( {\begin{array}{c}m-1-h\\ h'\end{array}}\right) (D_t^{h'}A)(t,D_x)D_t^{m-1-h-h'}u. \end{aligned}$$
(11)

Now we write the second summand in the last equation in (11) as \(Xu+Yu\) where Xu contains all terms with \(h'=0\) and

$$\begin{aligned} Yu&= -\sum \limits _{h =0}^{m-1} \mathbf A _{h}(t,D_x) \sum _{h'=1}^{m-1-h} \left( {\begin{array}{c}m-1-h\\ h'\end{array}}\right) (D_t^{h'}A)(t,D_x)D_t^{m-1-h-h'}u \nonumber \\&= -\sum \limits _{h =0}^{m-2} \mathbf A _{h}(t,D_x) \sum _{h'=1}^{m-1-h} \left( {\begin{array}{c}m-1-h\\ h'\end{array}}\right) (D_t^{h'}A)(t,D_x)D_t^{m-1-h-h'}u. \end{aligned}$$
(12)

By replacing \(h'\) with \(h'+1-h\) in the second sum in (12) we get

$$\begin{aligned} Yu=-\sum \limits _{h =0}^{m-2} \mathbf A _{h}(t,D_x) \sum _{h'=h}^{m-2} \left( {\begin{array}{c}m-1-h\\ h'+1-h\end{array}}\right) (D_t^{h'+1-h}A)(t,D_x)D_t^{m-2-h'}u \end{aligned}$$

and then by (10) we conclude that \(Yu=B(t,D_t,D_x)u\) as desired. It remains to show that

$$\begin{aligned} \sum \limits _{h =0}^{m-1} \mathbf A _{h}(t,D_x) D_t^{m-h} u + Xu = \det (I_mD_t - A(t,D_x))u. \end{aligned}$$
(13)

By (8), we obtain

$$\begin{aligned} \mathbf A _{h}(t,D_x) A(t,D_x) = \mathbf A _{h+1}(t,D_x) - \sigma _{h+1}^{(m)}(\lambda ) I_m \end{aligned}$$

and, thus,

$$\begin{aligned} X= & {} -\sum _{h=0}^{m-1} \mathbf A _h(t,D_x)A(t,D_x)D_t^{m-1-h} \\= & {} -\sum _{h=1}^{m} \mathbf A _{h}(t,D_x)D_t^{m-h} + \underbrace{\sum _{h=1}^{m}\sigma _{h}^{(m)}(\lambda ) I_m D_t^{m-h}}_{=\det (I_m D_t - A(t,D_x))-I_mD_t^m\,(\text {see}\, (57))}. \end{aligned}$$

Using that \(\mathbf A _m =0\) [thanks to the Cayley–Hamilton theorem, see (58)] and \(\mathbf A _0 = I_m\), we obtain (13) which concludes the proof. \(\square \)

It will be convenient for the description of some important matrices in this paper to rewrite the lower order terms in a different way. More precisely, we have the following corollary.

Corollary 3.3

We can write the lower order term in (9) as

$$\begin{aligned} B(t,D_t,D_x) = -\sum _{h=0}^{m-2} \mathbf B _{h+1}(t,D_x)D_t^{h}, \end{aligned}$$
(14)

where

$$\begin{aligned} \mathbf B _{h+1}(t,D_x) = \sum _{h'=0}^{m-2-h} \left( {\begin{array}{c}m-1-h'\\ h\end{array}}\right) \mathbf A _{h'}(t,D_x)(D_t^{m-1-h-h'}A)(t,D_x) \end{aligned}$$
(15)

and \(\mathbf A _{h'}(t,D_x)\) is given by (8).

Proof

Formula (14) follows from (9) by interchanging the order of the sums appropriately. Indeed, we have, using (8) and (10), that

$$\begin{aligned}&B(t,D_t,D_x) \nonumber \\&\quad = - \sum _{h=0}^{m-2} \mathbf A _h(t,D_x) \sum _{h'=h}^{m-2} \left( {\begin{array}{c}m-1-h\\ h'+1-h\end{array}}\right) (D_t^{h'+1-h}A)(t,D_x)D_t^{m-2-h'} \nonumber \\&\quad = -\sum _{h'=0}^{m-2} \underbrace{\sum _{h=0}^{h'} \mathbf A _h(t,D_x) \left( {\begin{array}{c}m-1-h\\ h'+1-h\end{array}}\right) (D_t^{h'+1-h}A)(t,D_x)}_{=: \mathbf B _{m-1-h'}(t,D_x)} D_t^{m-2-h'} \nonumber \\&\quad = - \sum _{h=0}^{m-2} \mathbf B _{h+1}(t,D_x)D_t^h , \end{aligned}$$
(16)

with

$$\begin{aligned} \mathbf B _{h+1}(t,D_x) = \sum _{h'=0}^{m-2-h} \left( {\begin{array}{c}m-1-h'\\ h\end{array}}\right) \mathbf A _{h'}(t,D_x)(D_t^{m-1-h-h'}A)(t,D_x). \end{aligned}$$

Note that in computing \(\mathbf B _{h+1}\) in the last line of (16), we use the binomial identity \(\left( {\begin{array}{c}m-1-h\\ m-1-h-k\end{array}}\right) = \left( {\begin{array}{c}m-1-h\\ k\end{array}}\right) \) and reorder the summation. This completes the proof after relabelling summation indices. \(\square \)

Note that by rewriting the lower order terms as in Corollary 3.3 we clearly see that \(B(t,D_t,D_x)\) is of order \(m-2\) in \(D_t\) rather than of order \(m-1\). As explanatory examples we give a closer look to the operator \(B(t,D_t,D_x)\) in the cases \(m=2\) and \(m=3\).

Example 3.4

Consider \(m=2\): The sum in (14) has only one term. We have

$$\begin{aligned} \mathbf B _1(t,D_x) = \mathbf A _{0}(t,D_x)(D_tA)(t,D_x) \end{aligned}$$

with \(\mathbf A _{0}(t,D_x) = \sigma _{0}^{(2)}(\lambda )A^0(t,D_x) = I_2\) (see Lemma 7.4).

Example 3.5

Consider \(m=3\). The sum in (14) has two terms. We have

$$\begin{aligned} \mathbf B _1(t,D_x)= & {} \sum _{h'=0}^1 \left( {\begin{array}{c}2-h'\\ 0\end{array}}\right) \mathbf A _{h'}(t,D_x)(D_t^{2-h'}A)(t,D_x), \\= & {} \mathbf A _{0}(t,D_x)(D_t^2A)(t,D_x) + \mathbf A _1(t,D_x)(D_t A)(t,D_x), \\= & {} (D_t^2 A)(t,D_x) + (A(t,D_x)-{{\mathrm{tr}}}(A)(t,D_x)I_3)(D_tA)(t,D_x), \end{aligned}$$

and

$$\begin{aligned} \mathbf B _2(t,D_x) = 2 \mathbf A _0(t,D_x) (D_tA)(t,D_x) = 2 (D_tA)(t,D_x). \end{aligned}$$

Here we used the fact that \(\mathbf A _{0}(t,D_x) = \sigma _{0}^{(3)}(\lambda )A^0(t,D_x) = I_3\) and \(\sigma _1^{(3)}(\lambda ) = -{{\mathrm{tr}}}(A)(t,D_x)\) (see Lemma 7.4).

Corollary 3.3 completes Step 2 of our proof and allows us to transform (1) into

$$\begin{aligned}&{{{\mathrm{\mathbf {adj}}}}}(I_mD_t - A(t,D_x))(I_mD_t - A(t,D_x))u \nonumber \\&\quad = \delta (t,D_t,D_x)I_mu + B(t,D_t,D_x)u=0, \end{aligned}$$
(17)

where \(\delta (t,D_t,D_x)\) has symbol \(\det (I_m\tau - A(t,\xi ))\) and \(B(t,D_t,D_x)\) is given by (14). Note that \(\delta (t,D_t,D_x)\) is the scalar operator

$$\begin{aligned} D_t^m + \sum _{h=1}^{m} c_{h}(t,D_x)D_t^{m-h}, \end{aligned}$$

with \(c_{h}(t,\xi )\) homogeneous polynomial of order h with respect to \(\xi \) and therefore \(\delta (t,D_t,D_x)I_m\) is a decoupled system of m identical scalar differential operators of order m while \(B(t,D_t,D_x)\) is a system of differential operators of order \(m-1\). As mentioned before, the \(c_h(t,\xi )\) are the coefficients of the characteristic polynomial of \(A(t,\xi )\), see Appendix.

3.3 Step 3: Reduction to a first order system of pseudodifferential equations

We now transform the system in (17) into a system of pseudodifferential equations by following Taylor in [23]. More precisely, we transform each m-th order scalar equation in \(\delta (t,D_t,D_x)I_m\) into a first order pseudodifferential system in Sylvester form. In this way we obtain m systems with identical Sylvester matrix which can be put together in block-diagonal form obtaining a block-diagonal \(m^2\times m^2\) matrix with m identical Sylvester blocks. The precise structure of the lower order terms will be worked out in the next subsection. To carry out this transformation, we set

$$\begin{aligned} U= & {} (U_1,\dots ,U_m)^T \in \mathbb {R}^{m^2} \nonumber \\ U_i:= & {} \left( D_t^{j-1}\langle D_{x} \rangle ^{m-j}u_i \right) _{j=1,\dots ,m} \in \mathbb {R}^m, \quad i=1,\dots ,m, \end{aligned}$$
(18)

where the \(u_i\) are the components of the original vector u in (1). We can rewrite the Cauchy problem for (17) as

$$\begin{aligned} \left\{ \begin{array}{l} D_t U = \mathcal A(t,D_x) U + \mathcal B(t,D_x)U, \\ \left. U \right| _{t=0} = U_0=(U_{0,1}, \ldots , U_{0,m})^T, \end{array} \right. \end{aligned}$$
(19)

where the components \(U_{0,i}\) of the \(m^2\)-column vector \(U_0\) are given by

$$\begin{aligned} U_{0,i}=\left( D_t^{j-1}\langle D_{x} \rangle ^{m-j}u_{i}(0,x)\right) _{j=1,\ldots , m}, \end{aligned}$$

and u is the solution of the Cauchy problem (1) with \(u(0,x)=u_0\). Passing now to analyse the matrices \(\mathcal A(t,D_x)\) and \(\mathcal {B}(t,D_x)\), we have that \(\mathcal {A}(t,D_x)\) is an \(m^2 \times m^2\) block diagonal matrix of m identical blocks of size \(m \times m\) of the type

$$\begin{aligned} \langle D_{x} \rangle \left( \begin{array}{ccccc} 0 &{} 1 &{} 0 &{} \cdots &{} 0 \\ 0 &{} 0 &{} 1 &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \cdots &{} 1 \\ -c_{m}(t,D_x)\langle D_{x} \rangle ^{-m} &{} -c_{m-1}(t,D_x)\langle D_{x} \rangle ^{-m+1} &{} \dots &{} \dots &{} -c_1(t,D_x)\langle D_{x} \rangle ^{-1} \end{array}\right) .\nonumber \\ \end{aligned}$$
(20)

and the matrix \(\mathcal B(t,D_x)\) is composed of m matrices of size \(m \times m^2\) as follows:

$$\begin{aligned} \left( \begin{array}{cccccc} 0 &{}0 &{} 0 &{} \dots &{} 0 &{} 0 \\ 0 &{}0 &{} 0 &{} \dots &{} 0 &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ l_{i,1}(t,D_x) &{} l_{i,2}(t,D_x) &{} \dots &{} \dots &{} l_{i,m^2-1}(t,D_x) &{} l_{i,m^2}(t,D_x), \end{array}\right) \end{aligned}$$
(21)

\(i=1,\dots ,m\). Note that the entries of the matrices \(\mathcal {A}(t,D_x)\) and \(\mathcal {B}(t,D_x)\) are pseudodifferential operators of order 1 and 0, respectively.

3.4 Step 4: Structure of the matrix \(\mathcal {B}(t,D_x)\) of the lower order terms

To analyse the structure of the \(m^2\times m^2\) matrix \(\mathcal {B}(t,D_x)\) we recall that it is obtained from the \(m\times m\) matrix \(B(t,D_t,D_x)\) in (17) via the transformation (18).

From Corollary 3.3 we have that

$$\begin{aligned} B(t,D_t,D_x)u = \left( - \sum _{h=0}^{m-2} \sum _{j=1}^m b_{ij}^{(h+1)}(t,D_x)D_t^h u_j \right) _{i=1,\dots ,m}, \end{aligned}$$
(22)

where the \(b_{ij}^{(h+1)}(t,D_x)\) denote the (ij)-element of \(\mathbf B _{h+1}(t,D_x)\) in (14). By the previously described transform (18), we obtain that

$$\begin{aligned} D_t^m u_i = -\sum _{h=0}^{m-1} c_{m-h}(t,D_x)D_t^h u_i + \sum _{j=1}^{m} \sum _{h=0}^{m-2} b^{(h+1)}_{ij}(t,D_x)D_t^h u_j \end{aligned}$$

and, thus, see that the coefficients \(b_{ij}^{(1)}(t,D_x)\) in (22) will be associated to \(l_{i,1+(j-1)m}(t,D_x)\) for \(j=1,\dots ,m\), the coefficients \(b_{ij}^{(2)}(t,D_x)\) to \(l_{i,2+(j-1)m}(t,D_x)\) for \(j=1,\dots ,m\) and so forth. In particular, we get that \(l_{i,m+(j-1)m}(t,D_x) \equiv 0\) for \(j=1,\dots ,m\) which is due to the fact that (1) is homogeneous. As a general formula for the non-zero elements of \(\mathcal {B}(t,D_x)\), we can write

$$\begin{aligned} l_{i,h+1+(j-1)m}(t,D_x) = b_{ij}^{(h+1)}(t,D_x)\langle D_{x} \rangle ^{1-m+h} \end{aligned}$$
(23)

for \(j=1,\dots ,m\) and \(h=0,\dots ,m-2\).

To avoid further complication of the notation, we consider the \(b_{ij}^{(l)}(t,\xi )\) from now on as the by \(\langle \xi \rangle ^{l-m}\) scaled elements in (23) if referenced as elements of \(\mathcal {B}(t,\xi )\).

For the convenience of the reader, we conclude this section by illustrating the Steps 1-4 in the case \(m=2\) and \(m=3\). For simplicity, we take \(x\in \mathbb {R}\).

3.5 Steps 1–4 for \(m=2\)

We consider the system

$$\begin{aligned} D_t u - A(t)D_x u = D_t \left( \begin{array}{l} u_1 \\ u_2 \end{array}\right) - \left( \begin{array}{ll} a_{11}(t) &{} a_{12}(t) \\ a_{21}(t) &{} a_{22}(t) \end{array}\right) D_x \left( \begin{array}{l} u_1 \\ u_2 \end{array}\right) = 0 \end{aligned}$$
(24)

for \((t,x) \in [0,T] \times \mathbb {R}\). Computing the adjunct of \(I_2 \tau - A(t)\xi \) we obtain

$$\begin{aligned} {{{\mathrm{\mathbf {adj}}}}}(I_2 \tau - A(t)\xi ) = \left( \begin{array}{ll} \tau &{} 0 \\ 0 &{} \tau \end{array}\right) - \left( \begin{array}{ll} a_{22}(t) &{} -a_{12}(t) \\ -a_{21}(t) &{} a_{11}(t) \end{array}\right) \xi = I_2 \tau - {{{\mathrm{\mathbf {adj}}}}}(A)(t)\xi . \end{aligned}$$

Applying the corresponding operator to (24), we obtain

$$\begin{aligned} \left( I_2 D_t - {{{\mathrm{\mathbf {adj}}}}}(A)D_x \right) \left( I_2 D_t - A(t)D_x u\right)= & {} \delta (t,D_t,D_x)u - (D_tA)(t)D_x u \nonumber \\= & {} \delta (t,D_t,D_x)u - \mathbf B _1(t,D_x)u, \end{aligned}$$
(25)

where \(\mathbf B _1(t,D_x)\) is given by (15) with \(h=0\).

Now we set

$$\begin{aligned} U= & {} (U_1 , U_2, U_3, U_4)^T = (\langle D_{x} \rangle u_1, D_t u_1 , \langle D_{x} \rangle u_2, D_t u_2)^T \\ D_t U= & {} ( \langle D_{x} \rangle U_2, D_t^2 u_1 , \langle D_{x} \rangle U_4, D_t^2 u_2)^T. \end{aligned}$$

and, thus, get the system

$$\begin{aligned} D_t U = \mathcal A(t,D_x) U + \mathcal B(t,D_x) U, \end{aligned}$$

where \(\mathcal {A}(t,D_x)\) is a \(4 \times 4\) block diagonal matrix, as in (20), with the block

$$\begin{aligned} \langle D_{x} \rangle \left( \begin{array}{cc} 0 &{} 1 \\ -\mathrm{det}(A)(t)D_x^2\langle D_{x} \rangle ^{-2} &{} {{\mathrm{tr}}}(A)(t)D_x\langle D_{x} \rangle ^{-1} \end{array}\right) \end{aligned}$$

and \(\mathcal {B}(t,D_x)\) is a \(4 \times 4\) matrix of two \(2 \times 4\) blocks

$$\begin{aligned} \mathcal {B}_i(t,D_x)= \left( \begin{array}{cccc} 0 &{} 0 &{} 0 &{} 0 \\ D_t a_{1i}(t)D_x\langle D_{x} \rangle ^{-1} &{} 0 &{} D_t a_{2i}(t)D_x\langle D_{x} \rangle ^{-1} &{} 0 \end{array}\right) , \quad i=1,2. \end{aligned}$$

Note that the entries of the matrix \(\mathcal {B}_i(t,D_x)\) can be obtained from (23) by setting \(h=0\) and \(j=1,2\).

3.6 Steps 1–4 for \(m=3\)

We consider

$$\begin{aligned} D_t \left( \begin{array}{l} u_1 \\ u_2 \\ u_3 \end{array}\right) - \left( \begin{array}{lll} a_{11}(t) &{} a_{12}(t) &{} a_{13}(t) \\ a_{21}(t) &{} a_{22}(t) &{} a_{23}(t) \\ a_{31}(t) &{} a_{32}(t) &{} a_{33}(t) \end{array}\right) D_x \left( \begin{array}{l} u_1 \\ u_2 \\ u_3 \end{array}\right) = 0 \end{aligned}$$

for \((t,x) \in [0,T] \times \mathbb {R}\). We have

$$\begin{aligned} {{{\mathrm{\mathbf {adj}}}}}(I_3 \tau - A(t)\xi ) = I_3 \tau ^2 + (A(t)-{{\mathrm{tr}}}(A)(t)I_3)\xi \tau + {{{\mathrm{\mathbf {adj}}}}}(A)(t)\xi ^2 \end{aligned}$$

and therefore

$$\begin{aligned} {{{\mathrm{\mathbf {adj}}}}}(I_3 D_t - A(t)D_x) = I_3 D_t^2 + (A(t)-{{\mathrm{tr}}}(A)(t))I_3 D_tD_x + {{{\mathrm{\mathbf {adj}}}}}(A)(t)D_x^2. \end{aligned}$$

Applying this operator to the original system, we obtain

$$\begin{aligned} {{{\mathrm{\mathbf {adj}}}}}(I_3 D_t - A(t)D_x)(I_3 D_t - A(t)D_x)u = \delta (t,D_t,D_x)u + B(t,D_t,D_x)u, \end{aligned}$$

where we used the fact that \({{{\mathrm{\mathbf {adj}}}}}(A) = A^2 + c_1A + c_2I_3\) (see example Example 7.6) and set

$$\begin{aligned} B(t,D_t,D_x)&= - (D_t^2 A)(t)D_x - 2(D_t A)(t)D_xD_t + {{\mathrm{tr}}}(A)(t)(D_t A)(t)D_x^2 \nonumber \\&\qquad - A(t)(D_t A)(t)D_x^2, \\&= -\mathbf B _1(t,D_x) - \mathbf B _2(t,D_x)D_t,\nonumber \end{aligned}$$
(26)

corresponding to (14). Now we introduce

$$\begin{aligned}&U&= (U_1, U_2, U_3)^T \in \mathbb {R}^9 \,\,\text {with} \\&U_j&= ( \langle D_{x} \rangle ^2 u_j, D_t \langle D_{x} \rangle u_j, D_t^2 u_j),\quad j=1,2,3. \end{aligned}$$

Thus, we obtain

$$\begin{aligned} D_t U = \mathcal A(t,D_x) U + \mathcal B(t,D_x)U, \end{aligned}$$

where \(\mathcal {A}(t,D_x)\) is a block diagonal matrix with three blocks of the type

$$\begin{aligned} \langle D_{x} \rangle \left( \begin{array}{ccc} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 1 \\ -c_3(t,D_x)\langle D_x \rangle ^{-3} &{} -c_2(t,D_x) \langle D_x \rangle ^{-2}&{} -c_1(t,D_x)\langle D_x \rangle ^{-1} \end{array}\right) . \end{aligned}$$

By direct computation (see Appendix), we get that \(c_{h}(t,D_x) = \sigma _h^{(3)}(\lambda )\), where

$$\begin{aligned} \sigma _1^{(3)}(\lambda )= & {} - {{\mathrm{tr}}}(A)(t,D_x) \\ \sigma _2^{(3)}(\lambda )= & {} a_{11}(t)a_{22}(t)D_x^2 + a_{11}(t)a_{33}(t)D_x^2 + a_{22}(t)a_{33}(t)D_x^2 \\&- a_{23}(t)a_{32}(t)D_x^2 - a_{12}(t)a_{21}(t)D_x^2 - a_{31}(t)a_{13}(t)D_x^2\\ \sigma _3^{(3)}(\lambda )= & {} -\det (A)(t,D_x). \end{aligned}$$

Indeed, since

$$\begin{aligned} \det (I_3\tau - A) = \prod _{h=1}^3 (\tau -\lambda _i) =\sum _{h=0}^3 \sigma ^{(3)}_h(\lambda ) \tau ^{3-h}, \end{aligned}$$

it follows that

$$\begin{aligned}&\det (I_3\tau - A) \nonumber \\&\quad = \tau ^3 + \underbrace{(-a_{11}-a_{22}-a_{33})}_{\sigma _{1}^{(3)}(\lambda ) = -{{\mathrm{tr}}}(A)} \tau ^2 \\&\qquad + \underbrace{(a_{11}a_{22}-a_{12}a_{21}+a_{11}a_{33}-a_{13}a_{31}+a_{22}a_{33}-a_{23}a_{32})}_{\sigma _2^{(3)}(\lambda )} \tau \\&\qquad + \underbrace{(-a_{11}a_{22}a_{33}+a_{11}a_{23}a_{32}+a_{12}a_{21}a_{33} - a_{11}a_{23}a_{31} - a_{13}a_{21}a_{32} + a_{13}a_{22}a_{31})}_{\sigma _3^{(3)}(\lambda ) = -\det (A)}. \end{aligned}$$

Finally, the matrix \(\mathcal {B}(t,D_x)\) is made of three blocks of \(3 \times 9\) matrices

$$\begin{aligned} \mathcal {B}_k(t,D_x)= \left( \begin{array}{lllllllll} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ b_{k1}^{(1)} &{} b_{k1}^{(2)} &{} 0 &{} b_{k2}^{(1)} &{} b_{k2}^{(2)} &{} 0 &{} b_{k3}^{(1)} &{} b_{k3}^{(2)} &{} 0 \end{array}\right) , \end{aligned}$$

\(k=1,2,3\) which correspond to (21) via formula (23). More precisely, we get

$$\begin{aligned} b^{(1)}_{kj}= & {} (D_t^2 a_{kj} + 2D_t a_{kj} - {{\mathrm{tr}}}(A_0)D_t a_{kj})D_x\langle D_x \rangle ^{-1},\nonumber \\ b^{(2)}_{kj}= & {} (a_{k1}D_t a_{1j} + a_{k2}D_t a_{2j} + a_{k3}D_t a_{3j})D_x^2\langle D_x \rangle ^{-2}, \end{aligned}$$
(27)

for \(k=1,2,3\) and \(j=1,2\). The elements \(b^{(1)}_{kj}\) and \(b^{(2)}_{kj}\) can are the scaled (kj)-elements of the matrices \(\mathbf B _1(t,D_x)\) and \(\mathbf B _2(t,D_x)\) from (26) respectively.

4 Energy estimate

Now we apply the Fourier transform with respect to x to the Cauchy problem in (19) and set \(\mathcal F_{x \rightarrow \xi }(U)(t,\xi ) =: V(t,\xi )\). We then obtain

$$\begin{aligned} \left\{ \begin{array}{ll} D_t V = \mathcal A(t,\xi ) V + \mathcal B(t,\xi ) V, \\ \left. V\right| _{t=0} = V_0, \end{array} \right. \end{aligned}$$
(28)

where \(V_0=\widehat{U_0}.\) From now on, we will concentrate on (28) and the matrix

$$\begin{aligned} \mathcal A_0(t,\xi ) := \langle \xi \rangle ^{-1} \mathcal A(t,\xi ). \end{aligned}$$

Note that by construction of \(\mathcal A(t,\xi )\), the matrix \(\mathcal A_0(t,\xi )\) is made of m identical Sylvester type blocks with eigenvalues \(\lambda _l(t,\xi )\), \(l=1,\dots ,m\), where \(\lambda _l(t,\xi )\langle \xi \rangle \), \(l=1,\dots , m\) are the rescaled eigenvalues of the original matrix \(A(t,\xi )\) in (1).

4.1 Step 5: Computing the energy estimate

Let \(\mathcal {Q}_\varepsilon ^{(m)}(t,\xi )\) be the quasi-symmetriser of the matrix \(\mathcal A_0(t,\xi )\). By Remark 2.5 it will be a \(m^2\times m^2\) block diagonal matrix with m identical blocks given by the quasi-symmetriser \(Q_\varepsilon ^{(m)}(t,\xi )\) of the defining block of \(\mathcal A_0(t,\xi )\) (see Sect. 2 for definition and properties). Hence, we define the energy

$$\begin{aligned} E_\varepsilon (V)(t,\xi ) = \big ( \mathcal {Q}_\varepsilon ^{(m)}(t,\xi )V(t,\xi )| V(t,\xi ) \big ) \end{aligned}$$

where \((\cdot | \cdot )\) denotes the scalar product in \(\mathbb {R}^{m^2}\). To improve the readability, we drop the dependencies on t and \(\xi \) in the following unless we find it important to stress. By direct computations we have

$$\begin{aligned} \partial _t E_\varepsilon&=(\partial _t\mathcal {Q}^{(m)}_\varepsilon V | V)+ i(\mathcal {Q}^{(m)}_\varepsilon D_tV| V)-i(\mathcal {Q}^{(m)}_\varepsilon V | D_tV)\\&=(\partial _t\mathcal {Q}^{(m)}_\varepsilon V | V)+i(\mathcal {Q}^{(m)}_\varepsilon (\mathcal {A} V+\mathcal {BV}) | V)-i(\mathcal {Q}^{(m)}_\varepsilon V | \mathcal {A} V+ \mathcal {B}V)\\&=(\partial _t\mathcal {Q}^{(m)}_\varepsilon V | V)+i\langle \xi \rangle ((\mathcal {Q}^{(m)}_\varepsilon \mathcal {A}_0- \mathcal {A}_0^*\mathcal {Q}^{(m)}_\varepsilon )V | V)\\&\quad +i((\mathcal {Q}^{(m)}_\varepsilon \mathcal {B}- \mathcal {B}^*\mathcal {Q}^{(m)}_\varepsilon )V | V). \end{aligned}$$

It follows that

$$\begin{aligned} \partial _t E_\varepsilon&\le \frac{|(\partial _t\mathcal {Q}^{(m)}_\varepsilon V | V)|E_\varepsilon }{(\mathcal {Q}^{(m)}_\varepsilon V | V)}+|\langle \xi \rangle ((\mathcal {Q}^{(m)}_\varepsilon \mathcal {A}_0- \mathcal {A}_0^*\mathcal {Q}^{(m)}_\varepsilon )V | V)| \nonumber \\&\quad +|((\mathcal {Q}^{(m)}_\varepsilon \mathcal {B}- \mathcal {B}^*\mathcal {Q}^{(m)}_\varepsilon )V | V)|. \end{aligned}$$
(29)

By Proposition 2.1 it follows that \(\mathcal {Q}_\varepsilon ^{(m)}(t,\xi )\) is a family of \(C^\infty \), non-negative Hermitian matrices such that

$$\begin{aligned} \mathcal {Q}_\varepsilon ^{(m)}(t,\xi )=\mathcal {Q}_0^{(m)}(t,\xi )+\varepsilon ^2 \mathcal {Q}_1^{(m)}(t,\xi )+\cdots +\varepsilon ^{2(m-1)}\mathcal {Q}_{m-1}^{(m)}(t,\xi ). \end{aligned}$$

In addition, by the same proposition, there exists a constant \(C_m>0\) such that for all \(t\in [0,T]\), \(\xi \in \mathbb {R}^n\) and \(\varepsilon \in (0,1]\) the following estimates hold uniformly in \(V \in \mathbb {R}^{m^2}\):

$$\begin{aligned} C_m^{-1}\varepsilon ^{2(m-1)}|V|^2&\le (\mathcal {Q}^{(m)}_\varepsilon V|V)\le C_m|V|^2,\end{aligned}$$
(30)
$$\begin{aligned} |((\mathcal {Q}_\varepsilon ^{(m)}\mathcal A_0-\mathcal A_0^*\mathcal {Q}_\varepsilon ^{(m)}) V|V)|&\le C_m\varepsilon (\mathcal {Q}_\varepsilon ^{(m)} V|V) \end{aligned}$$
(31)

Finally, the hypothesis (2) on the eigenvalues and Proposition 2.3 ensure that the family

$$\begin{aligned} \{ \mathcal {Q}_\varepsilon ^{(m)}(t,\xi ):\, \varepsilon \in (0,1],\, t\in [0,T],\, \xi \in \mathbb {R}^n\} \end{aligned}$$

is nearly diagonal.

Note that since the entries of the matrix \(A(t,\xi )\) in (1) are \(C^\infty \) with respect to t, the matrices \(\mathcal A(t,\xi )\) and \(\mathcal B(t,\xi )\) as well as the quasi-symmetriser have the same regularity properties.

We now proceed by estimating the three summands in the right-hand side of (29). Due to the block diagonal structure of the matrices involved we can make use of the proof strategy adopted for the scalar case in [12, Subsections 4.1, 4.2, 4.3].

4.2 First term

Let \(k\ge 1\). We write \(\frac{|(\partial _t\mathcal {Q}^{(m)}_\varepsilon V|V)|}{(\mathcal {Q}^{(m)}_\varepsilon V| V)}\) as

$$\begin{aligned} \frac{|(\partial _t\mathcal {Q}^{(m)}_\varepsilon V,V)|}{(\mathcal {Q}^{(m)}_\varepsilon V | V)^{1-1/k}(\mathcal {Q}^{(m)}_\varepsilon V, V)^{1/k}}. \end{aligned}$$

From (30) we have

$$\begin{aligned} \frac{|(\partial _t\mathcal {Q}^{(m)}_\varepsilon V|V)|}{(\mathcal {Q}^{(m)}_\varepsilon V| V)}\le & {} \frac{|(\partial _t\mathcal {Q}^{(m)}_\varepsilon V|V)|}{(\mathcal {Q}^{(m)}_\varepsilon V | V)^{1-1/k}(C_m^{-1}\varepsilon ^{2(m-1)}|V|^2)^{1/k}}\\\le & {} C_m^{1/k}\varepsilon ^{-2(m-1)/k}\frac{|(\partial _t \mathcal {Q}^{(m)}_\varepsilon V|V)|}{(\mathcal {Q}^{(m)}_\varepsilon V | V)^{1-1/k}|V|^{2/k}}. \end{aligned}$$

A block-wise application of Lemma 2.4 yields the estimate

$$\begin{aligned} \int _{0}^T\frac{|(\partial _t\mathcal {Q}^{(m)}_\varepsilon V|V)|}{(\mathcal {Q}^{(m)}_\varepsilon V| V)}\, dt\le & {} C_m^{1/k}\varepsilon ^{-2(m-1)/k}C_T\sup _{\xi \in \mathbb {R}^n}\Vert \mathcal {Q}_\varepsilon (\cdot ,\xi )\Vert ^{1/k}_{{C}^k([0,T])}\\\le & {} C_1\varepsilon ^{-2(m-1)/k}, \end{aligned}$$

for all \(\varepsilon \in (0,1]\). Setting \(\frac{|(\partial _t\mathcal {Q}^{(m)}_\varepsilon V | V)|}{(\mathcal {Q}^{(m)}_\varepsilon V | V)} =: K_\varepsilon (t,\xi )\), we can conclude that

$$\begin{aligned} \frac{|(\partial _t \mathcal {Q}^{(m)}_\varepsilon V|V)|E_\varepsilon }{(Q^{(m)}_\varepsilon V | V)}=K_\varepsilon (t,\xi )E_\varepsilon , \end{aligned}$$

with

$$\begin{aligned} \int _{0}^T K_\varepsilon (t,\xi )\, dt\le C_1\varepsilon ^{-2(m-1)/k}. \end{aligned}$$

4.3 Second term

From the property (31) we immediately have that

$$\begin{aligned} |\langle \xi \rangle ((\mathcal {Q}^{(m)}_\varepsilon \mathcal A_0- \mathcal A_0^*\mathcal {Q}^{(m)}_\varepsilon )V|V)|\le C_m\varepsilon \langle \xi \rangle (\mathcal {Q}_\varepsilon ^{(m)}V|V)\le C_2\varepsilon \langle \xi \rangle E_\varepsilon . \end{aligned}$$

4.4 Third term

In this subsection, we treat the third term on the right-hand side of (29). By Proposition 2.1(iv) and the definition of the matrix \(\mathcal B(t,\xi )\) we have that

$$\begin{aligned} ((\mathcal {Q}^{(m)}_\varepsilon \mathcal B- \mathcal B^*\mathcal {Q}^{(m)}_\varepsilon )V|V)&= ((\mathcal {Q}_0^{(m)}\mathcal B- \mathcal B^*\mathcal {Q}_0^{(m)})V|V)\\&\quad +\varepsilon ^2\sum _{i=1}^m((\mathcal {Q}^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp \mathcal B-\mathcal B^*\mathcal {Q}^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp )V|V), \end{aligned}$$

with \(\mathcal {Q}^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp \) block diagonal matrix with m blocks \({Q}^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp \) as defined in Proposition 2.1(iv). Note that

$$\begin{aligned} (\mathcal {Q}^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp \mathcal B - \mathcal B^*\mathcal {Q}^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp )= 0, \end{aligned}$$

for all \(i=1,\dots ,m\), due to the structure of zeros in \(\mathcal B\) and in \(\mathcal {Q}^{(m-1)}_\varepsilon (\pi _i\lambda )^\sharp \). Thus,

$$\begin{aligned} ((\mathcal {Q}^{(m)}_\varepsilon \mathcal B - \mathcal B^*\mathcal {Q}^{(m)}_\varepsilon )V|V)=((\mathcal {Q}_0^{(m)} \mathcal B - \mathcal B^*\mathcal {Q}_0^{(m)})V|V). \end{aligned}$$

Since from Proposition 2.1(i) the quasi-symmetriser is made of non-negative matrices we have that

$$\begin{aligned} (\mathcal {Q}^{(m)}_0 V,V)\le E_\varepsilon . \end{aligned}$$

It is purpose of the next section to find suitable Levi conditions on \(\mathcal B(t,\xi )\) such that

$$\begin{aligned} |((\mathcal {Q}_0^{(m)} \mathcal B - \mathcal B^*\mathcal {Q}_0^{(m)})V | V)|\le C_3 (\mathcal {Q}^{(m)}_0 V | V)\le C_3 E_\varepsilon \end{aligned}$$
(32)

holds for some constant \(C_3>0\) independent of \(t\in [0,T]\), \(\xi \in \mathbb {R}^n\) and \(V\in \mathbb {C}^{m^2}\). We will then formulate these Levi-type conditions in terms of the matrix A in (1).

5 Estimates for the lower order terms

We remind the reader of the fact that the \(b_{ij}^{(l)}(t,\xi )\), if referenced as elements of \(\mathcal B(t,\xi )\), are the by \(\langle \xi \rangle ^{l-m}\) scaled (ij)-elements of \(\mathbf B _l(t,\xi )\) in (14). See also Sect. 3.4 for details.

To start, we rewrite \(((\mathcal {Q}_0^{(m)} \mathcal B - \mathcal B^*\mathcal {Q}_0^{(m)})V | V)\) in terms of the matrix \(\mathcal W^{(m)}\). Recall that from Sect. 2, \(\mathcal W^{(m)}\) is the \(m^2\times m^2\) block diagonal matrix with m identical blocks

$$\begin{aligned} W^{(m)}=\left( \begin{array}{l} W^{(m)}_1(\lambda )\\ \vdots \\ W^{(m)}_m(\lambda ) \end{array}\right) , \end{aligned}$$

with

$$\begin{aligned} W^{(m)}_i(\lambda )=(\sigma _{m-1}^{(m-1)}(\pi _i\lambda ),ldots,\sigma _1^{(m-1)}(\pi _i\lambda ),1),\quad 1\le i\le m. \end{aligned}$$

From Proposition 2.1(v) we have

$$\begin{aligned} ((\mathcal {Q}_0^{(m)} \mathcal B - \mathcal B^*\mathcal {Q}_0^{(m)})V | V)= & {} (m-1)!((\mathcal {W}^{(m)} \mathcal BV | \mathcal {W}^{(m)}V)-(\mathcal {W}^{(m)}V |\mathcal {W}^{(m)}\mathcal BV))\\= & {} 2i(m-1)!{{\mathrm{Im}}}(\mathcal {W}^{(m)}\mathcal B V | \mathcal {W}^{(m)}V). \end{aligned}$$

It follows that

$$\begin{aligned} |((\mathcal {Q}_0^{(m)} B-B^*\mathcal {Q}_0^{(m)})V | V)|\le 2(m-1)!|\mathcal {W}^{(m)}BV||\mathcal {W}^{(m)}V|. \end{aligned}$$

Since

$$\begin{aligned} (\mathcal {Q}_0^{(m)} V | V)=(m-1)!|\mathcal {W}^{(m)}V|^2, \end{aligned}$$

we have that if

$$\begin{aligned} |\mathcal {W}^{(m)}\mathcal BV| \le C|\mathcal {W}^{(m)}V| \end{aligned}$$
(33)

holds true for some constant \(C>0\), independent of t, \(\xi \) and V, then estimate (32) will hold true as well.

In the sequel, for the sake of simplicity we will make use of the following notation: given f and g two real valued functions in the variable y, \(f(y)\prec g(y)\) if there exists a constant \(C>0\) such that \(f(y)\le C g(y)\) for all y. More precisely, we will set \(y=(t,\xi )\) or \(y=(t,\xi ,V)\). Thus, (33) can be rewritten as

$$\begin{aligned} |\mathcal {W}^{(m)}\mathcal BV| \prec |\mathcal {W}^{(m)}V|. \end{aligned}$$

In analogy to the scalar case in [12] we will now focus on (33). Before proceeding with our general result, for advantage of the reader we will illustrate the main ideas leading to the Levi-type conditions on \(\mathcal B\) in the case \(m=2\) and \(m=3\).

5.1 The case \(m=2\)

For simplicity we take \(n=1\). From Sects. 3.5 and 2.1 we have that

$$\begin{aligned} \mathcal B(t,\xi )=\left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} 0 &{} 0 &{} 0 &{} 0 \\ D_t a_{11}(t) &{} 0 &{} D_t a_{21}(t) &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ D_t a_{12}(t) &{} 0 &{} D_t a_{22}(t) &{} 0 \end{array}\right) \xi \langle \xi \rangle ^{-1} \end{aligned}$$

and

$$\begin{aligned} \mathcal W^{(2)}(t,\xi )=\left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} -\lambda _2 &{} 1 &{} 0 &{} 0 \\ -\lambda _1 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} -\lambda _2 &{} 1 \\ 0 &{} 0 &{} -\lambda _1 &{} 1 \\ \end{array}\right) , \end{aligned}$$

respectively. We have

$$\begin{aligned} \mathcal W^{(2)} \mathcal B V= & {} \left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} -\lambda _2 &{} 1 &{} 0 &{} 0 \\ -\lambda _1 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} -\lambda _1 &{} 1 \\ 0 &{} 0 &{} -\lambda _2 &{} 1 \\ \end{array}\right) \left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} 0 &{} 0 &{} 0 &{} 0 \\ D_t a_{11}(t) &{} 0 &{} D_t a_{21}(t) &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 \\ D_t a_{12}(t) &{} 0 &{} D_t a_{22}(t) &{} 0 \end{array}\right) \xi \langle \xi \rangle ^{-1} \left( \begin{array}{l} V_1 \\ V_2 \\ V_3 \\ V_4 \end{array}\right) \\= & {} \left( \begin{array}{llll} D_t a_{11}(t) &{} 0 &{} D_t a_{21}(t) &{} 0 \\ D_t a_{11}(t) &{} 0 &{} D_t a_{21}(t) &{} 0 \\ D_t a_{12}(t) &{} 0 &{} D_t a_{22}(t) &{} 0 \\ D_t a_{12}(t) &{} 0 &{} D_t a_{22}(t) &{} 0 \end{array}\right) \left( \begin{array}{l} V_1 \\ V_2 \\ V_3 \\ V_4 \end{array}\right) \xi \langle \xi \rangle ^{-1}\nonumber \\= & {} \left( \begin{array}{l} D_t a_{11}(t)V_1 + D_t a_{21}(t) V_3 \\ D_t a_{11}(t)V_1 + D_t a_{21}(t) V_3 \\ D_t a_{12}(t)V_1 + D_t a_{22}(t) V_3 \\ D_t a_{12}(t)V_1 + D_t a_{22}(t) V_3 \end{array}\right) \xi \langle \xi \rangle ^{-1} \end{aligned}$$

and

$$\begin{aligned} \mathcal W ^{(2)}V = \left( \begin{array}{llll} -\lambda _2 &{} 1 &{} 0 &{} 0 \\ -\lambda _1 &{} 1 &{} 0 &{} 0 \\ 0 &{} 0 &{} -\lambda _2 &{} 1 \\ 0 &{} 0 &{} -\lambda _1 &{} 1 \\ \end{array}\right) \left( \begin{array}{l} V_1 \\ V_2 \\ V_3 \\ V_4 \end{array}\right) = \left( \begin{array}{l} -\lambda _2 V_1 + V_2 \\ -\lambda _1 V_1 + V_2 \\ -\lambda _2 V_3 + V_4 \\ -\lambda _1 V_3 + V_4 \end{array}\right) . \end{aligned}$$

Thus, we obtain that \(|\mathcal W^{(2)} \mathcal B V|^ 2\prec |\mathcal W^{(2)}V|^2\) is equivalent to

$$\begin{aligned}&|D_t a_{11}(t)V_1 + D_t a_{21}(t) V_3|^2\xi \langle \xi \rangle ^{-1} + |D_t a_{12}(t)V_1 + D_t a_{22}(t) V_3|^2\xi \langle \xi \rangle ^{-1} \nonumber \\&\quad \prec |-\lambda _2 V_1 + V_2|^2 + |-\lambda _1 V_1 + V_2|^2 + |-\lambda _2 V_3 + V_4|^2 + |-\lambda _1 V_3 + V_4|^2. \end{aligned}$$
(34)

We now estimate the left-hand side of (34) from above and the right-hand side from below. We get

$$\begin{aligned}&|D_t a_{11}(t)V_1 + D_t a_{21}(t) V_3|^2 + |D_t a_{12}(t)V_1 + D_t a_{22}(t) V_3|^2 \\&\quad \prec ( |D_t a_{11}(t)|^2 + |D_t a_{12}(t)|^2 )|V_1|^2 + ( |D_t a_{21}(t)|^2 + |D_t a_{22}(t)|^2 ) |V_3|^2 \end{aligned}$$

and, by using the inequality \(|z_1|^2 + |z_2|^2 \ge \frac{1}{2} |z_1 - z_2|^2\), \(z_1,z_2\in \mathbb {C}\), and the condition (2) on the eigenvalues,

$$\begin{aligned}&|-\lambda _2 V_1 + V_2|^2 + |-\lambda _1 V_1 + V_2|^2 + |-\lambda _2 V_3 + V_4|^2 + |-\lambda _1 V_3 + V_4|^2 \\&\quad \succ (\lambda _2-\lambda _1)^2|V_1|^2 + (\lambda _2-\lambda _1)^2|V_3|^2 \\&\quad \succ (\lambda ^2_1 + \lambda _2^2)|V_1|^2 + (\lambda ^2_1 + \lambda _2^2)|V_2|^2. \end{aligned}$$

Combining the last two inequalities, we finally obtain that \(|\mathcal W^{(2)} \mathcal B V|^2\prec |\mathcal W V|^2\) provided that

$$\begin{aligned}&(|D_t a_{11}(t)|^2 + |D_t a_{21}(t)|^2)\xi \langle \xi \rangle ^{-1} \prec \lambda _1^2(t,\xi ) + \lambda _2^2(t,\xi ), \nonumber \\&(|D_t a_{12}(t)|^2 + |D_t a_{22}(t)|^2)\xi \langle \xi \rangle ^{-1} \prec \lambda _1^2(t,\xi ) + \lambda _2^2(t,\xi ). \end{aligned}$$
(35)

This is a Levi-type condition on the matrix of the lower order terms \(\mathcal B\) written in terms of the entries of the original matrix A in (1). Note that by adopting the notations introduced in Sect. 3.6 for the matrix \(\mathcal B\) in the case \(m=2\) as well, i.e.,

$$\begin{aligned} \mathcal B=\left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} 0 &{} 0 &{} 0 &{} 0\\ b_{11}^{(1)}(t) &{} 0 &{} b_{12}^{(1)}(t) &{}0 \\ 0 &{} 0 &{} 0 &{} 0\\ b_{21}^{(1)}(t) &{} 0 &{} b_{22}^{(1)}(t) &{} 0 \end{array}\right) \end{aligned}$$

the Levi-type conditions above can be written as

$$\begin{aligned} |b_{11}^{(1)}|^2 +|b_{21}^{(1)}|^2&\prec \lambda _1^2 +\lambda _2^2\\ |b_{12}^{(1)}|^2 +|b_{22}^{(1)}|^2&\prec \lambda _1^2 +\lambda _2^2, \end{aligned}$$

where \(\lambda _1^2+\lambda _2^2\) is the entry \(q_{11}\) of the symmetriser of the matrix \(A_0=A\langle \xi \rangle ^{-1}\).

5.2 The case \(m=3\)

We begin by recalling that from Sect. 3.6 the \(9\times 9\) matrix \(\mathcal B(t,\xi )\) is given by the \(3\times 9\) matrices \(\mathcal B_k(t,\xi )\), \(k=1,2,3\), as follows:

$$\begin{aligned} \mathcal B=\left( \begin{array}{l} \mathcal B_1\\ \mathcal B_2\\ \mathcal B_3 \end{array}\right) =\left( \begin{array}{lllllllll} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ b_{11}^{(1)}(t) &{} b_{11}^{(2)}(t) &{} 0 &{} b_{12}^{(1)}(t) &{} b_{12}^{(2)}(t) &{} 0 &{} b_{13}^{(1)}(t) &{} b_{13}^{(2)}(t) &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ b_{21}^{(1)}(t) &{} b_{21}^{(2)}(t) &{} 0 &{} b_{22}^{(1)}(t) &{} b_{22}^{(2)}(t) &{} 0 &{} b_{23}^{(1)}(t) &{} b_{23}^{(2)}(t) &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ b_{31}^{(1)}(t) &{} b_{31}^{(2)}(t) &{} 0 &{} b_{32}^{(1)}(t) &{} b_{32}^{(2)}(t) &{} 0 &{} b_{33}^{(1)}(t) &{} b_{33}^{(2)}(t) &{} 0 \end{array}\right) . \end{aligned}$$

Hence,

$$\begin{aligned} \mathcal W^{(3)} \mathcal B = \left( \begin{array}{lllllllll} b_{11}^{(1)} &{} b_{11}^{(2)} &{} 0 &{} b_{12}^{(1)} &{} b_{12}^{(2)} &{} 0 &{} b_{13}^{(1)} &{} b_{13}^{(2)} &{} 0 \\ b_{11}^{(1)} &{} b_{11}^{(2)} &{} 0 &{} b_{12}^{(1)} &{} b_{12}^{(2)} &{} 0 &{} b_{13}^{(1)} &{} b_{13}^{(2)} &{} 0 \\ b_{11}^{(1)} &{} b_{11}^{(2)} &{} 0 &{} b_{12}^{(1)} &{} b_{12}^{(2)} &{} 0 &{} b_{13}^{(1)} &{} b_{13}^{(2)} &{} 0 \\ b_{21}^{(1)} &{} b_{21}^{(2)} &{} 0 &{} b_{22}^{(1)} &{} b_{22}^{(2)} &{} 0 &{} b_{23}^{(1)} &{} b_{23}^{(2)} &{} 0 \\ b_{21}^{(1)} &{} b_{21}^{(2)} &{} 0 &{} b_{22}^{(1)} &{} b_{22}^{(2)} &{} 0 &{} b_{23}^{(1)} &{} b_{23}^{(2)} &{} 0 \\ b_{21}^{(1)} &{} b_{21}^{(2)} &{} 0 &{} b_{22}^{(1)} &{} b_{22}^{(2)} &{} 0 &{} b_{23}^{(1)} &{} b_{23}^{(2)} &{} 0 \\ b_{31}^{(1)} &{} b_{31}^{(2)} &{} 0 &{} b_{32}^{(1)} &{} b_{32}^{(2)} &{} 0 &{} b_{33}^{(1)} &{} b_{33}^{(2)} &{} 0 \\ b_{31}^{(1)} &{} b_{31}^{(2)} &{} 0 &{} b_{32}^{(1)} &{} b_{32}^{(2)} &{} 0 &{} b_{33}^{(1)} &{} b_{33}^{(2)} &{} 0 \\ b_{31}^{(1)} &{} b_{31}^{(2)} &{} 0 &{} b_{32}^{(1)} &{} b_{32}^{(2)} &{} 0 &{} b_{33}^{(1)} &{} b_{33}^{(2)} &{} 0 \end{array}\right) , \end{aligned}$$
(36)

and

$$\begin{aligned} \mathcal W^{(3)} V = \left( \begin{array}{l} \lambda _2\lambda _3 V_1 - (\lambda _2 + \lambda _3)V_2 + V_3 \\ \lambda _3\lambda _1 V_1 - (\lambda _3 + \lambda _1)V_2 + V_3 \\ \lambda _1\lambda _2 V_1 - (\lambda _1 + \lambda _2)V_2 + V_3 \\ \lambda _2\lambda _3 V_4 - (\lambda _2 + \lambda _3)V_5 + V_6 \\ \lambda _3\lambda _1 V_4 - (\lambda _3 + \lambda _1)V_5 + V_6 \\ \lambda _1\lambda _2 V_4 - (\lambda _1 + \lambda _2)V_5 + V_6 \\ \lambda _2\lambda _3 V_7 - (\lambda _2 + \lambda _3)V_8 + V_9 \\ \lambda _3\lambda _1 V_7 - (\lambda _3 + \lambda _1)V_8 + V_9 \\ \lambda _1\lambda _2 V_7 - (\lambda _1 + \lambda _2)V_8 + V_9 \\ \end{array}\right) . \end{aligned}$$
(37)

Note that \(\mathcal W^{(3)} \mathcal B\) is a \(9\times 9\) matrix with three blocks of three identical rows and \(\mathcal W^{(3)} V\) is a \(9\times 1\) matrix with three blocks of rows having the same structure in \(\lambda _1\), \(\lambda _2\) and \(\lambda _3\).

From (36), we deduce that

$$\begin{aligned} |\mathcal W \mathcal B V|^2\prec & {} \left( |b_{11}^{(1)}|^2 + |b_{21}^{(1)}|^2 + |b_{31}^{(1)}|^2 \right) |V_1|^2 \left( |b_{11}^{(2)}|^2 + |b_{21}^{(2)}|^2 + |b_{31}^{(2)}|^2 \right) |V_2|^2 \\&+ \left( |b_{12}^{(1)}|^2 + |b_{22}^{(1)}|^2 + |b_{32}^{(1)}|^2 \right) |V_4|^2 + \left( |b_{12}^{(2)}|^2 + |b_{22}^{(2)}|^2 + |b_{32}^{(21)}|^2 \right) |V_5|^2 \\&+ \left( |b_{13}^{(1)}|^2 + |b_{23}^{(1)}|^2 + |b_{33}^{(1)}|^2 \right) |V_7|^2 + \left( |b_{13}^{(2)}|^2 + |b_{23}^{(2)}|^2 + |b_{33}^{(21)}|^2 \right) |V_8|^2. \end{aligned}$$

Taking inspiration from the Levi conditions in [12] and in analogy with the case \(m=2\) we set

$$\begin{aligned} |b_{11}^{(1)}|^2 +|b_{21}^{(1)}|^2+|b_{31}^{(1)}|^2\prec & {} \lambda _1^2 \lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2 \lambda _3^2 \nonumber \\ |b_{12}^{(1)}|^2 +|b_{22}^{(1)}|^2+|b_{32}^{(1)}|^2\prec & {} \lambda _1^2 \lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2 \lambda _3^2 \nonumber \\ |b_{13}^{(1)}|^2 +|b_{23}^{(1)}|^2+|b_{33}^{(1)}|^2\prec & {} \lambda _1^2 \lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2 \lambda _3^2 \\ |b_{11}^{(2)}|^2 +|b_{21}^{(2)}|^2+|b_{31}^{(2)}|^2\prec & {} (\lambda _1+\lambda _2)^2 + (\lambda _1+\lambda _3)^2 + (\lambda _2+\lambda _3)^2 \nonumber \\ |b_{12}^{(2)}|^2 +|b_{22}^{(2)}|^2+|b_{32}^{(2)}|^2\prec & {} (\lambda _1+\lambda _2)^2 + (\lambda _1+\lambda _3)^2 + (\lambda _2+\lambda _3)^2 \nonumber \\ |b_{13}^{(2)}|^2 +|b_{23}^{(2)}|^2+|b_{33}^{(2)}|^2\prec & {} (\lambda _1+\lambda _2)^2 + (\lambda _1+\lambda _3)^2 + (\lambda _2+\lambda _3)^2.\nonumber \end{aligned}$$
(38)

Note that \(\lambda _1^2 \lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2 \lambda _3^2 \) and \((\lambda _1+\lambda _2)^2 + (\lambda _1+\lambda _3)^2 + (\lambda _2+\lambda _3)^2\) are the entries \(q_{11}\) and \(q_{22}\) of the symmetriser of \(A_0=\langle \xi \rangle ^{-1}A\), respectively. By imposing these conditions on the lower order terms we have that

$$\begin{aligned} |\mathcal W^{(3)} \mathcal B V|^2&\prec \left( \lambda _1^2 \lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2 \lambda _3^2 \right) (|V_1|^2 + |V_4|^2 + |V_7|^2) \nonumber \\&\quad + \left( (\lambda _1+\lambda _2)^2 + (\lambda _1+\lambda _3)^2 + (\lambda _2+\lambda _3)^2 \right) (|V_2|^2 + |V_5|^2 + |V_8|^2). \end{aligned}$$
(39)

Making a comparison with [12], we observe that \(V_1\), \(V_4\), and \(V_7\) play the role of \(V_1\) in [12] and \(V_2\), \(V_5\) and \(V_8\) play the role of \(V_2\) in [12]. Finally, from (37), we obtain that

$$\begin{aligned} |\mathcal W^{(3)} V|^2= & {} |\lambda _2\lambda _3 V_1 - (\lambda _2 + \lambda _3)V_2 + V_3|^2 + |\lambda _3\lambda _1 V_1 - (\lambda _3 + \lambda _1)V_2 + V_3|^2 \\&+\, |\lambda _1\lambda _2 V_1 - (\lambda _1 + \lambda _2)V_2 + V_3|^2 + |\lambda _2\lambda _3 V_4 - (\lambda _2 + \lambda _3)V_5 + V_6|^2 \\&+\, |\lambda _3\lambda _1 V_4 - (\lambda _3 + \lambda _1)V_5 + V_6|^2 + |\lambda _1\lambda _2 V_4 - (\lambda _1 + \lambda _2)V_5 + V_6|^2 \\&+\, |\lambda _2\lambda _3 V_7 - (\lambda _2 + \lambda _3)V_8 + V_9|^2 + |\lambda _3\lambda _1 V_7 - (\lambda _3 + \lambda _1)V_8 + V_9|^2 \\&+\, |\lambda _1\lambda _2 V_7 - (\lambda _1 + \lambda _2)V_8 + V_9|^2. \end{aligned}$$

It is our aim to prove that \(|\mathcal W^{(3)} \mathcal B V|^2\prec |\mathcal W^{(3)} V|^2\). We do this by estimating \(|\mathcal W^{(3)} \mathcal B V|^2\) and \(|\mathcal W^{(3)} V|^2\) in different zones. More precisely, inspired by [12] we decompose \(\mathbb {R}^9\) as

$$\begin{aligned} \Sigma _1^{\delta _1} \cup (\Sigma _1^{\delta _1})^c, \end{aligned}$$

where

$$\begin{aligned} \Sigma _1^{\delta _1}&:= \Big \{ V \in \mathbb {R}^9 : \sum _{1\le i<j\le 3} (\lambda _i + \lambda _j)^2 (|V_2|^2 + |V_5|^2 + |V_8|^2) \\&\le \delta _1 \sum _{1\le i<j\le 3} \lambda _i^2\lambda _j^2 (|V_1|^2 + |V_4|^2 + |V_7|^2) \Big \} \end{aligned}$$

for some \(\delta _1 >0\).

Estimate on \(\Sigma _1^{\delta _1}\). By definition of the zone, we obtain from (39)

$$\begin{aligned} |\mathcal W^{(3)} \mathcal B V|^2\prec \left( \lambda _1^2\lambda _2^2 +\lambda _2^2\lambda _3^2 + \lambda _1^2\lambda _3^2 \right) (|V_1|^2 + |V_4|^2 + |V_7|^2). \end{aligned}$$

Thanks to the hypothesis (2) on the eigenvalues, we have the following estimatesFootnote 1

$$\begin{aligned} |\mathcal W^{(3)} V|^2\succ & {} |(\lambda _2\lambda _3 - \lambda _3\lambda _1)V_1 - (\lambda _2-\lambda _1)V_2 |^2 \\&+ |(\lambda _2\lambda _3-\lambda _1\lambda _2 )V_1 - (\lambda _3-\lambda _1)V_2|^2\\&+ |(\lambda _3\lambda _1 - \lambda _1\lambda _2)V_1 - (\lambda _3-\lambda _2)V_2|^2 \\\succ & {} (\lambda _1^2 + \lambda _2^2)|\lambda _3V_1-V_2|^2 + (\lambda _3^2 + \lambda _1^2)|\lambda _2V_1-V_2|^2 \\&+ (\lambda _2^2 + \lambda _3^2)|\lambda _1V_1-V_2|^2 \\\succ & {} \lambda _1^2|(\lambda _3-\lambda _2)V_1|^2+\lambda _3^2|(\lambda _2-\lambda _1)V_1|^2\\\succ & {} \left( \lambda _1^2\lambda _2^2 +\lambda _2^2\lambda _3^2 + \lambda _1^2\lambda _3^2 \right) |V_1|^2. \end{aligned}$$

Note that in the previous bound from below we have taken in considerations only the terms with \(V_1\), \(V_2\) and \(V_3\). Repeating the same arguments for the groups of terms with \(V_4\), \(V_5\), \(V_6\) and \(V_7\), \(V_8\), \(V_9\), respectively, we get that

$$\begin{aligned} |\mathcal W^{(3)} V|^2 \succ \left( \lambda _1^2\lambda _2^2 +\lambda _2^2\lambda _3^2 + \lambda _1^2\lambda _3^2 \right) |V_4|^2 \end{aligned}$$

and

$$\begin{aligned} |\mathcal W^{(3)} V|^2 \succ \left( \lambda _1^2\lambda _2^2 +\lambda _2^2\lambda _3^2 + \lambda _1^2\lambda _3^2 \right) |V_7|^2. \end{aligned}$$

Hence,

$$\begin{aligned} |\mathcal W^{(3)}V|^2 \succ \left( \sum _{1\le i<j\le 3} \lambda _i^2\lambda _j^2 \right) (|V_1|^2 + |V_4|^2 + |V_7|^2). \end{aligned}$$

Thus, combining the last estimate with (39), we obtain \(|\mathcal W^{(3)} \mathcal B V|\prec |\mathcal W^{(3)} V|\) for all \(V \in \Sigma _1^{\delta _1}\). No assumptions have been made on \(\delta _1>0\).

Estimate on \((\Sigma ^{\delta _1}_1)^c\). By definition of the zone \((\Sigma ^{\delta _1}_1)^c\), we obtain from (39) that

$$\begin{aligned} |\mathcal W^{(3)} \mathcal B V|^2\prec \big ( 1+\frac{1}{\delta _1} \big ) \left( \sum _{1\le i<j\le 3} (\lambda _i+\lambda _j)^2 \right) (|V_2|^2 + |V_5|^2 + |V_8|^2). \end{aligned}$$
(40)

Further, by taking into considerations only the terms with \(V_1\), \(V_2\) and \(V_3\) in \(|\mathcal W^{(3)} V|^2\) we have

$$\begin{aligned} |\mathcal W^{(3)} V|^2= & {} |\lambda _2\lambda _3 V_1 - (\lambda _2 + \lambda _3)V_2 + V_3|^2 + |\lambda _3\lambda _1 V_1 - (\lambda _3 + \lambda _1)V_2 + V_3|^2 \nonumber \\&+ |\lambda _1\lambda _2 V_1 - (\lambda _1 + \lambda _2)V_2 + V_3|^2 \nonumber \\\succ & {} \gamma _1 \big ( |(\lambda _2+\lambda _3)V_2-V_3|^2 + |(\lambda _3+\lambda _1)V_2-V_3|^2 \nonumber \\&+ |(\lambda _1+\lambda _2)V_2-V_3|^2 \big ) - \gamma _2\big ( \lambda _1^2\lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2\lambda _3^2 \big )|V_1|^2 \end{aligned}$$
(41)

for some constant \(\gamma _1, \gamma _2>0\) suitably chosen.Footnote 2 The hypothesis (2) implies

$$\begin{aligned}&(\lambda _2-\lambda _1)^2 + (\lambda _3-\lambda _2)^2 + (\lambda _3-\lambda _1)^2 \ge \frac{2}{C} (\lambda _1^2 + \lambda _2^2 + \lambda _3^2) \\&\qquad \ge \frac{1}{2C}\big ( (\lambda _1 + \lambda _2)^2 + (\lambda _1 + \lambda _3)^2 + (\lambda _2+\lambda _3)^2 \big ). \end{aligned}$$

Applying the last inequality to (41), we obtain

$$\begin{aligned} |\mathcal W^{(3)} V|^2\succ & {} \gamma _1 \big ( |(\lambda _2+\lambda _3)V_2-V_3|^2 + |(\lambda _3+\lambda _1)V_2-V_3|^2 \\&+ |(\lambda _1+\lambda _2)V_2-V_3|^2 \big ) - \gamma _2\big ( \lambda _1^2\lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2\lambda _3^2 \big )|V_1|^2 \\\succ & {} \gamma _1 ((\lambda _2-\lambda _1)^2 + (\lambda _3-\lambda _2)^2 + (\lambda _3-\lambda _1)^2) |V_2|^2 \\&- \gamma _2\big ( \lambda _1^2\lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2\lambda _3^2 \big )|V_1|^2 \\\succ & {} \gamma _1' ( (\lambda _1 + \lambda _2)^2 + (\lambda _1 + \lambda _3)^2 + (\lambda _2+\lambda _3)^2)|V_2|^2 \\&- \gamma _2\big ( \lambda _1^2\lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2\lambda _3^2 \big )|V_1|^2. \end{aligned}$$

Now, repeating the same argument for the terms involving \(V_4, V_5, V_6\) and \(V_7, V_8, V_9\), respectively, we get

$$\begin{aligned} |\mathcal W^{(3)} V|^2&\succ \gamma _1' ( (\lambda _1 + \lambda _2)^2 + (\lambda _1 + \lambda _3)^2 + (\lambda _2+\lambda _3)^2)|V_5|^2\\&\quad - \gamma _2\big ( \lambda _1^2\lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2\lambda _3^2 \big )|V_4|^2 \end{aligned}$$

and

$$\begin{aligned} |\mathcal W^{(3)} V|^2&\succ \gamma _1' ( (\lambda _1 + \lambda _2)^2 + (\lambda _1 + \lambda _3)^2 + (\lambda _2+\lambda _3)^2)|V_8|^2\\&\quad - \gamma _2\big ( \lambda _1^2\lambda _2^2 + \lambda _1^2\lambda _3^2 + \lambda _2^2\lambda _3^2 \big )|V_7|^2. \end{aligned}$$

It follows that for all \(V\in (\Sigma _1^{\delta _1})^c\) the bound from below

$$\begin{aligned} |\mathcal W^{(3)} V|^2 \succ \big (\gamma _1' - \frac{\gamma _2}{\delta _1} \big ) \left( \sum _{1\le i<j\le 3} (\lambda _i+\lambda _j)^2 \right) (|V_2|^2 +|V_5|^2 + |V_8|^2) \end{aligned}$$

holds, provided that \(\delta _1\) is chosen large enough. Combining this with (40), we get \(|\mathcal W^{(3)} \mathcal B V|\prec |\mathcal W V|\) on \((\Sigma _1^{\delta _1})^c\) and, thus, on \(\mathbb {R}^9\).

5.3 The general case

Recall from Sect. 3.4 that the \(m^2 \times m^2\) matrix \(\mathcal B(t,\xi )\) is made up of m matrices of dimension \(m \times m^2\) that contain only in the last line non-zero elements, see (21). To not further complicate the notation, we will in what follows denote \(\mathcal W^{(m)}\) simply by \(\mathcal W\) and will also assume that the \(b_{ij}^{(l)}(t,\xi )\) in \(\mathcal B(t,\xi )\) are properly scaled by \(\langle \xi \rangle ^{l-m}\). For that see Sect. 3.4, specifically formula (23). Thus, we have

$$\begin{aligned} \mathcal B(t,\xi ) = \left( \begin{array}{l} \mathcal B_1(t,\xi ) \\ \vdots \\ \mathcal B_m(t,\xi ) \end{array}\right) , \quad B_i(t,\xi ) = \left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} 0 &{} 0 &{} \cdots &{} 0 \\ \mathcal B^{(1)}_i(t,\xi )&{} \mathcal B^{(2)}_i(t,\xi ) &{} \cdots &{} \mathcal B^{(m)}_i(t,\xi ) \end{array}\right) . \end{aligned}$$

The \(\mathcal B_i(t,\xi )\) are then given by

$$\begin{aligned} \mathcal B_i(t,\xi ) = \left( b_{ij}^{(1)}(t,\xi ), b_{ij}^{(2)}(t,\xi ), \cdots , b_{ij}^{(m-1)}(t,\xi ), 0 \right) \end{aligned}$$

for \(1 \le i \le m\). Thus, we obtain

$$\begin{aligned} \mathcal W \mathcal B = \left( \begin{array}{lllllllll} b_{11}^{(1)} &{} \cdots &{} b_{11}^{(m-1)} &{} 0 &{} \cdots &{} b_{1m}^{(1)} &{} \cdots &{} b_{1m}^{(m-1)} &{} 0 \\ \vdots &{} &{} \vdots &{} \vdots &{} &{} &{} &{} \vdots &{} \vdots \\ b_{11}^{(1)} &{} \cdots &{} b_{11}^{(m-1)} &{} 0 &{} \cdots &{} b_{1m}^{(1)} &{} \cdots &{} b_{1m}^{(m-1)} &{} 0 \\ b_{21}^{(1)} &{} \cdots &{} b_{21}^{(m-1)} &{} 0 &{} \cdots &{} b_{2m}^{(1)} &{} \cdots &{} b_{2m}^{(m-1)} &{} 0 \\ \vdots &{} &{} \vdots &{} \vdots &{} &{} \vdots &{} &{} \vdots &{} \vdots \\ b_{21}^{(1)} &{} \cdots &{} b_{21}^{(m-1)} &{} 0 &{} \cdots &{} b_{2m}^{(1)} &{} \cdots &{} b_{2m}^{(m-1)} &{} 0 \\ \mathbf \vdots &{} &{} \mathbf \vdots &{} \mathbf \vdots &{} &{} \mathbf \vdots &{} &{} \mathbf \vdots &{} \mathbf \vdots \\ b_{m1}^{(1)} &{} \cdots &{} b_{m1}^{(m-1)} &{} 0 &{} \cdots &{} b_{mm}^{(1)} &{} \cdots &{} b_{mm}^{(m-1)} &{} 0 \\ \vdots &{} &{} \vdots &{} \vdots &{} &{} \vdots &{} &{} \vdots &{} \vdots \\ b_{m1}^{(1)} &{} \cdots &{} b_{m1}^{(m-1)} &{} 0 &{} \cdots &{} b_{mm}^{(1)} &{} \cdots &{} b_{mm}^{(m-1)} &{} 0 \\ \end{array}\right) . \end{aligned}$$
(42)

We are now ready to prove the following theorem.

Theorem 5.1

Let the entries of the matrix \(\mathcal B(t,\xi )\) fulfill the conditions

$$\begin{aligned} \sum _{k=1}^m|b_{kj}^{(l)}(t,\xi )|^2\prec \sum _{i=1}^m |\sigma _{m-l}^{(m-1)}(\pi _i \lambda )|^2 \end{aligned}$$
(43)

for any \(l=1,\dots , m-1\) and \(j=1,\dots ,m\). Then we have

$$\begin{aligned} |\mathcal W \mathcal B V|\prec |\mathcal W V| \end{aligned}$$

for all \(V \in \mathbb {C}^{m^2}\). More precisely, we define

$$\begin{aligned} \Sigma _h^{\delta _h}&:= \Big \{ V \in \mathbb {C}^{m^2} : \quad \sum _{j=h+1}^{m-1} \sum _{i=1}^{m} |\sigma _{m-j}^{(m-1)}(\pi _i \lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2 \nonumber \\&\le \delta _h\sum _{i=1}^m |\sigma _{m-h}^{(m-1)} (\pi _i \lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2 \Big \} \end{aligned}$$
(44)

for \(h = 1,\dots , m-2\). There exist suitable \(\delta _h\), \(h=1,\dots ,m-2\) such that

$$\begin{aligned} |\mathcal W \mathcal B V|^2&\prec \sum _{i=1}^m |\sigma _{m-1}^{(m-1)}(\pi _i \lambda )|^2 \sum \limits _{l=0}^{m-1} |V_{1+lm}|^2 \\ |\mathcal W V|^2&\succ \sum _{i=1}^m |\sigma _{m-1}^{(m-1)}(\pi _i \lambda )|^2 \sum \limits _{l=0}^{m-1} |V_{1+lm}|^2 \end{aligned}$$

on \(\Sigma _1^{\delta _1}\) and

$$\begin{aligned} |\mathcal W \mathcal B V|^2&\prec \sum \limits _{i=1}^m |\sigma _{m-h}^{(m-1)}(\pi _i \lambda )|^2 \sum \limits _{l=0}^{m-1} |V_{h+lm}|^2 \\ |\mathcal W V|^2&\succ \sum \limits _{i=1}^m |\sigma _{m-h}^{(m-1)}(\pi _i \lambda )|^2 \sum \limits _{l=0}^{m-1} |V_{h+lm}|^2 \end{aligned}$$

on \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}} \cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}} \cap \cdots \cap \big (\Sigma _{h-1}^{\delta _{h-1}}\big )^\mathrm{{c}} \cap \Sigma _h^{\delta _h}\) for \(2 \le h \le m-2\). Finally,

$$\begin{aligned} |\mathcal W \mathcal B V|^2&\prec \sum _{i=1}^m |\sigma _{1}^{(m-1)}(\pi _i \lambda )|^2 \sum \limits _{l=0}^{m-1} |V_{m-1+lm}|^2 \\ |\mathcal W V|^2&\succ \sum _{i=1}^m |\sigma _{1}^{(m-1)}(\pi _i \lambda )|^2 \sum \limits _{l=0}^{m-1} |V_{m-1+lm}|^2 \end{aligned}$$

on \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}} \cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}} \cap \cdots \cap \big (\Sigma _{m-2}^{\delta _{m-2}}\big )^\mathrm{{c}}\).

Note that if \(m=2\) no zone argument is needed to prove the theorem above (see Sect. 5.1) and when \(m=3\) just one zone is needed (see Sect. 5.2). The proof of Theorem 5.1 has the same structure as the proof of Theorem 5 in [12] and requires some auxiliary lemmas.

Lemma 5.2

For all i and j with \(1\le i,j\le m\) and \(k=1,...,m-1,\) one has

$$\begin{aligned}&\sigma _{m-k}^{(m-1)}(\pi _i\lambda )-\sigma _{m-k}^{(m-1)}(\pi _j\lambda ) \nonumber \\&\quad = (-1)^{m-k}(\lambda _j-\lambda _i) \sum \limits _{\begin{array}{c} i_h\ne i,\, i_h\ne j\\ 1\le i_1<i_2<\cdots <i_{m-k-1}\le m \end{array}} \lambda _{i_1}\lambda _{i_2}\cdots \lambda _{i_{m-k-1}} \end{aligned}$$
(45)

Proof

The proof can be found in [12, Lemma3]. \(\square \)

Lemma 5.3

For all \(k=1,\ldots ,m\), we have

$$\begin{aligned} \sum _{l=0}^{m-1} \sum _{i=1}^m\left| \sum _{j=k}^m \sigma _{m-j}^{(m-1)}(\pi _i\lambda ) V_{j+lm} \right| ^2 \succ \sum _{i=1}^m|\sigma _{m-k}^{(m-1)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1}|V_{k+lm}|^2. \end{aligned}$$
(46)

Proof

The proof of this lemma follows by induction by applying Lemma 5.2 and can also be obtained by repeated application of Lemma 4 in [12] to the respective groups of \(V_i\). \(\square \)

Proof of Theorem 5.1

By the definition of \(\mathcal B\), we have that \(|\mathcal W \mathcal B V|^2\prec |\mathcal WV|^2\) is equivalent to

$$\begin{aligned} \sum _{i=1}^m \left| \sum _{j=1}^{m-1}\sum _{l=1}^m b_{il}^{(j)} V_{j+(l-1)m} \right| ^2\prec \sum _{l=0}^{m-1} \sum _{i=1}^m \left| \sum _{j=1}^m \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm} \right| ^2. \end{aligned}$$
(47)

Making use of the Levi-type conditions (43), we obtain

$$\begin{aligned} \sum _{i=1}^m \left| \sum _{j=1}^{m-1}\sum _{l=1}^m b_{il}^{(j)} V_{j+(l-1)m} \right| ^2\prec & {} \sum _{l=1}^m \sum _{j=1}^{m-1} \left( \sum _{i=1}^m |b_{il}^{(j)}|^2 \right) |V_{j+(l-1)m}|^2 \nonumber \\\prec & {} \sum _{j=1}^{m-1} \sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i \lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2. \end{aligned}$$
(48)

On \(\Sigma _1^{\delta _1}\), we further obtain the estimate

$$\begin{aligned} \sum _{i=1}^m \left| \sum _{j=1}^{m-1}\sum _{l=1}^m b_{il}^{(j)} V_{j+(l-1)m} \right| ^2\prec (1+\delta _1) \sum _{i=1}^m |\sigma _{m-1}^{(m-1)}(\pi _i \lambda )|^2 \sum \limits _{l=0}^{m-1} |V_{1+lm}|^2. \end{aligned}$$

Lemma 5.3 gives, setting \(k=1\) in (46) that

$$\begin{aligned} \sum _{l=0}^{m-1} \sum _{i=1}^m \left| \sum _{j=1}^m \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm} \right| ^2 \succ \sum _{i=1}^m|\sigma _{m-1}^{(m-1)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1}|V_{1+lm}|^2. \end{aligned}$$

This proves inequality (47) in \(\Sigma _1^{\delta _1}\). Now, we assume that \(V \in (\Sigma _1^{\delta _1})^c \cap (\Sigma _2^{\delta _2})^c \cap \dots \cap (\Sigma _{h-1}^{\delta _{h-1}})^c \cap \Sigma _h^{\delta _h}\) for \(2 \le h \le m-2\). From the definition of the zones for \(1\le k\le h-1\) and \(\delta _k\ge 1\), we obtain

$$\begin{aligned}&\sum _{i=1}^m |\sigma ^{(m-1)}_{m-(h-1)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h-1+lm}|^2 \\&\quad <\frac{1}{\delta _{h-1}}\biggl ( \sum _{j=h+1}^{m-1}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1}|V_{j+lm}|^2 \\&\qquad +\sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1}|V_{h+lm}|^2 \biggl ) \\&\quad \le \frac{1}{\delta _{h-1}}(1+\delta _h)\sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2, \end{aligned}$$

as well as

$$\begin{aligned}&\sum _{i=1}^m |\sigma ^{(m-1)}_{m-(h-2)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h-2+lm}|^2\\&\quad <\frac{1}{\delta _{h-2}}\biggl (\sum _{j=h+1}^{m-1}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2\\&\qquad +\sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2 \\&\qquad + \sum _{i=1}^m|\sigma ^{(m-1)}_{m-(h-1)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h-1+lm}|^2 \biggl )\\&\quad \le \frac{1}{\delta _{h-2}}\big (1+\delta _h+\frac{1}{\delta _{h-1}}(1+\delta _h)\big ) \sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2\\&\quad \le (1+\delta _h)\big (\frac{1}{\delta _{h-1}}+\frac{1}{\delta _{h-2}})\sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2. \end{aligned}$$

Continuing these estimates recursively, we obtain that

$$\begin{aligned}&\sum _{i=1}^m |\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2\nonumber \\&\quad \prec (1+\delta _h)\sum _{k=1}^{h-1}\frac{1}{\delta _k}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2 \end{aligned}$$
(49)

for all j with \(1\le j \le h-1\) is valid on the zone \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}} \cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}} \cap \cdots \cap \big (\Sigma _{h-1}^{\delta _{h-1}}\big )^\mathrm{{c}} \cap \Sigma _h^{\delta _h}\).

From (48), the estimate (49) and the definition of the zone \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}} \cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}} \cap \cdots \cap \big (\Sigma _{h-1}^{\delta _{h-1}}\big )^\mathrm{{c}} \cap \Sigma _h^{\delta _h}\) we get the following estimate of the left-hand side of (47):

$$\begin{aligned} \sum _{i=1}^m \left| \sum _{j=1}^{m-1}\sum _{l=1}^m b_{il}^{(j)} V_{j+(l-1)m} \right| ^2&\prec \sum _{j=1}^{m-1} \sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i \lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2 \\&\prec \sum _{j=h+1}^{m-1} \sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2 \\&\quad +\sum _{i=1}^m |\sigma _{m-h}^{(m-1)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2\\&\quad +\sum _{j=1}^{h-1}\sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2 \\&\prec \sum _{i=1}^m |\sigma _{m-h}^{(m-1)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2. \end{aligned}$$

Now, we have to estimate the right-hand side of (47) on \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}} \cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}} \cap \cdots \cap \big (\Sigma _{h-1}^{\delta _{h-1}}\big )^\mathrm{{c}} \cap \Sigma _h^{\delta _h}\). We make use of Lemma 5.3 and of the bound (49). We obtain

$$\begin{aligned}&\sum _{l=0}^{m-1} \sum _{i=1}^m\left| \sum _{j=1}^m \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm}\right| ^2\\&\quad \succ \gamma _1 \sum _{l=0}^{m-1} \sum _{i=1}^m \left| \sum _{j=h}^{m}\sigma ^{(m-1)}_{m-j}(\pi _i\lambda ) V_{j+lm} \right| ^2 \\&\quad \quad -\gamma _2 \sum _{l=0}^{m-1} \sum _{i=1}^m \left| \sum _{j=1}^{h-1}\sigma ^{(m-1)}_{m-j}(\pi _i\lambda ) V_{j+lm} \right| ^2\\&\quad \succ \gamma _1 \sum _{l=0}^{m-1} \sum _{i=1}^m \left| \sum _{j=h}^{m}\sigma ^{(m-1)}_{m-j}(\pi _i\lambda ) V_{j+lm} \right| ^2 \\&\quad \quad -\gamma _2 \sum _{i=1}^m \sum _{j=1}^{h-1} |\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2 \\&\quad \succ \gamma _1 \sum _{i=1}^m|\sigma _{m-h}^{(m-1)}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2 \\&\quad \quad -\gamma _2 (1+\delta _h) \sum _{k=1}^{h-1}\frac{1}{\delta _k}\sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2\sum _{l=0}^{m-1}|V_{h+lm}|^2\\&\quad =\left( \gamma _1-\gamma _2(1+\delta _h) \sum _{k=1}^{h-1}\frac{1}{\delta _k}\right) \sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1}|V_{h+lm}|^2, \end{aligned}$$

where the second inequality follows from

$$\begin{aligned} \sum _{l=0}^{m-1} \sum _{i=1}^m \left| \sum _{j=1}^{h-1}\sigma ^{(m-1)}_{m-j}(\pi _i\lambda ) V_{j+lm} \right| ^2 \le (h-1) \sum _{i=1}^m \sum _{j=1}^{h-1} |\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2 \end{aligned}$$

which follows from \(|z_1 + \cdots + z_k| \le k \sum _{i=1}^k |z_i|^2\). This yields estimate (47) on the zone \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}} \cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}} \cap \cdots \cap \big (\Sigma _{h-1}^{\delta _{h-1}}\big )^\mathrm{{c}} \cap \Sigma _h^{\delta _h}\) for any \(\delta _h > 0\) provided that \(\delta _1\), \(\dots \), \(\delta _{h-1}\) are chosen large enough.

The last step is assuming that \(V \in \big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}} \cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}} \cap \cdots \cap \big (\Sigma _{m-2}^{\delta _{m-2}}\big )^\mathrm{{c}}\). Thus, from the definition of the \(\Sigma ^{\delta _h}\), we have

$$\begin{aligned}&\sum _{j=h+1}^{m-1}\sum _{i=1}^{m-1} |\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2 \nonumber \\&\qquad > \delta _h \sum _{i=1}^m|\sigma ^{(m-1)}_{m-h}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{h+lm}|^2 \end{aligned}$$
(50)

for \(1 \le h \le m-2\). More precisely from the previous estimate we obtain \(m-2\) inequalities starting with

$$\begin{aligned}&\sum _{i=1}^m|\sigma ^{(m-1)}_{m-1}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{1+lm}|^2 \nonumber \\&\quad < \frac{1}{\delta _1} \sum _{j=2}^{m-1}\sum _{i=1}^{m-1} |\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2, \end{aligned}$$
(51)

(where we put \(h=1\) in (50)) and ending with

$$\begin{aligned}&\sum _{i=1}^m|\sigma ^{(m-1)}_{2}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{(m-2)+lm}|^2 \\&\quad < \frac{1}{\delta _{m-2}} \sum _{i=1}^{m-1} |\sigma ^{(m-1)}_{1}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{(m-1)+lm}|^2, \end{aligned}$$

[where \(h=m-2\) in (50)]. Using now the second of the inequalities, i.e. \(h=2\) in (50), on the right hand side of (51), we get

$$\begin{aligned}&\frac{1}{\delta _1} \sum _{j=3}^{m-1}\sum _{i=1}^{m-1} |\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2 + \frac{1}{\delta _1} \sum _{i=1}^m |\sigma ^{(m-1)}_{m-2}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{2+lm}|^2 \\&\quad \le \left( \frac{1}{\delta _1} + \frac{1}{\delta _1} \frac{1}{\delta _2} \right) \sum _{j=3}^{m-1}\sum _{i=1}^{m-1} |\sigma ^{(m-1)}_{m-j}(\pi _i\lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2. \end{aligned}$$

Then using the remaining estimates for \(h=3\) to \(h=m-2\) recursively, we finally arrive at

$$\begin{aligned}&\sum _{i=1}^m |\sigma _{m-j}^{(m-1)}(\pi _i \lambda )|^2 \sum _{l=0}^{m-1} |V_{j+lm}|^2 \nonumber \\&\quad \le \sum _{h=1}^{m-2} \frac{1}{\delta _h} \sum _{i=1}^{m} |\sigma _{1}^{(m-1)}(\pi _i \lambda )|^2 \sum _{l=0}^{m-1} |V_{m-1+lm}|^2 \end{aligned}$$
(52)

for any \(1 \le j \le m-2\), \(\delta _h\ge 1\). From (52) and the Levi-type conditions we deduce that

$$\begin{aligned} \sum _{i=1}^m \left| \sum _{j=1}^{m-1}\sum _{l=1}^m b_{il}^{(j)} V_{j+(l-1)m} \right| ^2\prec \sum _{i=1}^m |\sigma _1^{(m-1)}(\pi _i \lambda )|^2 \sum _{l=0}^{m-1} |V_{m-1+lm}|^2 \end{aligned}$$

in \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}} \cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}} \cap \cdots \cap \big (\Sigma _{m-2}^{\delta _{m-2}}\big )^\mathrm{{c}}\).

Using Lemma 5.3, we get

$$\begin{aligned}&\sum _{l=0}^{m-1} \sum _{i=1}^m\left| \sum _{j=1}^m \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm}\right| ^2 \\&\quad = \sum _{l=0}^{m-1} \sum _{i=1}^m\left| \sum _{j=1}^{m-2} \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm} + \sum _{j=m-1}^{m} \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm}\right| ^2 \\&\quad \succ \gamma _1 \sum _{l=0}^{m-1} \sum _{i=1}^m\left| \sum _{j=m-1}^m \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm}\right| ^2 - \gamma _2 \sum _{l=0}^{m-1} \sum _{i=1}^m\left| \sum _{j=1}^{m-2} \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm}\right| ^2 \\&\quad \succ \gamma _1 \sum _{i=1}^m |\sigma _1^{(m-1)}(\pi _i \lambda )|^2 \sum _{l=0}^{m-1} |V_{m-1+lm}|^2 \\&\qquad -\gamma _2 \sum _{l=0}^{m-1} \sum _{i=1}^m\left| \sum _{j=1}^{m-2} \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm}\right| ^2. \end{aligned}$$

The second term on the right-hand side of the last inequality can be estimated with (52) and we obtain

$$\begin{aligned} \sum _{l=0}^{m-1} \sum _{i=1}^m\left| \sum _{j=1}^m \sigma ^{(m-1)}_{m-j}(\pi _i\lambda )V_{j+lm}\right| ^2 \succ \sum _{i=1}^m |\sigma _1^{(m-1)}(\pi _i \lambda )|^2 \sum _{l=0}^{m-1} |V_{m-1+lm}|^2 \end{aligned}$$

provided that the \(\delta _h\), \(1\le h\le m-2\) are chosen large enough. Thus (47) holds on the zone \(\big (\Sigma _1^{\delta _1}\big )^\mathrm{{c}} \cap \big (\Sigma _2^{\delta _2}\big )^\mathrm{{c}} \cap \cdots \cap \big (\Sigma _{m-2}^{\delta _{m-2}}\big )^\mathrm{{c}}\) and the proof of Theorem 5.1 is complete.

6 Well-posedness results

In this section we prove our main result: the well-posedness of the Cauchy problem (1). We formulate the following theorem by adopting the language and the notations of the previous sections concerning the lower order terms. A different formulation will be given in Theorem 6.2. Note that Theorems 6.1 and 6.2 correspond to Theorems 1.1 and 1.2, respectively.

Theorem 6.1

Let \(A(t,D_x)\), \(t\in [0,T]\), \(x\in \mathbb {R}^n\), be an \(m\times m\) matrix of first order differential operators with \(C^\infty \)-coefficients. Let \(A(t,\xi )\) have real eigenvalues satisfying condition (2). Let

$$\begin{aligned} \left\{ \begin{array}{ll} D_t u - A(t,D_x)u = 0,&{}\quad (t,x) \in [0,T] \times \mathbb {R}^n\\ \left. u \right| _{t=0} = u_0,&{}\quad x \in \mathbb {R}^n \end{array} \right. \end{aligned}$$

be the Cauchy problem (1). Assume that the Cauchy problem (19),

$$\begin{aligned} \left\{ \begin{array}{l} D_t U = \mathcal A(t,D_x) U + \mathcal B(t,D_x)U, \\ \left. U \right| _{t=0} = U_0=(U_{0,1}, \ldots , U_{0,m})^T, \end{array} \right. \end{aligned}$$

obtained from (1) by block Sylvester reduction as in Sect. 3 has the lower order terms matrix \( \mathcal B(t,D_x)\) fulfilling the Levi-type conditions (43). Hence, for all \(s\ge 1\) and for all \(u_0\in \gamma ^s(\mathbb {R}^n)^m\) there exists a unique solution \(u\in C^1([0,T], \gamma ^s(\mathbb {R}^n))^m\) of the Cauchy problem (1).

Proof

We assume \(s>1\) since the case \(s=1\) is known thanks to see [18, 20]. By the finite propagation speed for hyperbolic equations it is not restrictive to take compactly supported initial data and, therefore, to have the solution u compactly supported in x. Note that if \(u_0\in \gamma _c^s(\mathbb {R}^n)^m\) then by deriving the system in (1) with respect to t we immediately have that \(D_t^j u(0,x)\in \gamma _c^s(\mathbb {R}^n)^m\) for \(j=1,\dots ,m-1\). It follows that if u solves (1) then U defined in (18) solves the Cauchy problem (19) with initial data \(U_0\in \gamma _c^s(\mathbb {R}^n)^{m^2}\). We now prove that \(U\in C^1([0,T],\gamma ^s(\mathbb {R}^n))^{m^2}\). This will allow us to conclude that \(u\in C^1([0,T], \gamma ^s(\mathbb {R}^n))^m\). We recall that the Cauchy problem (19) is given by the system

$$\begin{aligned} D_t U = \mathcal A(t,D_x) U + \mathcal B(t,D_x)U, \end{aligned}$$

where \(\mathcal A(t,\xi )\) is a block Sylvester matrix with m identical blocks having the same eigenvalues of \(A(t,\xi )\). We make use of the energy \(E_\varepsilon \) defined via the quasi-symmetriser in Sect. 4. Combining the energy estimate (29) with the estimates of the first, second and third term in Sects. 4.2, 4.3 and 4.4, respectively, we get

$$\begin{aligned} \partial _t E_\varepsilon (t,\xi )\le (K_\varepsilon (t,\xi )+C_2\varepsilon \langle \xi \rangle +C_3)E_\varepsilon (t,\xi ), \end{aligned}$$
(53)

where \(K_\varepsilon (t,\xi )\) is defined in Sect. 4.2, the bound from above

$$\begin{aligned} \int _0^T K_\varepsilon (t,\xi )\, dt\le C_1\varepsilon ^{-2(m-1)/k}, \end{aligned}$$

holds for all \(k\ge 1\) and \(C_1, C_2, C_3\) are positive constants. Note that in the estimate (53) we have used both the condition (2) on the eigenvalues and the Levi-type conditions (43). Thanks to the reduction to block Sylvester form that we have applied to obtain the Cauchy problem (19), we deal here with the same kind of energy employed in [12] for the scalar weakly hyperbolic equations of order m. The proof therefore continues as the proof of Theorem 6 in [12] with the only difference that k can be taken arbitrary. This is due to the fact that the coefficients of the matrix \(A(t,\xi )\) are \(C^\infty \) with respect to t. It follows, by working on the Fourier transform level, that \(U\in C^1([0,T],\gamma ^s(\mathbb {R}^n))^{m^2}\) and therefore \(u\in C^1([0,T], \gamma ^s(\mathbb {R}^n))^m\). \(\square \)

We now formulate Theorem 6.1 with an additional condition on the matrix \(A(t,\xi )\) which implies the Levi-type conditions (43).

Theorem 6.2

Let \(A(t,D_x)\), \(t\in [0,T]\), \(x\in \mathbb {R}^n\), be an \(m\times m\) matrix of first order differential operators with \(C^\infty \)-coefficients. Let A have real eigenvalues satisfying condition (2) and let \(Q=(q_{ij})\) be the symmetriser of \(A_0=\langle \xi \rangle ^{-1}A\). Assume that

$$\begin{aligned} \max _{k=1,\ldots ,m-1}\Vert D_t^k A_0(t,\xi )\Vert ^2 \prec q_{j,j}(t,\xi ) \end{aligned}$$
(54)

for all \((t,\xi ) \in [0,T] \times \mathbb {R}^n\) and \(j=1,\dots ,m-1\). Hence, for all \(s\ge 1\) and for all \(u_0\in \gamma ^s(\mathbb {R}^n)^m\) there exists a unique solution \(u\in C^1([0,T], \gamma ^s(\mathbb {R}^n))^m\) of the Cauchy problem (1).

Proof

From Proposition 3.2 and Corollary 3.3 we have that

$$\begin{aligned} \sum _{k=1}^m|b_{kj}^{(l)}(t,\xi )|^2 \prec \max _{k=1,\dots ,m-1}\Vert D_t^k A_0(t,\xi )\Vert ^2 \end{aligned}$$

for all \((t,\xi ) \in [0,T] \times \mathbb {R}^n\) and \(l=1,\dots , m-1\) and \(j=1,\dots ,m\). It follows that (54) implies the Levi-type conditions (43) and therefore Theorem 6.2 follows from Theorem 6.1. \(\square \)

It is clear that the hypothesis (54) on the matrix \(A_0=A\langle \xi \rangle ^{-1}\) is in general stronger than the Levi-type conditions (43). However, in some cases (43) and (54) coincide as illustrated by the following examples.

Example 6.3

In the special case \(D_t^2 u - a(t)D_x^2 u = 0\) with \(a(t) \ge 0\) and appropriate Cauchy data, the Levy-type condition is automatically satisfied for \(a \in C^2[0,T]\). Indeed, with \(a_{11}=0\), \(a_{12}=1\), \(a_{21}=a(t)\), and \(a_{22} = 0\), condition (35) becomes \(|D_t a(t)| \le C a(t)\) which is satisfied by Glaeser’s inequality [15].

Example 6.4

When \(m=2\), the Levi-type conditions (43) imply (54) (and therefore coincide with it). Indeed, as observed in Sect. 5.1, the Levi-type conditions are formulated as

$$\begin{aligned} (|D_t a_{11}(t)|^2 + |D_t a_{21}(t)|^2)\langle \xi \rangle ^{-2}\prec & {} \lambda _1^2(t,\xi ) + \lambda _2^2(t,\xi ), \\ (|D_t a_{12}(t)|^2 + |D_t a_{22}(t)|^2)\langle \xi \rangle ^{-2}\prec & {} \lambda _1^2(t,\xi ) + \lambda _2^2(t,\xi ). \end{aligned}$$

This implies

$$\begin{aligned} \Vert D_t A_0\Vert ^2 \prec q_{1,1} \end{aligned}$$

which is condition (54).

Example 6.5

Let us now take a \(3\times 3\) matrix A with trace zero. For simplicity let us assume that \(n=1\) and that the eigenvalues of the corresponding \(A_0\) are \(\lambda _1(t,\xi )=-\sqrt{a(t)}\xi \langle \xi \rangle ^{-1}\), \(\lambda _2(t,\xi )=0\) and \(\lambda _3(t,\xi )=\sqrt{a(t)}\xi \langle \xi \rangle ^{-1}\) with \(a(t)\ge 0\) for \(t\in [0,T]\). It follows that the hypothesis (2) on the eigenvalues is satisfied. By direct computations we get

$$\begin{aligned} q_{1,1}= & {} \lambda _1^2\lambda _2^2+\lambda _1^2\lambda _3^2+\lambda _2^2\lambda _3^2=a(t)\xi ^2\langle \xi \rangle ^{-2},\\ q_{2,2}= & {} (\lambda _1+\lambda _2)^2+(\lambda _1+\lambda _3)^2+(\lambda _2+\lambda _3)^2=2a(t)\xi ^2\langle \xi \rangle ^{-2}. \end{aligned}$$

It follows that both \(q_{1,1}\) and \(q_{2,2}\) are comparable to a and therefore combining (38) with (27) we conclude that

$$\begin{aligned} |b^{(1)}_{kj}|^2= & {} |D_t^2 a_{kj} + 2D_t a_{kj}|^2\prec a(t),\\ |b^{(2)}_{kj}|^2= & {} |a_{k1}D_t a_{1j} + a_{k2}D_t a_{2j} + a_{k3}D_t a_{3j}|^2\prec a(t), \end{aligned}$$

for \(k=1,2,3\) and \(j=1,2\). We can easily see on the matrix

$$\begin{aligned} A(t,\xi )=\left( \begin{array}{l@{\quad }l@{\quad }l} 0 &{} a(t) &{} 0\\ 1 &{} 0 &{} 0\\ 0 &{} 1 &{} 0 \end{array}\right) \xi \end{aligned}$$

that the conditions above on the entries of A entail

$$\begin{aligned} |D_t^k a(t)|^2\prec a(t) \end{aligned}$$

for all \(t\in [0,T]\) and \(k=1,2\), i.e. condition (54) .

We now assume that the coefficients of the matrix \(A(t,\xi )\) are analytic with respect to t. We will prove that in this case the Cauchy problem (1) with the same Levi-type conditions employed above is \(C^\infty \) well-posed.

The proof of the \(C^\infty \) well-posedness follows very closely the arguments in [12]. Thus, we will only give a sketch with the differences and refer the reader to the cited work for more details. We begin by recalling a lemma on analytic functions whose proof can be found in [12] (see Lemma 5 in [12]).

Lemma 6.6

Let \(f(t,\xi )\) be an analytic function in \(t \in [0,T]\), continuous and homogeneous of order 0 in \(\xi \in \mathbb {R}^n\). Then,

  1. (i)

    for all \(\xi \) there exists a finite partition \((\tau _{h(\xi )})\) of the interval [0, T] such that

    $$\begin{aligned} 0 = \tau _0< \tau _1< \dots< \tau _{h(\xi )}< \dots < \tau _{N(\xi )} = T \end{aligned}$$

    with \(\sup _{\xi \ne 0} N(\xi ) < +\infty \), such that \(f(t,\xi ) \ne 0\) in each open interval \((\tau _{h(\xi )}, \tau _{(h+1)(\xi )})\);

  2. (ii)

    there exists a positive constant C such that

    $$\begin{aligned} |\partial _t f(t,\xi )| \le C \left( \frac{1}{t-\tau _{h(\xi )}} + \frac{1}{\tau _{(h+1)(\xi )}-t} \right) |f(t,\xi )| \end{aligned}$$

    for all \(t \in (\tau _{h(\xi )}, \tau _{(h+1)(\xi )})\), \(\xi \in \mathbb {R}^n \setminus \{0\}\) and \(0 \le h(\xi ) \le N(\xi )-1\).

Theorem 6.7

If all entries of \(A(t,D_x)\) in (1) are analytic on [0, T], the eigenvalues satisfy (2) and the entries of the matrix \(\mathcal B(t,\xi )\) in (19) satisfy the Levi conditions (43) for \(\xi \) away from 0, then the Cauchy problem (1) is \(C^\infty \) well-posed, i.e., for all \(u_0\in C^\infty (\mathbb {R}^n)^m\) there exists a unique solution \(u\in C^1([0,T], C^\infty (\mathbb {R}^n))^m\) of the Cauchy problem (1).

Proof

Thanks to the finite propagation speed property it is not restrictive to assume that the initial data have compact support. By Remark 2.5, the entries of the quasi-symmetriser \(\mathcal Q_\varepsilon ^{(m)}(t,\xi )\) are analytic in \(t \in [0,T]\) and, using Proposition 2.1, can be written as

$$\begin{aligned} q_{\varepsilon ,ij}(t,\xi ) = q_{0,ij}(t,\xi ) + \varepsilon ^2 q_{1,ij}(t,\xi ) + \dots + \varepsilon ^{2(m-1)}q_{m-1,ij}(t,\xi ). \end{aligned}$$
(55)

We note that \(q_{\varepsilon ,{(i+hm)(j+hm)}} = q_{\varepsilon ,ij}\), \(h=0,\dots ,m-1\) due to the block-diagonal structure of \(\mathcal Q_\varepsilon ^{(m)}(t,\xi )\). Since all functions on the right hand side of (55) are analytic, we can use Lemma 6.6 on each of them. Note that the partition \((\tau _{h(\xi )})\) in Lemma 6.6 can be chosen independent from \(\varepsilon \).

Now, following [12, 22], we use a Kovalevskayan-type energy near the points \(\tau _{h(\xi )}\) and a hyperbolic-type energy on the rest of the interval [0, T] (see also [19]). We start with the interval \([0,\tau _{1}]\) (\(\tau _1 = \tau _{1(\xi )}\)), setting

$$\begin{aligned} E_\varepsilon (t,\xi ) = \left\{ \begin{array}{ll} |V(t,\xi )|^2 &{}\quad \text {for}\, t \in [0,\varepsilon ] \cup [\tau _1-\varepsilon ,\tau _1], \\ \langle Q_\varepsilon ^{(m)}(t,\xi )V(t,\xi ) | V(t,\xi )\rangle &{}\quad \text {for}\, t \in [\varepsilon ,\tau _1-\varepsilon ]. \end{array} \right. \end{aligned}$$

The estimate on \([0,\varepsilon ] \cup [\tau _1-\varepsilon ,\tau _1]\) is standard and the details are left to the reader. We obtain, as in [12],

$$\begin{aligned} E_\varepsilon (t,\xi ) \le \left\{ \begin{array}{ll} e^{2C \varepsilon \langle \xi \rangle } E_\varepsilon (0,\xi ) &{}\quad \text {for}\, t \in [0,\varepsilon ] \\ e^{2C\varepsilon \langle \xi \rangle } E_\varepsilon (\tau _1-\varepsilon ,\xi ) &{}\quad \text {for}\, t \in [\tau _1-\varepsilon ]. \end{array} \right. \end{aligned}$$
(56)

On \([\varepsilon , \tau _1-\varepsilon ]\), we get

$$\begin{aligned} \partial _t E(t,\xi ) \le \left( \frac{|(\partial _t \mathcal Q^{(m)}_\varepsilon V,V)|}{(\mathcal Q_\varepsilon ^{(m)} V| V)} + C_2 \varepsilon \langle \xi \rangle + C_3\right) E_\varepsilon (t,\xi ), \end{aligned}$$

where we used (31) [see (iii) in Proposition 2.1] and the Levi-type conditions (43) for \(|\xi | \ge R\) to ensure that we have

$$\begin{aligned} |((\mathcal {Q}_0^{(m)} B-B^*\mathcal {Q}_0^{(m)})V | V)| \le C |\mathcal W^{(m)} V|^2 = (\mathcal Q^{(m)}_0 V| V), \end{aligned}$$

see also (32) in Sect. 4.4. Thanks to Proposition 2.3, the family \(\{\mathcal Q_\varepsilon ^{(m)}\}\) is nearly diagonal, when the eigenvalues \(\lambda _l\), \(l=1,\dots ,m\) of A satisfy (2). Thus, we have \(\mathcal Q_\varepsilon \ge c_0 {{\mathrm{diag}}}(\mathcal Q_\varepsilon ^{(m)})\), i.e,

$$\begin{aligned} (\mathcal Q_\varepsilon ^{(m)} V | V) \ge c_0 \sum _{h=1}^m q_{\varepsilon ,hh} \sum _{l=0}^{m-1} |V_{h+lm}|^2 = c_0\sum _{h=1}^{m^2} q_{\varepsilon ,hh} |V_h|^2. \end{aligned}$$

Using Proposition 2.1 and the Cauchy–Schwarz inequality, we obtain

$$\begin{aligned} |q_{\varepsilon ,ij}||V_i||V_j| \le \sum _{h=1}^{m^2} q_{\varepsilon ,hh}|V_h|^2. \end{aligned}$$

Together with Lemma 6.6, using the last two inequalities, we conclude that

$$\begin{aligned} \int \limits _{\varepsilon }^{\tau _1-\varepsilon } \frac{|(\partial _t \mathcal Q^{(m)}_\varepsilon V,V)|}{(\mathcal Q_\varepsilon ^{(m)} V| V)} dt \le \frac{1}{c_0} \int \limits _\varepsilon ^{\tau _1-\varepsilon } \sum _{i,j=1}^{m^2} \frac{|\partial _t q_{ij}(t,\xi )|}{|q_{ij}(t,\xi )|} dt \le C \log \left( \frac{T}{\varepsilon } \right) \end{aligned}$$

for a certain positive constant C not depending on t and \(\xi \). Thanks to the block diagonal form of the quasi-symmetriser, the proof now continues as the proof of Theorem 7 in [12]. This leads to the inequality

$$\begin{aligned} |V(t,\xi )| \le c \langle \xi \rangle ^{N(\xi )(m-1)} e^{N(\xi )C_T} \langle \xi \rangle ^{N(\xi )C_T}, \end{aligned}$$

obtained by setting \(\varepsilon =\langle \xi \rangle ^{-1}\). Lemma 6.6 guarantees that the function \(N(\xi )\) is bounded in \(\xi \). Therefore, we can conclude that there exists a \(\kappa \in \mathbb {N}\), depending only on n, m, and T as well as a positive constant \(C>0\) such that

$$\begin{aligned} |V(t,\xi )| \le C \langle \xi \rangle ^\kappa |V(0,\xi )| \end{aligned}$$

for all \(t \in [0,T]\) and \(|\xi | \ge R\). Clearly this estimate implies the \(C^\infty \) well-posedness of the Cauchy problem (1). \(\square \)

Remark 6.8

Since the entries of the matrix A are at least \(C^\infty \) with respect to t in both Theorems 6.1 and 6.7, from the system itself in (1) we obtain that the dependence in t of the solution u is actually not only \(C^1\) but \(C^\infty \).

Remark 6.9

In this paper we have studied homogeneous systems. Our method, described in the previous sections, can be generalised to non-homogeneous systems with some technical work on the lower order terms. Key point is to investigate the relation of the matrix of the lower order terms in the original system with the matrix \(\mathcal B\) obtained after reduction to block Sylvester form.