1 Introduction

We want to study the Cauchy problem for first-order hyperbolic systems of the type

$$\begin{aligned} \begin{aligned}&D_t u-A(t,D_x)u-B(t)u=0,\quad x\in \mathbb {R}^n,\, t\in [0,T],\\&u|_{t=0}=g_0, \end{aligned} \end{aligned}$$
(1)

where A and B are \(m\times m\) matrices of first-order and zero- order differential operators, respectively, with t-dependent coefficients, and u and \(g_0\) are column vectors with m entries. We work under the assumptions that the system matrix is of size \(m\times m\) with real eigenvalues and that the coefficients are of class \(C^{m-1}\) with respect to t. It follows that at the points of highest multiplicity the eigenvalues are of Hölder class \((m-1)/m\). We will therefore assume that the matrix \(A(t,\xi )\) has m real eigenvalues \(\lambda _j(t,\xi )\) of Hölder class \(C^\alpha \), \(0<\alpha \le 1\) with respect to t. Note that it is not restrictive to assume that the eigenvalues \(\lambda _j\), \(j=1,\ldots ,m\), are ordered because we can always reorder them to satisfy this (ordering) assumption, and the Hölder continuity is preserved by such reordering. If \(\alpha =1\), it is sufficient to assume that \(\lambda _j\), \(j=1,\ldots ,m\), are Lipschitz.

In analogy with scalar equations in [3, 7], we work under the hypothesis of the following uniform property: There exists a constant \(c>0\) such that

$$\begin{aligned} |\lambda _i(t,\xi )-\lambda _j(t,\xi )|\le c|\lambda _k(t,\xi )-\lambda _{k-1}(t,\xi )|, \end{aligned}$$
(2)

for all \(1\le i,j,k\le m\), \(t\in [0,T]\) and \(\xi \in \mathbb {R}^n\).

Assumptions of Hölder regularity of this type and the uniform condition (2) are rather natural (see Colombini and Kinoshita [3] and the authors’ paper [7] for a discussion and examples). In particular, Colombini and Kinoshita [3] treated the scalar version of the Cauchy problem (1) with \(n=1\), and the authors extended it to the multidimensional case \(n\ge 1\) in [7], also improving some Gevrey indices.

The research of this paper continues investigations of properties of solutions to Cauchy problems for hyperbolic equations with multiplicities. The case of time-dependent coefficients already presents a number of challenging problems, most importantly in view of the fact that already the scalar wave equation

$$\begin{aligned} \partial _t^2 v-a(t)\Delta v=0,\quad v(0)=v_0,\quad \partial _t v(0)=v_1, \end{aligned}$$
(3)

in dimension \(n=1\) may not be well-posed even for smooth data \(v_0,v_1\in C^\infty \). More precisely, if \(a\in C^\alpha \) is Hölder with \(0<\alpha <1\), even in the strictly hyperbolic case \(a>0\) the Cauchy problem (3) may have nonunique solutions (see Colombini et al. [2]), and in the weakly hyperbolic case \(a\ge 0\) even if a is smooth \(a\in C^\infty \), the Cauchy problem (3) may have no distributional solutions (see Colombini and Spagnolo [4]). However, the Cauchy problem (3) is well-posed in suitable Gevrey classes (see Colombini et al. [1]). At the moment, scalar higher-order equations with time-dependent coefficients are relatively understood (see, e.g. [3, 13] and their respective extensions in [7, 8]). Further extreme cases: analytic coefficients and distributional coefficients have been also investigated (see, e.g. authors’ papers [6, 9, 10], respectively, and references therein). Hyperbolic systems of the form (1) have been also investigated (see, e.g. Garetto [6], Kajitani and Yuzawa [14] and Yuzawa [15]).

The main new idea behind this paper enabling us to obtain an improvement in the well-posedness results for the system in (1) is the transformation of the system (1) to a larger system which, however, enjoys the property of being in block Sylvester form. Such a transformation, which can be performed under the assumption that the system coefficients are of class \(C^{m-1}\) with respect to t, is carried out following the method of D’Ancona and Spagnolo [5], leading to the Cauchy problem of the form

$$\begin{aligned} \begin{aligned}&D_tU-\mathcal {A}(t,D_x)U-\mathcal {L}(t,D_x)U=0,\\&U_{t=0} = U_0. \end{aligned} \end{aligned}$$
(4)

This is a Cauchy problem for the first-order hyperbolic system of the size \(m^2\times m^2\) of pseudo-differential equations. Despite the increase in the size of the system from \(m\times m\) to \(m^2\times m^2\) and the change from a differential system to a pseudo-differential one, the system (4) has a crucial advantage of being in a block Sylvester form (see (9) for a precise formulation). This allows us to implement the ideas developed in [7] for scalar equations, where the reduction in a scalar equation to a Sylvester form system was performed.

To summarise our result, here we first note that combining the results in [14, 15] we already know that the Cauchy problem (1) is well-posed in the Gevrey class \(\gamma ^s\) , with

$$\begin{aligned} 1\le s<1+\frac{\alpha }{m}. \end{aligned}$$
(5)

Arguing by the Fourier characterisation of Gevrey–Beurling ultradistributions, one can easily extend the Gevrey well-posedness above to spaces of ultradistributions. It is our aim in this paper to show that the interval of Gevrey well-posedness in (5) can be enlarged under the uniform property (2) of the eigenvalues. Since by the results of Kajitani and Yuzawa at least an ultradistributional solution exists for Gevrey initial data with \(s\ge 1+\frac{\alpha }{m}\) we will prove, for suitable values of s, that this solution is indeed Gevrey, because it solves the reduced Cauchy problem (4). In this sense, the well-posedness of (1) can be determined by studying the well-posedness of the reduced Cauchy problem (4). More precisely, by standard arguments it is sufficient to find an a priori estimate on the Fourier transform with respect to x of the solution U of (4).

We assume that the Gevrey classes \(\gamma ^s({\mathbb R}^n)\) are well known: These are spaces of all \(f\in C^\infty (\mathbb {R}^n)\) such that for every compact set \(K\subset \mathbb {R}^n\) there exists a constant \(C>0\) such that for all \(\beta \in \mathbb N_0^n\) we have the estimate

$$\begin{aligned} \sup _{x\in K}|\partial ^\beta f(x)|\le C^{|\beta |+1}(\beta !)^s. \end{aligned}$$
(6)

For \(s=1\), we obtain the class of analytic functions. We refer to [7] for a detailed discussion and Fourier characterisations of Gevrey spaces of different types. Since we are dealing with vectors in this paper, we will write \(\gamma ^s({\mathbb R}^n)^m\) for m-vectors consisting of functions in \(\gamma ^s({\mathbb R}^n)\). This is our main result:

Theorem 1.1

Assume that coefficients of the \(m\times m\) matrices A and B are of class \(C^{m-1}\) and that the matrix \(A(t,\xi )\) has m real eigenvalues \(\lambda _j(t,\xi )\) of Hölder class \(C^\alpha \), \(0<\alpha \le 1\) with respect to t, that satisfy (2). Let \(T>0\) and \(g_0\in \gamma ^s({\mathbb R}^n)^m\). Then, the Cauchy problem (1) has a unique solution \(u\in C^1([0,T],\gamma ^s({\mathbb R}^n)^m)\) provided that

$$\begin{aligned} 1\le s< 1+\min \biggl \{\alpha ,\frac{1}{m-1}\biggr \}. \end{aligned}$$
(7)

For the proof, we can assume that \(s>1\) since the case \(s=1\) is essentially known (see [11, 12]).

Also, we note that the proof also covers the case \(\alpha =1\), in which case it is enough to assume that the eigenvalues are Lipschitz.

We note that the result of Theorem 1.1 is an improvement of known results in terms of the Gevrey order. For example, this is an improvement of Yuzawa’s and Kajitani’s order (5) from [14, 15]. See Remark 2.3 for more details.

The energy estimates obtained in the proof of Theorem 1.1 allow one to also obtain the ultradistributional well-posedness results. First we note that the Gevrey spaces \(\gamma ^s(\mathbb {R}^n)\) considered in (6) are of Gevrey–Roumeau type. At the same time, we denote by \(\gamma ^{(s)}(\mathbb {R}^n)\) the Gevrey spaces of Gevrey–Beurling type, i.e. the space of all \(f\in C^\infty ({\mathbb R}^n)\) such that for every compact set \(K\subset \mathbb {R}^n\) and for every constant \(A>0\) there exists a constant \(C_{A,K}>0\) such that for all \(\beta \in \mathbb N_0^n\) we have the estimate

$$\begin{aligned} \sup _{x\in K}|\partial ^\beta f(x)|\le C_{A,K} A^{|\beta |}(\beta !)^s. \end{aligned}$$

For \(1<s<\infty \), we denote by \(\mathcal D_{(s)}^\prime (\mathbb {R}^n):=(\gamma ^{(s)}_c(\mathbb {R}^n))'\) the topological dual of compactly supported functions in \(\gamma ^{(s)}(\mathbb {R}^n)\) and by \({\mathcal E}'_{(s)}(\mathbb {R}^n)\) the topological dual of \(\gamma ^{(s)}(\mathbb {R}^n)\). Consequently, arguing similarly to [7], the proof of Theorem 1.1 yields the following ultradistributional well-posedness:

Theorem 1.2

Assume that coefficients of the \(m\times m\) matrices A and B are of class \(C^{m-1}\) and that the matrix \(A(t,\xi )\) has m real eigenvalues \(\lambda _j(t,\xi )\) of Hölder class \(C^\alpha \), \(0<\alpha \le 1\) with respect to t, that satisfy (2). Let \(T>0\) and \(g_0\in ({\mathcal E}_{(s)}^\prime (\mathbb {R}^n))^m\). Then, the Cauchy problem (1) has a unique solution \(u\in C^1([0,T],(\mathcal D_{(s)}^\prime (\mathbb {R}^n))^m)\) provided that

$$\begin{aligned} 1<s\le 1+\min \biggl \{\alpha ,\frac{1}{m-1}\biggr \}. \end{aligned}$$

2 Proof of Theorem 1.1

The first step in our new approach to the Cauchy problem (1) is to rewrite the system in a special form, i.e. in block Sylvester form. This is possible thanks to the reduction given by D’Ancona and Spagnolo [5], which is summarised in the following subsection.

2.1 Reduction to block Sylvester form

We begin by considering the cofactor matrix \(L(t,\tau ,\xi )\) of \((\tau I-A(t,\xi ))^T\) where I is the \(m\times m\) identity matrix. By applying the corresponding operator \(L(t,D_t,D_x)\) to (1), we transform the system

$$\begin{aligned} D_t u-A(t,D_x)u-B(t)u=0 \end{aligned}$$

into

$$\begin{aligned} \delta (t,D_t,D_x)Iu-C(t,D_t,D_x)u=0, \end{aligned}$$
(8)

where \(\delta (t,\tau ,\xi )=\mathrm{det}(\tau I-A(t,\xi ))\), \(C(t,D_t,D_x)\) is the matrix of lower-order terms (differential operators of order \(m-1\)). Since the entries of A and B are of class \(C^{m-1}\) with respect to t the equation above has continuous t-dependent coefficients. Indeed, the coefficients of the equation \(D_t u-A(t,D_x)u-B(t)u=0\) are of class \(C^{m-1}\) and the operator \(L(t,D_t,D_x)\) is of order \(m-1\) being defined via the cofactor matrix of a \(m\times m\) matrix. Note that \(\delta (t,D_t,D_x)\) is the operator

$$\begin{aligned} D_t^m+\sum _{h=0}^{m-1}b_{m-h}(t,D_x)D_t^h, \end{aligned}$$

with \(b_{m-h}(t,\xi )\) homogeneous polynomial of order \(m-h\).

We got in this way a set of scalar equations of order m which can be transformed into a first-order system of size \(m^2\times m^2\) of pseudo-differential equations, by setting

$$\begin{aligned} \begin{aligned} U&= (U_1,\ldots ,U_m)^T \in \mathbb {R}^{m^2} \\ U_i&= \left( D_t^{j-1}\langle D_x \rangle ^{m-j}u_i \right) _{j=1,\ldots ,m} \in \mathbb {R}^m, \quad i=1,\ldots ,m, \end{aligned} \end{aligned}$$

where \(\langle D_x \rangle \) is the pseudo-differential operator with symbol \(\langle \xi \rangle \). More precisely, the equation (8) is now written as

$$\begin{aligned} D_tU-\mathcal {A}(t,D_x)U-\mathcal {L}(t,D_x)U=0, \end{aligned}$$

where \(\mathcal {A}\) is a \(m^2\times m^2\) matrix made of m identical blocks of the type

$$\begin{aligned} \langle D_x \rangle \cdot \\ \left( \begin{array}{cccccc} 0 &{}\quad 1 &{}\quad 0 &{}\quad \cdots &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 1 &{}\quad \cdots &{}\quad 0\\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ -b_m(t,D_x)\langle D_x \rangle ^{-m}&{}\quad -b_{m-1}(t,D_x)\langle D_x \rangle ^{-m+1} &{}\quad \cdots &{}\quad \cdots &{}\quad -b_1(t,D_x)\langle D_x \rangle ^{-1}\\ \end{array} \right) , \end{aligned}$$

and the matrix \(\mathcal {L}\) of the lower-order terms is made of m blocks of size \(m\times m^2\) of the type

$$\begin{aligned} \left( \begin{array}{cccccc} 0 &{}\quad 0 &{}\quad 0 &{}\quad \cdots &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad \cdots &{}\quad 0 &{}\quad 0\\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ l_{j,1}(t, D_x)&{}\quad l_{j,2}(t,D_x) &{}\quad \cdots &{}\quad \cdots &{}\quad l_{j,m^2-1}(t,D_x) &{}\quad l_{j,m^2}(t,D_x) \end{array} \right) , \end{aligned}$$

with \(j=1,\ldots ,m\). Note that the entries of the matrices \(\mathcal {A}\) and \(\mathcal {L}\) are pseudo-differential operators of order 1 and 0, respectively.

Concluding, the Cauchy problem (1) has been transformed into

$$\begin{aligned} \begin{aligned}&D_tU-\mathcal {A}(t,D_x)U-\mathcal {L}(t,D_x)U=0,\\&U_{t=0} =U_0. \end{aligned} \end{aligned}$$
(9)

This is a Cauchy problem of first-order pseudo-differential equations with principal part in block Sylvester form. The size of the system is increased from \(m\times m\) to \(m^2\times m^2\), but the system is still hyperbolic, since the eigenvalues of any block of \(\mathcal {A}(t,\xi )\) are the eigenvalues of the matrix \(A(t,\xi )\). The initial data \(U_0\) are an \(m^2\)-column vector with entries

$$\begin{aligned} U_{0,i}=\left( D_t^{j-1}\langle D_x \rangle ^{m-j}u_{i}(0,x)\right) _{j=1,\ldots , m}, \end{aligned}$$

\(i=1,\ldots ,m\), where u is the solution of the Cauchy problem (1) with \(u(0,x)=g_0\). As observed in the introduction, we already know that such u exists at least as ultradistribution. By using the initial condition \(g_0\) and by deriving the system in (1) \(m-1\) times with respect to t, we obtain that \(D_t^{j-1}\langle D_x \rangle ^{m-j}u(0,x)\) has the same regularity properties of \(g_0\) for \(j=1,\ldots ,m\). It follows that if \(g_0\in \gamma ^s(\mathbb {R}^n)^m\) then \(U_0\in \gamma ^s(\mathbb {R}^n)^{2m}\).

2.2 Energy estimates

As in [7] we regularise the eigenvalues \(\lambda _{j}(t,\xi )\) with respect to t and we separate them by adding some power of a parameter \(\varepsilon \rightarrow 0\). In detail, assuming that the \(\lambda _j\)’s are ordered and taking a mollifier \(\varphi \in \mathcal {C}^\infty _{\text {c}}(\mathbb {R})\), \(\varphi \ge 0\) with \(\int \varphi (t)\, dt=1\) we set

$$\begin{aligned} \lambda _{j,\varepsilon }(t,\xi ):=(\lambda _j(\cdot ,\xi )*\varphi _\varepsilon )(t)+j\varepsilon ^\alpha \langle \xi \rangle ,\quad t\in [0,T],\, \xi \in \mathbb {R}^n, \end{aligned}$$

where \(\varphi _\varepsilon (t)=\varepsilon ^{-1}\varphi (t/\varepsilon )\) and \(j=1,\ldots ,m\). The next proposition collects the main properties of these regularised eigenvalues and has been proven in [7] (see Propositions 18 and 19).

Proposition 2.1

Let \(\varphi \in C^\infty _{c}(\mathbb {R})\), \(\varphi \ge 0\) with \(\int _\mathbb {R}\varphi (x)\, dx=1\).

Under the assumptions of Theorem 1.1, let

$$\begin{aligned} \lambda _{j,\varepsilon }(t,\xi ):=(\lambda _j(\cdot ,\xi )*\varphi _\varepsilon )(t)+j\varepsilon ^\alpha \langle \xi \rangle , \end{aligned}$$
(10)

for \(j=1,\ldots ,m\) and \(\varphi _\varepsilon (s)=\varepsilon ^{-1}\varphi (s/\varepsilon )\), \(\varepsilon >0\). Then, there exists a constant \(c>0\) such that

  1. (i)

    \(|\partial _t\lambda _{j,\varepsilon }(t,\xi )|\le c\,\varepsilon ^{\alpha -1}\langle \xi \rangle \),

  2. (ii)

    \(|\lambda _{j,\varepsilon }(t,\xi )-\lambda _j(t,\xi )|\le c\,\varepsilon ^{\alpha }\langle \xi \rangle \),

  3. (iii)

    \(\lambda _{j,\varepsilon }(t,\xi )-\lambda _{i}(t,\xi )\ge \varepsilon ^\alpha \langle \xi \rangle \) for \(j>i\),

for all \(t,s\in [0,T']\) with \(T'<T\) and all \(\xi \in \mathbb {R}^n\).

We can now define the \(m^2\times m^2\) block diagonal matrix \(H_\varepsilon \) made of m identical blocks of the type

$$\begin{aligned} \left( \begin{array}{ccccc} 1 &{}\quad 1 &{}\quad 1 &{}\quad \dots &{}\quad 1\\ \lambda _{1,\varepsilon }\langle \xi \rangle ^{-1} &{}\quad \lambda _{2,\varepsilon }\langle \xi \rangle ^{-1} &{}\quad \lambda _{3,\varepsilon }\langle \xi \rangle ^{-1} &{}\quad \dots &{}\quad \lambda _{m,\varepsilon }\langle \xi \rangle ^{-1} \\ \lambda ^2_{1,\varepsilon }\langle \xi \rangle ^{-2} &{}\quad \lambda ^2_{2,\varepsilon }\langle \xi \rangle ^{-2} &{}\quad \lambda ^2_{3,\varepsilon }\langle \xi \rangle ^{-2} &{}\quad \dots &{}\quad \lambda ^2_{m,\varepsilon }\langle \xi \rangle ^{-2} \\ \dots &{}\quad \dots &{}\quad \dots &{}\quad \dots &{}\quad \dots \\ \lambda ^{m-1}_{1,\varepsilon }\langle \xi \rangle ^{-m+1} &{}\quad \lambda ^{m-1}_{2,\varepsilon }\langle \xi \rangle ^{-m+1} &{}\quad \lambda ^{m-1}_{3,\varepsilon }\langle \xi \rangle ^{-m+1} &{}\quad \dots &{}\quad \lambda ^{m-1}_{m,\varepsilon }\langle \xi \rangle ^{-m+1}\\ \end{array} \right) . \end{aligned}$$
(11)

By separation of the regularised eigenvalues, one easily sees that the matrix \(H_\varepsilon \) is invertible. Since weakly hyperbolic equations and systems posses the finite speed of propagation property, we know that if the initial data are compactly supported then the solution will be compactly supported in x as well. Hence, instead of dealing with the Cauchy problem (9) directly we can apply the Fourier transform with respect to x to it and focus on the corresponding Cauchy problem

$$\begin{aligned} \begin{aligned} D_tV-\mathcal {A}(t,\xi )V-\mathcal {L}(t,\xi )V&=0,\\ V_{t=0}&=\widehat{U_0} \end{aligned} \end{aligned}$$
(12)

Note assuming compactly supported initial data in Theorem 1.1 is not restrictive. We look for a solution \(V(t,\xi )\) of the type

$$\begin{aligned} V(t,\xi )=\mathrm {e}^{-\rho (t)\langle \xi \rangle ^{\frac{1}{s}}}(\det H_\varepsilon )^{-1}H_\varepsilon W, \end{aligned}$$
(13)

where \(\rho \in C^1[0,T]\) will be determined in the sequel. By substitution in (12) we obtain

$$\begin{aligned}&\mathrm {e}^{-\rho (t)\langle \xi \rangle ^{\frac{1}{s}}}(\det H_\varepsilon )^{-1}H_\varepsilon D_tW+\mathrm {e}^{-\rho (t)\langle \xi \rangle ^{\frac{1}{s}}}\mathrm{i}\rho '(t)\langle \xi \rangle ^{\frac{1}{s}}(\det H_\varepsilon )^{-1}H_\varepsilon W\\&\quad +\,\mathrm{i}\mathrm {e}^{-\rho (t)\langle \xi \rangle ^{\frac{1}{s}}}\frac{\partial _t\det H_\varepsilon }{(\det H_\varepsilon )^2}H_\varepsilon W +\mathrm {e}^{-\rho (t)\langle \xi \rangle ^{\frac{1}{s}}}(\det H_\varepsilon )^{-1}(D_tH_\varepsilon )W\\&\qquad =\mathrm {e}^{-\rho (t)\langle \xi \rangle ^{\frac{1}{s}}}(\det H_\varepsilon )^{-1}(\mathcal {A}+\mathcal {L})H_\varepsilon W. \end{aligned}$$

Multiplying both sides of the previous equation by \(\mathrm {e}^{\rho (t)\langle \xi \rangle ^{\frac{1}{s}}}(\det H_\varepsilon )H_\varepsilon ^{-1}\) we get

$$\begin{aligned} D_tW+\mathrm{i}\rho '(t)\langle \xi \rangle ^{\frac{1}{s}}W+\mathrm{i}\frac{\partial _t\det H_\varepsilon }{\det H_\varepsilon }W + H_\varepsilon ^{-1}(D_t H_\varepsilon )W= H_\varepsilon ^{-1}(\mathcal {A}+\mathcal {L})H_\varepsilon W. \end{aligned}$$

Thus,

$$\begin{aligned}&\partial _t |W(t,\xi )|^2=2\mathrm{Re} (\partial _t W(t,\xi ),W(t,\xi ))\nonumber \\&\quad =2\rho '(t)\langle \xi \rangle ^{\frac{1}{s}}|W(t,\xi )|^2+2\frac{\partial _t\det H_\varepsilon }{\det H_\varepsilon }|W(t,\xi )|^2-2 \mathrm{Re}(H_\varepsilon ^{-1}\partial _t H_\varepsilon W,W)\nonumber \\&\qquad -2\mathrm{Im} (H_\varepsilon ^{-1}\mathcal {A}H_\varepsilon W,W)-2\mathrm{Im} (H_\varepsilon ^{-1}\mathcal {L}H_\varepsilon W,W). \end{aligned}$$
(14)

Inspired by the treatment of higher-order equations given in [7], we proceed by estimating the terms:

  1. (i)

    \(\frac{\partial _t\det H_\varepsilon }{\det H_\varepsilon }\),

  2. (ii)

    \(\Vert H_\varepsilon ^{-1}\partial _t H_\varepsilon \Vert \),

  3. (iii)

    \(\Vert H_\varepsilon ^{-1}\mathcal {A}H_\varepsilon -(H_\varepsilon ^{-1}\mathcal {A}H_\varepsilon )^*\Vert \),

  4. (iv)

    \(\Vert H_\varepsilon ^{-1}\mathcal {L}H_\varepsilon -(H_\varepsilon ^{-1}\mathcal {L}H_\varepsilon )^*\Vert \).

2.2.1 Estimate of (i), (ii), (iii) and (iv)

We begin by noting that the m identical blocks of the \(m^2\times m^2\)-matrix \(H_\varepsilon \) are exactly given by the matrix H used in the paper [7] (formula (3.4)). Hence, we can set

$$\begin{aligned} H=\left( \begin{array}{ccccc} 1 &{}\quad 1 &{}\quad 1 &{}\quad \dots &{}\quad 1\\ \lambda _{1,\varepsilon }\langle \xi \rangle ^{-1} &{}\quad \lambda _{2,\varepsilon }\langle \xi \rangle ^{-1} &{}\quad \lambda _{3,\varepsilon }\langle \xi \rangle ^{-1} &{}\quad \dots &{}\quad \lambda _{m,\varepsilon }\langle \xi \rangle ^{-1} \\ \lambda ^2_{1,\varepsilon }\langle \xi \rangle ^{-2} &{}\quad \lambda ^2_{2,\varepsilon }\langle \xi \rangle ^{-2} &{}\quad \lambda ^2_{3,\varepsilon }\langle \xi \rangle ^{-2} &{}\quad \dots &{}\quad \lambda ^2_{m,\varepsilon }\langle \xi \rangle ^{-2} \\ \dots &{}\quad \dots &{}\quad \dots &{}\quad \dots &{}\quad \dots \\ \lambda ^{m-1}_{1,\varepsilon }\langle \xi \rangle ^{-m+1} &{}\quad \lambda ^{m-1}_{2,\varepsilon }\langle \xi \rangle ^{-m+1} &{}\quad \lambda ^{m-1}_{3,\varepsilon }\langle \xi \rangle ^{-m+1} &{}\quad \dots &{}\quad \lambda ^{m-1}_{m,\varepsilon }\langle \xi \rangle ^{-m+1}\\ \end{array} \right) . \end{aligned}$$

and observe that

$$\begin{aligned} \frac{\partial _t\det H_\varepsilon }{\det H_\varepsilon }=m\frac{\partial _t\det H}{\det H}. \end{aligned}$$

By arguing as in (4.3) in [7] we immediately have that

$$\begin{aligned} \biggl |\frac{\partial _t\det H_\varepsilon (t,\xi )}{\det H_\varepsilon (t,\xi )}\biggr |\le c_1\varepsilon ^{-1}, \end{aligned}$$
(15)

for all \(t\in [0,T]\), \(\xi \in \mathbb {R}^n\) and \(\varepsilon \in (0,1]\).

Since \(H_\varepsilon \) is block diagonal, its inverse will be block diagonal as well and precisely given by m identical blocks \(H^{-1}\) as defined in Proposition 17(ii) in [7]. It follows that to estimate \(\Vert H_\varepsilon ^{-1}\partial _t H_\varepsilon \Vert \) it is enough to estimate the norm of the corresponding block \(H^{-1}\partial _t H\). This has been done in Subsection 4.2 in [7] and leads to

$$\begin{aligned} \Vert H_\varepsilon ^{-1}\partial _t H_\varepsilon \Vert \le c_2\varepsilon ^{-1}. \end{aligned}$$
(16)

Note that to obtain (16) one uses the uniform property (2) of the eigenvalues and of the corresponding regularisations.

The same block argument applies to \(\Vert H_\varepsilon ^{-1}\mathcal {A}H_\varepsilon -(H_\varepsilon ^{-1}\mathcal {A}H_\varepsilon )^*\Vert \). Indeed, the matrix \(H_\varepsilon ^{-1}\mathcal {A}H_\varepsilon -(H_\varepsilon ^{-1}\mathcal {A}H_\varepsilon )^*\) is block diagonal with m blocks of the type \(H^{-1}\mathcal {A}H-(H^{-1}\mathcal {A}H)^*\). This is the type of matrix which has been estimated in Subsection 4.3 in [7]. In detail, \(\Vert H^{-1}\mathcal {A}H-(H^{-1}\mathcal {A}H)^*\Vert \le c_3\varepsilon ^\alpha \langle \xi \rangle \) and therefore

$$\begin{aligned} \Vert H_\varepsilon ^{-1}\mathcal {A}H_\varepsilon -(H_\varepsilon ^{-1}\mathcal {A}H_\varepsilon )^*\Vert \le c_3\varepsilon ^\alpha \langle \xi \rangle . \end{aligned}$$
(17)

Finally, if we consider now the matrix of the lower-order terms \(H_\varepsilon ^{-1}\mathcal {L}H_\varepsilon -(H_\varepsilon ^{-1}\mathcal {L}H_\varepsilon )^*\) we easily sees that it is made of m blocks of the type \((\mathrm{det}H)^{-1}\) times a matrix with 0-order symbols bounded with respect to \(\varepsilon \) (see Subsection 4.4. in [7]). More precisely, by following the arguments of Proposition 17(iv) in [7], we get the estimate

$$\begin{aligned} \Vert H_\varepsilon ^{-1}\mathcal {L}H_\varepsilon -(H_\varepsilon ^{-1}\mathcal {L}H_\varepsilon )^*\Vert \le c_4\varepsilon ^{\alpha (1-m)}. \end{aligned}$$
(18)

We now insert (15), (16), (17) and (18) in the energy estimate (14). We obtain

$$\begin{aligned} \partial _t |W(t,\xi )|^2\le 2(\rho '(t)\langle \xi \rangle ^{\frac{1}{s}}+c_1\varepsilon ^{-1}+c_2\varepsilon ^{-1}+c_3\varepsilon ^{\alpha }\langle \xi \rangle +c_4\varepsilon ^{\alpha (1-m)})|W(t,\xi )|^2\nonumber \\ \le (2\rho '(t)\langle \xi \rangle ^{\frac{1}{s}}+C_1\varepsilon ^{-1}+C_2\varepsilon ^{\alpha }\langle \xi \rangle +C_3\varepsilon ^{\alpha (1-m)})|W(t,\xi )|^2. \end{aligned}$$
(19)

We now set \(\varepsilon =\langle \xi \rangle ^{-\gamma }\) in (19), and we compare the terms

$$\begin{aligned} \langle \xi \rangle ^{\gamma },\quad \langle \xi \rangle ^{1-\gamma \alpha },\quad \langle \xi \rangle ^{\gamma \alpha (m-1)}. \end{aligned}$$

For \(\gamma =\min \{\frac{1}{1+\alpha },\frac{1}{\alpha m}\}\) one has that

$$\begin{aligned} \max \{\gamma ,\gamma \alpha (m-1)\}\le 1-\gamma \alpha \end{aligned}$$

and therefore

$$\begin{aligned} \partial _t |W(t,\xi )|^2\le (2\rho '(t)\langle \xi \rangle ^{\frac{1}{s}}+C\langle \xi \rangle ^{1-\gamma \alpha })|W(t,\xi )|^2. \end{aligned}$$

2.3 Conclusion of the proof of Theorem 1.1

Let \(\rho (t)=\rho (0)-\kappa t\), where \(\kappa >0\). If

$$\begin{aligned} \frac{1}{s}>1-\gamma \alpha =1-\min \biggl \{\frac{\alpha }{1+\alpha },\frac{1}{m}\biggr \}=\max \biggl \{\frac{1}{1+\alpha },\frac{m-1}{m}\biggr \}, \end{aligned}$$
(20)

for \(|\xi |\) large enough we have that \(\partial _t |W(t,\xi )|^2\le 0\), i.e. \(W(t,\xi )=W(0,\xi )\). Therefore,

$$\begin{aligned} |V(t,\xi )|= & {} \mathrm {e}^{-\rho (t)\langle \xi \rangle ^{\frac{1}{s}}}\frac{1}{\det H_\varepsilon (t,\xi )}|H_\varepsilon (t,\xi )||W(t,\xi )| \nonumber \\\le & {} \mathrm {e}^{-\rho (t)\langle \xi \rangle ^{\frac{1}{s}}}\frac{1}{\det H_\varepsilon (t,\xi )}|H_\varepsilon (t,\xi )||W(0,\xi )|\nonumber \\= & {} \mathrm {e}^{(-\rho (t)+\rho (0))\langle \xi \rangle ^{\frac{1}{s}}}\frac{\det H_\varepsilon (0,\xi )}{\det H_\varepsilon (t,\xi )}|H_\varepsilon (t,\xi )||H_\varepsilon ^{-1}(0,\xi )||V(0,\xi )|, \end{aligned}$$
(21)

where, arguing on the block level for \(\gamma \) as above, we have

$$\begin{aligned} \frac{\det H_\varepsilon (0,\xi )}{\det H_\varepsilon (t,\xi )}|H_\varepsilon (t,\xi )||H_\varepsilon ^{-1}(0,\xi )|\le c\,\varepsilon ^{-\alpha \frac{(m-1)m}{2}}=c\langle \xi \rangle ^{\gamma \alpha \frac{(m-1)m}{2}}. \end{aligned}$$

It follows that

$$\begin{aligned} |V(t,\xi )|\le c\mathrm {e}^{\kappa T\langle \xi \rangle ^{\frac{1}{s}}}\langle \xi \rangle ^{\gamma \alpha \frac{(m-1)m}{2}}|V(0,\xi )|. \end{aligned}$$

By choosing \(\kappa \) small enough we can conclude that \(|V(t,\xi )|\le c'\mathrm {e}^{-\delta \langle \xi \rangle ^{\frac{1}{s}}}\) for some \(c',\delta >0\). By the Paley–Wiener characterisation of Gevrey functions this yields to the existence and uniqueness of the solution \(U\in C^1([0,T];\gamma ^s(\mathbb {R}^n)^m)\) of the Cauchy problem (9) and therefore to the Gevrey well-posedness of the original Cauchy problem (1).

Remark 2.2

We have proven that the solution u of the Cauchy problem (1) is of class \(C^1\) with respect to t. Since the coefficients of the matrices A and B are of class \(C^{m-1}\), it actually follows that u belongs to \(C^m([0,T];\gamma ^s(\mathbb {R}^n)^m)\).

Remark 2.3

Note that (20) implies

$$\begin{aligned} s<1+\min \biggl \{\alpha ,\frac{1}{m-1}\biggr \}. \end{aligned}$$

This is an improvement in terms of Gevrey order of Yuzawa’s and Kajitani’s result in [14, 15]. Indeed, Yuzawa first for t-dependent systems (without lower-order terms) in [15] and later Yuzawa and Kajitani for (tx)-dependent systems in [14] have proven well-posedness in the Gevrey class \(\gamma ^s\), with

$$\begin{aligned} 1\le s<1+\frac{\alpha }{m}. \end{aligned}$$

It is easy to see that

$$\begin{aligned} \frac{\alpha }{m}\le \min \biggl \{\alpha ,\frac{1}{m-1}\biggr \}. \end{aligned}$$

Remark 2.4

The strategy adopted in the proof of Theorem 1.1 shows how the energy estimate used for scalar equations in [7] can be directly applied to systems after reduction to block Sylvester form to obtain Gevrey well-posedness. In the same way, one can get well-posedness in spaces of ultradistributions. In other words, Theorem 1.2 is proven by arguing on the reduced Cauchy problem (9) as in Subsection 4.5 from the aforementioned paper.