1 Introduction

The object of this work is the large time analysis of the Goldstein-Taylor equations on the one-dimensional torus \({\mathbb {T}}\), i.e. on \([0,2\pi ]\) with periodic boundary conditions, and for \(t\in (0,\infty )\):

$$\begin{aligned} \begin{aligned} \partial _t f_+(x,t) + \partial _x f_+(x,t)&= \frac{\sigma (x)}{2}(f_-(x,t)-f_+(x,t)),\\ \partial _t f_-(x,t) - \partial _x f_-(x,t)&= - \frac{\sigma (x)}{2}(f_-(x,t)-f_+(x,t)),\\ f_{\pm }(x,0)&= f_{\pm ,0}(x), \end{aligned} \end{aligned}$$
(1)

where \(f_{\pm }(x,t)\) are the density functions of finding an element with a velocity \(\pm 1\) in a position \(x\in {\mathbb {T}}\) at time \(t>0\). The function \(\sigma \in L^{\infty }_+\left( {\mathbb {T}} \right) :=\big \{f\in L^\infty ({\mathbb {T}})\;\big |\;\text{ essmin } f>0\big \}\) is the relaxation coefficient, and \(f_{\pm ,0}\) are the initial conditions. Since (1) is mass conserving, its steady state is of the form

$$\begin{aligned} f_{\pm ,\infty }(x) := f_\infty \ ,\quad x\in {\mathbb {T}}\ ;\qquad f_\infty :=\frac{1}{2} (f_{+,0}+f_{-,0})_{\mathrm {avg}}, \end{aligned}$$

with the notation

$$\begin{aligned} h_{\mathrm {avg}}:=\frac{1}{2\pi }\int _{0}^{2\pi } h(x)dx. \end{aligned}$$
(2)

The Goldstein-Taylor model was originally considered as a diffusion process, resulting as a limit of a discontinuous random migration in 1D, where particles may change direction with rate \(\sigma \). It appeared in the context of turbulent fluid motion and the telegrapher’s equation, see [15, 22], respectively. (1) can also be seen as a special 1D case of a BGK-model (named after the three physicists Bhatnagar, Gross, and Krook [9]) with a discrete set of velocities. Such equations commonly appear in applications like gas and fluid dynamics as velocity discretisations of various kinetic models (e.g. the Boltzmann equation). The mathematical analysis of such discrete velocity models has a long standing tradition, see [10, 18] and references therein.

Although the Goldstein-Taylor equation is very simple, it still exhibits an interesting and mathematically rich structure. Hence, it has been attracting continuous interest over the last 20 years. Most of its mathematical analyses was devoted to the following three topics: scaling limits, asymptotic preserving (AP) numerical schemes, and large time behaviour. In a diffusive scaling, the Goldstein-Taylor model can be viewed as a hyperbolic approximation to the heat equation [21]. Various AP-schemes for this model in the stiff relaxation regime (i.e. for \(\sigma \rightarrow \infty \)) were constructed and analyzed in [4, 16, 17]. Since the large time convergence of solutions to (1) towards its unique steady state is also the topic of this work, we shall review the related literature in more detail:

Analytically, the main difficulty of (1) is with its hypocoercivity, as defined in [24]: More specifically, the relaxation operator on the r.h.s. is not coercive on \({\mathbb {T}}\times {\mathbb {R}}^2\). Hence, for each fixed x, the r.h.s. by itself would drive the system to its local equilibrium, generated by the kernel of the relaxation operator, \({\text {span}}\{\left( {\begin{array}{c}1\\ 1\end{array}}\right) \}\), but the local mass (density) might be different at different positions. Convergence to the global equilibrium \((f_\infty ,f_\infty )^T\) only arises due to the interplay between local relaxation and the transport operator on the l.h.s. of (1).

The Goldstein-Taylor model was also considered in the analysis of [5], if one chooses the velocity matrix to be \(V=\text{ diag }(1,-1)\) and the relaxation matrix \({\varvec{A}}(x)\) to be

$$\begin{aligned} {\varvec{A}}(x) = \frac{\sigma (x)}{2} \begin{pmatrix} 1&{} -1\\ -1&{} 1 \end{pmatrix}\ge 0. \end{aligned}$$

Exponential convergence to the steady state is then proved in the aforementioned work for the system (1) with inflow boundary conditions. Such boundary conditions make the problem significantly easier than in the periodic set-up envisioned here, in particular it allows for \(\sigma (x)\) to be zero on a subset of \({\mathbb {T}}\), an issue that proves to be far more difficult in our setting.

In [12] the authors proved polynomial decay towards the equilibrium, allowing \(\sigma (x)\) to vanish at finitely many points.

In [23] the author proved exponential decay for solutions to (1) for a more general \(\sigma (x)\ge 0\). That work is based on a (non-local in time) weak coercive estimate on the damping.

All of the papers mentioned so far did not focus on the optimality of the (exponential) decay rate. Using the equivalence between (1) and the telegrapher’s equation, the authors of [8] have shown that this optimal decay rate, \(\mu (\sigma )\), is the minimum of \(\sigma _{\mathrm {avg}}\) and the spectral gap of the telegrapher’s equation (excluding the case when some of those eigenvalues with real part equal to \(\mu (\sigma )\) are defective). The precise value of this spectral gap, however, is hardly accessible - even for simple non-constant relaxation functions \(\sigma (x)\) (see e.g. Appendix A). Moreover, it is based on the restrictive requirement \(f_{\pm ,0}\in H^1({\mathbb {T}})\), and cannot be extended to other discrete velocity models in 1D. The reason for the latter is that [8] heavily relies on the equivalence of (1) to the telegrapher’s equation.

The issues above motivated our subsequent analysis: We introduce a method for \(L^2\)–initial data that can be extended to other discrete velocity BGK-models (as illustrated below for a \(3-\)velocities system), and that yields sharp rates for constant \(\sigma \). Moreover, and most importantly, it is applicable in the general non-homogeneous \(\sigma \in L^{\infty }_+\left( {\mathbb {T}} \right) \) case and yields in these cases an explicit, quantitative lower bound for the decay rate. In this case, however, it will not achieve an optimal rate of convergenceFootnote 1 to the appropriate equilibrium of the system. The method to be derived here will use a Lyapunov function technique in the spirit of the earlier works [1, 2, 13, 24].

This paper is structured as follows: In §2 we give the analytical setting of the problem and present our main convergence result (Theorem  1). In §3 we recall some analytical results which will be needed in the analysis that will follow, and explore some properties of the entropy functional \(E_\theta \) and the anti-derivative of functions on \({\mathbb {T}}\), defined in (4) and (5), respectively. § 4 is devoted to the case where \(\sigma (x)=\sigma \) is constant, which will motivate our more general approach: Based on a modal decomposition of the Goldstein-Taylor system and its spectral analysis we derive the entropy functional \(E_\theta \), first on a modal level and then as a pseudo-differential operator in physical space. We conclude by proving part (a) of our main theorem. Continuing to §5, we will prove, using a perturbative approach to the problem, part (b) of our main theorem. The robustness of our method will be shown in §6 where we use it to obtain an explicit rate of convergence for a \(3-\)velocities Goldstein-Taylor model. Finally, in Appendix A we discuss a potential way to improve the technique from §5, and explicitly show the lack of optimality of it for a particular case of \(\sigma (x)\).

2 The Setting of the Problem and Main Results

To better understand the Goldstein-Taylor system, (1), one starts by recasting it in the macroscopic variables

$$\begin{aligned} u: = f_+ + f_-,\quad \quad v:=f_+-f_-, \end{aligned}$$

representing the spatial (mass) density and the flux density, respectively. The macroscopic variables yield the following system of equations on \({\mathbb {T}}\times (0,\infty )\):

$$\begin{aligned} \begin{aligned}&\partial _t u(x,t) +\partial _x v(x,t) =0,\\&\partial _t v(x,t) +\partial _x u(x,t) =- \sigma (x)v(x,t),\\&u(.,0)=u_0:=f_{+,0}+f_{-,0},\quad v(.,0)=v_0:=f_{+,0}-f_{-,0}\ , \end{aligned} \end{aligned}$$
(3)

whose theory of existence and uniqueness is straightforward (since the r.h.s. is a bounded perturbation of the transport operator; see §2 in [12] or, more generally, [20]). Moreover, when one tries to understand the qualitative behaviour of (3), one notices that the equation for u speaks of “total mass conservation” (upon integration over the spatial interval \((0,2\pi )\)), while the equation for v predicts a strong decay to zero for the function. This means, at least intuitively, that the difference between \(f_+\) and \(f_-\) should go to zero, and that their sum retains its mass. As the main driving force of the equation is a transport operation on the torus, we will not be surprised to learn that the large time behaviour of u (and since v should go to zero, of \(f_+\) and \(f_-\) as well) is convergence to a constant. All of this has been verified in several cases, most generally in [8].

We now set the framework that will assist us in the investigation of the large time behaviour of (3), in a relatively general case. The natural Hilbert space to consider this problem is \(L^2({\mathbb {T}})^{\otimes 2}\), with the standard inner product for each component:

$$\begin{aligned} \left\langle f_1,f_2\right\rangle :=\frac{1}{2\pi }\int _{0}^{2\pi } f_1(x)\overline{f_2(x)}dx, \end{aligned}$$

where the bar denotes complex conjugation. Since (1) and (3) are (only) hypocoercive, the symmetric part of their generators (i.e. the operators on their r.h.s.) are not coercive on \(L^2({\mathbb {T}})^{\otimes 2}\). Hence, the standard \(L^2\)–norm cannot serve as a usable Lyapunov functional. As is typical for hypocoercive equations (see [1, 13, 24]), a possible remedy to this problem is to consider a “twisted” norm (often also referred to as entropy functional), constructed in a way that this functional strictly decays along each trajectory (u(t), v(t)).

The following functional, which will be our entropy functional, is not an ansatz, and its origin will be derived in §4. Moreover, we will show that it will yield the sharp exponential decay for constant \(\sigma \), when one chooses \(\theta =\theta (\sigma )\) appropriately.

Definition 1

Let \(f,g\in L^2\left( {\mathbb {T}} \right) \) and let \(\theta >0\) be given. Then we define the entropy \(E_\theta (f,g)\) as

$$\begin{aligned} E_\theta (f,g):=\left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 -\frac{\theta }{2\pi }\int _{0}^{2\pi } {\text {Re}}\left( \partial _x^{-1}f (x)\overline{g(x)} \right) dx. \end{aligned}$$
(4)

Here, the anti-derivative of f is defined as

$$\begin{aligned} \partial _x^{-1}f (x):=\int _{0}^{x}f(y)dy - \left( \int _{0}^{x}f(y)dy \right) _{\mathrm {avg}}\ , \end{aligned}$$
(5)

with the average defined in (2). The normalization constant in (5) is chosen such that \((\partial _x^{-1}f)_{\mathrm {avg}}=0\).

Several recent studies (like [1, 13]) considered the Goldstein-Taylor system with constant \(\sigma \). This case can be investigated fairly easy as one is able to utilise Fourier analysis in this setting, and construct a Lyapunov functional as a sum of quadratic functionals of the Fourier modes. However, the moment we change \(\sigma (x)\) to a non-constant function - even to one that is natural in the Fourier setting, such as sine or cosine - the Fourier analysis becomes nigh impossible to solve.

The main idea that guided us in our approach was to re-examine the case where \(\sigma \) is constant and to recast the modal Fourier norm by using a pseudo-differential operator, without needing its modal decomposition. This functional, which is exactly \(E_\theta \) for particular choices of \(\theta =\theta (\sigma )\), can then be extended to the case where \(\sigma (x)\) is not constant, yielding quantitative estimates for the convergence. As the nature of this approach is perturbative, our decay rates are not optimal. The methodology itself, however, is fairly robust, and is viable in other cases, such as the multi-velocity Goldstein-Taylor model (as we shall see).

The main theorem we will show in this paper, with the use of the vector notation

$$\begin{aligned} f(t):=\left( {\begin{array}{c}f_+(t)\\ f_-(t)\end{array}}\right) \ ,\quad f_0:=\left( {\begin{array}{c}f_{+,0}\\ f_{-,0}\end{array}}\right) \ , \end{aligned}$$
(6)

is the following:

Theorem 1

Let \(u,v\in C([0,\infty );L^2\left( {\mathbb {T}} \right) )\) be mildFootnote 2 real valued solutions to (3) with initial datum \(u_0,\, v_0\in L^2\left( {\mathbb {T}} \right) \). Denoting by \(u_{\mathrm {avg}}=\left( u_0 \right) _{\mathrm {avg}}\) we have:

  1. a)

    \(\underline{\text {If }\sigma (x)=\sigma \text { is constant}}\) we have that: If \(\sigma \not =2\) then

    $$\begin{aligned} E_{\theta (\sigma )}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\theta (\sigma )}\left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-2\mu \left( \sigma \right) t} \end{aligned}$$

    where

    $$\begin{aligned} \theta \left( \sigma \right) :={\left\{ \begin{array}{ll} \sigma , &{} 0<\sigma<2 \\ \frac{4}{\sigma }, &{} \sigma>2 \end{array}\right. }, \quad \mu \left( \sigma \right) :={\left\{ \begin{array}{ll} \frac{\sigma }{2}, &{} 0<\sigma <2 \\ \frac{\sigma }{2}-\sqrt{\frac{\sigma ^2}{4}-1}, &{} \sigma >2 \end{array}\right. }, \end{aligned}$$

    and if \(\sigma =2\) then for any \(0<\epsilon <1\)

    $$\begin{aligned} E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-2\left( 1-\epsilon \right) t}.\end{aligned}$$

    Consequently if \(\sigma \not =2\)

    $$\begin{aligned} \begin{aligned} \Big \Vert f(t)-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert \le C_{\sigma } \Big \Vert f_0-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert e^{-\mu \left( \sigma \right) t}, \end{aligned} \end{aligned}$$
    (7)

    where

    $$\begin{aligned} C_{\sigma }:={\left\{ \begin{array}{ll} \sqrt{\frac{2+\sigma }{2-\sigma }}, &{} 0<\sigma <2 \\ \sqrt{ \frac{\sigma +2}{\sigma -2}}, &{} \sigma >2 \end{array}\right. }, \quad f_{\infty }=\frac{u_{\mathrm {avg}}}{2}\ , \end{aligned}$$
    (8)

    and the decay rate \(\mu (\sigma )\) is sharp.

    For \(\sigma =2\) we have that

    $$\begin{aligned} \begin{aligned} \Big \Vert f(t)-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert \le \frac{\sqrt{2}}{\epsilon } \Big \Vert f_0-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert e^{-\left( 1-\epsilon \right) t}\ . \end{aligned} \end{aligned}$$
    (9)
  2. b)

    \({\text {If }\sigma (x)\text { is non-constant}}\) such that

    $$\begin{aligned} 0<\sigma _{\mathrm {min}}:=\inf _{x\in {\mathbb {T}}}\sigma (x)< \sup _{x\in {\mathbb {T}}}\sigma (x)=:\sigma _{\mathrm {max}}<\infty , \end{aligned}$$

    then by defining

    $$\begin{aligned} \theta ^*:=\min \left( \sigma _{\mathrm {min}},\frac{4}{\sigma _{\mathrm {max}}} \right) \end{aligned}$$
    (10)

    and

    $$\begin{aligned} \alpha ^*:=\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) :={\left\{ \begin{array}{ll} \frac{\sigma _{\mathrm {min}}\left( 4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}\sigma _{\mathrm {max}} \right) }{4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}^2}, &{} \sigma _{\mathrm {min}}< \frac{4}{\sigma _{\mathrm {max}}}\\ \sigma _{\mathrm {max}}- \sqrt{\sigma _{\mathrm {max}}^2-4}, &{} \sigma _{\mathrm {min}}\ge \frac{4}{\sigma _{\mathrm {max}}} \end{array}\right. } \end{aligned}$$
    (11)

    we have that

    $$\begin{aligned} E_{\theta ^*}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\theta ^*}\left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-\alpha ^*t}, \end{aligned}$$

    and as result

    $$\begin{aligned} \begin{aligned} \Big \Vert f(t)-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert \le \sqrt{\frac{2+\theta _*}{2-\theta _*}} \Big \Vert f_0-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert e^{-\frac{\alpha ^*}{2}t}\ , \end{aligned} \end{aligned}$$
    (12)

    with \(f_\infty \) defined in (8).

Part (a) of this theorem will be proved in §4.4, and Part (b) in §5. In many of the proofs which will eventually lead to the proof of this theorem we will assume that (uv) is a classical solution, pertaining to \(u_0,\; v_0\) in the periodic Sobolev space \(H^1({\mathbb {T}})\). The general result will follow by a simple density argument.

Remark 1

It is simple to see that if \(\sigma (x)\) satisfies the conditions of (b), then, as \(\sigma _{\mathrm {min}}\) and \(\sigma _{\mathrm {max}}\) approach a positive constant \(\sigma \not =2\), we find that

$$\begin{aligned} \theta ^*\rightarrow \min \left( \sigma ,\frac{4}{\sigma } \right) ,\quad \text {and}\quad \alpha ^*\rightarrow {\left\{ \begin{array}{ll} \sigma -\sqrt{\sigma ^2-4}, &{} \sigma >2 \\ \sigma , &{} \sigma <2\end{array}\right. }\ , \end{aligned}$$

recovering the results of part (a) of the above theorem.

In addition, one should note that when \(\sigma _{\mathrm {min}}> \frac{4}{\sigma _{\mathrm {max}}} \) we have that

$$\begin{aligned} \alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) =2\mu \left( \sigma _{\mathrm {max}} \right) , \end{aligned}$$

where \(\mu \left( \sigma \right) \) was defined in part (a) of the Theorem. This validates the intuition that, if \(\sigma _{\mathrm {max}}\) is “large enough”, the convergence rate of the solution can be estimated using the “worst convergence rate”, corresponding to \(\mu \left( \sigma _{\mathrm {max}} \right) \) of the \(\sigma (x)=\sigma \) case.

Lastly, one notices that when \(\sigma _{\mathrm {min}}=\frac{4}{\sigma _{\mathrm {max}}}\)

$$\begin{aligned} \frac{\sigma _{\mathrm {min}}\left( 4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}\sigma _{\mathrm {max}} \right) }{4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}^2} =\sigma _{\mathrm {max}}- \sqrt{\sigma _{\mathrm {max}}^2-4}, \end{aligned}$$

which shows the continuity of \(\alpha ^{*}\) on the curve that stitches the two formulas in (11).

3 Preliminaries

In this short section we will remind the reader of a few simple properties of functions on the torus, as well as explore properties of the anti-derivative function, \(\partial _x^{-1} f\), and our functional \(E_{\theta }(f,g)\). Most of the simple proofs of this section will be deferred to Appendix B.

We begin with the well known Poincaré inequality:

Lemma 1

(Poincaré Inequality) Let \(f\in H^1_{per}\left( {\mathbb {T}} \right) \) with \(f_{\mathrm {avg}}=0\). Then

$$\begin{aligned} \left\Vert f\right\Vert \le \left\Vert f^\prime \right\Vert . \end{aligned}$$
(13)

Next we focus our attention on some simple, yet crucial, properties of the anti-derivative function which was defined in (5).

Lemma 2

Let \(f\in L^1\left( {\mathbb {T}} \right) \). Then:

  1. i)

    \(\left( \partial ^{-1}_x f \right) _{\mathrm {avg}}=0\).

  2. ii)

    \(\partial _x^{-1}f\) is differentiable a.e. on \([0,2\pi ]\) and \(\partial _x\left( \partial ^{-1}_x f \right) (x)=f(x)\) a.e.

  3. iii)

    If in addition f is differentiable we have that \(\partial ^{-1}_x\left( \partial _x f \right) (x)=f(x)-f_{\mathrm {avg}}\).

  4. iv)

    If, in addition, we have that \(f_{\mathrm {avg}}=0\), then \(\partial _x^{-1}f\) is a continuous function on the torus, and

    $$\begin{aligned} \widehat{\partial ^{-1}_x f}\left( k \right) = \left\{ \begin{array}{ll} \frac{{\widehat{f}}(k)}{ik}, &{} \, k\ne 0 \\ 0, &{} \, k=0 \\ \end{array} \right. \ . \end{aligned}$$
    (14)

Remark 2

(ii), (iv), and the fact that f is a function on the torus, imply that if \(f_{\mathrm {avg}}=0\) we are allowed to use integration by parts with \(\partial _x^{-1}f(x)\) on this boundaryless manifold without qualms.

The last simple lemma in this revolves around our newly defined functional, \(E_{\theta }\).

Lemma 3

Let \(f,g\in L^2\left( {\mathbb {T}} \right) \) be such that \(f_{\mathrm {avg}}=0\) and let \(\theta \in {\mathbb {R}}\) be given. Then the entropy \(E_\theta (f,g)\), defined in (4), satisfies

$$\begin{aligned} E_\theta \left( f,g \right) \le \left( 1+\frac{\left|\theta \right|}{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \ . \end{aligned}$$
(15)

If in addition \(\left|\theta \right|<2\) we have that

$$\begin{aligned} E_\theta \left( f,g \right) \ge \left( 1-\frac{\left|\theta \right|}{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \ . \end{aligned}$$
(16)

In particular, if \(0\le \theta <2\) we have that

$$\begin{aligned} \left( 1-\frac{\theta }{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \le E_\theta \left( f,g \right) \le \left( 1+\frac{\theta }{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \ . \end{aligned}$$
(17)

Lastly, we shall prove the following theorem, which (finally) brings the system (3) into play, and on which we will rely on frequently in our future estimation.

Proposition 1

Let \(u,v\in C([0,\infty );L^2\left( {\mathbb {T}} \right) )\) be (real valued) mild solutions to (3) with initial datum \(u_0,\; v_0\in L^2\left( {\mathbb {T}} \right) \). Then for any \(\theta \in {\mathbb {R}}\)

$$\begin{aligned} \frac{d}{dt}E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right)= & {} -\theta \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 + \frac{1}{2\pi } \int _0^{2\pi }(\theta -2\sigma (x))v(x,t)^2 dx\nonumber \\&+\frac{\theta }{2\pi }\int _0^{2\pi }\sigma (x) \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx-\theta \left( v(t)_{\mathrm {avg}} \right) ^2\ ,\nonumber \\ \end{aligned}$$
(18)

where

$$\begin{aligned} u_{\mathrm {avg}}=\frac{1}{2\pi }\int _{0}^{2\pi }u_0(x)dx=\frac{1}{2\pi } \int _{0}^{2\pi }u(x,t)dx,\quad \forall t>0. \end{aligned}$$
(19)

Proof

We begin by noticing that the validity of (19) follows immediately from the fact that u is a mild solution and the conservation of mass property of the system (3). Moreover, one can see that replacing \(\left( u(t),v(t) \right) \) by \(\left( u(t)-u_{\mathrm {avg}},v(t) \right) \) yields an equivalent solution (up to a constant shift in the initial data) to the system of equations, with the additional condition that the average of the first component is zero for all \(t\ge 0\). With this observation in mind, we can assume without loss of generality that \(u_{\mathrm {avg}}=0\).

Using the Goldstein-Taylor equations we see that

$$\begin{aligned} \frac{d}{dt}\left\Vert u(t)\right\Vert ^2= & {} 2\left\langle u,\partial _t u\right\rangle =-2\left\langle u,\partial _x v\right\rangle .\\ \frac{d}{dt}\left\Vert v(t)\right\Vert ^2= & {} 2\left\langle v,\partial _t v\right\rangle =-2\left\langle v,\partial _x u+\sigma v\right\rangle . \end{aligned}$$

Since

$$\begin{aligned} \left\langle u,\partial _x v\right\rangle +\left\langle v,\partial _x u\right\rangle =\frac{1}{2\pi }\int _{0}^{2\pi }\partial _x\left( uv \right) (x,t)dx=0\ ,\end{aligned}$$

we see that

$$\begin{aligned} \frac{d}{dt}\left( \left\Vert u(t)\right\Vert ^2+\left\Vert v(t)\right\Vert ^2 \right) =-\frac{1}{\pi }\int _{0}^{2\pi } \sigma (x)v(x,t)^2dx. \end{aligned}$$
(20)

We now turn our attention to the mixed term of \(E_\theta (u,v)\):

$$\begin{aligned}&\frac{d}{dt}\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}u (x,t)v(x,t)dx\\&\quad =\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}\left( \partial _t u \right) (x,t)v(x,t)dx +\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}u (x,t)\partial _t v(x,t)dx\\&\quad =-\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}\left( \partial _x v \right) (x,t)v(x,t)dx\\&\quad -\frac{\theta }{2\pi }\int _{0}^{2\pi } \partial _x^{-1}u (x,t)[\partial _x u(x,t)+\sigma (x)v(x,t)]dx. \end{aligned}$$

Using points (ii) and (iii) of Lemma 2, together with Remark 2, we find that the above equals

$$\begin{aligned}&-\frac{\theta }{2\pi }\int _{0}^{2\pi } \left( v(x,t)-v(t)_{\mathrm {avg}} \right) v(x,t) dx + \frac{\theta }{2\pi }\int _{0}^{2\pi } u(x,t)^2 dx \\&\quad -\frac{\theta }{2\pi }\int _{0}^{2\pi } \sigma (x) \partial _x^{-1}u(x,t)v(x,t)dx. \end{aligned}$$

Subtracting this from (20) (as there is a minus in definition (4)) yields (18). \(\square \)

4 Constant Relaxation Function

In recent years, the investigation of the Goldstein-Taylor model on \({\mathbb {T}}\) with constant relaxation function \(\sigma \) was frequently tackled with a modal decomposition in the Fourier space w.r.t. x. This approach allows for an extension to other discrete velocity models and even some continuous velocities models [1], but is not suitable for the non-homogeneous case.

Before beginning with our investigation we review a few recent results:

In [13, §1.4] exponential convergence to equilibrium was shown, but without the sharp rate. In [1, §4.1] a hypocoercive decay estimate of the form

$$\begin{aligned} \Big \Vert f(t)-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert _{L^2} \le c\,e^{-\mu t} \Big \Vert f_0-\left( {\begin{array}{c}f_\infty \\ f_\infty \end{array}}\right) \Big \Vert _{L^2}\ , \end{aligned}$$

with the vector notation from (6) and the sharp rate

$$\begin{aligned} \mu (\sigma )= \left\{ \begin{array}{ll} \frac{\sigma }{2}, &{} \,0< \sigma <2 \\ \frac{\sigma }{2}-\sqrt{\frac{\sigma ^2}{4}-1}, &{} \,\sigma >2 \\ \end{array} \right. \end{aligned}$$

was obtained (see also Fig. 1 below). A further study on the minimal constant c in the above was provided in [3, Th. 1.1].

Fig. 1
figure 1

The exponential decay rate, \(\mu \left( \sigma \right) \), of the solution pair \((u(t)-u_{\mathrm {avg}},\,v(t))\) grows linearly until \(\sigma =2\) where the defectiveness appears (hence the circle). From that point onwards the decay rate decreases, and is of order \(O\left( \frac{1}{\sigma } \right) \)

With these results in mind, we turn our attention to the following (recast) Goldstein-Taylor equation with a constant relaxation rate:

$$\begin{aligned} \begin{aligned} \partial _t u(x,t)&= -\partial _x v(x,t),\\ \partial _t v(x,t)&= -\partial _x u(x,t) - \sigma v(x,t) \ . \end{aligned} \end{aligned}$$
(21)

In order to be able to discover our entropy functional, we shall consider the straightforward modal analysis in detail. This will allow us to obtain not only explicit decay rates for each Fourier mode, but also an “optimal Lyapunov functional” for such given mode, with which we will then be able to construct a non-modal entropy functional in terms of a pseudo-differential operator as defined in (4).

As was mentioned in §2, this will give us intuition to the large time behaviour of the equation in several cases even when \(\sigma (x)\) is not constant.

4.1 Fourier Analysis and the Spectral Gap

One natural way to understand the large time behaviour of (21) relies on a simple Fourier analysis together with a hypocoercivity technique that was developed by Arnold and Erb in [6]. We begin with the former, and focus on the latter from the next subsection onwards.

Using the Fourier transform on the torus (i.e. in the spatial variable), we see that (21) is equivalent to infinity many decoupled ODE systems:

$$\begin{aligned} \frac{d}{dt} \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix} = - \begin{pmatrix} 0&{} ik\\ ik&{} \sigma \end{pmatrix}\begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix}:=-\mathbf{C}_k \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix} ,\qquad k\in {\mathbb {Z}}. \end{aligned}$$
(22)

The eigenvalues of the matrices \(\mathbf{C}_k\in {\mathbb {C}}^{2\times 2}\) are given by

$$\begin{aligned} \lambda _{\pm ,k}:= \frac{\sigma }{2} \pm \sqrt{\frac{\sigma ^2}{4}-k^2},\quad k\in {\mathbb {Z}}, \end{aligned}$$

and as such:

  • Invariant space: For \(k=0\) we find that \(\lambda _{-,0}=0\) and \(\lambda _{+,0}=\sigma \). In fact, as

    $$\begin{aligned} \mathbf{C}_0=\begin{pmatrix} 0&{} 0\\ 0&{} \sigma \end{pmatrix} \end{aligned}$$
    (23)

    we can conclude immediately that \({\widehat{u}}(0,t)=\widehat{u_0}(0)\) and \({\widehat{v}}(0,t)=\widehat{v_0}(0)e^{-\sigma t}\), corresponding to the mass conservation of the original equation and the rapid decay of the difference between the masses of \(f_-\) and \(f_+\).

  • Case I: For \(0<|k|< \frac{\sigma }{2}\) one finds two real eigenvalues, whose minimum is

    $$\begin{aligned} \lambda _{-,k}=\frac{\sigma }{2} - \sqrt{\frac{\sigma ^2}{4}-k^2}=\frac{2k^2}{\sigma + \sqrt{\sigma ^2-4k^2}}\ , \end{aligned}$$

    i.e. the large time behaviour of \({\widehat{u}}(k)\) and \({\widehat{v}}(k)\) is controlled by \(e^{-\left( \frac{\sigma }{2} - \sqrt{\frac{\sigma ^2}{4}-k^2} \right) t}\).

  • Case II: For \(0<|k|=\frac{\sigma }{2}\in {\mathbb {N}}\) the two eigenvalues coincide and are equal to \(\frac{\sigma }{2}\). Moreover, that eigenvalue is defective (i.e. corresponds to a Jordan block of size 2) and the large time behaviour of \({\widehat{u}}(k)\) and \({\widehat{v}}(k)\) is controlled by \(\left( 1+t \right) e^{-\frac{\sigma }{2}t}\).

  • Case III: For \(|k|>\frac{\sigma }{2} \), one finds two complex conjugate eigenvalues, whose real part equals \(\frac{\sigma }{2}\). Thus the large time behaviour of \({\widehat{u}}(k)\) and \({\widehat{v}}(k)\) is controlled by \(e^{-\frac{\sigma }{2}t}\).

Fig. 2
figure 2

The eigenvalues \(\lambda _{\pm ,k}\) of \(\mathbf{C}_k\), \(|k|\in {\mathbb {N}}\) for \(\sigma =5\). The spectral gap is \(\mu =(5-\sqrt{21})/2\)

From the observations above, we notice that as long as we subtract \({\widehat{u}}(0)\), i.e. as long as we remove the initial total mass from the original solution, all the modes converge exponentially to zero. Their rates have a sharp, and uniform-in-k lower bound that depends on \(\sigma \). This spectral gap of (21) will be denoted by \(\mu \left( \sigma \right) \).

Case I, i.e. \(0<|k|< \frac{\sigma }{2} \), is the most “difficult case” as the real part of the eigenvalues depends on k. However, one notices that the lower eigenvalue, \(\lambda _{-,k}\), increases with k, which implies that, if there are \(k-\)s such that \(0<|k|< \frac{\sigma }{2} \), the slowest possible convergence will be given by \(\lambda _{-,\pm 1}\). As we need to compare the decay rates of all modes simultaneously, we find that it is enough to consider the following possibilities:

  • \(0<\sigma <2\): We only have possibilities of Case III, implying that all modes are controlled by \(e^{-\frac{\sigma }{2}t}\).

  • \(\sigma =2\): We have possibilities of Case III, as well as defectiveness in \(k=\pm 1\) (Case II). This means that the modes are controlled by \(\left( 1+t \right) e^{-t}\). If one searches for a pure exponential control, the best rate one would find is \(e^{-\left( 1-\epsilon \right) t}\) for any given fixed \(\epsilon >0\).

  • \(\sigma >2\): We have possibilities from Cases I and III, and potentially Case II. All the modes that correspond to Case I are controlled by \(e^{-\left( \frac{\sigma }{2} - \sqrt{\frac{\sigma ^2}{4}-1} \right) t}\), while those that correspond to Case III are controlled by \(e^{-\frac{\sigma }{2}t}\). If Case II is realised, i.e. \(\frac{\sigma }{2}\in {\mathbb {N}}\setminus \left\{ 1\right\} \), we find that the modes \(k=\pm \frac{\sigma }{2}\) are controlled by \(\left( 1+t \right) e^{-\frac{\sigma }{2}t}\). In total, thus, all the modes are controlled by \(e^{-\left( \frac{\sigma }{2} - \sqrt{\frac{\sigma ^2}{4}-1} \right) t}\), a decay rate that is realised on the \(k=\pm 1\) modes, and the coefficient in the exponent is the spectral gap of the Goldstein-Taylor system (21).

An illustration of the eigenvalues of the matrices \(\mathbf{C}_k\) for \(\left|k\right|\in {\mathbb {N}}\) and \(\sigma =5\) can be viewed in Fig. 2.

Before we turn our attention to properly consider these cases and “uncover” our spatial entropy, we remind the reader of the hypocoercivity technique which will allow us to transform the spectral information of \(\mathbf{C}_k\) into a an appropriate, twisted norm with which we will show the desired decay of the k-th mode.

4.2 Hypocoercivity and Modal Lyapunov Functionals

In the previous subsection we have concluded that, barring the zero mode, all the Fourier modes of (22) decay exponentially (excluding potentially those with \(|k|=\frac{\sigma }{2}\) where a polynomial correction is required). The lack of positive definiteness of the governing matrix, \(\mathbf{C}_k\), stops us from seeing this behaviour in the Euclidean norm on \({\mathbb {C}}^2\). However, by modifying the norm with the help of another, closely related, positive definite matrix \({\mathbf {P}}_k\), one can construct a new Lyaponov functional, which is equivalent to the Euclidean norm, that decays with the expected exponential rate (at least for a non-defective \(\mathbf{C}_k\)).

This is exactly the idea that motivated Arnold and Erb, and which is expressed in the following theorem (see [6, 1, Lemma 2]):

Theorem 2

Let the matrix \(\mathbf{C}\in {\mathbb {C}}^{n\times n}\) be positive stable (i.e. have only eigenvalues with positive real parts). Let

$$\begin{aligned} \mu =\min \left\{ {\text {Re}}\lambda \;|\;\lambda \text { is an eigenvalue of }\mathbf{C}\right\} . \end{aligned}$$

Then:

  1. i)

    If all eigenvalues with real part equal to \(\mu \) are non-defective, there exists a Hermitian, positive definite matrix \({\mathbf {P}}\) such that

    $$\begin{aligned} \mathbf{C}^*{\mathbf {P}}+{\mathbf {P}}\mathbf{C}\ge 2\mu {\mathbf {P}}. \end{aligned}$$
    (24)
  2. ii)

    If at least one eigenvalue with real part equal to \(\mu \) is defective, then for any \(\epsilon >0\), one can find a Hermitian, positive definite matrix \({\mathbf {P}}_{\epsilon }\) such that

    $$\begin{aligned} \mathbf{C}^*{\mathbf {P}}_{\epsilon }+{\mathbf {P}}_{\epsilon }\mathbf{C}\ge 2\left( \mu -\epsilon \right) {\mathbf {P}}_{\epsilon }\ , \end{aligned}$$
    (25)

    where \(\mathbf{C}^*\) denotes the Hermitian transpose of \(\mathbf{C}\).

We remark that the matrices \({\mathbf {P}}\) and \({\mathbf {P}}_\epsilon \) are never unique.

One can utilise the theorem in the following way: Assuming the eigenvalues associated to \(\mathbf{C}\)’s spectral gap, \(\mu \), are non-defective, then by defining the norm

$$\begin{aligned} \left\Vert y\right\Vert ^2_{{\mathbf {P}}}:=\left\langle y,{\mathbf {P}}y\right\rangle =y^*{\mathbf {P}}y, \end{aligned}$$

one sees that, if y(t) solves the ODE \({\dot{y}}=-\mathbf{C}y\), then

$$\begin{aligned} \frac{d}{dt}\left\Vert y\right\Vert ^2_{{\mathbf {P}}}=-\left\langle y, \left( \mathbf{C}^*{\mathbf {P}}+{\mathbf {P}}\mathbf{C} \right) y\right\rangle \le -2 \mu \left\Vert y\right\Vert ^2_{{\mathbf {P}}}, \end{aligned}$$
(26)

resulting in the correct decay rate. The same approach works in the second case of Theorem 2.

Besides the general idea of this methodology, Arnold and Erb have given a recipe (one that was later extended in [7] to defective cases, using a time dependent matrix \({\mathbf {P}}\)) to finding the matrices \({\mathbf {P}}\, \mathrm{and}\, {\mathbf {P}}_{\epsilon }\):

Assuming that \(\mathbf{C}\) is diagonalisable, and letting \(\left\{ \omega _i\right\} _{i=1,\dots ,n}\) be the eigenvectors of \(\mathbf{C}^*\), the matrix \({\mathbf {P}}>0\) can be chosen to be

$$\begin{aligned} {\mathbf {P}}=\sum _{i=1}^n b_i \omega _i \otimes \omega _i^*, \end{aligned}$$
(27)

for any positive sequence \(\left\{ b_i\right\} _{i=1,\dots , n}\). The above formula remains true, for a particular choice of \(\left\{ b_i\right\} _{i=1,\dots , n}\), in the case where \(\mathbf{C}\) is not diagonalisable. In that case we also need to augment the eigenvectors with the generalised eigenvectors. We refer the interested reader to Lemma 4.3 in [6]. Moreover, for \(n=2\), the case we shall need below, and \(\mathbf{C}\) non-defective, all matrices \({\mathbf {P}}\) satisfying (24) are indeed of the form (27), see [3, Lemma 3.1].

We now turn our attention back to the Fourier transformed Goldstein-Taylor system (22) and determine the modal Lyapunov functionals using the above recipe. A short computation, where the weights \(b_1,\,b_2\) are chosen such that both diagonal elements of \({\mathbf {P}}\) are 1, finds the following matrices (For Case III we also require \(b_1=b_2\), as this minimises the number of the resulting admissible matrices \({\mathbf {P}}_k\) satisfying (24).):

  • Case I: \(0<|k|< \frac{\sigma }{2} \). In this case we have:

    $$\begin{aligned} {\mathbf {P}}_k^{(I)}:= \begin{pmatrix} 1&{} -\frac{2ki}{\sigma }\\ \frac{2ki}{\sigma }&{} 1 \end{pmatrix}, \end{aligned}$$
    (28)
  • Case II: \(|k|= \frac{\sigma }{2}\in {\mathbb {N}}\). As this case fosters defective eigenvalues, we will only consider the case \(\sigma =2\) (as was mentioned beforehand), and state the matrix corresponding to \(k=\pm 1\) and a given fixed \(\epsilon >0\):

    $$\begin{aligned} {\mathbf {P}}^{(II)}_{\epsilon ,\pm 1}:= \begin{pmatrix} 1 &{} \mp \frac{i(2-\epsilon ^2)}{2+\epsilon ^2}\\ \pm \frac{i(2-\epsilon ^2)}{2+\epsilon ^2} &{} 1 \end{pmatrix} \end{aligned}$$
    (29)
  • Case III: \(|k|> \frac{\sigma }{2} \). In this case we have:

    $$\begin{aligned} {\mathbf {P}}_k^{(III)}:= \begin{pmatrix} 1&{} -\frac{i\sigma }{2k}\\ \frac{i\sigma }{2k}&{} 1 \end{pmatrix} \end{aligned}$$
    (30)

For each mode \(k\ne 0\), its modal Lyapunov functional will be given by \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_k}^2\), where the matrix \({\mathbf {P}}_k\) is chosen according to the above three cases. In Case II, the parameter \(\epsilon >0\) can be chosen arbitrarily small.

4.3 Derivation of the Spatial Entropy \({E_\theta (u,v)}\)

The goal of this subsection is twofold: Finding a modal entropy to our system, and translating it to a spatial entropy that is modal-independent.

To begin with we shall define a modal entropy to quantify the exponential decay of solutions to (22) towards its steady state:

$$\begin{aligned} \widehat{u_\infty }(k) = \left\{ \begin{array}{ll} \widehat{u_0}(k=0)=(u_0)_{\mathrm {avg}}, &{} \, k=0 \\ 0, &{} \, k\ne 0 \\ \end{array} \right. \ ;\qquad \quad \widehat{v_\infty }(k)=0\ ,\, k\in {\mathbb {Z}}\ . \end{aligned}$$
(31)

Since the matrix \(\mathbf{C}_0\) from (23) has no spectral gap, the mode \(k=0\) plays a special role, and hence will be treated separately.

Once found, we will want to relate that modal-based entropy to the spatial entropy \(E_\theta \) from Definition 1, which is not based on a modal decomposition. To this end we already remark that the off-diagonal factors ik in (28) and 1/ik in (30) correspond in physical space, roughly speaking, to a first derivative and an anti-derivative, respectively.

As in §4.1 we shall distinguish three cases of \(\sigma \):

\({\varvec{0<\sigma <2}}:\) All modes \(k\ne 0\) satisfy \(|k|> \frac{\sigma }{2} \), and hence are of Case III. We recall from §4.1 that all modes decay here with the sharp rate \(\frac{\sigma }{2}\). For a modal entropy to reflect this decay, we hence have to use for each mode a Lyapunov functional \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_k}^2\), where \({\mathbf {P}}_k\) satisfies the inequality (24) with \(\mu =\frac{\sigma }{2}\). \({\mathbf {P}}_k={\mathbf {P}}_k^{(III)}\) is the most convenient choice.

We define the modal entropy for any \(\left\{ {\widehat{u}}(k),{\widehat{v}}(k)\right\} _{k\in {\mathbb {Z}}}\) such that \({\widehat{u}}(0)=0\) as

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right):= & {} \sum _{k\in {\mathbb {Z}}\setminus \left\{ 0\right\} }\left\Vert \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix}\right\Vert ^2_{{\mathbf {P}}_k^{(III)}}+\left\Vert \begin{pmatrix} {\widehat{u}}(0)\\ {\widehat{v}}(0) \end{pmatrix}\right\Vert ^2 \end{aligned}$$
(32)
$$\begin{aligned}= & {} \sum _{k\in {\mathbb {Z}}}\left( \left|{\widehat{u}}(k)\right|^2 -\sigma {\text {Re}}\left( \frac{{\widehat{u}}(k)}{ik}\overline{{\widehat{v}}(k)} \right) +\left|{\widehat{v}}(k)\right|^2 \right) \ , \end{aligned}$$
(33)

where we used the convention \(\frac{{\widehat{u}}(0)}{0}=0\). The mode \(k=0\) was included since \({\widehat{u}}(0,t)={\widehat{u}}(0)=0\) and \({\widehat{v}}(0,t)={\widehat{v}}(0)e^{-\sigma t}\). Using Plancherel’s equality, and (iv) from Lemma 2, we find that

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) =E_\sigma \left( u,v \right) , \end{aligned}$$
(34)

which shows why we consider the spatial entropy functional from Definition 1 in this case.

We note that, since \(u_{\mathrm {avg}}(t)\) is conserved, part (iv) of Lemma 2, explains why we have chosen to use the anti-derivative of u, and not of v.

\({\varvec{\sigma >2:}}\) This situation is more complicated than the previous one, as we have a mixture of at least two of the aforementioned three cases: finitely many \(k-\)s in \({\mathbb {Z}}\) for which \(0<|k|< \frac{\sigma }{2} \) (i.e. Case I), Case II for two \(k-\)s if \(\frac{\sigma }{2}\in {\mathbb {N}}\), while the rest satisfy \(|k|> \frac{\sigma }{2} \) (i.e. Case III). Following the above methodology to construct the modal entropy, we would need to use a combination of \({\mathbf {P}}_k^{(I)}\) and \({\mathbf {P}}_k^{(III)}\), given by (28) and (30), and potentially a matrix for the defective modes. This is feasible on the modal level, but does not easily translate back to the spatial variables. It would yield a complicated pseudo-differential operator “inside” the spatial entropy.

Recalling the discussion from §4.1 we see that the overall decay rate, \(\mu =\frac{\sigma }{2}-\sqrt{\frac{\sigma ^2}{4}-1}\) is only determined by the modes \(k=\pm 1\). Since all the other modes decay faster, we are not obliged to use “optimal” modal Lyapunov functionals for these higher modes. This gives some leeway for choosing the matrices \({\mathbf {P}}_k\), \(|k|>1\). Moreover, using these “optimal” functionals will result in worsening of (i.e. enlargement of) the multiplicative constant in the \(L^2\) hypocoercive estimation (7). Due to these reasons we will use the matrix

$$\begin{aligned} {\mathbf {P}}_k^{\mathrm {suff}}:={\mathbf {P}}_{k}^{(III)}\left( \sigma \rightarrow \frac{4}{\sigma } \right) =\begin{pmatrix} 1&{} -\frac{2i}{k\sigma }\\ \frac{2i}{k\sigma }&{} 1 \end{pmatrix} >0\ \end{aligned}$$
(35)

when \(k\ne 0\), which satisfies \({\mathbf {P}}_{\pm 1}^{\mathrm {suff}}={\mathbf {P}}_{\pm 1}^{(I)}\) for the crucial lowest modes. It also satisfies the following result, which implies exponential decay of all modal Lyapunov functionals \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_k^{\mathrm {suff}}}^2\), \(k\ne 0\) with rate \(2\mu =\sigma -\sqrt{\sigma ^2-4}\).

Lemma 4

Let \(\sigma >2\). Then

$$\begin{aligned} \mathbf{C}^*_k {\mathbf {P}}^{\mathrm {suff}}_k+{\mathbf {P}}^{\mathrm {suff}}_k\mathbf{C}_k-2\mu {\mathbf {P}}^{\mathrm {suff}}_k\ge 0 \qquad \forall \,k\ne 0\ . \end{aligned}$$

The proof of this lemma is straightforwardFootnote 3. Proceeding like in (32) we define the modal entropy for any \(\left\{ {\widehat{u}}(k),{\widehat{v}}(k)\right\} _{k\in {\mathbb {Z}}}\) such that \({\widehat{u}}(0)=0\):

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) :=\sum _{k\in {\mathbb {Z}}\setminus \left\{ 0\right\} }\left\Vert \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix}\right\Vert ^2_{{\mathbf {P}}_k^{\mathrm {suff}}}+\left\Vert \begin{pmatrix} {\widehat{u}}(0)\\ {\widehat{v}}(0) \end{pmatrix}\right\Vert ^2\ . \end{aligned}$$

Due to (34) and (35) it is related to the spatial entropy functional from Definition 1 as

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) =E_{\frac{4}{\sigma }}\left( u,v \right) . \end{aligned}$$

\({\varvec{\sigma =2:}}\) Just like in the previous case, the lowest frequency modes \(k=\pm 1\) control the large time behaviour. However, the matrices \(\mathbf{C}_{\pm 1}\) are now defective, which leads to a (purely) exponential decay rate reduced by \(\epsilon \).

We proceed similarly to the case \(\sigma >2\) and define for some \(\epsilon >0\):

$$\begin{aligned} {\mathbf {P}}_{\epsilon ,k}^{\mathrm {suff}}= {\mathbf {P}}^{(III)}_{k}\left( \sigma \rightarrow \frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2} \right) =\begin{pmatrix} 1&{} -\frac{i\left( 2-\epsilon ^2 \right) }{k\left( 2+\epsilon ^2 \right) }\\ \frac{i\left( 2-\epsilon ^2 \right) }{k\left( 2+\epsilon ^2 \right) }&{} 1 \end{pmatrix}>0\ , \end{aligned}$$
(36)

which satisfies \({\mathbf {P}}_{\epsilon ,\pm 1}^{\mathrm {suff}}={\mathbf {P}}^{(II)}_{\epsilon ,\pm 1}\) for the crucial lowest model. It also satisfies the following result, which implies exponential decay of all modal Lyapunov functionals \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_{\epsilon ,k}^{\mathrm {suff}}}^2\), \(k\ne 0\) with rate of at least \(2\mu =2(1-\epsilon )\).

Lemma 5

Let \(\sigma =2\). Then

$$\begin{aligned} \mathbf{C}^*_k {\mathbf {P}}^{\mathrm {suff}}_{\epsilon ,k}+{\mathbf {P}}^{\mathrm {suff}}_{\epsilon ,k}\mathbf{C}_k-2\mu {\mathbf {P}}^{\mathrm {suff}}_{\epsilon ,k}>0 \qquad \forall \,k\ne 0\ . \end{aligned}$$

Proceeding like in (32) we define the modal entropy for any \(\left\{ {\widehat{u}}(k),{\widehat{v}}(k)\right\} _{k\in {\mathbb {Z}}}\) such that \({\widehat{u}}(0)=0\):

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) :=\sum _{k\in {\mathbb {Z}}\setminus \left\{ 0\right\} }\left\Vert \begin{pmatrix} {\widehat{u}}(k)\\ {\widehat{v}}(k) \end{pmatrix}\right\Vert ^2_{{\mathbf {P}}_{\epsilon ,k}^{\mathrm {suff}}}+\left\Vert \begin{pmatrix} {\widehat{u}}(0)\\ {\widehat{v}}(0) \end{pmatrix}\right\Vert ^2\ . \end{aligned}$$

Due to (34) and (36) it is related to the spatial entropy functional from Definition 1 as

$$\begin{aligned} {\mathscr {E}}\left( {\widehat{u}},{\widehat{v}} \right) =E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u,v \right) . \end{aligned}$$

4.4 The Evolution of the Spatial Entropy \(E_\theta \)

In the previous subsection we have shown how, depending on the value of \(\sigma \), the entropies \(E_{\sigma }\), \(E_{\frac{4}{\sigma }}\) and \(E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\) are the correct candidates to show the exponential convergence to equilibrium. A closer look at (26) shows that each modal Lyapunov functional \(\big \Vert \left( {\begin{array}{c}{\hat{u}} (k,t)\\ {\hat{v}} (k,t)\end{array}}\right) \big \Vert _{{\mathbf {P}}_k}^2\) decays exponentially, and hence also the spatial entropy \(E_\theta \). Recalling the decay rates presented in §4.3 for the three regimes of \(\sigma \), confirms that we have actually already proved most of part (a) of Theorem 1. However, as our main goal is to consider these functionals in the spatial variable alone (i.e. without a modal decomposition), we shall show how one achieves the correct convergence result following a direct calculation. This will also serve as a preparation for §5.

Theorem 3

Under the same conditions of Theorem 1 with \(\sigma (x)=\sigma \), one has that

  1. i)

    If \(0<\sigma <2\) then

    $$\begin{aligned} E_{\sigma }\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\sigma }\left( u_{0}-u_{\mathrm {avg}},v_0 \right) e^{- \sigma t}. \end{aligned}$$
  2. ii)

    If \(\sigma >2\) then

    $$\begin{aligned} E_{\frac{4}{\sigma }}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\frac{4}{\sigma }}\left( u_{0}-u_{\mathrm {avg}},v_0 \right) e^{- \left( \sigma -\sqrt{\sigma ^2-4} \right) t}. \end{aligned}$$
  3. iii)

    If \(\sigma =2\) then for any \(0<\epsilon <1\)

    $$\begin{aligned} E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u_{0}-u_{\mathrm {avg}},v_0 \right) e^{- 2(1-\epsilon ) t}. \end{aligned}$$

Proof

In order to prove this theorem we shall obtain differential inequalities for \(E_\theta \), from which we will conclude the desired result by a simple application of Gronwall’s inequality. Using Proposition 1 we find that:

\({\text {If }0<\sigma <2:}\)

$$\begin{aligned} \frac{d}{dt}E_{\sigma }\left( u(t)-u_{\mathrm {avg}},v(t) \right)= & {} -\sigma \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 - \sigma \left\Vert v(t)\right\Vert ^2\\&+\frac{\sigma ^2}{2\pi }\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx-\sigma \left( v(t)_{\mathrm {avg}} \right) ^2 \\= & {} -\sigma E_{\sigma }\left( u(t)-u_{\mathrm {avg}},v(t) \right) -\sigma \left( v(t)_{\mathrm {avg}} \right) ^2 \le \\&-\sigma E_{\sigma }\left( u(t)-u_{\mathrm {avg}},v(t) \right) . \end{aligned}$$

Note that, since \(v_{\mathrm {avg}}(t)=(v_{0})_{\mathrm {avg}}\,e^{-\sigma t}\), we can compute \(E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right) \) explicitly.

\({\text {If }\sigma >2:}\)

$$\begin{aligned} \frac{d}{dt}E_{\frac{4}{\sigma }}\left( u(t)-u_{\mathrm {avg}},v(t) \right)= & {} -\frac{4}{\sigma } \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 -\left( 2\sigma -\frac{4}{\sigma } \right) \left\Vert v(t)\right\Vert ^2\\&+\frac{4}{2\pi }\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx-\frac{4}{ \sigma } \left( v(t)_{\mathrm {avg}} \right) ^2\\\le & {} -\left( \sigma -\sqrt{\sigma ^2-4} \right) E_{\frac{4}{\sigma }}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \\&+\left( \sigma -\sqrt{\sigma ^2-4}-\frac{4}{\sigma } \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2\\&+\left( \frac{4}{\sigma }-\sigma -\sqrt{\sigma ^2-4} \right) \left\Vert v(t)\right\Vert ^2\\&+\frac{4}{2\pi }\left( 1-\frac{\sigma -\sqrt{\sigma ^2-4}}{\sigma } \right) \int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx. \end{aligned}$$

The desired inequality, \(\frac{d}{dt}E_{\frac{4}{\sigma }}\le -\big (\sigma -\sqrt{\sigma ^2-4}\,\big ) E_{\frac{4}{\sigma }}\), is valid if and only if

$$\begin{aligned} \begin{aligned}&\frac{4}{2\pi }\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx \\&\quad \le \left( \sigma -\sqrt{\sigma ^2-4} \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2+\left( \sigma +\sqrt{\sigma ^2-4} \right) \left\Vert v(t)\right\Vert ^2. \end{aligned} \end{aligned}$$
(37)

Cauchy-Schwarz inequality, together with Poincaré inequality (Lemma 1) and Lemma 2, imply that

$$\begin{aligned}&\frac{4}{2\pi }\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx\\&\quad \le 4\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert \left\Vert v(t)\right\Vert \\&\quad =2\left( \sqrt{\sigma -\sqrt{\sigma ^2-4}}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert \right) \left( \sqrt{\sigma +\sqrt{\sigma ^2-4}}\left\Vert v(t)\right\Vert \right) \ . \end{aligned}$$

Together with the fact that \(2\left|ab\right|\le a^2+b^2\) this shows (37), concluding the proof in this case.

\(\underline{\text {If }\sigma =2:}\)

$$\begin{aligned} \begin{aligned}&\frac{d}{dt}E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}} \left( u(t)-u_{\mathrm {avg}},v(t) \right) \\&\quad =- \frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 -\frac{2\left( 2+3\epsilon ^2 \right) }{2+\epsilon ^2}\left\Vert v(t)\right\Vert ^2\\&\qquad +\frac{1}{2\pi }\cdot \frac{4\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx-\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2} \left( v(t)_{\mathrm {avg}} \right) ^2\\&\quad \le -2\left( 1-\epsilon \right) E_{\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}}\left( u(t)-u_{\mathrm {avg}},v(t) \right) -2\epsilon \left( 1-\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2\\&\qquad -2\epsilon \left( 1+\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert v(t)\right\Vert ^2+\frac{1}{2\pi }\cdot \frac{4\epsilon \left( 2-\epsilon ^2 \right) }{2+\epsilon ^2}\int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx. \end{aligned} \end{aligned}$$

Like before, the desired inequality will follow if

$$\begin{aligned}&\frac{1}{2\pi }\cdot \frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2} \int _0^{2\pi } \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx \\&\quad \le \left( 1-\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 +\left( 1+\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert v(t)\right\Vert ^2. \end{aligned}$$

This is valid since

$$\begin{aligned}&\frac{1}{2\pi }\cdot \frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2} \int _0^{2\pi }\partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t)dx \\&\quad \le \frac{2\sqrt{4+\epsilon ^4}}{2+\epsilon ^2}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert \left\Vert v(t)\right\Vert \\&\quad = 2 \left( \sqrt{1-\frac{2\epsilon }{2+\epsilon ^2}}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert \right) \left( \sqrt{1+\frac{2\epsilon }{2+\epsilon ^2}}\left\Vert v(t)\right\Vert \right) \\&\quad \le \left( 1-\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 +\left( 1+\frac{2\epsilon }{2+\epsilon ^2} \right) \left\Vert v(t)\right\Vert ^2, \end{aligned}$$

where we used Cauchy-Schwarz inequality, Poincaré inequality, and Lemma 2 again.

The theorem is now complete. \(\square \)

As the last part of this section, we finally prove part (a) of Theorem 1:

Proof of part (a) of Theorem 1

The decay estimates of \(E_{\theta (\sigma )}\) are already shown in Theorem 3. To show (7) and (9) we recall that

$$\begin{aligned} f_+ = \frac{u+v}{2},\quad f_-=\frac{u-v}{2}\ , \end{aligned}$$

and

$$\begin{aligned} \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2\le \frac{2}{2-\theta }E_{\theta }\left( f,g \right) ,\quad E_{\theta }\left( f,g \right) \le \frac{2+\theta }{2}\left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) \end{aligned}$$

for \(0<\theta <2\) and \(f_{\mathrm {avg}}=0\), according to Lemma 3. Thus, using the definition of \(f_\infty \) from (8) we see that

$$\begin{aligned}&\left\Vert f_+(t)-f_\infty \right\Vert ^2 + \left\Vert f_-(t)-f_\infty \right\Vert ^2 \\&\quad =\frac{1}{2}\left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2+\frac{1}{2}\left\Vert v(t)\right\Vert ^2 \le \frac{1}{2-\theta }E_\theta \left( u(t)-u_{\mathrm {avg}},v(t) \right) \\&\quad \le \frac{1}{2-\theta }E_\theta \left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-2\mu (\sigma )t} \le \frac{1}{2}\cdot \frac{2+\theta }{2-\theta }\left( \left\Vert u_0-u_{\mathrm {avg}}\right\Vert ^2 +\left\Vert v_0\right\Vert ^2 \right) e^{-2\mu (\sigma )t} \\&\quad =\frac{2+\theta }{2-\theta } \left( \left\Vert f_{+,0}-f_\infty \right\Vert ^2+\left\Vert f_{-,0}-f_\infty \right\Vert ^2 \right) e^{-2\mu (\sigma )t}, \end{aligned}$$

which shows the result for the appropriate choices of \(\theta (\sigma )\) and \(\mu (\sigma )\). For \(\sigma =2\) we choose

$$\begin{aligned} \theta (2)=\frac{2\left( 2-\epsilon ^2 \right) }{2+\epsilon ^2},\quad \mu (2)=1-\epsilon \ . \end{aligned}$$

The sharpness of the decay rate for \(\sigma \ne 2\) can be verified easily on the first mode, e.g. for \(u_0=0\), \(v_0=e^{ix}\). \(\square \)

With the constant case fully behind us, we can now focus on the case where \(\sigma (x)\) is a non-constant function.

5 \(x-\)Dependent Relaxation Function

The large time behaviour of solutions to the Goldstein-Taylor equation (1), or equivalently its recast form (3), becomes increasingly harder to understand, if the relaxation function, \(\sigma (x)\), is not a constant. However, as shown in §4, we have managed to find a potential spatial entropy that captures the exact behaviour of the decay to equilibrium. The idea that we will employ in this section is to use the same type of entropy to try and estimate the convergence rate even when \(\sigma (x)\) is not constant. This is, as mentioned in the introduction, a perturbative approach - yet the methodology, and ideas, are robust enough to deal with more complicated systems, as will be shown in the next section.

A fundamental theorem to establish our main result, Theorem 1 (b), is the following:

Theorem 4

Let \(u,v\in C([0,\infty );L^2\left( {\mathbb {T}} \right) )\) be mild solutions to (3) with initial datum \(u_0,\; v_0\in L^2\left( {\mathbb {T}} \right) \). Denoting by \(u_{\mathrm {avg}}=\left( u_0 \right) _{\mathrm {avg}}\) we have that for any given \(0<\alpha ,\theta <2 \) the conditions

$$\begin{aligned} \begin{aligned} \alpha< \theta ,\quad \theta +\alpha < 2\sigma _{\mathrm {min}}\end{aligned} \end{aligned}$$
(38)

and

$$\begin{aligned} \begin{aligned} \sup _{x\in {\mathbb {T}}}\left( \theta ^2\left( \sigma (x)-\alpha \right) ^2-4\left( \theta -\alpha \right) \left( 2\sigma (x)-\theta -\alpha \right) \right) \le 0, \end{aligned} \end{aligned}$$
(39)

imply that

$$\begin{aligned} E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le E_{\theta }\left( u_0-u_{\mathrm {avg}},v_0 \right) e^{-\alpha t},\quad t\ge 0. \end{aligned}$$
(40)

Proof

Using (18) from Proposition 1, and the fact that \(\theta \left( v(t)_{\mathrm {avg}} \right) ^2 \ge 0\), we find that

$$\begin{aligned} \begin{aligned} \frac{d}{dt}E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right) \le&-\alpha E_{\theta }\left( u(t)-u_{\mathrm {avg}},v(t) \right) -\left( \theta -\alpha \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 \\&- \frac{1}{2\pi } \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx\\&+\frac{\theta }{2\pi }\int _0^{2\pi }\left( \sigma (x)-\alpha \right) \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx. \end{aligned} \end{aligned}$$
(41)

The proof of the theorem will follow from the above inequality if we can show that

$$\begin{aligned} \begin{aligned}&\frac{\theta }{2\pi }\int _0^{2\pi }\left( \sigma (x)-\alpha \right) \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx \\&\quad \le \left( \theta -\alpha \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 +\frac{1}{2\pi } \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx. \end{aligned} \end{aligned}$$
(42)

Due to condition (38) we have that

$$\begin{aligned} \inf _{x\in {\mathbb {T}}}\left( 2\sigma (x)-\theta -\alpha \right) =2\sigma _{\mathrm {min}}-\theta -\alpha >0. \end{aligned}$$

Hence, we obtain with Cauchy-Schwarz, Young’s inequality \(\left|ab\right| \le \frac{a^2}{\theta }+\frac{\theta b^2}{4}\), and the Poincaré inequality, (13), that

$$\begin{aligned} \begin{aligned}&\left|\frac{\theta }{2\pi }\int _0^{2\pi }\left( \sigma (x)-\alpha \right) \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) v(x,t) dx \right|\\&\quad \le \frac{\theta }{2\pi }\int _0^{2\pi }\sqrt{2\sigma (x)-\theta -\alpha } \left|v(x,t)\right| \frac{\left|\sigma (x)-\alpha \right|}{\sqrt{2\sigma (x)-\theta -\alpha }}\left|\partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) \right|dx \\&\quad \le \frac{\theta }{2\pi }\left( \int _0^{2\pi }\left( 2\sigma (x)-\theta -\alpha \right) v(x,t)^2dx \right) ^{\frac{1}{2}}\\&\qquad \left( \int _{0}^{2\pi }\frac{\left( \sigma (x) -\alpha \right) ^2}{2\sigma (x)-\theta -\alpha } \left( \partial _x^{-1} \left( u(x,t)-u_{\mathrm {avg}} \right) \right) ^2dx \right) ^{\frac{1}{2}} \\&\quad \le \frac{1}{2\pi } \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx \\&\qquad + \frac{1}{2\pi } \int _0^{2\pi }\frac{\theta ^2\left( \sigma (x) -\alpha \right) ^2}{4\left( 2\sigma (x)-\theta -\alpha \right) }\left( \partial _x^{-1}\left( u(x,t) -u_{\mathrm {avg}} \right) \right) ^2 dx \\&\quad \le \frac{1}{2\pi } \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx + \sup _{x\in {\mathbb {T}}}\left( \frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{4\left( 2\sigma (x) -\theta -\alpha \right) } \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2 . \end{aligned} \end{aligned}$$
(43)

The above implies that (42) will be valid when

$$\begin{aligned} \sup _{x\in {\mathbb {T}}}\frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{4\left( 2\sigma (x)-\theta -\alpha \right) } \le \theta -\alpha , \end{aligned}$$

which is equivalent, due to the positivity of the denominator, to (39). The proof is thus complete. \(\square \)

Remark 3

It is worth to note that the conditions expressed in (38) are crucial in our estimation. Indeed, they tell us that

$$\begin{aligned} \left( \theta -\alpha \right) \left\Vert u(t)-u_{\mathrm {avg}}\right\Vert ^2\quad \text {and}\quad \int _0^{2\pi }( 2\sigma (x)-\theta -\alpha )v(x,t)^2 dx \end{aligned}$$

are non-negative. If one part of the condition would not be true, we would be able to “cook” initial data such that the mixed uv–term in (42) is zero, and the above terms add up to something strictly negative - breaking the functional inequality we are aiming to attain.

The next step towards proving part (b) in Theorem 1 is to look for \(\theta \) and \(\alpha \) such that conditions (38) and (39) are satisfied.

We recall the definition of \(\theta ^*\) from Theorem 1:

$$\begin{aligned} \theta ^*:=\min \left( \sigma _{\mathrm {min}},\frac{4}{\sigma _{\mathrm {max}}} \right) , \end{aligned}$$

which in a sense captures the “worst possible” behaviour when comparing \(\sigma (x)\) to the constant case (with \(\sigma \not =2\)). We show the following:

Lemma 6

Assume that \(0<\sigma _{\mathrm {min}}<\sigma _{\mathrm {max}}<\infty \), where \(\sigma _{\mathrm {min}}\) and \(\sigma _{\mathrm {max}}\) were defined in Theorem 1. Then

$$\begin{aligned} \alpha ^*:=\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) :={\left\{ \begin{array}{ll} \frac{\sigma _{\mathrm {min}}\left( 4+2\sqrt{4-\sigma _{\mathrm {min}}^2}-\sigma _{\mathrm {min}}\sigma _{\mathrm {max}} \right) }{4+2\sqrt{4-\sigma _{\mathrm {min}}^2} -\sigma _{\mathrm {min}}^2}, &{} \sigma _{\mathrm {min}}< \frac{4}{\sigma _{\mathrm {max}}}\\ \sigma _{\mathrm {max}}- \sqrt{\sigma _{\mathrm {max}}^2-4}, &{} \sigma _{\mathrm {min}}\ge \frac{4}{\sigma _{\mathrm {max}}} \end{array}\right. } \end{aligned}$$

is such that \(\theta ^*\) and \(\alpha ^*\) satisfy conditions (38) and (39).

Proof

Clearly, since

$$\begin{aligned} \theta ^*\le {\left\{ \begin{array}{ll} \sigma _{\mathrm {min}}, &{} \sigma _{\mathrm {min}}<\sigma _{\mathrm {max}}\le 2 \\ \frac{4}{\sigma _{\mathrm {max}}}, &{} \sigma _{\mathrm {max}}>2 \end{array}\right. } \end{aligned}$$

we always have that \(0<\theta ^*<2\).

We continue by considering condition (39), and finding appropriate parameters which will give condition (38) automatically. Denoting by

$$\begin{aligned} f\left( \alpha ,\theta ,y \right) :=\theta ^2\left( y-\alpha \right) ^2-4\left( \theta -\alpha \right) \left( 2y-\theta -\alpha \right) \end{aligned}$$

for \(\left( \alpha ,\theta \right) \) that satisfy condition (38) and \(y\in \left[ \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right] \), we find that for fixed \(\alpha \) and \(\theta \), f is an upward parabola in y whose non-positive part lies between its roots

$$\begin{aligned} y_{\pm }\left( \alpha ,\theta \right) :=\alpha + \frac{2\left( \theta -\alpha \right) }{\theta ^2}\left( 2\pm \sqrt{4-\theta ^2} \right) . \end{aligned}$$

Thus, condition (39) is satisfied if and only if

$$\begin{aligned} y_{-}\left( \alpha ,\theta \right) \le \sigma _{\mathrm {min}},\quad \text {and}\quad \sigma _{\mathrm {max}}\le y_{+}\left( \alpha ,\theta \right) . \end{aligned}$$

A simple calculation shows that for \(0<\theta <2\)

$$\begin{aligned} y_{-}\left( \alpha ,\theta \right)\le & {} \sigma _{\mathrm {min}}\;\;\Leftrightarrow \;\; \alpha \le \frac{\theta \left( 2\sqrt{4-\theta ^2}-\left( 4-\sigma _{\mathrm {min}}\theta \right) \right) }{2\sqrt{4-\theta ^2}-\left( 4-\theta ^2 \right) }=:\gamma _{\mathrm {min}}\left( \theta \right) , \\ \sigma _{\mathrm {max}}\le & {} y_{+}\left( \alpha ,\theta \right) \;\;\Leftrightarrow \;\; \alpha \le \frac{\theta \left( 2\sqrt{4-\theta ^2}+\left( 4-\sigma _{\mathrm {max}}\theta \right) \right) }{2\sqrt{4-\theta ^2}+\left( 4-\theta ^2 \right) }=:\gamma _{\mathrm {max}}\left( \theta \right) . \end{aligned}$$

This means that, if we choose \(\alpha \left( \theta \right) \) for a fixed \(\theta \) so that condition (39) is valid, we must have that

$$\begin{aligned} \alpha \left( \theta \right) \le \min \left( \gamma _{\mathrm {min}}\left( \theta \right) ,\gamma _{\mathrm {max}}\left( \theta \right) \right) . \end{aligned}$$

One can continue and show that (see Appendix B):

  1. (i)

    For \(\theta \le \sigma _{\mathrm {min}}\) and \(0<\theta <2\) we have that \(\gamma _{\mathrm {max}}\left( \theta \right) \le \gamma _{\mathrm {min}}\left( \theta \right) \).

  2. (ii)

    For \(\theta \le \frac{4}{\sigma _{\mathrm {max}}}\) and \(0<\theta <\sigma _{\mathrm {max}}\) we have that \(0<\gamma _{\mathrm {max}}\left( \theta \right) < \theta \).

With these observations we deduce that for any

$$\begin{aligned} \theta \in (0,\theta ^*]= \left( 0,\min \left( \sigma _{\mathrm {min}},\frac{4}{\sigma _{\mathrm {max}}} \right) \right] \cap (0,2) \end{aligned}$$

we have \(\theta <\sigma _{\mathrm {max}}\) and hence

$$\begin{aligned} \gamma _{\mathrm {max}}(\theta )=\min \left( \gamma _{\mathrm {min}}(\theta ), \gamma _{\mathrm {max}}(\theta ) \right) \quad \text {and} \quad \gamma _{\max }\left( \theta \right) <\theta . \end{aligned}$$

Hence, the pair \(\left( \theta ,\alpha =\gamma _{\mathrm {max}}(\theta ) \right) \) satisfies not only condition (39) but also

$$\begin{aligned} \gamma _{\mathrm {max}}(\theta )+\theta< 2\theta \le 2\theta ^*\le 2\sigma _{\mathrm {min}}\quad \text {and}\quad \gamma _{\mathrm {max}}\left( \theta \right) < \theta , \end{aligned}$$

i.e. condition (38). We conclude that \(\theta \) and \(\alpha =\gamma _{\mathrm {max}}\left( \theta \right) \) satisfy both desired conditions, for any \(\theta \in (0,\theta ^*]\).

Noticing that

$$\begin{aligned} \gamma _{\mathrm {max}}\left( \theta ^* \right) = \left\{ \begin{aligned}&\frac{\sigma _{\mathrm {min}}\left( 2\sqrt{4-\sigma _{\mathrm {min}}^2}+\left( 4-\sigma _{\mathrm {max}}\sigma _{\mathrm {min}} \right) \right) }{2\sqrt{4-\sigma _{\mathrm {min}}^2} +\left( 4-\sigma _{\mathrm {min}}^2 \right) },&\sigma _{\mathrm {min}}< \frac{4}{\sigma _{\mathrm {max}}} \\&\frac{\frac{8}{\sigma _{\mathrm {max}}}\sqrt{4-\frac{16}{\sigma _{\mathrm {max}}^2}}}{2\sqrt{4-\frac{16}{\sigma _{\mathrm {max}}^2}}+ 4-\frac{16}{\sigma _{\mathrm {max}}^2}},&\sigma _{\mathrm {min}}>\frac{4}{\sigma _{\mathrm {max}}} \end{aligned}\right\} =\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) , \end{aligned}$$

we conclude the proof. \(\square \)

Remark 4

The choice of \(\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) =\gamma _{\mathrm {max}}\left( \theta ^* \right) \) is not accidental. Indeed, one can easily show that

$$\begin{aligned} \frac{d}{d\theta }\gamma _{\mathrm {max}}\left( \theta \right) =\frac{8-2\sigma _{\mathrm {max}}\theta }{\left( 4-\theta ^2 \right) ^{\frac{3}{2}}}, \end{aligned}$$

and as such

$$\begin{aligned} \max _{\theta \in (0,\theta ^*]}\gamma _{\mathrm {max}}\left( \theta \right) = \gamma _{\mathrm {max}}\left( \theta ^* \right) . \end{aligned}$$

As the parameter \(\alpha ^*=\gamma _{\mathrm {max}}\left( \theta ^* \right) \) corresponds to the decay rate of our entropy according to Theorem 4, our choice of \(\alpha ^*\left( \sigma _{\mathrm {min}},\sigma _{\mathrm {max}} \right) \) was motivated by maximising the decay rate that is possible with our methodology.

We now posses all the tools which are required to prove part (b) of Theorem 1.

Proof of part (b) of Theorem 1

The convergence estimation for \(E_{\theta ^*}\left( u(t)-u_{\mathrm {avg}},v(t) \right) \) follows immediately from Theorem 4 and Lemma 6. To obtain (12) we use Lemma 3 in a similar fashion to the way we proved part (a). \(\square \)

6 Convergence to Equilibrium in a \(3-\)Velocity Goldstein-Taylor Model

The Goldstein-Taylor model can be thought of as a simplification of the BGK equation [1, 9]

$$\begin{aligned}&\partial _{t}f(x,v,t)+ v\cdot \nabla _x f(x,v,t)- \nabla _x V(x)\cdot \nabla _{v}f(x,v,t)\\&\qquad =M(v)\int f(x,v,t)dv - f(x,v,t), \end{aligned}$$

where the variable v is now in the discrete velocity space \(\left\{ v_1,\dots ,v_n\right\} \), the variable x is in the torus \({\mathbb {T}}\), and the potential V(x) is zero. The r.h.s. of the above BGK equation corresponds to a projection onto the Maxwellian M(v); in the discrete velocity case this Maxwellian is replaced by a constant matrix that determines the large time behaviour of the new model. Under the natural physical assumption of symmetry in the velocities (i.e. \(\sum _{i=1}^n v_i=0\)) and the expectation that the solutions will converge towards a state that is equally distributed in v and constant in xFootnote 4, we find one potential multi-velocity extension of the Goldstein-Taylor model on \({\mathbb {T}}\times (0,\infty )\):

$$\begin{aligned} \partial _{t}\begin{pmatrix} f_1(x,t) \\ \vdots \\ f_n(x,t) \end{pmatrix} +{\mathscr {V}} \begin{pmatrix} f_1(x,t) \\ \vdots \\ f_n(x,t) \end{pmatrix} = \sigma (x)\left( \begin{pmatrix} \frac{1}{n} \\ \vdots \\ \frac{1}{n} \end{pmatrix}\otimes \left( 1,\dots ,1 \right) -\mathbf{I} \right) \begin{pmatrix} f_1(x,t) \\ \vdots \\ f_n(x,t) \end{pmatrix}, \end{aligned}$$
(44)

with the the diagonal matrix \({\mathscr {V}}:=\text{ diag }[v_1,\dots ,v_n]\), and the discrete velocities

$$\begin{aligned} \left\{ v_1,\dots ,v_n\right\} ={\left\{ \begin{array}{ll} \left\{ -k+\frac{1}{2},\dots ,-\frac{1}{2},\frac{1}{2},\dots , k-\frac{1}{2}\right\} , &{} n=2k \\ \left\{ -k,\dots , -1,0,1,\dots , k \right\} ,&{} n=2k-1\end{array}\right. },\quad n\in {\mathbb {N}},\; n\ge 2. \end{aligned}$$

The matrix on the r.h.s. of (44) takes the form

$$\begin{aligned} {\varvec{Q}}=\frac{1}{n}\begin{pmatrix} 1-n &{} 1 &{} \dots &{} 1 \\ 1 &{} 1-n &{} \dots &{} 1 \\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ 1 &{} 1 &{} \dots &{} 1-n \end{pmatrix} \end{aligned}$$

which has \(\left( 1,1,\dots , 1 \right) ^T\) in its kernel, and \({\mathscr {A}}=\left\{ \left( \xi _1,\dots , \xi _n \right) ^T\in {\mathbb {R}}^n\;|\; \sum _{i=1}^n \xi _i=0\right\} \) as its \(n-1\) dimensional eigenspace corresponding to the eigenvalue \(\lambda =-1\). This corresponds to the conservation of total mass, and the fact that differences such as \(\left\{ f_i-f_j\right\} _{i,j=1,\dots , n}\) converge to zero. For more information we refer the interested reader to [1].

In this section we will consider a simple \(3-\)velocity Goldstein-Taylor model, which is governed by the following system of equations on \({\mathbb {T}}\times (0,\infty )\)

$$\begin{aligned} \begin{aligned}&\partial _t f_1(x,t) + \partial _x f_1(x,t)=\frac{\sigma (x)}{3}\left( f_2(x,t)+f_3(x,t)-2f_1(x,t) \right) ,\\&\partial _t f_2(x,t) =\frac{\sigma (x)}{3}\left( f_1(x,t)+f_3(x,t)-2f_2(x,t) \right) ,\\&\partial _t f_3(x,t) - \partial _x f_3(x,t)=\frac{\sigma (x)}{3}\left( f_1(x,t)+f_2(x,t)-2f_3(x,t) \right) . \end{aligned} \end{aligned}$$
(45)

Much like our Goldstein-Taylor equation, (1), we can recast the above with the variables

$$\begin{aligned} \begin{pmatrix} u_1 \\ u_2 \\ u_3 \end{pmatrix}=\begin{pmatrix} \frac{1}{\sqrt{3}} &{} \frac{1}{\sqrt{3}} &{} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{2}} &{} 0 &{} -\frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{6}} &{} -\frac{2}{\sqrt{6}} &{} \frac{1}{\sqrt{6}} \end{pmatrix} \begin{pmatrix} f_1 \\ f_2 \\ f_3 \end{pmatrix}\ , \end{aligned}$$
(46)

which yields the following set of equations:

$$\begin{aligned} \begin{aligned}&\partial _t u_1(x,t) + \sqrt{\frac{2}{3}}\partial _x u_2(x,t)=0,\\&\partial _t u_2(x,t) +\sqrt{\frac{2}{3}} \partial _x u_1(x,t)+\frac{1}{\sqrt{3}}\partial _x u_3(x,t)=-\sigma (x) u_2(x,t),\\&\partial _t u_3(x,t) + \frac{1}{\sqrt{3}}\partial _x u_2(x,t)= - \sigma (x) u_3(x). \end{aligned} \end{aligned}$$
(47)

The orthogonal transformation (46) has a strong geometrical reasoning behind it, as it diagonalises the appropriate “interaction matrix”, \({\varvec{Q}}\). It is also worth to mention that much like (3), this transformations brings us to the macroscopic variables. Indeed, up to some scaling \(u_1\) is the mass, \(u_2\) is the flux, and \(u_3\) is a linear combination of the kinetic energy and the mass.

Following our intuition we expect that by denoting

$$\begin{aligned} u_\infty := \frac{1}{2\sqrt{3}\pi }\int _{{\mathbb {T}}}\left( f_{1,0}(x)+f_{2,0}(x)+f_{3,0}(x) \right) dx, \end{aligned}$$

we will find that

$$\begin{aligned} u_1(t,x){\mathop {\longrightarrow }\limits ^{t\rightarrow \infty }} u_\infty ,\quad u_2(t,x){\mathop {\longrightarrow }\limits ^{t\rightarrow \infty }} 0,\quad u_3(t,x){\mathop {\longrightarrow }\limits ^{t\rightarrow \infty }} 0. \end{aligned}$$

To prove this result we shall introduce an appropriate Lyapunov functional. To find this functional, we have two options, even for the simple case of constant \(\sigma \) (which is our base case): Proceeding as in § 4.2, we could use a modal decomposition of (47) and the (optimal) positive definite matrices \({\mathbf {P}}_k\) to construct an entropy functional with sharp decay, and then rewrite it in physical space, using pseudo-differential operators. This construction, which is analogous to the construction of \(E_\theta (f,g)\) from (4), can become extremely cumbersome in dimension 3 and higher.

As a simpler alternative we shall hence rather follow the strategy from [1, §4.3] and [2, §2.3]: In Fourier space, the system matrix of (47) reads as

$$\begin{aligned} \mathbf{C}_k=\begin{pmatrix} 0 &{} \sqrt{\frac{2}{3}}ik &{} 0 \\ \sqrt{\frac{2}{3}}ik &{} \sigma &{} {\frac{1}{\sqrt{3}}}ik \\ 0 &{} {\frac{1}{\sqrt{3}}}ik &{} 0 \end{pmatrix}\ . \end{aligned}$$

We note that, for \(k\ne 0\), the hypocoercivity indexFootnote 5 of \(\mathbf{C}_k\), as well as of (47) is one, since this index is always bounded from above by the kernel dimension of the symmetric part of the generator, cf. [2]. For such index-1 problems, Theorem 2.6 from [2] shows that the choice

$$\begin{aligned} {\mathbf {P}}_k=\begin{pmatrix} 1 &{} \frac{\lambda }{ik} &{} 0 \\ -\frac{\lambda }{ik} &{} 1 &{} 0 \\ 0 &{} 0 &{} 1 \end{pmatrix}\ \quad k\ne 0, \end{aligned}$$

with an appropriate \(\lambda \in {\mathbb {R}}\), always yields a (simple) Lyapunov functional for (47), typically with a sub-optimal decay rate. Much like in § 5, this guides us to the definition of our functional, expressed in the following theorem:

Theorem 5

Let \(u_1,u_2,u_3\in C([0,\infty );L^2\left( {\mathbb {T}} \right) )\) be mild real valued solutions to (47) with initial datum \(u_{1,0},u_{2,0},u_{3,0}\in L^2\left( {\mathbb {T}} \right) \). Denoting by

$$\begin{aligned} {\mathfrak {E}}_{\theta }\left( f,g,h \right) :=\left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2+\left\Vert h\right\Vert ^2 - \frac{\theta }{2\pi }\int _{0}^{2\pi }\left( \partial _x^{-1}f(x) \,{g(x)} \right) dx, \end{aligned}$$

we have that

$$\begin{aligned} {\mathfrak {E}}_{\theta }\left( u_1(t)-u_{\infty },u_2(t),u_3(t) \right) \le {\mathfrak {E}}_{\theta }\left( u_{1,0}-u_{\infty },u_{2,0},u_{3,0} \right) e^{-\alpha t}\ ,\quad t\ge 0, \end{aligned}$$
(48)

for any \(\theta >0\) and \(\alpha >0\) such that

$$\begin{aligned} \sqrt{\frac{2}{3}}\theta +\alpha < 2\sigma _{\mathrm {min}}, \quad \alpha \le \sqrt{\frac{2}{3}}\theta , \end{aligned}$$
(49)

and

$$\begin{aligned} \left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{8\sigma (x) -4\sqrt{\frac{2}{3}}\theta -4\alpha } \right) +\left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2}{12\left( 2\sigma (x)-\alpha \right) } \right) \le \sqrt{\frac{2}{3}}\theta -\alpha . \end{aligned}$$
(50)

Remark 5

For \(0<\theta <2\), \({\mathfrak {E}}_{\theta }\left( f,g,h \right) \) is equivalent to \(\left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2+\left\Vert h\right\Vert ^2\). Indeed, following Lemma 3 we see that

$$\begin{aligned} \left( 1-\frac{\left|\theta \right|}{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) +\left\Vert h\right\Vert ^2\le {\mathfrak {E}}_{\theta }\left( f,g,h \right) \le \left( 1+\frac{\left|\theta \right|}{2} \right) \left( \left\Vert f\right\Vert ^2+\left\Vert g\right\Vert ^2 \right) +\left\Vert h\right\Vert ^2. \end{aligned}$$

Proof of Theorem 5

We start by noticing that the transformation

$$\begin{aligned} u_1\rightarrow u_1-u_\infty ,\quad u_2\rightarrow u_2,\quad u_3\rightarrow u_3 \end{aligned}$$

keeps (47) invariant, so we may assume, without loss of generality, that \(u_\infty =0\). This, together with the equation for \(u_1(x,t)\) implies that

$$\begin{aligned} \left( u_1(t) \right) _{\mathrm {avg}}=\left( u_{1,0} \right) _{\mathrm {avg}}=u_{\infty }=0. \end{aligned}$$

Next, we compute the time derivatives of the \(L^2\) norms and obtain:

$$\begin{aligned} \begin{aligned} \frac{d}{dt}\left( \left\Vert u_1(t)\right\Vert ^2 + \left\Vert u_2(t)\right\Vert ^2+\left\Vert u_3(t)^2\right\Vert \right) =&-\frac{1}{\pi }\int _0^{2\pi }\sigma (x)u_2(x,t)^2dx\\&-\frac{1}{\pi }\int _0^{2\pi }\sigma (x)u_3(x,t)^2dx. \end{aligned} \end{aligned}$$
(51)

Continuing, we see that

$$\begin{aligned} \begin{aligned} \frac{d}{dt}\int _{0}^{2\pi } \partial _x^{-1}u_1(x,t) u_2(x,t)dx&=2\pi \sqrt{\frac{2}{3}}\left( \left( u_2(t)_{\mathrm {avg}} \right) ^2-\left\Vert u_2(t)\right\Vert ^2 \right) + \frac{2\sqrt{2}\pi }{\sqrt{3}}\left\Vert u_1(t)\right\Vert ^2 \\&+\frac{1}{\sqrt{3}}\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx\\&-\int _{0}^{2\pi } \sigma (x)\partial _x^{-1}u_1(x,t) u_2(x,t)dx, \end{aligned} \end{aligned}$$

where we used Lemma 2. As such, together with (51), we conclude that

$$\begin{aligned} \begin{aligned}&\frac{d}{dt}{\mathfrak {E}}_{\theta }\left( u_1(t),u_2(t),u_3(t) \right) \\&\quad = -\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\sqrt{\frac{2}{3}}\theta \right) u_2(x,t)^2dx\\&\qquad -\frac{1}{\pi }\int _0^{2\pi }\sigma (x)u_3(x,t)^2dx-\sqrt{\frac{2}{3}} \theta \left\Vert u_1(t)\right\Vert ^2-\sqrt{\frac{2}{3}}\theta \left( u_2(t)_{\mathrm {avg}} \right) ^2\\&\qquad -\frac{\theta }{2\sqrt{3}\pi }\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx+\frac{\theta }{2\pi }\int _{0}^{2\pi } \sigma (x)\partial _x^{-1}u_1(x,t) u_2(x,t)dx. \end{aligned} \end{aligned}$$
(52)

Thus

$$\begin{aligned} \frac{d}{dt}{\mathfrak {E}}_{\theta }\left( u_1(t),u_2(t),u_3(t) \right) =-\alpha {\mathfrak {E}}_{\theta }\left( u_1(t),u_2(t),u_3(t) \right) +R_{\theta ,\alpha ,\sigma }(t) \end{aligned}$$

with

$$\begin{aligned} \begin{aligned} R_{\theta ,\alpha ,\sigma }(t):=&-\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x) -\sqrt{\frac{2}{3}}\theta -\alpha \right) u_2(x,t)^2dx\\&-\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\alpha \right) u_3(x,t)^2dx -\left( \sqrt{\frac{2}{3}}\theta -\alpha \right) \left\Vert u_1(t)\right\Vert ^2\\&-\sqrt{\frac{2}{3}}\theta \left( u_2(t)_{\mathrm {avg}} \right) ^2-\frac{\theta }{2\sqrt{3}\pi }\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx\\&+\frac{\theta }{2\pi }\int _{0}^{2\pi } \left( \sigma (x)-\alpha \right) \partial _x^{-1}u_1(x,t) u_2(x,t)dx. \end{aligned} \end{aligned}$$
(53)

To conclude the proof it is enough to show that under conditions (49) and (50) we have that \(R_{\theta ,\alpha ,\sigma }(t) \le 0\). We will, in fact, show the stronger statement:

$$\begin{aligned} \begin{aligned}&\left|-\frac{\theta }{2\sqrt{3}\pi }\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx+\frac{\theta }{2\pi }\int _{0}^{2\pi } \left( \sigma (x)-\alpha \right) \partial _x^{-1}u_1(x,t) u_2(x,t)dx\right|\\&\quad \le \frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\sqrt{\frac{2}{3}} \theta -\alpha \right) u_2(x,t)^2dx\\&\qquad +\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\alpha \right) u_3(x,t)^2dx +\left( \sqrt{\frac{2}{3}}\theta -\alpha \right) \left\Vert u_1(t)\right\Vert ^2. \end{aligned} \end{aligned}$$
(54)

Similarly to the techniques we have used in the proof of part (b) of Theorem 1, and using the positivity of the coefficients in the last two terms (which follows from (49)), we see that

$$\begin{aligned}&\left|\frac{\theta }{2\pi }\int _{0}^{2\pi } \left( \sigma (x)-\alpha \right) \partial _x^{-1}u_1(x,t) u_2(x,t)dx\right| \\&\quad \le \frac{\theta }{2\pi }\int _{0}^{2\pi }\frac{\left|\sigma (x) -\alpha \right|}{\sqrt{2\sigma (x)-\sqrt{\frac{2}{3}}\theta -\alpha }}\left|\partial _x^{-1}u_1(x,t)\right| \cdot \sqrt{2\sigma (x)-\sqrt{\frac{2}{3}}\theta -\alpha }\left|u_2(x,t)\right|dx \\&\quad \le \left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{8\sigma (x) -4\sqrt{\frac{2}{3}}\theta -4\alpha } \right) \left\Vert u_1(t)\right\Vert ^2+ \frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\sqrt{\frac{2}{3}}\theta -\alpha \right) u_2(x,t)^2dx, \end{aligned}$$

and that

$$\begin{aligned}&\left|\frac{\theta }{2\sqrt{3}\pi }\int _{0}^{2\pi } u_1(x,t)u_3(x,t)dx\right| \le \frac{\theta }{2\pi }\int _{0}^{2\pi } \frac{\left|u_1(x,t)\right|}{\sqrt{6\sigma (x)-3\alpha }}\sqrt{2\sigma (x)-\alpha } \left|u_3(x,t)\right|dx \\&\quad \le \left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2}{12\left( 2\sigma (x)-\alpha \right) } \right) \left\Vert u_1(t)\right\Vert ^2+\frac{1}{2\pi }\int _0^{2\pi }\left( 2\sigma (x)-\alpha \right) u_3(x,t)^2dx. \end{aligned}$$

Thus, one sees that (54) holds when

$$\begin{aligned} \left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2\left( \sigma (x)-\alpha \right) ^2}{8\sigma (x) -4\sqrt{\frac{2}{3}}\theta -4\alpha } \right) +\left( \sup _{x\in {\mathbb {T}}}\frac{\theta ^2}{12\left( 2\sigma (x)-\alpha \right) } \right) \le \sqrt{\frac{2}{3}}\theta -\alpha , \end{aligned}$$

which is (50). The proof is complete. \(\square \)

While we have elected not to optimise the choice of \(\alpha \) (as in §5), we can still infer the following, simpler yet far from optimal, corollary:

Corollary 1

Let \(\theta >0\) and \(\alpha >0\) be such that

$$\begin{aligned} \sqrt{\frac{2}{3}}\theta +\alpha < 2\sigma _{\mathrm {min}}, \quad \alpha \le \sqrt{\frac{2}{3}}\theta \end{aligned}$$

and

$$\begin{aligned} \frac{\theta ^2\sigma _{\mathrm {max}}^2}{8\sigma _{\mathrm {min}}-4\sqrt{\frac{2}{3}}\theta -4\alpha } +\frac{\theta ^2}{12\left( 2\sigma _{\mathrm {min}}-\alpha \right) } \le \sqrt{\frac{2}{3}}\theta -\alpha . \end{aligned}$$
(55)

then

$$\begin{aligned} {\mathfrak {E}}_{\theta }\left( u_1(t)-u_{\infty },u_2(t),u_3(t) \right) \le {\mathfrak {E}}_{\theta }\left( u_{1,0}-u_{\infty },u_{2,0},u_{3,0} \right) e^{-\alpha t}. \end{aligned}$$

In particular, for

$$\begin{aligned} \alpha := \min \left( \frac{\sigma _{\mathrm {min}}}{2},\frac{3\sigma _{\mathrm {min}}}{9\sigma _{\mathrm {max}}^2+1} \right) \end{aligned}$$

we have that \({\mathfrak {E}}_{\sqrt{6}\alpha }\) decays exponentially to zero with rate \(\alpha \).

Proof

Since \(\alpha <2\sigma _{\mathrm {min}}\le \sigma _{\mathrm {max}}+\sigma _{\mathrm {min}}\) we see that

$$\begin{aligned} \alpha - \sigma _{\mathrm {max}}< \sigma _{\mathrm {min}}\le \sigma (x) <\sigma _{\mathrm {max}}+ \alpha , \end{aligned}$$

implying that \(\left( \sigma (x)-\alpha \right) ^2 \le \sigma _{\mathrm {max}}^2\) for any \(x\in {\mathbb {T}}\). Using this with additional elementary estimation on the denominator of the expressions that appear in (50), we see that (49) and (50) are valid. As such the first statement of the corollary follows from Theorem 5.

To show the second part of the corollary we notice that with the choice \(\theta _{\alpha }:=\sqrt{6}\alpha \) and \(\alpha \le \frac{\sigma _{\mathrm {min}}}{2}\)

$$\begin{aligned} \sqrt{\frac{2}{3}}\theta _\alpha +\alpha =3\alpha < 2\sigma _{\mathrm {min}},\quad \alpha \le 2\alpha =\sqrt{\frac{2}{3}}\theta _{\alpha }, \end{aligned}$$

giving us (49). Using the inequalities

$$\begin{aligned} 8\sigma _{\mathrm {min}}-4\sqrt{\frac{2}{3}}\theta _\alpha -4\alpha \ge 2\sigma _{\mathrm {min}},\quad \text {and}\quad 2\sigma _{\mathrm {min}}-\alpha \ge \frac{3}{2}\sigma _{\mathrm {min}}\end{aligned}$$

for the l.h.s. of (55), we see that

$$\begin{aligned} \frac{\theta _\alpha ^2\sigma _{\mathrm {max}}^2}{8\sigma _{\mathrm {min}}-4\sqrt{\frac{2}{3}}\theta _\alpha -4\alpha }+\frac{\theta _\alpha ^2}{12\left( 2\sigma _{\mathrm {min}}-\alpha \right) } \le \left( 9\sigma _{\mathrm {max}}^2+1 \right) \frac{\alpha ^2}{3\sigma _{\mathrm {min}}}. \end{aligned}$$

Thus, since \(\sqrt{\frac{2}{3}}\theta _\alpha -\alpha =\alpha \), the desired condition (55) is valid when

$$\begin{aligned} \alpha \le \frac{3\sigma _{\mathrm {min}}}{9\sigma _{\mathrm {max}}^2+1}, \end{aligned}$$

which concludes the proof. \(\square \)