Large Time Convergence of the Non-homogeneous Goldstein-Taylor Equation

The Goldstein-Taylor equations can be thought of as a simplified version of a BGK system, where the velocity variable is constricted to a discrete set of values. It is intimately related to turbulent fluid motion and the telegrapher’s equation. A detailed understanding of the large time behaviour of the solutions to these equations has been mostly achieved in the case where the relaxation function, measuring the intensity of the relaxation towards equally distributed velocity densities, is constant. The goal of the presented work is to provide a general method to tackle the question of convergence to equilibrium when the relaxation function is not constant, and to do so as quantitatively as possible. In contrast to the usual modal decomposition of the equations, which is natural when the relaxation function is constant, we define a new Lyapunov functional of pseudodifferential nature, one that is motivated by the modal analysis in the constant case, that is able to deal with full spatial dependency of the relaxation function. The approach we develop is robust enough that one can apply it to multi-velocity Goldstein-Taylor models, and achieve explicit rates of convergence. The convergence rate we find, however, is not optimal, as we show by comparing our result to those found in [8].


Introduction
The object of this work is the large time analysis of the Goldstein-Taylor equations on the onedimensional torus T, i.e. on [0, 2π] with periodic boundary conditions, and for t ∈ (0, ∞): , where f ± (x, t) are the density functions of finding an element with a velocity ±1 in a position x ∈ T at time t > 0. The function σ ∈ L ∞ + (T) := f ∈ L ∞ (T) essmin f > 0 is the relaxation coefficient, and f ±,0 are the initial conditions. Since (1) is mass conserving, its steady state is of the form with the notation The Goldstein-Taylor model was originally considered as a diffusion process, resulting as a limit of a discontinuous random migration in 1D, where particles may change direction with rate σ . It appeared in the context of turbulent fluid motion and the telegrapher's equation, see [15,22], respectively. (1) can also be seen as a special 1D case of a BGK-model (named after the three physicists Bhatnagar, Gross, and Krook [9]) with a discrete set of velocities. Such equations commonly appear in applications like gas and fluid dynamics as velocity discretisations of various kinetic models (e.g. the Boltzmann equation). The mathematical analysis of such discrete velocity models has a long standing tradition, see [10,18] and references therein.
Although the Goldstein-Taylor equation is very simple, it still exhibits an interesting and mathematically rich structure. Hence, it has been attracting continuous interest over the last 20 years. Most of its mathematical analyses was devoted to the following three topics: scaling limits, asymptotic preserving (AP) numerical schemes, and large time behaviour. In a diffusive scaling, the Goldstein-Taylor model can be viewed as a hyperbolic approximation to the heat equation [21]. Various AP-schemes for this model in the stiff relaxation regime (i.e. for σ → ∞) were constructed and analyzed in [4,16,17]. Since the large time convergence of solutions to (1) towards its unique steady state is also the topic of this work, we shall review the related literature in more detail: Analytically, the main difficulty of (1) is with its hypocoercivity, as defined in [24]: More specifically, the relaxation operator on the r.h.s. is not coercive on T × R 2 . Hence, for each fixed x, the r.h.s. by itself would drive the system to its local equilibrium, generated by the kernel of the relaxation operator, span{ 1 1 }, but the local mass (density) might be different at different positions. Convergence to the global equilibrium ( f ∞ , f ∞ ) T only arises due to the interplay between local relaxation and the transport operator on the l.h.s. of (1).
The Goldstein-Taylor model was also considered in the analysis of [5], if one chooses the velocity matrix to be V = diag(1, −1) and the relaxation matrix A(x) to be Exponential convergence to the steady state is then proved in the aforementioned work for the system (1) with inflow boundary conditions. Such boundary conditions make the problem significantly easier than in the periodic set-up envisioned here, in particular it allows for σ (x) to be zero on a subset of T, an issue that proves to be far more difficult in our setting. In [12] the authors proved polynomial decay towards the equilibrium, allowing σ (x) to vanish at finitely many points.
In [23] the author proved exponential decay for solutions to (1) for a more general σ (x) ≥ 0. That work is based on a (non-local in time) weak coercive estimate on the damping. All of the papers mentioned so far did not focus on the optimality of the (exponential) decay rate. Using the equivalence between (1) and the telegrapher's equation, the authors of [8] have shown that this optimal decay rate, μ(σ ), is the minimum of σ avg and the spectral gap of the telegrapher's equation (excluding the case when some of those eigenvalues with real part equal to μ(σ ) are defective). The precise value of this spectral gap, however, is hardly accessible -even for simple non-constant relaxation functions σ (x) (see e.g. Appendix A). Moreover, it is based on the restrictive requirement f ±,0 ∈ H 1 (T), and cannot be extended to other discrete velocity models in 1D. The reason for the latter is that [8] heavily relies on the equivalence of (1) to the telegrapher's equation.
The issues above motivated our subsequent analysis: We introduce a method for L 2initial data that can be extended to other discrete velocity BGK-models (as illustrated below for a 3−velocities system), and that yields sharp rates for constant σ . Moreover, and most importantly, it is applicable in the general non-homogeneous σ ∈ L ∞ + (T) case and yields in these cases an explicit, quantitative lower bound for the decay rate. In this case, however, it will not achieve an optimal rate of convergence 1 to the appropriate equilibrium of the system. The method to be derived here will use a Lyapunov function technique in the spirit of the earlier works [1,2,13,24]. This paper is structured as follows: In §2 we give the analytical setting of the problem and present our main convergence result (Theorem 1). In §3 we recall some analytical results which will be needed in the analysis that will follow, and explore some properties of the entropy functional E θ and the anti-derivative of functions on T, defined in (4) and (5), respectively. § 4 is devoted to the case where σ (x) = σ is constant, which will motivate our more general approach: Based on a modal decomposition of the Goldstein-Taylor system and its spectral analysis we derive the entropy functional E θ , first on a modal level and then as a pseudo-differential operator in physical space. We conclude by proving part (a) of our main theorem. Continuing to §5, we will prove, using a perturbative approach to the problem, part (b) of our main theorem. The robustness of our method will be shown in §6 where we use it to obtain an explicit rate of convergence for a 3−velocities Goldstein-Taylor model. Finally, in Appendix A we discuss a potential way to improve the technique from §5, and explicitly show the lack of optimality of it for a particular case of σ (x).

The Setting of the Problem and Main Results
To better understand the Goldstein-Taylor system, (1), one starts by recasting it in the macroscopic variables representing the spatial (mass) density and the flux density, respectively. The macroscopic variables yield the following system of equations on T × (0, ∞): whose theory of existence and uniqueness is straightforward (since the r.h.s. is a bounded perturbation of the transport operator; see §2 in [12] or, more generally, [20]). Moreover, when one tries to understand the qualitative behaviour of (3), one notices that the equation for u speaks of "total mass conservation" (upon integration over the spatial interval (0, 2π)), while the equation for v predicts a strong decay to zero for the function. This means, at least intuitively, that the difference between f + and f − should go to zero, and that their sum retains its mass. As the main driving force of the equation is a transport operation on the torus, we will not be surprised to learn that the large time behaviour of u (and since v should go to zero, of f + and f − as well) is convergence to a constant. All of this has been verified in several cases, most generally in [8].
We now set the framework that will assist us in the investigation of the large time behaviour of (3), in a relatively general case. The natural Hilbert space to consider this problem is L 2 (T) ⊗2 , with the standard inner product for each component: where the bar denotes complex conjugation. Since (1) and (3) are (only) hypocoercive, the symmetric part of their generators (i.e. the operators on their r.h.s.) are not coercive on L 2 (T) ⊗2 . Hence, the standard L 2 -norm cannot serve as a usable Lyapunov functional. As is typical for hypocoercive equations (see [1,13,24]), a possible remedy to this problem is to consider a "twisted" norm (often also referred to as entropy functional), constructed in a way that this functional strictly decays along each trajectory (u(t), v(t)).
The following functional, which will be our entropy functional, is not an ansatz, and its origin will be derived in §4. Moreover, we will show that it will yield the sharp exponential decay for constant σ , when one chooses θ = θ(σ ) appropriately.
Definition 1 Let f , g ∈ L 2 (T) and let θ > 0 be given. Then we define the entropy E θ ( f , g) as Here, the anti-derivative of f is defined as with the average defined in (2). The normalization constant in (5) is chosen such that Several recent studies (like [1,13]) considered the Goldstein-Taylor system with constant σ . This case can be investigated fairly easy as one is able to utilise Fourier analysis in this setting, and construct a Lyapunov functional as a sum of quadratic functionals of the Fourier modes. However, the moment we change σ (x) to a non-constant function -even to one that is natural in the Fourier setting, such as sine or cosine -the Fourier analysis becomes nigh impossible to solve. The main idea that guided us in our approach was to re-examine the case where σ is constant and to recast the modal Fourier norm by using a pseudo-differential operator, without needing its modal decomposition. This functional, which is exactly E θ for particular choices of θ = θ(σ ), can then be extended to the case where σ (x) is not constant, yielding quantitative estimates for the convergence. As the nature of this approach is perturbative, our decay rates are not optimal. The methodology itself, however, is fairly robust, and is viable in other cases, such as the multi-velocity Goldstein-Taylor model (as we shall see).
The main theorem we will show in this paper, with the use of the vector notation is the following: and if σ = 2 then for any 0 < < 1 where and the decay rate μ(σ ) is sharp.
then by defining θ * := min σ min , 4 σ max (10) and we have that and as result with f ∞ defined in (8).
Part (a) of this theorem will be proved in §4.4, and Part (b) in §5. In many of the proofs which will eventually lead to the proof of this theorem we will assume that (u, v) is a classical solution, pertaining to u 0 , v 0 in the periodic Sobolev space H 1 (T). The general result will follow by a simple density argument.

Remark 1
It is simple to see that if σ (x) satisfies the conditions of (b), then, as σ min and σ max approach a positive constant σ = 2, we find that recovering the results of part (a) of the above theorem. In addition, one should note that when σ min > 4 σ max we have that where μ (σ ) was defined in part (a) of the Theorem. This validates the intuition that, if σ max is "large enough", the convergence rate of the solution can be estimated using the "worst convergence rate", corresponding to μ (σ max ) of the σ (x) = σ case. Lastly, one notices that when σ min = 4 σ max which shows the continuity of α * on the curve that stitches the two formulas in (11).

Preliminaries
In this short section we will remind the reader of a few simple properties of functions on the torus, as well as explore properties of the anti-derivative function, ∂ −1 x f , and our functional E θ ( f , g). Most of the simple proofs of this section will be deferred to Appendix B.
We begin with the well known Poincaré inequality: Next we focus our attention on some simple, yet crucial, properties of the anti-derivative function which was defined in (5).
Lemma 2 Let f ∈ L 1 (T). Then: x f is a continuous function on the torus, and Remark 2 (ii), (iv), and the fact that f is a function on the torus, imply that if f avg = 0 we are allowed to use integration by parts with ∂ −1 x f (x) on this boundaryless manifold without qualms. where Proof We begin by noticing that the validity of (19) follows immediately from the fact that u is a mild solution and the conservation of mass property of the system (3). Moreover, one can see that replacing (u(t), v(t)) by u(t) − u avg , v(t) yields an equivalent solution (up to a constant shift in the initial data) to the system of equations, with the additional condition that the average of the first component is zero for all t ≥ 0. With this observation in mind, we can assume without loss of generality that u avg = 0. Using the Goldstein-Taylor equations we see that We now turn our attention to the mixed term of E θ (u, v): Using points (ii) and (iii) of Lemma 2, together with Remark 2, we find that the above equals Subtracting this from (20) (as there is a minus in definition (4)) yields (18).

Constant Relaxation Function
In recent years, the investigation of the Goldstein-Taylor model on T with constant relaxation function σ was frequently tackled with a modal decomposition in the Fourier space w.r.t.
x. This approach allows for an extension to other discrete velocity models and even some continuous velocities models [1], but is not suitable for the non-homogeneous case. Before beginning with our investigation we review a few recent results: In [13, §1.4] exponential convergence to equilibrium was shown, but without the sharp rate. In [1, §4.1] a hypocoercive decay estimate of the form with the vector notation from (6) and the sharp rate was obtained (see also Fig. 1 below). A further study on the minimal constant c in the above was provided in [3, Th. 1.1].
With these results in mind, we turn our attention to the following (recast) Goldstein-Taylor equation with a constant relaxation rate: In order to be able to discover our entropy functional, we shall consider the straightforward modal analysis in detail. This will allow us to obtain not only explicit decay rates for each Fourier mode, but also an "optimal Lyapunov functional" for such given mode, with which we Fig. 1 The exponential decay rate, μ (σ ), of the solution pair (u(t) − u avg , v(t)) grows linearly until σ = 2 where the defectiveness appears (hence the circle). From that point onwards the decay rate decreases, and is of order O 1 σ will then be able to construct a non-modal entropy functional in terms of a pseudo-differential operator as defined in (4). As was mentioned in §2, this will give us intuition to the large time behaviour of the equation in several cases even when σ (x) is not constant.

Fourier Analysis and the Spectral Gap
One natural way to understand the large time behaviour of (21) relies on a simple Fourier analysis together with a hypocoercivity technique that was developed by Arnold and Erb in [6]. We begin with the former, and focus on the latter from the next subsection onwards.
Using the Fourier transform on the torus (i.e. in the spatial variable), we see that (21) is equivalent to infinity many decoupled ODE systems: The eigenvalues of the matrices C k ∈ C 2×2 are given by and as such: -Invariant space: For k = 0 we find that λ −,0 = 0 and λ +,0 = σ . In fact, as we can conclude immediately that u(0, t) = u 0 (0) and v(0, t) = v 0 (0)e −σ t , corresponding to the mass conservation of the original equation and the rapid decay of the difference between the masses of f − and f + . -Case I: For 0 < |k| < σ 2 one finds two real eigenvalues, whose minimum is 2 ∈ N the two eigenvalues coincide and are equal to σ 2 . Moreover, that eigenvalue is defective (i.e. corresponds to a Jordan block of size 2) and the large time behaviour of u(k) and v(k) is controlled by (1 + t) e − σ 2 t . -Case III: For |k| > σ 2 , one finds two complex conjugate eigenvalues, whose real part equals σ 2 . Thus the large time behaviour of u(k) and v(k) is controlled by e − σ 2 t .
From the observations above, we notice that as long as we subtract u(0), i.e. as long as we remove the initial total mass from the original solution, all the modes converge exponentially to zero. Their rates have a sharp, and uniform-in-k lower bound that depends on σ . This spectral gap of (21) will be denoted by μ (σ ). Case I, i.e. 0 < |k| < σ 2 , is the most "difficult case" as the real part of the eigenvalues depends on k. However, one notices that the lower eigenvalue, λ −,k , increases with k, which implies that, if there are k−s such that 0 < |k| < σ 2 , the slowest possible convergence will be given by λ −,±1 . As we need to compare the decay rates of all modes simultaneously, we find that it is enough to consider the following possibilities: -0 < σ < 2: We only have possibilities of Case III, implying that all modes are controlled by e − σ 2 t . -σ = 2: We have possibilities of Case III, as well as defectiveness in k = ±1 (Case II). This means that the modes are controlled by (1 + t) e −t . If one searches for a pure exponential control, the best rate one would find is e −(1− )t for any given fixed > 0. -σ > 2: We have possibilities from Cases I and III, and potentially Case II. All the modes that correspond to Case I are controlled by e , while those that correspond to Case III are controlled by e − σ 2 t . If Case II is realised, i.e. σ 2 ∈ N \ {1}, we find that the modes k = ± σ 2 are controlled by (1 + t) e − σ 2 t . In total, thus, all the modes are controlled by e , a decay rate that is realised on the k = ±1 modes, and the coefficient in the exponent is the spectral gap of the Goldstein-Taylor system (21).
An illustration of the eigenvalues of the matrices C k for |k| ∈ N and σ = 5 can be viewed in Fig. 2. Before we turn our attention to properly consider these cases and "uncover" our spatial entropy, we remind the reader of the hypocoercivity technique which will allow us to transform the spectral information of C k into a an appropriate, twisted norm with which we will show the desired decay of the k-th mode.

Hypocoercivity and Modal Lyapunov Functionals
In the previous subsection we have concluded that, barring the zero mode, all the Fourier modes of (22) decay exponentially (excluding potentially those with |k| = σ norm with the help of another, closely related, positive definite matrix P k , one can construct a new Lyaponov functional, which is equivalent to the Euclidean norm, that decays with the expected exponential rate (at least for a non-defective C k ). This is exactly the idea that motivated Arnold and Erb, and which is expressed in the following theorem (see [6], [ ii) If at least one eigenvalue with real part equal to μ is defective, then for any > 0, one can find a Hermitian, positive definite matrix P such that where C * denotes the Hermitian transpose of C.
We remark that the matrices P and P are never unique. One can utilise the theorem in the following way: Assuming the eigenvalues associated to C's spectral gap, μ, are non-defective, then by defining the norm y 2 P := y, Py = y * Py, one sees that, if y(t) solves the ODEẏ = −Cy, then resulting in the correct decay rate. The same approach works in the second case of Theorem 2.
Besides the general idea of this methodology, Arnold and Erb have given a recipe (one that was later extended in [7] to defective cases, using a time dependent matrix P) to finding the matrices P and P : Assuming that C is diagonalisable, and letting {ω i } i=1,...,n be the eigenvectors of C * , the matrix P > 0 can be chosen to be for any positive sequence {b i } i=1,...,n . The above formula remains true, for a particular choice of {b i } i=1,...,n , in the case where C is not diagonalisable. In that case we also need to augment the eigenvectors with the generalised eigenvectors. We refer the interested reader to Lemma 4.3 in [6]. Moreover, for n = 2, the case we shall need below, and C non-defective, all matrices P satisfying (24)  We now turn our attention back to the Fourier transformed Goldstein-Taylor system (22) and determine the modal Lyapunov functionals using the above recipe. A short computation, where the weights b 1 , b 2 are chosen such that both diagonal elements of P are 1, finds the following matrices (For Case III we also require b 1 = b 2 , as this minimises the number of the resulting admissible matrices P k satisfying (24).): In this case we have: -Case II: |k| = σ 2 ∈ N. As this case fosters defective eigenvalues, we will only consider the case σ = 2 (as was mentioned beforehand), and state the matrix corresponding to k = ±1 and a given fixed > 0: In this case we have: For each mode k = 0, its modal Lyapunov functional will be given by , where the matrix P k is chosen according to the above three cases. In Case II, the parameter > 0 can be chosen arbitrarily small.

Derivation of the Spatial Entropy E Â (u, v)
The goal of this subsection is twofold: Finding a modal entropy to our system, and translating it to a spatial entropy that is modal-independent. To begin with we shall define a modal entropy to quantify the exponential decay of solutions to (22) towards its steady state: Since the matrix C 0 from (23) has no spectral gap, the mode k = 0 plays a special role, and hence will be treated separately. Once found, we will want to relate that modal-based entropy to the spatial entropy E θ from Definition 1, which is not based on a modal decomposition. To this end we already remark that the off-diagonal factors ik in (28) and 1/ik in (30) correspond in physical space, roughly speaking, to a first derivative and an anti-derivative, respectively. As in §4.1 we shall distinguish three cases of σ : 0 < σ < 2 : All modes k = 0 satisfy |k| > σ 2 , and hence are of Case III. We recall from §4.1 that all modes decay here with the sharp rate σ 2 . For a modal entropy to reflect this decay, we hence have to use for each mode a Lyapunov functional û(k,t) v(k,t) is the most convenient choice. We define the modal entropy for any { u(k), v(k)} k∈Z such that u(0) = 0 as where we used the convention u(0) 0 = 0. The mode k = 0 was included since u(0, t) = u(0) = 0 and v(0, t) = v(0)e −σ t . Using Plancherel's equality, and (iv) from Lemma 2, we find that which shows why we consider the spatial entropy functional from Definition 1 in this case. We note that, since u avg (t) is conserved, part (iv) of Lemma 2, explains why we have chosen to use the anti-derivative of u, and not of v. σ > 2 : This situation is more complicated than the previous one, as we have a mixture of at least two of the aforementioned three cases: finitely many k−s in Z for which 0 < |k| < σ 2 (i.e. Case I), Case II for two k−s if σ 2 ∈ N, while the rest satisfy |k| > σ 2 (i.e. Case III). Following the above methodology to construct the modal entropy, we would need to use a combination of P , given by (28) and (30), and potentially a matrix for the defective modes. This is feasible on the modal level, but does not easily translate back to the spatial variables. It would yield a complicated pseudo-differential operator "inside" the spatial entropy.
Recalling the discussion from §4.1 we see that the overall decay rate, μ = σ 2 − σ 2 4 − 1 is only determined by the modes k = ±1. Since all the other modes decay faster, we are not obliged to use "optimal" modal Lyapunov functionals for these higher modes. This gives some leeway for choosing the matrices P k , |k| > 1. Moreover, using these "optimal" functionals will result in worsening of (i.e. enlargement of) the multiplicative constant in the L 2 hypocoercive estimation (7). Due to these reasons we will use the matrix when k = 0, which satisfies P suff ±1 = P (I ) ±1 for the crucial lowest modes. It also satisfies the following result, which implies exponential decay of all modal Lyapunov functionals

Lemma 4 Let σ > 2. Then
The proof of this lemma is straightforward 3 . Proceeding like in (32) we define the modal entropy for any { u(k), v(k)} k∈Z such that u(0) = 0: Due to (34) and (35) it is related to the spatial entropy functional from Definition 1 as Just like in the previous case, the lowest frequency modes k = ±1 control the large time behaviour. However, the matrices C ±1 are now defective, which leads to a (purely) exponential decay rate reduced by .
We proceed similarly to the case σ > 2 and define for some > 0: which satisfies P suff ,±1 = P ,±1 for the crucial lowest model. It also satisfies the following result, which implies exponential decay of all modal Lyapunov functionals û(k,t) v(k,t) 2 P suff ,k , k = 0 with rate of at least 2μ = 2(1 − ).
Proceeding like in (32) we define the modal entropy for any { u(k), v(k)} k∈Z such that u(0) = 0: Due to (34) and (36) it is related to the spatial entropy functional from Definition 1 as

The Evolution of the Spatial Entropy E Â
In the previous subsection we have shown how, depending on the value of σ , the entropies E σ , E 4 σ and E 2 ( 2− 2 ) 2+ 2 are the correct candidates to show the exponential convergence to equilibrium. A closer look at (26) shows that each modal Lyapunov functional û(k,t) v(k,t) 2 P k decays exponentially, and hence also the spatial entropy E θ . Recalling the decay rates presented in §4.3 for the three regimes of σ , confirms that we have actually already proved most of part (a) of Theorem 1. However, as our main goal is to consider these functionals in the spatial variable alone (i.e. without a modal decomposition), we shall show how one achieves the correct convergence result following a direct calculation. This will also serve as a preparation for §5.

Theorem 3 Under the same conditions of Theorem 1 with σ (x) = σ , one has that
iii) If σ = 2 then for any 0 < < 1 Proof In order to prove this theorem we shall obtain differential inequalities for E θ , from which we will conclude the desired result by a simple application of Gronwall's inequality. Using Proposition 1 we find that: If 0 < σ < 2 : Cauchy-Schwarz inequality, together with Poincaré inequality (Lemma 1) and Lemma 2, imply that Together with the fact that 2 |ab| ≤ a 2 + b 2 this shows (37), concluding the proof in this case.
If σ = 2 : Like before, the desired inequality will follow if This is valid since where we used Cauchy-Schwarz inequality, Poincaré inequality, and Lemma 2 again. The theorem is now complete.
As the last part of this section, we finally prove part (a) of Theorem 1:

Proof of part (a) of Theorem 1 The decay estimates of E θ(σ ) are already shown in Theorem 3.
To show (7) and (9) we recall that for 0 < θ < 2 and f avg = 0, according to Lemma 3. Thus, using the definition of f ∞ from (8) we see that which shows the result for the appropriate choices of θ(σ ) and μ(σ ). For σ = 2 we choose The sharpness of the decay rate for σ = 2 can be verified easily on the first mode, e.g. for With the constant case fully behind us, we can now focus on the case where σ (x) is a non-constant function.

x−Dependent Relaxation Function
The large time behaviour of solutions to the Goldstein-Taylor equation (1), or equivalently its recast form (3), becomes increasingly harder to understand, if the relaxation function, σ (x), is not a constant. However, as shown in §4, we have managed to find a potential spatial entropy that captures the exact behaviour of the decay to equilibrium. The idea that we will employ in this section is to use the same type of entropy to try and estimate the convergence rate even when σ (x) is not constant. This is, as mentioned in the introduction, a perturbative approach -yet the methodology, and ideas, are robust enough to deal with more complicated systems, as will be shown in the next section.
A fundamental theorem to establish our main result, Theorem 1 (b), is the following: Denoting by u avg = (u 0 ) avg we have that for any given 0 < α, θ < 2 the conditions α < θ, θ + α < 2σ min (38) and imply that Proof Using (18) from Proposition 1, and the fact that θ v(t) avg 2 ≥ 0, we find that The proof of the theorem will follow from the above inequality if we can show that Due to condition (38) we have that Hence, we obtain with Cauchy-Schwarz, Young's inequality |ab| ≤ a 2 θ + θ b 2 4 , and the Poincaré inequality, (13), that The above implies that (42) will be valid when which is equivalent, due to the positivity of the denominator, to (39). The proof is thus complete.

Remark 3
It is worth to note that the conditions expressed in (38) are crucial in our estimation. Indeed, they tell us that are non-negative. If one part of the condition would not be true, we would be able to "cook" initial data such that the mixed u-v-term in (42) is zero, and the above terms add up to something strictly negative -breaking the functional inequality we are aiming to attain.
The next step towards proving part (b) in Theorem 1 is to look for θ and α such that conditions (38) and (39) are satisfied.
We recall the definition of θ * from Theorem 1: which in a sense captures the "worst possible" behaviour when comparing σ (x) to the constant case (with σ = 2). We show the following:
Proof Clearly, since we always have that 0 < θ * < 2. We continue by considering condition (39), and finding appropriate parameters which will give condition (38) automatically. Denoting by for (α, θ ) that satisfy condition (38) and y ∈ [σ min , σ max ], we find that for fixed α and θ , f is an upward parabola in y whose non-positive part lies between its roots
Noticing that we conclude the proof.
As the parameter α * = γ max (θ * ) corresponds to the decay rate of our entropy according to Theorem 4, our choice of α * (σ min , σ max ) was motivated by maximising the decay rate that is possible with our methodology.
We now posses all the tools which are required to prove part (b) of Theorem 1.

Proof of part (b) of Theorem 1
The convergence estimation for E θ * u(t) − u avg , v(t) follows immediately from Theorem 4 and Lemma 6. To obtain (12) we use Lemma 3 in a similar fashion to the way we proved part (a).

Convergence to Equilibrium in a 3−Velocity Goldstein-Taylor Model
The Goldstein-Taylor model can be thought of as a simplification of the BGK equation [1,9] where the variable v is now in the discrete velocity space {v 1 , . . . , v n }, the variable x is in the torus T, and the potential V (x) is zero. The r.h.s. of the above BGK equation corresponds to a projection onto the Maxwellian M(v); in the discrete velocity case this Maxwellian is replaced by a constant matrix that determines the large time behaviour of the new model. Under the natural physical assumption of symmetry in the velocities (i.e. n i=1 v i = 0) and the expectation that the solutions will converge towards a state that is equally distributed in v and constant in x 4 , we find one potential multi-velocity extension of the Goldstein-Taylor model on T × (0, ∞): x, t) . . . x, t) . . . x, t) . . .
with the the diagonal matrix V := diag[v 1 , . . . , v n ], and the discrete velocities The matrix on the r.h.s. of (44) takes the form which has (1, 1, . . . , 1) T in its kernel, and A = (ξ 1 , . . . , ξ n ) T ∈ R n | n i=1 ξ i = 0 as its n − 1 dimensional eigenspace corresponding to the eigenvalue λ = −1. This corresponds to the conservation of total mass, and the fact that differences such as f i − f j i, j=1,...,n converge to zero. For more information we refer the interested reader to [1]. In this section we will consider a simple 3−velocity Goldstein-Taylor model, which is governed by the following system of equations on T × (0, ∞) , , Much like our Goldstein-Taylor equation, (1), we can recast the above with the variables which yields the following set of equations: The orthogonal transformation (46) has a strong geometrical reasoning behind it, as it diagonalises the appropriate "interaction matrix", Q. It is also worth to mention that much like (3), this transformations brings us to the macroscopic variables. Indeed, up to some scaling u 1 is the mass, u 2 is the flux, and u 3 is a linear combination of the kinetic energy and the mass. Following our intuition we expect that by denoting we will find that To prove this result we shall introduce an appropriate Lyapunov functional. To find this functional, we have two options, even for the simple case of constant σ (which is our base case): Proceeding as in § 4.2, we could use a modal decomposition of (47) and the (optimal) positive definite matrices P k to construct an entropy functional with sharp decay, and then rewrite it in physical space, using pseudo-differential operators. This construction, which is analogous to the construction of E θ ( f , g) from (4), can become extremely cumbersome in dimension 3 and higher.
As a simpler alternative we shall hence rather follow the strategy from [1, §4.3] and [2, §2.3]: In Fourier space, the system matrix of (47) reads as We note that, for k = 0, the hypocoercivity index 5 of C k , as well as of (47) is one, since this index is always bounded from above by the kernel dimension of the symmetric part of the generator, cf. [2]. For such index-1 problems, Theorem 2.6 from [2] shows that the choice with an appropriate λ ∈ R, always yields a (simple) Lyapunov functional for (47), typically with a sub-optimal decay rate. Much like in § 5, this guides us to the definition of our functional, expressed in the following theorem: Theorem 5 Let u 1 , u 2 , u 3 ∈ C([0, ∞); L 2 (T)) be mild real valued solutions to (47) with initial datum u 1,0 , u 2,0 , u 3,0 ∈ L 2 (T). Denoting by we have that for any θ > 0 and α > 0 such that and ⎛ ⎝ sup is equivalent to f 2 + g 2 + h 2 . Indeed, following Lemma 3 we see that Proof of Theorem 5 We start by noticing that the transformation keeps (47) invariant, so we may assume, without loss of generality, that u ∞ = 0. This, together with the equation for u 1 (x, t) implies that (u 1 (t)) avg = u 1,0 avg = u ∞ = 0.
Next, we compute the time derivatives of the L 2 norms and obtain: Continuing, we see that where we used Lemma 2. As such, together with (51), we conclude that Thus To conclude the proof it is enough to show that under conditions (49) and (50) we have that R θ,α,σ (t) ≤ 0. We will, in fact, show the stronger statement: Similarly to the techniques we have used in the proof of part (b) of Theorem 1, and using the positivity of the coefficients in the last two terms (which follows from (49)), we see that and that Thus, one sees that (54) holds when ⎛ ⎝ sup which is (50). The proof is complete.
While we have elected not to optimise the choice of α (as in §5), we can still infer the following, simpler yet far from optimal, corollary: Corollary 1 Let θ > 0 and α > 0 be such that In particular, for α := min σ min 2 , 3σ min 9σ 2 max + 1 we have that E √ 6α decays exponentially to zero with rate α.
Proof Since α < 2σ min ≤ σ max + σ min we see that implying that (σ (x) − α) 2 ≤ σ 2 max for any x ∈ T. Using this with additional elementary estimation on the denominator of the expressions that appear in (50), we see that (49) and (50) are valid. As such the first statement of the corollary follows from Theorem 5. To show the second part of the corollary we notice that with the choice θ α := √ 6α and α ≤ σ min giving us (49). Using the inequalities 8σ min − 4 2 3 θ α − 4α ≥ 2σ min , and 2σ min − α ≥ 3 2 σ min for the l.h.s. of (55), we see that Thus, since 2 3 θ α − α = α, the desired condition (55) is valid when which concludes the proof.
Funding Open Access funding provided by University of Graz.
show the non-positivity of an appropriate "remainder". The use of these two inequalities is somewhat crude (yet due to that, quite general), and one can imagine that replacing these two estimations with an inequality that is more L 2 based would improve the range of validity of the theorem. One idea that comes to mind is a weighted Poincaré inequality, i.e. an inequality of the form for a given weight ω(x) ≥ 0 and constant C. Denoting by 6 we can replace condition (39) of Theorem 4 with the improved condition (58) will be explicitly derived in §1. From this point onwards the appendix will proceed as follows: First we will show how one can find the optimal weighted Poincaré constant, and compute it in some simple cases, which we will then use in the case were σ (x) is given by (56) to obtain an improvement of our current rate of convergence to equilibrium. Next we will compute the optimal rate given by [8], and conclude a lack of optimality by comparing the rate we achieved in our main theorem, the improved rate we have found, and the optimal rate of [8].

Weighted Poincaré Inequality
The problem of finding a weighted Poincaré inequality and its associated sharp constant can be recast as a constrained variational problem. We define the functional where H 1 (T) is the Sobolev space of real valued periodic functions, by and denote by Even though the minimization set is not convex, standard techniques from Calculus of Variation (see for instance [14,Sect. 8]) show that if ω is bounded then the infimum is attained (the conditions on ω can be weakened).
One can easily check that in that case Finding a minimiser to the problem (59) amounts to solving the following constrained Euler-Lagrange equation on T considered in weak form, with two Lagrange multipliers λ > 0 and τ ∈ R. Integrating (60) against u shows that which we will use shortly.
Since ω ∈ L ∞ (T), we find that u ∈ H 2 (T) → C 1 (T). When ω is piecewise constant, the ODE (60), now in strong form, can be solved explicitly. This shows that in these cases the minimiser of F is actually unique.
This is how we can find c min , and consequently C ω , explicitly (numerically in many cases).

Improved Methodology
We return now to the proof of the differential inequality (41) that governs the evolution of E θ , which is essentially based on the estimate (43). Choosing θ * = 1 and σ (x) as in (56) and using the weight which appears in the penultimate line of (43), we see that by using the previously discussed weighted Poincaré inequality instead of the last step of (43), we obtain from (41) and (43): We will maximise the decay rate α, satisfying (so that the second term in (64) is non-positive) by a processes of iteration: Guessing the starting value α 0 := α * (1, 4) = 2 2 − √ 3 (the rate one obtains from our main Theorem 1, cf. (11)) we follow the process described in the previous subsection and find the weighted Poincaré constant C 2 ω α 0 = 1.12013..., which indeed satisfies (65). We proceed and create a sequence {α n } n∈N 0 , defined recursively, so that each α n improves upon the previous step by taking its "optimal" value, i.e. , n ∈ N, as long as (65) is still satisfied for this choice. A change of α implies a change of our weight function ω α (x), yet these new weights are still of the form given in our previous subsection. As such we are able to compute the appropriate C ω αn s, and to show that this sequence converges to the improved decay rate 7 α max ≈ 0.7234.