Quantitative Rates of Convergence to Non-equilibrium Steady State for a Weakly Anharmonic Chain of Oscillators

We study a 1-dimensional chain of N weakly anharmonic classical oscillators coupled at its ends to heat baths at different temperatures. Each oscillator is subject to pinning potential and it also interacts with its nearest neighbors. In our set up both potentials are homogeneous and bounded (with N dependent bounds) perturbations of the harmonic ones. We show how a generalised version of Bakry–Emery theory can be adapted to this case of a hypoelliptic generator which is inspired by Baudoin (J Funct Anal 273(7):2275-2291, 2017). By that we prove exponential convergence to non-equilibrium steady state in Wasserstein–Kantorovich distance and in relative entropy with quantitative rates. We estimate the constants in the rate by solving a Lyapunov-type matrix equation and we obtain that the exponential rate, for the homogeneous chain, has order bigger than N-3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N^{-3}$$\end{document}. For the purely harmonic chain the order of the rate is in [N-3,N-1]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ [N^{-3},N^{-1}]$$\end{document}. This shows that, in this set up, the spectral gap decays at most polynomially with N.


Description of the Model
We consider a model for heat conduction consisting of a one-dimensional chain of N coupled oscillators. The evolution is a Hamiltonian dynamics with Hamiltonian where ( p, q) belong in the phase space R 2N and q 0 , q N +1 describe the boundaries which here are considered to be fixed: q 0 = q N +1 = 0. We denote by q = (q 1 , . . . , q N ) ∈ R N the displacements of the atoms from their equilibrium positions and by p = ( p 1 , . . . , p N ) ∈ R N the momenta. Each particle has its own pinning potential U pin and it also interacts with its nearest neighbors through an interaction potential U int . Notice that here all the masses are equal and we take them m i = 1. So we consider a homogeneous chain, where both the masses and the potentials that act on each oscillator, are the same. The classical Hamiltonian dynamics is perturbed by noise and friction in the following way: the two ends of the chain are in contact with heat Langevin baths at two different temperatures T L , T R > 0. So our dynamics is described by the following system of SDEs: where γ i are the friction constants, T i are the two temperatures and W 1 , W N are two independent normalised Wiener processes. The dynamics (1.1) is equivalently described by the following Liouville equation on the law of the process where L is the second order differential operator which is the generator of the semigroup P t acting on the space C 2 b (R 2N ) of bounded realvalued, C 2 functions on the phase space. We denote by L * the generator of the dual semigroup that acts on probability measures.

State of the Art
The model described by the SDEs (1.1), was first used to describe heat diffusion and derive rigorously Fourier's law (for an overview see [8,14,27] and [17]). Since then, it has been the subject of many studies, both from a numerical and from a theoretical perspective. First, the purely harmonic case with several idealised reservoirs at different temperatures has been solved explicitly in [35]. In this paper the authors found exactly how the non-equilibrium stationary state looks like: it is Gaussian in the positions and momenta of the system. For the anharmonic chain there are no explicit results in general. However it has been studied numerically for many different potentials and many kinds of heat baths, including the Langevin heat baths that we consider here. See for instance [2,20,28] and references therein.
There are two facts in this model that make its rigorous study very challenging: first of all, we do not know explicitly the form of the invariant measure of (1.1) and also our generator is highly degenerate, having the dissipation and noise acting only on two variables of momenta at the end of the chain. It is not difficult to see, though, that in the equilibrium case, i.e. when the two temperatures are equal T L = T R = T = β −1 , the stationary measure is the Gibbs-Boltzmann measure dμ( p, q) = exp(−β H ( p, q))d pdq: after explicit calculations we have L * e −β H ( p,q) = 0.
Since we are interested in the theoretical aspects of the model, we refer to [15,16], which is the first rigorous study of the anharmonic case. The existence of a steady state has only been obtained in some cases where the potentials act like polynomials near infinity. In particular under the following assumptions on the potentials: lim λ→∞ λ −k U (λq) = a k |q| k and lim λ→∞ λ 1−k U (λq) = ka k |q| k−1 sign(q) for constants a k > 0, where for the interaction: k ≥ 2 and for the pinning k ≥ 1 (the exponent k for the pinning was improved in [9]) and assuming that the interaction potential is at least as strong as the pinning, the existence and uniqueness of an invariant measure was first proved in [15] using functional analytic methods. In particular it was proved that the resolvent of the generator of (1.1) is compact in a suitable weighted L 2 space. Later it was proved in [34] that the rate of convergence to the steady state is exponential using probabilistic tools. Note that in the above-mentioned papers, the coupling of the chain with the heat baths is slightly different and a bit more complicated than considering Langevin thermostats, with physical interpretation: the model of the reservoirs is the classical field theory given by linear wave equations with initial conditions distributed with respect to appropriate Gibbs measures at different temperatures, see also [33,Sect. 2]. Later, an adaptation of a very similar probabilistic proof was provided in [9] for the Langevin thermostats. The difference with the Langevin heat baths is that the dissipation and the noise act on the momenta only indirectly through some auxiliary variables. Finally let us mention that the relaxation rates have been studied for short chains of rotors with Langevin thermostats in [11,13].
Regarding the existence, uniqueness of a non-equilibrium stationary state and exponential convergence towards it in more complicated networks of oscillators (multi-dimensional cases) see [12]. The proofs there are inspired by the above-mentioned works in the 1-dimensional chains.
There are also cases where there is no convergence to equilibrium, when for instance l > k, i.e. when the pinning is stronger than the coupling potential, see for example [21,22]. In [22] the resolvent of the generator fails to be compact or/and there is lack of spectral gap, under some scenarios included in l > k. In particular, when the interaction is harmonic, 0 belongs in the essential spectrum of the generator as soon as the pinning potential is of the form |q| k for k > 3. The conjecture is that this is true as soon as k > 2n 2n−1 if n is the center of the chain.

Notation
denote the elements of the canonical basis in R n and | · | to denote the Euclidean norm on R n , from the usual inner product ·, · . For a square matrix A = (a i j ) 1≤i, j≤n ∈ R n×n , we write A 2 for the operator (spectral) norm, induced by the Euclidean norm for vectors : We also write A 1/2 for the square root of a (positive definite) matrix A, i.e. the matrix such that A 1/2 A 1/2 = A, for A 1/2 a positive definite matrix as well. Moreover, by C ∞ b (R n ) we denote the space of the smooth and bounded functions, by ∇ z we denote the gradient on z-variables in a metric space X with respect to the Euclidean metric. We write P 2 (R n ) for the space of the probability measures on R n that have second moment finite, i.e.
[N ] denotes the set {1, 2, . . . , N } and we use the notation g(x) O f (x) to indicate that there is a dimensionless constant C > 0 so that |g(x)| ≤ C| f (x)|.

Set Up and Main Results
Let us state two assumptions: one on the boundary conditions of the chain and one on the potentials.
• (H1) Regarding the boundary conditions, we consider the oscillators chain with rigidly fixed edges: the left boundary of the chain is an oscillator labelled 0 and the right is an oscillator labelled N + 1 under the hypothesis that q 0 = q N +1 = 0. The first and the last particle are pinned with additional harmonic forces, corresponding to their attachment to a wall. Note that these boundary conditions and heat baths modelled by two Ornstein-Uhlenbeck processes at both ends as explained above, is the same model as in [35] and is known as the Casher-Lebowitz model, since it is also one of the models considered in [10]. 1 • (H2) The chain is weakly anharmonic: both pinning and interaction potentials differ from the quadratic ones by perturbing potentials U N pin , U N int ∈ C 2 (R) with bounded Hessians in the following sense: and C 0 is a dimensionless constant.
Under Assumptions (H1) and (H2) for a ≥ 0, c > 0, the Hamiltonian takes the form and denoting by L the infinitesimal generator, we look at the Liouville equation ∂ t f = L * f , where the generator of the dynamics now is where we take all the friction constants equal γ 1 = γ N = γ , for the two temperatures T L , T R we assume that they satisfy T L = T + T , T R = T − T , for some temperature difference T > 0. Also, B is the symmetric tridiagonal (Jacobi) matrix It is convenient to see the above form of the generator in the following block-matrix form: where z = ( p, q) T ∈ R 2N , (q) corresponds to the perturbing potentials so that and M in blocks is the following where I is the identity matrix, so that it corresponds to the transport part of the operator, while B and correspond to the harmonic part of the potentials and the drift from both ends, respectively.
Motivation. This study is motivated by a discussion opened in C. Villani's memoir on hypocoercivity, see Sect. 9.2 in [40], concerning open questions on the heat conduction model as defined above, and how to approach them by hypocoercive techniques. This chain of coupled oscillators corresponds to a hypocoercive situation, where the diffusion only at the ends of the chain leads to a convergence to the stationary distribution exponentially fast, under the following assumptions on the potentials: strict convexity on the interaction potential (being stronger than the pinning one) and bounded Hessians for both potentials. In particular, he points out that it might be possible to recover the previous results of exponential convergence in the weighted H 1 (μ)-norm for this different class of potentials (than the potentials assumed in [16] for instance) by applying a generalised version of Theorem 24 in [40]. For that, one needs to know some properties of the, non-explicit, non-equilibrium steady state μ: for instance, if it satisfies a Poincaré inequality or if the Hessian of the logarithm of its density is bounded. Finally we note that entropic hypocoercivity has been applied in [29] in order to develop estimates and to get quantitative convergence results to the limit equation, for anharmonic chains but with thermostats in contact with all the particles along the chain.
Main results. Here, considering a perturbation of the harmonic chain (homogeneous case), instead we follow an approach that combines hypocoercivity techniques and the Bakry-Émery theory of calculus and curvature conditions as in [4]. We prove the validity of the Bakry-Émery criterion in a modified setting. This is explained in more details and is implemented in Sect. 3. The whole idea was inspired by Baudoin in [6]: using this combination, Baudoin proved exponential convergence to equilibrium for the Kinetic Fokker-Planck equation in H 1 -norm and in Kantorovich-Wasserstein distance.
Thus we show, for the dynamics (1.1) as well, exponential convergence to the stationary state in Kantorovich-Wasserstein distance and in relative entropy and we get quantitative rates of convergence in these distances, i.e. we obtain information on the N -dependence of the rate. In particular our estimates show that the convergence rate in the harmonic chain approach 0 as N tends to infinity at a polynomial rate with order between C 1 /N 3 and C 2 /N and that the scaling of the rate is bigger than C 3 N −3 in the weakly anharmonic chain.
In order to quantify the above rates, we estimate b N 2 , where b N is a block matrix defined in Sect. 3 as a solution of a matrix equation, (1.10). Since b N 2 appears in the rates in the Theorems 1.4, 1.6 and the Proposition 1.2, we start by stating this result: (1.9), with pinning and interaction coefficients a ≥ 0, c > 0. For all N ∈ N, there exists a unique symmetric positive definite block matrix b N (1.10) Moreover there exists C a,c > 0, that depends only on the coefficients a, c, such that for all Second, we state the following Proposition, that is restricted to the harmonic chain, and provides us with a lower bound on the spectral gap (given the estimates on b N 2 by Proposition 1.1): Proposition 1.2 (Lower bound on the spectral gap of the harmonic chain) For the spectral gap ρ of the chain described by the generator (1.8) without the perturbing potentials (the harmonic chain), which is given by the relation we have the following property: there exists κ > 0 such that for all N ∈ N, This lower bound is in fact the optimal rate in the case of the harmonic homogeneous chain. In the work [7, Proposition 9.1] an upper bound is provided as well and thus the scaling of ρ is exactly N −3 . This is done by exploiting the form of the matrix M, (1.9), and more specifically using information on the spectrum of the discrete Laplacian. In [7] we study also the case of disordered chains by considering different pinning coefficients for each oscillator. Compared to the homogeneous case, as in this paper, where the decay is polynomial, in a disordered chain the spectral gap decays at an exponential rate in terms of N . Regarding the adaptation of the generalised Bakry-Emery theory presented in this paper to a non-homogeneous scenario, we can prove existence of a spectral gap for the weakly anharmonic chain as soon as the matrix M has a spectral gap (and this is the case as soon as all the interaction coefficients c i = 0). The difficulty in a non-homogeneous scenario will be the second part (as described in the Sect. 2): to solve the high-dimensional matrix equation (1.10) in order to estimate the spectral norm. Remark 1. 3 We expect the bound on the b N 2 , from Proposition 1.1, to be optimal, since from the proof of Proposition 1.2 combined with [7, Proposition 9.1]: there exist c 1 > 0, such that In the following, we consider b N as given by Proposition 1.1. Before we state the first main Theorem, we recall the definition of the Kantorovich-Rubinstein-Wasserstein L 2 -distance W 2 (μ, ν) between two probability measures μ, ν: where the infimum is taken over the set of all the couplings, i.e. the joint measures π on R N × R N with left and right marginals μ and ν respectively. It is easy to see that W 2 is indeed a metric. We restrict ourselves on the subspace P 2 (R 2N ), where μ and ν have second moments finite, so that their distance W 2 (μ, ν) will be finite. For more information on this distance we refer the reader for instance to [41] and references therein. Theorem 1. 4 We consider a chain of coupled oscillators whose dynamics are described by the system (1.1) under Assumptions (H1) and (H2). For a fixed number of particles N , there is a unique stationary state f ∞ , in particular, for initial data f 1 0 , f 2 0 of the evolution equation, we have the following contraction property: for C a,c , λ 0 dimensionless constants.
Moreover, in the set up of Theorem 1.4, we get some qualitative information about the non-equilibrium steady distribution, like the validity of a Poincaré inequality and even better, a Log-Sobolev inequality:

Proposition 1.5 (Log-Sobolev inequality) Let T be the quadratic form
Under Assumption (H2), the unique invariant measure μ = f ∞ from the Theorem 1.4 satisfies a Log-Sobolev inequality (L S I (C N )) : where γ, T L , C a,c , λ 0 := λ 0 (C 0 ) are all dimensionless constants with the prefactor in (1.5), C 0 , to satisfy C 0 < min(1, 2T R )C −2 a,c . Consequently we have convergence to the non-equilibrium steady state in Entropy. Let us first define the following information-theoretical functionals. For two probability measures μ and ν on R 2N with ν μ, we define the Boltzmann H functional h log h dμ, ν = hμ (1.13) and the relative Fisher information (1.14) We have entropic convergence in the following sense, as in [40,Sect. 6]: Theorem 1. 6 We consider a chain of coupled oscillators whose dynamics are described by the system (1.1) under Assumptions (H1) and (H2). For a fixed number of particles N , assuming that (i) μ is the invariant measure for P t and (ii) that it satisfies a Log-Sobolev inequality with constant C N > 0, for all f > 0 with we have a convergence to the non-equilibrium steady state in the following sense: for dimensionless constants λ a,c , λ 0 .
From Theorem 1.4 we get an exponential rate of order bigger than N −3 for the weakly anharmonic chain. In the purely harmonic case, we have that the convergence rate is between C 1 N −3 and C 2 N −1 for some constants C 1 , C 2 that are independent of N .

Remark 1.7
Note that a generalised version of calculus has been applied for a toy model of the dynamics (1.1) by Monmarché, [31]: working with the unpinned, non-kinetic version, with convex interaction and given that the center of the mass is fixed, he proves the same kind of convergences and ends up with explicit and optimal N -dependent rates, of order O(N −2 ), for the overdamped dynamics.

Plan of the Paper
Sections 2, 3, 4 and 5 concern the proofs of the convergence to the steady state by hypocoercive arguments (applying the generalised Bakry-Emery criterion) while Sect. 6 is devoted to estimating the spectral norm of b N , which is crucial in the final estimate for the scaling of the spectral gap. In particular, Sect. 2 contains an introduction to Bakry-Emery theory and an explanation of the method that is used. In Sect. 3 we obtain the estimates that lead to the proof of Proposition 1.5. In Sects. 4 and 5 we give the proof of Theorems 1.6 and 1.4 respectively. Finally in Sect. 6 we prove Propositions 1.1 and 1.2.

Introduction to Carré du Champ Operators
Consider a Markov semigroup P t with at least one invariant measure μ and infinitesimal generator L : D(L) ⊂ L 2 (μ) → L 2 (μ). Here we restrict ourselves to the case of the diffusion operators and we associate with the operator L, a bilinear quadratic differential form , the so-called Carré du Champ operator, which is defined as follows: for every pair of functions In other words measures the default of the distributivity of L. Then we define its iteration 2 , where instead of the multiplication we use the action of : From the theory of -calculus we have that a curvature condition of the form for all f in a suitable algebra A dense in the L 2 (μ)-domain of L and λ > 0 is equivalent to the following gradient estimate where P t is the semigroup generated by L. The uniqueness of the invariant measure then follows from the contraction property in W 2 distance (which is equivalent to the gradient estimate above thanks to Kuwada's duality, see [26] or Theorem 4.1 later on). This also implies a Log-Sobolev inequality (and thus a Poincaré inequality), see [4] or [3,Sect. 3].
Attempt to apply the classical theory to the generator L given by (1.8): For the generator of the dynamics (1.1), given by (1.8), we can not bound 2 by from below. Explicit calculations give Since we can not control the terms ∂ p i f ∂ q i f , we can not bound 2 from below by . In cases like this, we say that the particle system has −∞ Bakry-Emery curvature.

Description of the Method
In order to overcome this problem, we are doing the following: (1) First we modify the classical theory: we define a new quadratic form, different, but equivalent, to the |∇ z f | 2 that will play the role of the functional. This will spread the noise from p 1 and p N to all the other degrees of freedom as well. The general idea comes from Baudoin [6]. We make a suitable choice of a positive definite matrix, b N ∈ R 2N ×2N , to define a new quadratic form that will replace the functional, so that we obtain a 'twisted' curvature condition: an estimate of the form (2.3). This implies also a modified gradient estimate, and thus a Poincaré and Log-Sobolev inequality. We choose this matrix to be the unique solution of a Lyapunov equation with positive definite r.h.s.: In general in order to deal with a hypocoercive situation in H 1 -setting, one can perturb the norm to an equivalent norm, so that exponential convergence results can be deduced with this new norm. The idea is originally due to Talay in [38] and it was later generalised by Villani in [40]. Then one can have convergence in the usual norm thanks to their equivalence. Here, instead of the norm, we modify the gradient and thus the Carré du Champ, and work with a generalised -theory.
The idea of working with the matrix that solves the above-mentioned Lyapunov equation came from the fact that (i) we need to control from below the quantity b N M + M T b N and (ii) in the linear chain, the covariance and determines the stationary solution of the corresponding Liouville equation. Therefore, tackling the hypoellipticity problem, i.e. spreading the dissipation to all the degrees of freedom, corresponds to working with a Lyapunov equation with positive definite r.h.s. A way to think of it is as a sequence of Lyapunov equations: so that in each step we add a positive entry in the diagonal of the r.h.s. from both sides. This corresponds to spreading the noise and dissipation to the next oscillator from both ends until the center of the chain, like the commutators would do in a classical hypoelliptic setting, see also Fig. 1. So in the last step we have N > 0 which corresponds to having spread the noise everywhere in the space. This allows us to prove the validity of the generalised Bakry-Emery criterion (3.4), which is the key estimate in order to have exponential convergence to the non-equilibrium steady state.
(2) In order to make our estimates quantitative, we estimate the spectral norm of the matrix b N and its inverse. Regarding the bound on the norm of b N , we estimate its entries using that it solves the Lyapunov equation, while for the norm of b −1 N , we compare it to the norm of b −1 0 which is uniformly bounded in N . This corresponds to the proof of Proposition 1.1 which is the subject of Sect. 6.
For those familiar with Hörmander's method we describe briefly here the similarity with the spreading of dissipation-mechanism: in Hörmander's theory the smoothing mechanism is the one transferred through the interacting particles inductively by the use of commutators: the generator has the form Then [∂ p 1 , X 0 ] = −∂ p 1 +∂ q 1 . Now commuting ∂ q 1 with the first order terms of the generator: the smoothing mechanism' to p 2 . Continuing like that, commuting the 'new' variable with the first order terms of L, inductively we cover all the particles of the chain.

Functional Inequalities in the Modified Setting
In order to apply a 'twisted' Bakry-Emery machinery, introduced by Baudoin in Sect. 2.6 of [6], we work with the positive definite matrix b N chosen to be the solution of the Lyapunov equation (1.10). The following Proposition gives us existence of such a solution. Proof It is a matrix reformulation of a well known and classical result of Lyapunov that can be found for instance in [18, p. 224] or [30,Sect. 20].
The eigenvalues of M have strictly positive real part ([25, Lemma 5.1]) and the right hand side of (1.10) is positive definite. Therefore there exists a positive solution of (1.10). Also, we can easily see that the solution is given by the formula We define the following quadratic quantity for f , g ∈ C ∞ (R 2N ), Then we consider the functional Here T ( f , f ) is always positive since b N ≥ 0 (and in fact positive definite since b N > 0: this is proven in the last part of the proof of Proposition 1.1). In contrast with the original operator , our modified quadratic form T is related to L only indirectly through the different steps of commutators.
We have an equivalence of the following form between T and |∇ z | 2 : Combining this with the conclusion of Proposition 1.1, we write Proposition 3.2 With the above notation, under Assumption (H2), for all N ∈ N there exists constant such that for f ∈ C ∞ (R 2N ), We use the form of the generator L as in (1.8): where is the function that corresponds to the perturbing potentials. We write About the (−z T M∇ z ) -part of L, the last equation of the above formula gives and finally regarding the second order terms of the generator we end up with We write the second and third term of the last equation as N ∇ z f ) and then from the boundedness assumption on the operator norms of the Hessians for both perturbing potentials and the Lyapunov equation (1.10), we get the following We conclude by gathering the terms.
The assumption (H2) combined with the conclusion of the Proposition 1.1 ensures us that λ N is positive, by choosing suitable pre-factors, as we do in the proofs of the main Theorems 1.4 and 1.6. We state now the following lemma that gives the 'twisted' gradient bound.

Lemma 3.3 (Gradient bound) Under Assumption (H2), for all N
for λ N given by Proposition 3.2.
Proof We shall first present a formal derivation of the estimate (3.5).
Since from the semigroup property we have d ds P s = LP s = P s L, by differentiating and using the above inequality we get , by Grönwall's lemma we get the desired inequality for every smooth and bounded function f .
First we take W ( p, q) = 1 + |p| 2 + |q| 2 as a Lyapunov structure that satisfies the following conditions: W > 1, LW ≤ CW , the sets {W ≤ m} are compact for each m, and T (W ) ≤ CW 2 . This W satisfy the conditions thanks to the bounded-Hessians assumption, i.e. |∇(U N int + U N pin )| will be Lipschitz. In particular, for the inequality LW ≤ CW using Cauchy-Schwarz and Young's inequalities, we write Then L n has compact support in K n := {W ≤ 2n}, in the sense that it is 0 outside of it, due to the definition of h n . Let P n t be the semigroup generated by L n , which is given as the unique bounded solution of Then we also have that for every bounded f ∈ L ∞ (R 2N ), pointwise We do the 'interpolation semigroup argument' as before for L n and for for fixed t > 0, n ≥ 1 applied to a fixed point ( p, q) in the support inside the set {W ≤ n}. It is true, due to the properties of W , that T (P n t f , P n t f ) ≤ C f ,t with C f ,t independent of n and so we have a bound on T (P n t f , P n t f ) uniformly on the set {W ≤ n}. Indeed with C 1 constant independent of n. About the last term: with C 2 > 0 some constant again independent of n (from the assumptions on the Lyapunov functional W ). Therefore Combining this last estimate with the above bounds we end up with the differential inequality is again independent of n. We multiply both sides with e (2|λ N |+2)s so that the above inequality implies or equivalently, after integrating both sides in time from 0 to t, that which gives the boundedness of T (P n from the above bound we have that This comes from the formula Now C does not depend on n (from before), so passing to the limit we have and so T (P t f , P t f ) is also bounded. Now we can repeat the standard Bakry-Emery calculations as in the beginning of the proof.

Remark 3.4
Note that using the equivalence of T and |∇ z | 2 : we get the following L 2 -gradient estimate Once we have a curvature condition of the form (3.4) we are also able to show that the stationary measure satisfies a Poincaré inequality.
We denote by the Carré du Champ operator defined in (2.1). By differentiating we have where in the first inequality we used that for the second we used the gradient bound from Lemma 3.3 and just right after that, the semigroup property. The last line can be rewritten like Now letting t to go to ∞, thanks to the ergodicity, we have the desired inequality.
In fact it is possible to show a stronger pointwise gradient bound, that we exploit for the proof of a Log-Sobolev inequality for the invariant measure of the dynamics.

Proposition 3.6 (Strong gradient bound)
Remark 3.7 This is a better estimate than (3.5) in Lemma 3.3 because of Cauchy-Schwarz inequality.
Proof The rigorous justification, i.e. boundedness of T (P t−s f , P t−s f )), of the following formal calculations is exactly like in the proof of Lemma 3.3.
Here for f ∈ C ∞ c (R 2N ), and for fixed t ≥ 0, (p, q) ∈ R 2N , instead we define We denote by g = P t−s f , we differentiate and perform the standard calculations we have where in the first equality we used that g .
In the first inequality we used the formula where is the Carré du Champ operator defined in (2.1), and that T and ∂ p 1 obviously commute. Now from Grönwall's lemma we get This pointwise, strong gradient bound implies a Log-Sobolev inequality.

Proof of Proposition 1.5 For
for fixed s ∈ [0, t] evaluated at a fixed point in the phase space. We denote by the Carré du Champ operator defined in (2.1) and following again Bakry's recipes, we get where for the second inequality we used the bound from Proposition 3.6, while for the last inequality we applied Jensen's and the fact that the function y 2 /x is convex for x, y positive. Now integrating from 0 to t, we get Letting t → ∞ and thanks to the ergodicity of the semigroup, we get the LSI with constant corresponding to the constant with the non-perturbed Fischer information. Therefore, applying the estimates from Proposition 1.1 we have where C 0 is the constant in (1.5) which we choose small enough, i.e. to satisfy

Convergence to Equilibrium in Kantorovich-Wasserstein Distance
We use that the gradient estimate (3.6) is equivalent to an estimate in Wasserstein distance (Kuwada's duality [26]). More specifically, we have the following Theorem, here stated only in the Euclidean space with the Lebesgue measure (R 2N , |·|, λ) and only for the Wasserstein-2 distance: where this estimate is associated with the Lipschitz norm defined just above.
Since λ N , as defined in (3.3), is: by exploiting the estimates on b N 2 and b −1 N 2 from the Proposition 1.1 we quantify the rate: This gives us the statement of the Theorem: Finally, for the uniqueness of the stationary solution f ∞ , we see that all the solutions f t will converge towards it if we make the choice f 2 0 = f ∞ .

Entropic Convergence to Equilibrium
If μ is the invariant measure of the system, we prove here convergence to the stationary state in Entropy as stated in Theorem 1.6: first with respect to the functional and then using the equivalence of Proof of Theorem 1. 6 We consider the functional

(s) = P s P t−s f log P t−s f + P s P t−s f T (log P t−s f , log P t−s f )
and by differentiating and repeating similarly the steps from the Propositions 3.6 and 1.5 we end up with where we have used that for the second inequality and in the last inequality we used the bound (3.4). We introduce a constant η on which we will optimise later, we integrate against the invariant measure μ and we apply the Log-Sobolev inequality from Proposition 1.5: Finally, from Grönwall's inequality we have or equivalently the desired convergence, thanks to the invariance of the measure. Since we have that the exponential rate is indeed of order λ N (as in the convergence in Theorem 1.4): Since T and |∇ z | 2 are equivalent, see (3.2), we get the above convergence in the non-perturbed setting with equivalence-constant max 1, b −1 In particular, both the Boltzmann entropy H μ (P t f μ), given by (1.13), and the Fisher information I μ (P t f μ), given by (1.14), decay: Thus, combining with the conclusion of Proposition 1.1, the denominator is of order 1 with the dimension, and, as in the proof of Theorem 1.4, λ N ≥ λ 0 N −3 and we conclude.

Remark 5.1 (i)
The rate of the convergence to the stationary state, λ N , does not depend on the difference of the temperatures T : under the assumption (H2) we get existence of spectral gap for all T , since the twisted curvature condition from Proposition 3.2 sees only the first order terms of the generator. The scaling of λ N relies on the result of the Proposition 1.1 and we can see through its proof that it is not affected by T . Therefore, the same scaling holds in the equilibrium case T = 0 as well. (ii) Regarding the boundary conditions: Assumption (H1) is not necessary in order to obtain existence of a spectral gap with a lower bound N −3 . In fact, we have spectral gap as soon as there is a solution to the matrix equation (1.10), which is the case when a ≥ 0, c > 0 (see Proposition 3.1). Therefore, the proof of Proposition 1.1 still holds, with minor differences, when we consider the following b.c. as well (free in a sense): q 0 = q 1 , q N = q N +1 . Note that for the harmonic chain, this is suggested by numerical simulations similar to Fig. 2 (iv) A convergence to equilibrium in total variation norm for a similar small perturbation of the harmonic oscillator chain, has been shown recently in [32]. There, a version of Harris' ergodic Theorem was applied making it possible to treat more general cases of the oscillator chain with different kind of noises, as well. However, this is a non-quantitative version of Harris' Theorem, which provides no information on the dependency of the convergence rate in N .

Estimates on the Spectral Norm of b N
First, let us state the following Proposition on the optimal exponential rate of convergence for the purely harmonic chain. Moreover the spectral gap approaches 0 as N goes to infinity as follows: for some constant C independent of N .
Proof We exploit the results by Arnold and Erb in [1] or by Monmarché in [31, Proposition 13]: working with an operator of the form under the conditions that (i) no non-trivial subspace of Ker(F ) is invariant under M and (ii) the matrix M is positively stable, i.e. all the eigenvalues have real part greater than 0, then the associated semigroup has a unique invariant measure and if ρ > 0, then for the exponential rate λ H N of the above Ornstein-Uhlenbeck process we have for every ∈ (0, ρ). Fix such an > 0 and conclude the first statement of the Proposition.
In particular, when m is the maximal dimension of the Jordan block of M corresponding to the eigenvalue λ such that Re(λ) = ρ, the quantity (1 + t 2(m−1) )e −2ρt is the optimal one regarding the long time behaviour, [31]. This implies that the spectral gap of the generator is ρ − , whereas the constant in front of the exponential is where L i,i+1 are uniquely determined by the quadratic form We will use this information in the last part of the proof of Proposition 1.1, to bound the spectral norm of the inverse, b −1 N 2 .
The rest of this section is devoted to the study of the solution of the matrix equation (1.10). Note that [35,36] are two other cases where a Lyapunov equation is explicitly solved in order to study the thermal transport in atom harmonic chains. The right hand side of the equation in the two above-mentioned cases is much simpler though, therefore it is easier to provide an analytical formula which represents the unique solution as in [36].

Matrix Equations on Lyapunov Equation
and where From (6.7) and considering that x m and y m are symmetric matrices, we get From that we get (6.2) and (6.3) directly, and also that: and by applying (6.2) to (6.8) we get (6.4). Also, using that x m and y m are required to be symmetric matrices, from the transposed version of (6.3), we get the equation From now on, we perform all the calculations when the dimension of the block matrices, N , is odd. The same calculations with minor differences hold when N is even as well.

Calculations for m = 0, 1, 2
Before we start analysing the form of the block z N , we first present how each unit in the right hand side of the Lyapunov equation has been computed in [35], where they found exactly the elements of z 0 := (z (0) i j ) 1≤i, j≤N when a = 0, c = 1, to be for α constant such that cosh(α) = 1 + 1 2γ . (It was done in the same manner with [43,Sect. 11] but there the case was T = 0). Here we describe briefly the steps: first we notice that z 0 is antisymmetric since in (6.2) J (0) 0 = 0, and second, by (6.4) we get that it has a Toeplitz-form Indeed note that the r.h.s of (6.4) forms a bordered matrix ⎡ i.e. only the bordered elements are non zero and so the l.h.s of (6.4) should also have this bordered form. Due to the tridiagonal form of B we get a Toeplitz matrix: in particular using that B = −c N + a I , the l.h.s of (6.4) is and equating the non-boundary entries, due to the symmetry of N and the antisymmetry of z 0 , we have that the elements of z 0 will be constant along the diagonals: indeed, for 1 < i < N , for the diagonal's entries of the Eq. (6.11) we have −cz For the superdiagonal's entries of the Eq. (6.11) We repeat these calculations through all the non-boundary entries of the matrix, and using the information we get from each one calculation, we end up with the Toeplitz form of z 0 in (6.10).
We can now see that a solution to (6.6) is a symmetric Hankel matrix which is antisymmetric about the cross diagonal and such that (y 1, j+1 . Then we apply (6.3) to get a formula for the entries of x 0 and from the bordered entries of x 0 from (6.4), we end up with the linear equation Here z 0 , e 1 ∈ C N −1 are the vectors z 0 = (z (0) (1, 0, . . . , 0) T and K 0 is a (N − 1) × (N − 1) symmetric Jacobi matrix whose entries depend on the (dimensionless) friction constant γ and interaction constant c: We solve the above equation using for example Cramer's rule and we find an explicit formula for the z (0) 1, j 's: the recurrence formula of the determinant of K 0 is the same formula of the Chebyshev polynomials of the second kind, so using properties of these polynomials and imposing appropriate initial conditions we end up with the form (6.9).
For m ≥ 1 we use again the Eq. (6.4). In the first step we get that: For m = 1, i.e. for the form of the z 1 -block in b 1 , the elements z (1) 1,1 , z (1) N ,N in the main diagonal are −1/2. The difference with the m = 0 step is that z 1 is not antisymmetric anymore, since 1/2 is added in the first entry of the diagonal (due to the form of J 1 ). So from (6.2) we write But we still have the bordered form in the r.h.s. of (6.4), so we still have a Toeplitz-form for z 1 .
In the next Lemma we give the form of the z 2 block of b 2 .  (2) i,i = 0 otherwise z (2) 1,2 + z (2) N ,N −1 = 2 1+a+2c 4c , z The last property is that the Toeplitz form is not perturbed in more than 2 diagonals away from the centre. So we denote by μ a,c := 1+a+2c 4c and we write: Proof of Lemma 6.4 z 2 is not antisymmetric but from (6.2) we immediately have that z 2 = z anti 2 − J 2 , where z anti 2 is antisymmetric. So we work with z anti Here, besides that z 2 is not antisymmetric, the r.h.s of (6.4) is not a bordered matrix anymore and also the matrix B J 2 affects non boundary entries as well, in particular it adds the (3 × 2) top-left and bottom-right submatrices of B to the (3 × 2) respective submatrices of z 2 : Equating the entries that correspond to the zero-submatrix as drawn above we will have the same calculations as in the step m = 0.
This corresponds to the Toeplitz property that holds for all the diagonals apart from the 5 central ones. Remember that for m = 0 we end up with a Toeplitz matrix.
In the m-th step of the sequence of these matrix equations, for the z m -block of b m , the central (4m−3) diagonals have a perturbed Toeplitz form: the elements across these diagonals on each line are changed by constants that depend on the coefficients a, c.
The resulting matrix z m is described in the following way, where μ a,c := 1+a+2c 4c : The explanation is the same as in the step m = 2 but this holds for an arbitrary m ≤ N . From the above Lemma we conclude that z N can be written in the general form

Preliminaries
where we write J for the square matrix with 1's in the superdiagonal and J for the matrix with 1's in the subdiagonal. Also ι k for the matrix with 1 in the (k, k + 1)-entry and ι −k for the matrix with −1 in the (k + 1, k)-entry. So for example For a visualisation: Proof of Lemma 6.5 From (6.2) we have where z anti N is antisymmetric matrix. So in order to find the form of z N we only need to study z anti N and due to its antisymmetry, we only need to study its upper triagonal part.
We look at the non-bordered entries of the upper triagonal part of (6.4). That is the equation . (6.18) Looking at the diagonal's entries (i, i) for 1 < i < N of the above Eq. (6.18), we write −cz and using the antisymmetry of the elements of z anti N , it gives Therefore, inductively we get At the same time, looking from bottom-right to top-left, we can write Then, looking at the super-diagonal's entries, i.e. the (i, i + 1)-entry, for 1 < i < N − 1, of Eq.
Similarly, looking at the entries (i, i + 2) for 1 < i < N − 2: Also, from the reversed direction we get inductively For the general case, as stated in the Lemma, we prove it by induction in k. For k = 1, 2, 3 is true from the above calculations. We do it for k odd. Let it hold for k − 2, we look at the (i, i + k − 1)-entry of Eq. (6.18) : Then from the induction hypothesis we end up with the (6.13). The case k even follows similarly. Now generalise the previous induction formulas for k odd for example and write: and from the reversed direction From these two equations we have the specific case (6.16). k even is proven similarly. For (6.15) we write for k odd: where in the last line we applied (6.16). The case k even is proven in the same way.
The above discussion shows that in order to understand the entries of z N , we need only to understand the vector z N = (z 1, j and x (N ) i,N about the 'cross diagonal'.
and after the obvious cancellations we have for j even Similarly for j odd we have , we get the first equations in (6.21) as well. We conclude with (6.22) just by combining the above relations in both cases.
Finally to get this specific value for x (N ) 1,N we look at the (1, N )-entry of Eq. (6.4) and perform the same calculations as above.
Considering the above Lemma we can write the matrix z N also as follows:  Proof Due to symmetry of y N is enough to look at the upper-triagonal part. We look at the entries (i, i + k) of Eq. (6.5). For k = 1 we have −y i+1,i+1 = 0 which is the Eq. (6.26). For 1 < k < N − 1 we prove it by induction in k, like in the proof of Lemma 6.5. Let us now look at the (1, N )entry of (6.5): and this is the desired equation. For (6.28), we look at (k − 1, N )entry of (6.5) for k ≥ 3. Performing the same calculations as above we get Then using the relations (6.26) and (6.27) for each of the terms above, we get the stated relation.
With the result of the following Lemma we relate the entries of y N with the entries of z N . wherez N is the vectorz where μ a,c := 1+a+2c 2c . In particular: Proof We combine the information for x 1i 's we get from two equations: first from (6.3), we remind that Eq. (6.3) is and second from the bordered entries of (6.4), which is We look at the element x (N ) 1,N and we write: In general using again Lemma 6.7 and relation (6.25), we have Putting the above relations in a more compact form we have with (6.31).
Now we estimate the entries y N : from (6.30) and Lemma 6.9, This gives that and then also, since y We write L i for the i-th row of the matrix y N and then calculate |y N v| 2 2 = |L 1 · v| 2 + · · · + |L N · v| 2 ≤ N |y . . .

+ |y
We estimate the terms due to the first half of the matrix, i.e. the terms until L N 2 +1 · v: from Lemma 6.7 we write all the y (N ) i, j 's in terms of the entries of y N and z N that, due to the observations above, scale at most like N . In particular for the second line Then, from (6.36): So the highest order is due to L N 2 +1 · v 2 for which we estimate The terms (2i − 1) in the sum above, denote the number of the entries of y N , z N that each i, j is given by. Regarding the terms due to the second half of the matrix, we use again Lemma 6.7, Eq. (6.26). This way we write the elements y Then Before we finish the proof, we give more details on the estimates (6.38) above: For the first inequality we apply iteratively Lemma 6.7. Regarding the row L 2 : 2,2 is given by the sum of 3 terms whose absolute value is of order not more than O(N ). The same holds (from Lemma 6.7) for each y is given by the sum of 3 terms whose absolute value has order less than N , while for y (N ) 3,3 , by applying Lemma 6.7 twice, i.e. until we end up only with elements of y N and z N , we get y (N ) 3,3 is given by the sum of 5 terms whose absolute value has order less than N . For y (N ) 3, j , j ≤ N − 2 (until the 'cross-diagonal'), apply Lemma 6.7 twice: the value of y and so they are given by 3 terms with absolute value of order at most N . In general, the same holds for the row L i , i ≤ N 2 + 1 from applications of Lemma 6.7 inductively. For all y (N ) i, j we apply Lemma 6.7 until we have written each y (N ) i, j only in terms of entries of y N and z N .
For j ≤ i, i.e. until the main diagonal, y (N ) i, j is given by the sum of ν terms, whose order is less than N , and ν = 1, 3, 5, · · · , (2i − 1) for y This formula gives that y (N ) i, j is the sum of (2 j − 1) terms whose absolute value has order less than O(N ).
The same holds for j > N − (i − 1), i.e. after the 'cross-diagonal', considering also (6.37). As for the rest terms in L i , for i ≤ j ≤ N − (i − 1): y (N ) i, j is given by the sum of (2i − 1) terms whose order is less than O(N ). Now, from (6.3) we can see that the entries of x N can be written in terms of entries of z N as well: where β i j are the elements of the matrix B, (1.7), and the entries of y N are split into two sums regarding their position about the cross diagonal.
We write x N 2 ≤ B 2 y N 2 + Faz N 2 y N 2 + N N 3 .
Proof of Proposition 1. 1 We are ready now to bound from above b N 2 . We write for some positive constant C 1 a,c b N 2 ≤ x N 2 + y N 2 ≤ C 1 a,c N 3 where for the first inequality: since b N is positive definite, decomposing b N in its square root matrices: And since X * X and X X * are unitarily congruent and the same holds for Y * Y and Y Y * (from polar decomposition for example), there are unitary matrices U , V ∈ C N ×N so that: Then it is clear that for the spectral norm (which is unitarily invariant): x N z N z T N y N 2 ≤ x N 2 + y N 2 .
Regarding the last part of the statement that b −1 N 2 is bounded from above: Let us first state some facts about the spectrum of the matrix b 0 that solves b 0 M + M T b 0 = diag (2T L , 0, . . . , 2T R , 0, . . . , 0) :=˜ .
It is known that b 0 is the covariance matrix that determines the stationary solution of the Liouville equation in the harmonic chain (and it has been found explicitly in [35], see a description of their approach in the beginning of the proof of Lemma 6.5). From [25, Lemma 5.1], we know that b 0 is bounded below and above: To sum up: for the homogeneous weakly anharmonic chain, the method described in Sect. 3 with the modified Bakry-Emery criterion, gives a lower bound on the spectral gap that is of order N −3 (see the exponential rate in the main Theorems). For the purely harmonic chain, since we know that it always decays with N from Proposition 6.1, this lower bound shows that the spectral gap in this case can not decay at an exponential rate in N , it is at most polynomial.
In the next Proposition, exploiting the estimates on b N 2 from the above matrix analysis, we get alternatively the lower bound on the spectral gap of the harmonic chain.
Proof of Proposition 1. 2 We remind that b N 2 ≤ C a,c N 3 by Proposition 1.1 and that the spectral gap divided by inf{Re(μ) : μ ∈ σ (M)} is bounded below and above in terms of N , by Proposition 6.1. From [19,39,Inequality (13)  Eventually, from the whole procedure in this note we have that the scaling of the spectral gap of the homogeneous harmonic chain is in between N −3 and N −1 . In [7, Proposition 9.1] it is proven that this lower bound is the sharp one, i.e. an upper bound of order N −3 is provided. From a simple numerical simulation in Matlab on the spectral gap of the matrix M, the true value is indeed N −3 . In particular calculating the real part of the smallest eigenvalue of the matrix M and multiplying the result by N 3 we get the following behaviour in Fig. 2, which shows that then the spectral gap converges for large N : CDT, the CCA.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.