Quantitative Rates of Convergence to Non-Equilibrium Steady State for a Weakly Anharmonic Chain of Oscillators

We study a $1$-dimensional chain of $N$ weakly anharmonic classical oscillators coupled at its ends to heat baths at different temperatures. Each oscillator is subject to pinning potential and it also interacts with its nearest neighbors. In our set up both potentials are homogeneous and bounded (with $N$ dependent bounds) perturbations of the harmonic ones. We show how a generalized version of Bakry-Emery theory can be adapted to this case of a hypoelliptic generator which is inspired by F. Baudoin (2017). By that we prove exponential convergence to non-equilibrium steady state (NESS) in Wasserstein-Kantorovich distance and in relative entropy with quantitative rates. We estimate the constants in the rate by solving a Lyapunov-type matrix equation and we obtain that the exponential rate, for the homogeneous chain, has order bigger than $N^{-3}$. For the purely harmonic chain the order of the rate is in $ [N^{-3},N^{-1}]$. This shows that, in this set up, the spectral gap decays at most polynomially with $N$.

where q i ∈ R is the displacements of the atoms with respect to their respective equilibrium positions and p i ∈ R the momentum of the atoms i = 1, . . . , N in the phase space R N × R N . Each particle has its own pinning potential V and it also interacts with its nearest neighbors through an interaction potential U . Notice that here we take all the masses m i = 1 for simplicity. The classical Hamiltonian dynamics is perturbed by noise and friction in the following way: the two ends of the chain are in contact with heat baths at two different temperatures T L , T R > 0. In this case we consider the interaction of the chain with the heat baths to be described by two Langevin processes. So our dynamics is described by the following system of SDEs: dq i (t) = p i (t)dt for i = 1, . . . , N, dp i (t) = (−∂ q i H)dt for i = 2, . . . , N − 1, dp 1 (t) = (−∂ q 1 H − γ 1 p 1 )dt + 2γ 1 T 1 dW 1 (t), dp where γ i are the friction constants, T i the two temperatures and W 1 , W N are two independent normalized Wiener processes.
The dynamics (1.1) is equivalently described by Liouville equation where L is the second order differential operator (1.3) 1.1. Derivation of L. Since the solution to (1.1) form a Markov process, we define the transition probabilities where z is the initial condition and P * t satisfies the Chapman-Kolmogorov relation Thus we consider a semigroup {P * t , t ≥ 0} on the space of Borel probability measures on the space R 2N such that for B a Borel subset on R 2N . Now, one can similarly consider the dual semigroup {P t , t ≥ 0} acting on observables. For any measurable function f : Ω → R we define where z t = (p t , q t ) solves (1.1) and E z f (z t ) is the expectation taking over all the realizations of the Brownian motion starting from z ∈ R 2N . Having a well-defined semigroup {P t } t≥0 with invariant measure µ, we can make sense of the following definition of the generator of the semigroup {P t } t≥0 (see for instance Section 1.4 in [BGL14]) : for every f ∈ C ∞ c (R 2N ). By applying Ito's formula, we get the expression of the generator of the Markov process which solves (1.1) to be (1.3).
1.2. State of the art. This model was first used to describe heat diffusion and derive rigorously Fourier's law (for an overview see [BLRB00] and references therein). Since then, it has been the subject of many studies, both from a numerical and from a theoretical perspective. Firstly, the purely harmonic case with several idealized reservoirs at different temperatures has been solved explicitly in [RLL67] where they found exactly how the non-equilibrium stationary state (NESS) looks like: they show that the stationary measure is Gaussian in the positions and momenta of the system, corresponding to the distribution when the temperatures of all the reservoirs are equal, with the difference that the covariances between momenta and positions are non zero, but proportional to the temperature difference. For the anharmonic chain there are no explicit results in general. However it has been studied numerically for many different potentials and many kinds of heat baths, including the Langevin heat baths that we consider here. See for instance [ALS06] [GLPV00, LLP03] and references therein.
There are two facts in this model that make its rigorous study very challenging: first of all, we do not know explicitly the form of the invariant measure of (1.1) and also our generator is highly degenerate, having the dissipation and noise acting only on two variables of momenta at the end of the chain. It is not difficult to see, though, that in the equilibrium case, i.e. when the two temperatures are equal T L = T R = T = β −1 , the stationary measure is the Gibbs-Boltzmann measure dµ(p, q) = exp(−βH(p, q))dpdq: after explicit calculations we have L * e −βH(p,q) = 0 since . So if f t (p, q) solves (1.2) then f t (p, q)dµ(p, q) does not depend on t.
Since we are interested in the theoretical aspects of the model, we refer to [EPRB99a,EPRB99b], which is the first rigorous studies of the anharmonic case. The existence of a steady state has only been obtained in some cases where the potentials act like polynomials near infinity. In particular making the following assumptions on pinning and interaction potentials lim λ→∞ λ −k U (λq) = a k |q| k and lim λ→∞ λ 1−k U (λq) = ka k |q| k−1 sign(q) for constants a k > 0, and assuming that the interaction potential is at least as strong as the pinning, the existence and uniqueness of an invariant measure was first proved in [EPRB99a] using functional analytic methods: they show that the resolvent of the generator of (1.1) is compact in a suitable weighted L 2 space. Later it was proven in [RBT02] that the rate of convergence to equilibrium is exponential using probabilistic tools.
In these papers they model the heat baths slightly differently and a bit more complicated than the Langevin thermostats, with physical interpretation: the model of the reservoirs is the classical field theory given by linear wave equations with initial conditions distributed with respect to appropriate Gibbs measures at different temperatures, see also [RB06,Section 2]. Later, an adaptation of a very similar probabilistic proof was provided in [Car07] for the Langevin thermostats. The difference with the Langevin heat baths is that the dissipation and the noise act on the momenta only indirectly through some auxialiary variables.
Regarding the existence, uniqueness of a Non-Equilibrium Stationary State and exponential convergence towards it in more complicated networks of oscillators (multi-dimensional cases) see [CEHRB18]. The proofs there are inspired by the abovementioned works in the 1-dimensional chains.
There are also cases where there is no convergence to equilibrium, when for instance l > k, i.e. when the pinning is stronger than the coupling potential, see for example [Hai09,HM09] where they cover some cases like that. In [HM09] they show that the resolvent of the generator fails to be compact or/and that there is lack of spectral gap. In particular they show that with harmonic interaction, 0 belongs in the essential spectrum of the generator as soon as the pinning potential is of the form |q| k for k > 3. The conjecture is that this is true as soon as k > 2n 2n−1 if n is the center of the chain.
2. Set up and main results. We consider a small perturbation of the harmonic chain, weakly anharmonic: First, regarding the boundary conditions, we model the oscillators chain with rigidly fixed edges meaning that we consider a chain beginning with an oscillator labelled 0 and ending with one labelled N + 1 under the hypothesis that q 0 = q N +1 = 0. The first and the last particle are pinned with additional harmonic forces, modelling their attachment to a wall. These boundary conditions and heat baths modelled by adding two Ornstein-Uhlembeck processes at both ends as explained above, is the same model as in [RLL67] and is known as the Casher-Lebowitz model, since it is also one of the models considered in [CL] in order to study the heat flux behaviour in disordered harmonic chains 1 .
Second, we assume that both pinning and interaction potentials of the oscillator chain differ from the harmonic ones by potentials with Hessians bounded from above by positive constants, uniformly in q ∈ R N . Eventually we have a Hamiltonian of the form Hess U pin (p, q) 2 ≤ C pin (N ) and sup where the positive constants C pin (N ), C int (N ) will be chosen later and they will depend on N .
Denoting by L the infinitesimal generator of the Markov semigroup (P t ) t≥0 which is defined as before, we look at Liouville equation ∂ t f = Lf , where the generator of 1 The other one considered for studying the N -dependence of the energy flux in disordered harmonic chains, was first introduced by Rubin-Greer, [JRG71], where the heat baths are semiinfinite chains distributed according to Gibbs equilibrium measures of temperatures T L , T R (free boundaries).
the dynamics now is where we take all the friction constants equal γ 1 = γ N = γ, for the two temperatures T L , T R we assume that they satisfy T L = T +∆T , T R = T −∆T , for some temperature difference ∆T > 0. Also, B is the symmetric tridiagonal (Jacobi) matrix It is convenient to see the above form of the generator in the following block-matrix form: where z = (p, q) ∈ R 2N , Φ(q) corresponds to the perturbing potentials so that the matrix Γ is the friction matrix Γ = diag(γ, 0, · · · , 0, γ) the matrix Θ is the temperature matrix and M in blocks is the following where I is the identity matrix, so that it corresponds to the transport part of the operator, while B and Γ correspond to the harmonic part of the potentials and the drift from both ends, respectively.
Motivation. This study is motivated by a discussion opened in C. Villani's memoir on hypocoercivity, see Section 9.2 in [Vil09a], about approaching questions on this heat conduction model by hypocoercive techniques. This chain of coupled oscillators is a hypocoercive situation, where the diffusion only on the ends of the chain leads to a convergence to the stationary distribution exponentially fast, under the following assumptions on the potentials: strict convexity on interaction potential (being stronger than the pinning one) and bounded Hessians for both potentials. In particular, he points out that it might be possible to recover the previous results of exponential convergence in the weighted H 1 (µ)-norm for this different class of potentials (than the potentials assumed in [EPRB99b] for instance) by applying a generalized version of Theorem 24 in [Vil09a]. For that, one needs to know some properties of the, non-explicit, non-equilibrium steady state µ: for instance, if it satisfies a Poincaré inequality or if the Hessian of the logarithm of its density is bounded.
Main results. Here, considering a perturbation of the harmonic chain (homogeneous case), instead we follow an approach that combines hypocoercivity and Bakry-Émery theory of Γ calculus and curvature conditions as in [BE85]. We obtain the same results with Bakry-Émery but in a perturbed setting. This is explained in more details and is implemented in Section 3. The whole idea was inspired by F. Baudoin in [Bau17], where he used this combination in order to show exponential convergence to equilibrium for the Kinetic Fokker-Planck equation in H 1 -norm and in Kantorovich-Wasserstein distance. By that it is possible to show, for the dynamics (1.1) as well, exponential convergence in Kantorovich-Wasserstein distance and in relative entropy and to get quantitative rates of convergence in these distances, i.e. to obtain information on the N -dependence of the rate. In particular our estimates show that the convergence rate in the harmonic chain approach 0 as N tends to infinity at a polynomial rate of order 1/N γ where 1 ≤ γ ≤ 3 and that it is bigger than N −3 in the weakly anharmonic chain.
In order to quantify the above rates, we estimate b N 2 , where b N is a block matrix defined in Section 3 as a solution of a matrix equation, (3.22). Since b N 2 appears in the rates in the Theorems 1.2, 1.4, we start by stating this result: Proposition 1.1. Considering the homogeneous scenario of the oscillators chain, with pinning coefficient a > 0 and interaction coefficient c > 0, there exists a symmetric block matrix that can be constructed as the unique solution of the following Lyapunov equation where Π N = diag(2T L , 1, . . . , 1, 2T R , 1, 1, . . . , 1, 1) and M = Γ −I B 0 , and the spectral norm of which, b N 2 , is bounded from above in terms of the dimension N , as O(N 3 ).
Theorem 1.2. We consider a chain of coupled oscillators whose dynamics are described by the system (1.1) and the Hamiltonian is given by (1.5) under the assumptions on the potentials given by (1.6). In particular we assume bounded perturbations of the harmonic chain with bounds depending on N , (6.69), so that For a fixed number of particles N , we have a convergence to NESS in Wasserstein-Kantorovich distance for f 1 0 , f 2 0 initial data of the evolution equation. Here b N is the matrix that solves the equation (3.22) and λ N a function of b N 2 as in (3.26): Moreover, there is a unique stationary solution f ∞ , since all the solutions f t will converge towards it if we make the choice f 2 0 = f ∞ . Estimates on b N 2 regarding N are given by the Proposition 1.1 and this allows to conclude that and that the bounds on the perturbing potentials vanish asymptotically with rate Moreover, in the set up of Theorem 1.2, we get some qualitative information about the non-equilibrium steady distribution, like the validity of a Poincaré inequality and even better, a Log-Sobolev inequality: Proposition 1.3 (Log-Sobolev inequality). Let L be the generator of the dynamics described by the SDEs (1.1). Let Γ be the Carré du Champ operator defined in (2.19), while T the perturbed quadratic form defined in (3.23). Assuming that we have a gradient estimate of the form (3.29), we obtain that the unique invariant measure µ = f ∞ from the Theorem 1.2 satisfies a Log-Sobolev inequality (LSI(C N )) : Consequently we have convergence to NESS in Entropy. Let us first define the following information-theoretical functionals: the Boltzmann H functional (1.14) and the relative Fisher information we have entropic convergence in the following sense, as in [Vil09a, Section 6]: Theorem 1.4. We consider a chain of coupled oscillators whose dynamics are described by the system (1.1) and the Hamiltonian is given by (1.5) under the assumptions on the potentials given by (1.6): we assume bounded perturbations of the harmonic chain with bounds depending on N , (6.69), so that Here b N is the matrix that solves the equation (3.22). For a fixed number of particles N , assuming that (i) µ is the invariant measure for P t and (ii) that it satisfies a Log-Sobolev inequality with constant C N > 0, for all f with E(f ) < ∞ i.e. the initial data have finite relative entropy with respect to µ, we have a convergence to NESS in Entropy: Thanks to the equivalence of T and |∇ z | 2 in (3.24), we get the above convergence in the non-perturbed setting with equivalence-constant max 1, b −1 In particular, both the Boltzmann entropy H µ (P t f µ), given by (1.14), and the Fisher information I µ (P t f µ), given by (1.15), decay: Estimates on b N 2 regarding N are given by the Proposition 1.1 and so we conclude that (1.18) Let us motivate in the next Remark the advantages of working in entropy when studying convergence in times for many particle systems. In short Relative Entropy and the Wasserstein metrics behave good with the dimension, whereas L 2 -norm provides very bad, in N , estimates.
Remark 1.5 (Chaoticity and the Perks of Entropy/Wasserstein). In general, in order to study convergence to equilibrium/NESS for many particle systems, it is natural to work with the so-called extensive functionals which are functionals subadditive or even additive (proportional) in N . Typical examples of extensive functionals are the relative Entropy and the Wasserstein distances. In particlular, for the Entropy when we work with de-correlated data f ⊗N t (i.e. chaotic data) we can write and for the Wasserstein-2 distance where for the second inequality we used that for k ≥ 1 and a i nonnegative constants, (a 1 + · · · + a n ) k ≤ n k−1 (a k 1 + · · · + a k n ). So that if we have convergences in times of the following form So the convergence would take the form Then we need to wait until t ≥ N ln( f 0 2 ) λ N and only then the convergence to equilibrium will start taking place.
From Theorem 1.2 we get an exponential rate of order bigger than N −3 . In the purely harmonic case, we have rate of order 1/N γ where 1 ≤ γ ≤ 3.
We mention finally the following Proposition that gives directly the same lower bound on the spectral gap (given the estimates on b N 2 by Proposition 1.1): Proposition 1.6 (Lower bound on the spectral gap of the harmonic chain). It is true that the spectral gap of the harmonic chain ρ has a lower bound This lower bound is in fact the optimal rate in the case of the harmonic homogeneous chain. In the work [BM] it is proven that ρ = O(N −3 ) by exploiting the form of the matrix M , (1.9) and more specifically using information on the spectrum of the discrete Laplacian. There, we study also the case of disordered chains. Unlike the homogeneous case here where the decay is polynomial, in a disordered chain the spectral gap decays at an exponential rate in terms of N .
Remark 1.7. Note that a generalized version of Γ calculus has been applied for a toy model of the dynamics (1.1) by P. Monmarché, [Mon19]: working with the unpinned, non-kinetic version, with convex interaction and given that the center of the mass is fixed, he proves the same kind of convergences and ends up with explicit and optimal N -dependent rates, of order O(N −2 ), for the overdamped dynamics.
3. Plan of the paper. Section 2 contains an introduction to the Bakry-Emery theory and an explanation of the method used. In Section 3 we obtain the estimates leading to the proof of Proposition 1.3. In Section 4 and Section 5 we give the proof of Theorem 1.4 and Theorem 1.2 respectively. Finally in Section 6 we prove Propositions 1.1 and 1.6. 4. Notation. {e i } n i=1 denote the elements of the canonical basis in R n and | · | to denote the Euclidean norm on R n , from the usual inner product ·, · . For a square matrix A = (a ij ) 1≤i,j≤n ∈ C n×n , we write A 2 for the operator (spectral) norm, induced by the Euclidean norm for vectors : and A * for the complex conjugate transpose A H =Ā T . We also write A 1/2 for the square root of a (positive definite) matrix A, i.e. the matrix such that A 1/2 A 1/2 = A. Moreover, by C ∞ b (R n ) we denote the space of the smooth and bounded functions, by ∇ z we denote the gradient on z-variables in a metric space X with respect to the Euclidean metric. We write P 2 (R n ) for the space of the probability measures on R n that have second moment finite, i.e. P 2 (R n ) = ρ ∈ P(R n ) : [N ] denotes the set {1, 2, . . . , N }.

Carré du Champ operators and curvature condition
Consider a Markov semigroup P t generated by an infinitesimal generator L : D(L) ⊂ L 2 (µ) → L 2 (µ), where µ is the invariant measure of the dynamics. Here we restrict ourselves to the case of the diffusion operators and we associate a bilinear quadratic differential form Γ, the so-called Carré du Champ operator, and it is defined as follows: for every pair of functions In other words Γ measures the default of the distributivity of L.
Then we can naturally define its iteration Γ 2 , where instead of the multiplication we use the action of Γ: From the theory of Γ-calculus we have that a curvature condition of the form for all f in a suitable algebra A dense in the L 2 (µ) domain of L and λ > 0 is equivalent to the following gradient estimate which implies a Log-Sobolev inequality (and thus Poincaré inequality), see [BE85] or [Bak06, Section 3].
Note that here the case is not the spatially homogeneous one and the generator of the dynamics described by (1.1) is hypoelliptic, not elliptic, since the noise acts only on 2 out of 2N variables of our phase space. So, the main problem in this case here is the lack of ellipticity, since the curvature condition (2.21) requires the ellipticity of the generator. More specifically, for the operator (1.8) we can not bound Γ 2 by Γ from below since after explicit calculations we have the formulas Since we can not control the terms ∂ p i f ∂ q i f, we can not bound Γ 2 from below by Γ. In cases like this, we say that the particle system has −∞ Bakry-Emery curvature.
1. Description of the method. In order to overcome this problem, we are doing the following: (1) Firstly we perturb the classical Γ theory, by defining a new quadratic form, different, but equivalent, to the |∇ z f | 2 that will play the role of the Γ functional. This will spread the noise from p 1 and p N to all the other degrees of freedom as well. The general idea comes from Baudoin [Bau17]. We make a suitable choice of a positive definite matrix to define a new quadratic form that will replace the Γ functional, so that we obtain a 'twisted' curvature condition: an estimate of the form (2.21). This will imply also a perturbed gradient estimate, and thus a Poincaré and Log-Sobolev inequality. As Villani introduced in [Vil09a], in order to deal with a hypocoercive situation in H 1 -setting, one can perturb the norm to an equivalent norm, so that the desired convergence results can be deduced with this new norm. Then one can have convergence in the usual norm thanks to their equivalence. Here, instead of the norm, we perturb the gradient and thus the Γ Carré du Champ, and work with a generalised Γ-theory.
(2) Second, in order to make our estimates quantitative we construct the matrix that we use to perturb the gradient. In Section 6, this matrix is constructed as a solution of a sequence of continuous Lyapunov equations and every step of the sequence corresponds to the spreading of noise and dissipation to the next oscillator from both ends until the center of the chain. In fact this will be presented as follows: adding noise from the left to the right until the N -th particle and from the right to the left until the 1-st particle, as shown in the Figure 1. These steps will allow us to get estimates on how the rate of convergence to NESS behave with N .
For those familiar with the method of Hörmander we describe briefly here the similarity with the spreading of dissipation-mechanism: in Hörmander's theory the smoothing mechanism is the one transferred through the interacting particles inductively by the use of commutators: the generator has the form Given that ∂ q 1 q 2 H is non-vanishing we have 'spread the smoothing mechanism' to p 2 . Continue like that, commuting the 'new' variable with the first order terms of L, inductively we will cover all the particles of the chain.

Functional inequalities in the perturbed setting
The goal is to apply a 'twisted' Bakry-Emery machinery, introduced by Baudoin in Section 2.6 of [Bau17]. For that we define a perturbed quadratic form by using a positive definite matrix: this matrix is chosen as the solution of the following continuous Lyapunov equation if and only if all the eigenvalues of −M have negative real parts (that is −M is stable) .
If one chooses W (z, z) = −z Π N z T , a matrix reformulation of the above Theorem gives necessary and sufficient condition for existence of positive solution of (3.22).
Define the following quadratic quantity for f, g ∈ C ∞ (R 2N ), Here T (f, f ) is always positive since b N ≥ 0 in contrast with the original operator Γ, our perturbed quadratic form T is related to L only indirectly through the different steps of commutators.
We have an equivalence of the following form between T and |∇ z | 2 : (3.24) Proposition 3.2. With the above notation and for a fixed number of particles N , there exists constant λ N that depends on the spectral norm of the matrix b N and the bounds of the perturbing potentials C pin , C int such that for f ∈ C ∞ (R 2N ), Proof. Using the form of the generator L as in (1.8) : where Φ is the function that corresponds to the perturbing potentials, we write The first term is cancelled with the last one and the second with the fourth one.
The first term is cancelled with the fifth one and the second with the fourth one. Finally for the second order terms of the generator we write We eventually end up with where for the second inequality we used that the terms T (∂ p i f, ∂ p i f ) for i = 1, N , are positive. From the boundedness assumption on the operator norms of the Hessians for both perturbing potentials, and using that b N solves the equation (3.22), we get the following (3.26) and we choose the quantity (C pin (N ) + C int (N )) in a way so that λ N is positive. In particular (after getting information on b N 2 from Proposition 1.1) we require, see (6.69) We state now the following lemma that gives the 'twisted' gradient bound.
Lemma 3.3 (Gradient bound). If the operator L satisfies the curvature condition (3.25) for some λ N and for f ∈ C ∞ c (R 2N ), we have the following perturbed gradient estimate (3.27) Proof. If T is compactly supported we consider the functional for f ∈ C ∞ c (R 2N ) and for fixed t. Since from the semigroup property we have d ds P s = LP s = P s L, by differentiating and using the above inequality we get , by Grönwall's lemma we get the desired inequality for every smooth and bounded function f .
In general we need T (P t f, P t f ) to belong in L ∞ (R 2N ) because then we know that P s T (P t−s f, P t−s f ) is well defined. So we do the following: First we take W (p, q) = 1 + |p| 2 + |q| 2 as a Lyapunov structure that satisfies the following conditions: W > 1, LW ≤ CW , the sets {W ≤ m} are compact for each m, and T (W ) ≤ CW 2 . This W satisfy the conditions thanks to the bounded-Hessians assumption, i.e. |∇(U int + U pin )| will be Lipschitz. In particular, for the inequality LW ≤ CW using Cauchy-Schwarz and Young's inequalities, we write Then L n is compactly supported in B n := {W ≤ 2n}. Let P n t be the semigroup generated by L n , which is given as the bounded solution of Then we also have that We do the 'interpolation semigroup argument' as before for L n and for for fixed t > 0, n ≥ 1 applied to a fixed point in the support. It is true, due to the properties of W , that T (P n t f, P n t f ) ≤ C f,t with C f,t independent of n and so we have a bound on T (P n t f, P n t f ) uniformly on the set About the last term: with C 2 > 0 some constant again independent of n (from assumptions on the Lyapunov functional W ). Therefore Combining this last estimate with the above bounds we end up with is again independent on n. We integrate in time from 0 to t and we get the desired boundedness on {W ≤ n}.
from the above bound we have that with support in {W ≤ n}. C does not depend on n (from before), so passing to the limit we have and so T (P t f, P t f ) is also bounded. Now we can repeat the standard Bakry-Emery calculations as in the beginning of the proof.
We refer also to the discussion in the book [BGL14, Section 3.2.3, page 145] for more details about for which classes of functions does this gradient bound hold. In [BGL14, Theorem 3.2.4] they give an argument in order to justify the above gradient bound for f ∈ L 2 (µ)-domain of the generator L of the diffusion proccess. They assume though reversibility of their reference measure µ.
Remark 3.4. Note that using the equivalence of T and |∇ z | 2 : we get the following L 2 -gradient estimate Once we have a curvature condition of the form (3.25) we are also able to show that the stationary measure satisfies a Poincaré inequality.
Proposition 3.5. Let L be the generator of the dynamics described by the SDEs (1.1). Let Γ the operator defined in (2.19), while T the perturbed quadratic form defined in (3.23). If f ∈ C ∞ (R 2N ) and a constant λ N > 0 exists so that the inequality T 2 (f, f ) ≥ λ N T (f, f ) holds, the unique invariant measure µ = f ∞ from the Theorem 1.2 satisfies a Poincaré inequality and the non-perturbed Poincaré inequality takes the form Proof. For f ∈ C ∞ (R 2N ), we consider the functional By differentiating we have where in the first inequality we used that for the second we used the gradient bound from Lemma 3.3 and just right after that, the semigroup property. Now letting t to ∞, thanks to the ergodicity we have the desired inequality with constant C N = So the constant in the (non-perturbed) Poincaré inequality depends on N through the norms b N 2 , b −1 N 2 , sincẽ .
In fact it is possible to show a stronger pointwise gradient bound, that will be exploited in the proof of a Log-Sobolev inequality for the invariant measure of the dynamics.
Proposition 3.6 (Strong gradient bound). It is true that for f ∈ C ∞ c , ∀t ≥ 0, Note. This is a better estimate than (3.27) in Lemma 3.3 because of Jensen's inequality.
Proof. The rigorous justification, i.e. boundedness of T (P t−s f, P t−s f )), of the following formal calculations is exactly like in the proof of Lemma 3.3.
Here for f ∈ C ∞ c , and for fixed t ≥ 0, instead we define By differentiating and performing the standard calculations we have where in the first inequality we have used the formula for the generator of the dynamics (1.1) and that T and ∂ p 1 obviously commute. Now from Grönwall's lemma we get

Now this pointwise, strong gradient bound implies a Log-Sobolev inequality.
Proof of Proposition 1.3. For f ∈ C ∞ c (R 2N ), we introduce the functional H(s) = P s P t−s f log P t−s f and following again Bakry's recipes, we get where for the second inequality we used the bound from Proposition 3.6, while for the last inequality we used Jensen's and the Markov property of the semigroup. Now integrating from 0 to t, we get Letting t → ∞ and thanks to the ergodicity of the semigroup, we get the LSI with constantC N = corresponding to the constant with the non-perturbed Fischer information.
We note here that since a Log-Sobolev inequality holds, we also have (see [Gro93]) that our semigroup is hypercontractive. Hypercontractivity says that at time t > 0 the semigroup is regularizing from L p to L q(t) for q(t) − 1 = exp(4t/C N )(p − 1).
For the notion of hypercontractivity we refer for instance to [Bak06, Theorem 2.5] and references therein. Hypercontractivity and the validity of a Log-Sobolev inequality, is in fact stronger result than L 2 -exponential convergence since it implies stronger exponential convergence in entropy which implies L 2 by a linearization, for that see [Wan17, Proposition 2.3]. For similar results in hypoelliptic cases, for instance the kinetic Fokker-Planck equation, see [Bau17], where a Log-Sobolev inequality is proven by similar local Bakry-Emery computations.

Entropic Convergence to equilibrium
If µ is the invariant measure of the system, we will prove here convergence to NESS in Entropy as stated in Theorem 1.4: with respect to the functional Proof of Theorem 1.4. We consider the functional and by differentiating and repeating similarly the steps from the Propositions 3.6 and 1.3 we end up with where we have used that for the second equality and in the last inequality we used the bound (3.25). Now, integrating against the invariant measure µ, and applying the Log-Sobolev inequality from Proposition 1.3, Finally, from Grönwall's inequality we have or equivalently the desired convergence, thanks to the invariance of the measure.
The last inequality is due to the estimates later in (6.70).

Convergence to equilibrium in Kantorovich-Wasserstein distance
We recall the definition of the Kantorovich-Rubinstein-Wasserstein L 2 -distance W 2 (µ, ν) between two probability measures µ, ν, for some metric d: where the infimum is taken over the set of all the couplings, i.e. the joint measures π on R N × R N with left and right marginals µ and ν respectively.
It is easy to see that W 2 satisfies the definition of a metric, whenever of course d is a metric, and so it is indeed a metric. We also restrict ourselves on the subspace P 2 (R 2N ), where µ and ν have second moments finite, so that their distance W 2 (µ, ν) will be finite. For more information on this distance see for instance [Vil09b] and references therein. Also, even though that convergence in Monge-Kantorovich-Wasserstein distance is a weak convergence comparatively with convergence in entropic sense or convergence in L 2 for instance, it can be defined on the more natural subspace P 2 (R 2N ) of probability measures. Finally, in contrast with entropy, it is more general to assume finite Wasserstein distance.
For a measurable function f and x ∈ R 2N , we define the upper-gradient with respect to the metric d What we use here to get convergence in Kantorovich-Wasserstein distance is that the gradient estimate of type (3.28) is equivalent to an estimate in Wasserstein distance (Kuwada's duality, [Kuw10]). More specifically, we have the following Theorem, here stated only in the Euclidean space with the Lebesgue measure (R 2N , |·|, λ): Theorem 5.1 (Theorem 2.2 of [Kuw10]). Let the Markov semigroup P on R 2N , that has a continuous density with respect to the Lebesgue measure. For c > 0, the following are equivalent: (i) For all probability measures µ, ν we have, where W p denotes the Wasserstein distance associated with the Euclidean distance.
(ii) When p > 1 and q its Hölder conjugate, for all bounded and Lipschitz functions f and z ∈ R 2N , where this estimate is associated with the Lipschitz norm defined just above.
Now, considering all the above, we prove Theorem 1.2 from which we get both uniqueness of an invariant measure and convergence to NESS of the semigroup P t .
Proof of Theorem 1.2. The convergence follows if we apply Kuwada's duality from Theorem 5.1 since we have the estimate (3.28) with c = b −1 N 1/2 2 b N 1/2 2 . Remark 5.2. A convergence to equilibrium in total variation norm for a similar small perturbation of the harmonic oscillator chain, has been shown recently in [Raq19]. There, a version of Harris' ergodic Theorem was applied making it possible to treat more general cases of the oscillator chain with different kind of noises, as well. However, this is a non-quantitative version of Harris' Theorem, which provides no information on the dependency of the convergence rate in N .

Estimates on the spectral norm of b N
First, let us state the following Proposition on the optimal exponential rate of convergence for the purely harmonic chain. Proposition 6.1 (Proposition 7.1 and 7.2 (3) in [BM]). The optimal order of the spectral gap of the dynamics which evolution is described by the generator (1.8), without the perturbing potentials, is given by the order of The spectral gap approaches 0 as N goes to infinity, and the rate should be at least of order O(1/N ).
Proof. We exploit the results by Arnold and Erb in [AE] or by Monmarché in [Mon19, Proposition 13] saying that working with an operator of the form under the conditions that (i) no non-trivial subspace of KerD is invariant under B T and (ii) the matrix B is positively stable, i.e. all the eigenvalues have real part greater than 0, then the associated semigroup has a unique invariant measure and if ρ = inf{Re(µ) : µ ∈ σ(B)} > 0, the sharp exponential rate of the above generator (of an Ornstein-Uhlenbeck process) is at least ρ − and at most ρ. This holds for every ∈ (0, ρ), concluding the first statement of the Proposition. In particular, when m is the maximal dimension of the Jordan block of C corresponding to the eigenvalue λ such that Re(λ) = ρ, the quantity (1 + t 2(m−1) )e −2ρt is the optimal one regarding the long time behaviour, [Mon19]. This implies that the spectral gap of the generator is ρ− , whereas the constant in front of the exponential c( , m) := sup t (1+t 2(m−1) )e −2 t .
When we look at the purely harmonic chain this is our case as well: the first condition is equivalent to the hypoellipticity of the operator L, [H67, Section 1], and our generator (1.8) is indeed hypoelliptic: it is proven, [EPRB99b, Section 3, page 667] and [Car07, Section 3], for more general classes of potentials than the quadratic ones, that the generator satisfies the rank condition of Hörmander's hypoellipticity Theorem, [ for some constant C independent of N . Thus the smallest real part of the eigenvalues should have order less than 1 N .
Remark 6.2. Let us remark here that B can be seen as the Schrödinger operator : . . , N }) and δ i the projection on the i-th coordinate. We give the following definition for the (discrete) Laplacian on l 2 ({1, . . . , N }) with Dirichlet boundary conditions: where L i,i+1 are uniquely determined by the quadratic form u, L i,i+1 u = (u(i) − u(i + 1)) 2 with u(0) = u(N + 1) = 0 Dirichlet b.c.
We will use this information just after the proof of Proposition (1.1), to quantify the equivalence constants and the LSI constant from Section 3. . This steps represent the spread of dissipation from the endpoints of the chain to each particle like the commutators would do in a classical hypoelliptic setting. We could stop the process at the N/2 + 1 steps, but to ease the notation we do it N times, and it will be like running two inductions simultaneously: one adding 1/2 's from the top left until the bottom right side and one from the bottom right to the top left side of the matrix. See Figure 1 in the introduction for a visualization.
and where From (6.35) and considering that x m and y m are symmetric matrices, we get From that we get (6.31) and (6.32) directly, and also that: and by applying (6.31) to (6.36) we get (6.33). Also, using that x m and y m are required to be symmetric matrices, from the transposed version of (6.32), we get the equation m Γ which, combined with (6.32), gives We perform all the calculations from now on when the dimension of the block matrices, N , is odd. The same calculations with minor differences hold when N is even as well.
3. Calculations for m = 0, 1, 2. Before we start analyzing the form of the block z N , we first present in this subsection how each unit in the right hand side of the Lyapunov equation (6.35) for 0 ≤ m ≤ N (that corresponds to the spread of noise on the system), affects the z m block of the solution b m . This, motivated by the strategy-subsection above, is only to make it easier for the reader to follow on how perturbing the r.h.s. of the Lyapunov equation affects the solution in each sequential step. Then we analyze the final step (proof of Lemma 6.5) the result that we are interested in. Thus, the reader who is interested only in the proofs, and not in the motivation behind them, might skip this subsection.
For m = 0: It has been computed in [RLL67], where they found exactly the elements of z 0 := (z (0) ij ) 1≤i,j≤N when a = 0, c = 1, to be for α constant such that cosh(α) = 1 + 1 2γ . (It was done in the same manner with [WU45, Section 11] but there the case was ∆T = 0). Here we describe briefly the steps: first we notice that z 0 is antisymmetric since in (6.31) J (0) 0 = 0, and second, by (6.33) we get that it has a Toeplitz -form Indeed note that the r.h.s of (6.33) forms a bordered matrix i.e. only the bordered elements are non zero and so the l.h.s of (6.33) should also have this bordered form. Due to the tridiagonal form of B we get a Toeplitz matrix: in particular using that B = −c∆ N + aI, the l.h.s of (6.33) is and equating the nonboundary-entries, due to the symmetry of ∆ N and the antisymmetry of z 0 , we have that the elements of z 0 will be constant along the diagonals: indeed, for 1 < i < N , for the diagonal's entries of the equation (6.41) we have −cz For the superdiagonal's entries of the equation (6.41) We repeat these calculations through all the non-boundary entries of the matrix, and using the information we get from each one calculation, we end up with the Toeplitz form of z 0 in (6.40). Now find that a solution to (6.38) is a symmetric Hankel matrix which is antisymmetric about the cross diagonal and such that (y 1,j+1 . Then apply (6.32) to get a formula for the entries of x 0 and from the information due to (6.33) about the bordered entries of x 0 , end up with the linear equation Here z 0 , e 1 ∈ C N −1 are the vectors z 0 = (z 1,j 's: the recurrence formula of the determinant of K 0 is the same formula of the Chebyshev polynomials of the second kind, so using properties of these polynomials and imposing appropriate initial conditions we end up with the form (6.39). The detailed implementation of this is done in [RLL67].
For m ≥ 1 we use again the equation (6.33). In the first step we get that: For m = 1, i.e. for the form of the z 1 -block in b 1 , the elements z But we still have the bordered form in the r.h.s. of (6.33), so we still have a Toeplitzform for z 1 .
In the next Lemma we give the form of the z 2 block of b 2 .
Proof of Lemma 6.4. z 2 is not antisymmetric but from (6.31) we immediately have that z 2 = z a 2 − J (0) 2 , where z a 2 is antisymmetric. So we work with z a 2 and due to the antisymmetry we look only at the upper diagonal part of the matrix.
Here, besides that z 2 is not antisymmetric, the r.h.s of (6.33) is not a bordered matrix anymore and also the matrix BJ (6.42) Equating the entries that correspond to the zero-submatrix as drawn above we will have the same calculations as in the step m = 0. From (6.31) we have z Looking at the (2, 2)-entry and the (2, 3)-entry of the equation (6.42) we have respectively −cz (2) 2,1 + 2cz (2) 2,3 + cz N,N −2 − 1 2 where the second equalities in both lines are proved by looking at the reversed direction (bottom-right to top-left side of the matrix). Also for k ≥ 2 and 1 ≤ i ≤ N − k, look at (i, i + k) entry of the equation (6.42) and get This corresponds to the Toeplitz property that holds for all the diagonals apart from the 5 central ones. Remember that for m = 0 we end up with a Toeplitz matrix.
In the m-th step of the sequence of these matrix equations, for the z m -block of b m , the central (4m − 3) diagonals have a perturbed Toeplitz form: the elements across these diagonals on each line are changed by constants that depend on the coefficients a, c. The resulting matrix z m is described in the following way, where µ a,c := 1+a+2c 4c : The explanation is the same as in the step m = 2 but this holds for an arbitrary m ≤ N . . z N has the following perturbed Toeplitz form: for 2 ≤ i ≤ N − k and 1 ≤ k ≤ N − 2, for k even (6.43) and for the second and second-to-last line respectively: for k even (6.44) This corresponds to the relation of the first row with the last row of the matrix.
From the above Lemma we conclude that z N can be written in the general form where we write J for the square matrix with 1's in the superdiagonal and J for the matrix with 1's in the subdiagonal. Also ι k for the matrix with 1 in the (k, k + 1)-entry and ι −k for the matrix with −1 in the (k + 1, k)-entry. So for example For a visualization: Proof of Lemma 6.5. The proof of this Lemma corresponds in analyzing the final, the N -th step of the matrix equations-sequence. First, from (6.31) we have where z a N is antisymmetric matrix. So in order to find the form of z N we only need to study z a N and due to its antisymmetry, we only need to study its upper triagonal part.
We look at the non-bordered entries of the upper triagonal part of (6.33). That is the equation (6.48) Looking at the diagonal's entries (i, i) for 1 < i < N of the above equation (6.48), we write −cz i+1,i − (2c + a) = 1 and using the antisymmetry of the elements of z a N , it gives z Therefore, inductively we get (6.50) Also, from the reversed direction we get inductively . For the general case, as stated in the Lemma, we prove it by induction in k. For k = 1, 2, 3 is true from the above calculations. We do it for k odd. Let it hold for k − 2, we look at the (i, i + k − 1)-entry of equation (6.48) : for 1 < i < N − (k − 1), Then from the induction hypothesis we end up with the (6.43). The case k even follows similarly. Now generalize the previous induction formulas for k odd for example and write: From these two equations we have the specific case (6.46). k even is proven similarly. For (6.45) we write for k odd: where in the last line we applied (6.46). The case k even is proven in the same way.
The above discussion shows that in order to understand the entries of z N , we need only to understand the vector z N = (z i,N about the 'cross diagonal'. Lemma 6.6. For 3 ≤ k ≤ N ,   Now for k := N − j + 2 then 3 ≤ k ≤ N . Since N is odd, whenever j is odd, k is even and the opposite. Solving the equations (6.54) and (6.53) for z (N ),a 1,k , we get the second equalities in (6.51), whereas solving (6.55) for λ := j + 1, for z (N ),a 1,λ , we get the first equalities in (6.51) as well. We conclude with (6.52) just by combining the above relations in both cases.
Finally to get this specific value for x (N ) 1,N we look at the (1, N )-entry of equation (6.33) and perform the same calculations as above.
Considering the above Lemma we can write the matrix z N also as follows: where κ L := T L +a+2c 2c and κ R := T R +a+2c 2c .
In the following we state a Lemma about the symmetries that hold in y N -block of b N , concluding that all the entries of y N can be written in terms of the vectors y N := (y Proof. Due to symmetry of y N is enough to look at the upper-triagonal part. We look at the entries (i, i + k) of equation (6.34). For k = 1 we have −y i+1,i+1 = 0 which is the equation (6.56). For 1 < k < N − 1 we prove it by induction in k, like in the proof of Lemma (6.5). Let us now look at the (1, N )-entry of (6.34): and this is the desired equation. For (6.58), we look at (k − 1, N )-entry of (6.34) for k ≥ 3. Performing the same calculations as above we get k−1,N −1 . Then using the relations (6.56) and (6.57) for each of the terms above, we get the desired relation.
With the result of the following Lemma we relate the entries of y N with the entries of z N .
Lemma 6.8. Let B be the matrix (1.7). We have where µ a,c := 1+a+2c 2c . In particular: Proof. We combine the information for x 1i 's we get from two equations: first from (6.32), we remind that equation (6.32) is and second from the bordered entries of (6.33), which is We look at the element x In general using again Lemma 6.7 and relation (6.55), we have (a + 2c)y Putting the above relations in a more compact form we have We end up with (6.60) considering that B −1 2 is uniformly (in N ) bounded, since B has bounded spectral gap.
Proof of Proposition 1.1. The following Lemma shows, through its proof, that there is one unique solution to the Lyapunov matrix equation (since one can explicitly find the entries of z N , that determine all the rest) and eventually gives the scaling in N of the entries of z N : We conclude the last statement by combining the above estimate on z (N ),a 1,N with (6.61). Now we continue by estimating the entries y N : from (6.60) and Lemma 6.9, Lemma 6.10 (Estimate on the spectral norm of y N ). For the spectral norm of y N we have that We write L i for the i-th row of the matrix y N and then calculate |y N v| 2 2 = |L 1 · v| 2 + · · · + |L N · v| 2 ≤N |y We estimate the terms due to the first half of the matrix, i.e. the terms until L N 2 +1 · v: from Lemma 6.7 we write all the y (N ) i,j 's in terms of the entries of y N and z N that, due to the observations above, scale at most like N . In particular for the second line |L 1 · v| 2 + · · · + L N 2 +1 · v 2 N N 2 |v 1 | 2 + · · · + N 2 |v N | 2 + (6.68) + N 2 |v 1 | 2 + 3 2 N 2 |v 2 | 2 + · · · + 3 2 N 2 |v N −1 | 2 + N 2 |v N | 2 + +N 2 |v 1 | 2 + 3 2 N 2 |v 2 | 2 + 5 2 N 2 |v 3 | 2 + 5 2 N 2 |v 4 | 2 + · · · + 5 2 N 2 |v N −2 | 2 + 3 2 |v N −1 | 2 + N 2 |v N | 2 . . .
So the highest order is due to L N 2 +1 · v 2 for which we estimate The terms (2i − 1) in the sum above, denote the number of the entries of y N , z N that each y (N ) i,j is given by. Regarding the terms due to the second half of the matrix, we use again Lemma 6.7, equations (6.56). This way we write the elements y Then Before we finish the proof, we give more details on the estimates (6.68) above: For the first inequality we apply iteratively Lemma 6.7. Regarding the row L 2 : So y (N ) 2,2 is given by the sum of 3 terms whose absolute value is of order not more than O(N ). The same holds (from Lemma 6.7) for each y is given by the sum of 3 terms whose absolute value has order less than N , while for y 3,3 is given by the sum of 5 terms whose absolute value has order less than N . For y (N ) 3,j , j ≤ N − 2 (until the 'cross-diagonal'), apply Lemma 6.7 twice: the value of y i,j is given by the sum of ν terms, whose order is less than N , and ν = 1, 3, 5, · · · , (2i − 1) for y This formula gives that y (N ) i,j is the sum of (2j − 1) terms whose absolute value has order less than O(N ).
The same holds for j > N − (i − 1), i.e. after the 'cross-diagonal', considering also (6.67). As for the rest terms in L i , for i ≤ j ≤ N − (i − 1): y (N ) i,j is given by the sum of (2i − 1) terms whose order is less than O(N ). Now, from (6.32) we can see that the entries of x N can be written in terms of entries of z N as well: where β ij are the elements of the matrix B, (1.7), and the entries of y N are split into two sums regarding their position about the cross diagonal. We write x N 2 ≤ B 2 y N 2 + Γz N 2 y N 2 + N N 3 .
Eventually, we write where for the first inequality: since b N is positive definite, decomposing b N in its square root matrices: And since X * X and XX * are unitarily congruent and the same holds for Y * Y and Y Y * (from polar decomposition for example), there are unitary matrices U , V ∈ C N ×N so that: Then it is clear that for the spectral norm (which is unitarily invariant): x N z N z T N y N 2 ≤ x N 2 + y N 2 .

5.
Quantifying the constants from Section 3. Let us first state some facts about the spectrum of the matrix b 0 that corresponds to the 0-th step of the induction: It is known that b 0 corresponds to the covariance matrix of the NESS of the harmonic chain (and it has been found explicitly in [RLL67], see a description of their approach in the beggining of the proof of Lemma (6.5)). From [JPS17, Lemma 5.1], we know that b 0 is bounded below and above: So b 0 2 and b −1 0 2 are uniformly bounded in terms of N : note that, Remark (6.2), B = −c ∆ N + N i=1 αδ i . Even though here we will only use that b −1 0 2 is finite, in fact when a > 0, B possesses a spectral gap uniformly in N (also in a more general non homogeneous scenario).
We estimate the equivalence constants in the relation (3.24):  So the same bounds hold for this weak anharmonicity as well.
To sum up: for the homogeneous weakly anharmonic chain, the method described in Section 3 with the perturbed Bakry-Emery criterion, bounds the LSI(C) from above by N 6 . It gives a lower bound on the spectral gap that is of order N −3 (see the exponential rate in Theorem 1.2). For the purely harmonic chain, since we know that it always decays with N from Proposition (6.1), this lower bound shows that the spectral gap in this case can not decay at an exponential rate in N , it is at most polynomial.
In the next Proposition, exploiting the estimates on b N 2 from the above matrix analysis, we get alternatively the lower bound on the spectral gap of the harmonic chain.
Proof of Proposition 1.6. Remember that b N 2 O(N 3 ) by Proposition (1.1) and that the order of the spectral gap is given by the order of inf{Re(µ) : µ ∈ σ(M )}. From [Ves03, Inequality (13)], [GKK], we have an estimate for the decay of e −M t : for u be the (normalized) eigenvector corresponding to an eigenvalue of M , µ > 0, we write e −2Re(µ)t = e −2Re(µ)t u 2 ≤ e −M t u 2 ≤ b N b −1 N e −t/ b N 3 Whenever we write LSI(C) h and LSI(C) nh is the constant in the Log-Sobolev inequality in the purely harmonic and weakly anharmonic chain respectively. So we have −2Re(µ) ≤ − 1 b N which means Re(µ) ≥ 1 2 b N and taking the infimum over the real parts of the eigenvalues of M , we get that the spectral gap of the chain is of order bigger than N −3 : Eventually, from the whole procedure in this note we have that the spectral gap of the homogeneous harmonic chain is in between N −3 and N −1 . In [BM] it is proven that this bound is the sharp one.
From a simple numerical simulation on the spectral gap of the matrix M , the true value is indeed N −3 . In particular calculating the real part of the smallest eigenvalue of the matrix M in N iterations, and multiplying the result by N 3 we get the following behaviour in Figure 2, which shows that then the spectral gap converges for large N : pointing out a mistake in earlier version of this draft and suggesting the reference [Ves03], to me. Finally, I thank Josephine Evans for first discussions on the idea. This work was supported by the EPSRC grant EP/L016516/1 for the University of Cambridge CDT, the CCA.