Exponential equilibration of genetic circuits using entropy methods

We analyse a continuum model for genetic circuits based on a partial integro-differential equation initially proposed in Friedman et al. (Phys Rev Lett 97(16):168302, 2006) as an approximation of a chemical master equation. We use entropy methods to show exponentially fast convergence to equilibrium for this model with explicit bounds. The asymptotic equilibration for the multidimensional case of more than one gene is also obtained under suitable assumptions on the equilibrium stationary states. The asymptotic equilibration property for networks involving one and more than one gene is investigated via numerical simulations.


Introduction
Translation of the information encoded in genes is responsible for all cellular functions. The decoding of DNA can be summarised, following the central dogma of molecular biology, in two steps: the transcription into messenger RNA and the translation into proteins. Cells produce responses to environmental signals, thanks to the regulation of DNA expression via certain feedback mechanism activating or inhibiting the genes. Typically, regulation is produced by the union of proteins to the DNA binding sites. Moreover, the number of species involved in gene regulatory networks (gene expression together with their regulation) is small, which makes its behaviour inherently stochastic (Elowitz et al. 2002;Gillespie 2007;Kaeet al. 2005;McAdams and Arkin 1997;Paulsson 2004). This underlying stochastic behaviour in gene regulatory networks is captured by using the chemical master equation (CME) (Kepler and Elston 2001;Mackey et al. 2011;Paulsson 2005;Sherman and Cohen 2014). However, the CME solution is unavailable in most cases, due to the large (even infinite) number of coupled equations.
There are two main ways to obtain the CME solution: via stochastic simulation or via approximations of the CME. One of the most extended methods to reproduce the CME dynamics using stochastic realisations is the stochastic simulation algorithm (SSA) (Gillespie 1976(Gillespie , 2007. This method has no restrictions in its applicability, even though it is computationally expensive. On the other hand, CME approximations which remain valid under certain conditions include the finite state projection (Munsky and Khammash 2006), moment methods (Engblom 2006;Hasenauer et al. 2015), linear noise approximations (Thomas et al. 2014;Kampen 2007;Wallace et al. 2012) or hybrid models (Jahnke 2011).
In addition to the above mentioned methods, assuming that protein production takes place in bursts one can obtain a partial integro-differential equation (PIDE) as a continuous approximation of the CME. This PIDE has a mathematical structure very similar to kinetic and transport equations in mathematical biology (Perthame 2007) and it admits an analytical solution for its steady state in the case of networks involving only one gene. In the next subsections, we describe both the one dimensional PIDE model (Friedman et al. 2006) for self-regulated gene networks and the generalised PIDE model (Pájaro et al. 2017) for arbitrary genetic circuits. We will discuss the main properties of the stationary states in one dimension to finally explain the main results of this work.

1-dimensional PIDE model
The kinetic equation, first proposed by Friedman et al. (2006), is a continuous approximation of the CME for gene self-regulatory networks. A schematic representation of this genetic circuit is illustrated in Fig. 1, where the transcription-translation mechanism from DNA to a protein X is shown. Note that DNA transcribes into messenger RNA not only from the active state at rate (per unit time τ ) k m , but also from the inactive state with rate constant k ε lower than k m , which is known as basal transcription level or transcriptional leakage (Friedman et al. 2006;Ochab-Marcinek and Tabaka 2015;Pájaro et al. 2015). The messenger RNA transcribes into protein X following a first-order process with rate constant (per unit time) k x . The messenger RNA and protein are degraded at rate constants γ m and γ x respectively.
For self-regulated gene networks, activation or inhibition of the DNA promoter is produced by the union of the protein expressed to the DNA binding sites (feedback mechanism). So that, under protein action the promoter can switch between its inactive (DNA off ) and active (DNA on ) forms, with rate constants k on and k off respectively (see Fig. 1). There are two types of feedback mechanism: positive or negative, cor- 1 Schematic representation of the transcription-translation mechanism under study. The promoters associated with the gene of interest are assumed to switch between active (DNA on ) and inactive (DNA off ) states, with rate constants k on and k off per unit time, respectively. In this study, the transition is assumed to be controlled by a feedback mechanism induced by the binding/unbinding of a given number of X -protein molecules, what makes the network self-regulated. Transcription of messenger RNA (mRNA) from the active DNA form, and translation into protein X are assumed to occur at rates (per unit time) k m and k x , respectively. k ε is the rate constant associated with transcriptional leakage. The mRNA and protein degradations are assumed to occur by first order processes with rate constants γ m and γ x , respectively responding to whether the protein inhibits or promotes their production, respectively. The fraction of the promoter in the active or inactive state is typically described by Hill functions (Alon 2007). We can express the probability that the promoter is in its inactive state as a function of the protein amount x, denoted by ρ : R + → [0, 1] (see Ochab-Marcinek and Tabaka 2015;Pájaro et al. 2015): where K := k off k on is the equilibrium binding constant and H ∈ Z\{0} is the Hill coefficient which is positive if H proteins bound to the DNA inhibiting their production (negative feedback) and negative if |H | proteins bound to the DNA activating their production (positive feedback). Then, the rate R T of messenger RNA production (transcription) can be written as function of the Hill expression (1.1), R T = k m c(x), with the input function c(x) := (1 − ρ(x))+ρ(x)ε, where ε is the leakage constant defined as ε := k ε k m . Note that the function R T accounts for the messenger RNA production both from the DNA active state (with probability 1 − ρ(x)) with rate constant k m and from the inactive DNA (with probability ρ(x)) with lower rate constant k ε .
The PIDE model is valid under the assumption of protein production in bursts. So, we consider gene self-regulatory networks where the degradation rate of m R N A is much faster than the corresponding to protein, γ m /γ x 1. Such condition is verified in many gene regulatory networks, both in prokaryotic and eukaryotic organisms (Shahrezaei and Swain 2008;Dar et al. 2012), and results in protein being produced in bursts. As suggested in Friedman et al. (2006) and Elgart et al. (2011), the burst size (denoted by b = k x γ m ) is typically modelled by an exponential distribution. The conditional probability for protein level to jump from a state y to a state x > y after a burst is proportional to: The temporal evolution of the probability density function of the amount of proteins, p : R + × R + → R + is described by the following PIDE model: where τ is time, t = γ x τ represents a dimensionless time associated to the time scale of protein degradation, a = k m γ x is the dimensionless rate constant related to transcription, which represents the mean number of bursts (burst frequency) and ω(x − y) is given by (1.2). The input function c : R + → [ε, 1], which represents the feedback mechanism, takes the form (Ochab-Marcinek and Tabaka 2015; Pájaro et al. 2015): Note that the above input function can be constant, equal to one, when the protein does not promote or repress its production (open loop). This constant c(x) = 1 is used when the DNA is always in its active state, thus implying a unique messenger RNA production rate (k m ), reducing the system complexity.
We denote the stationary solution of Eq. (1.3) (which we sometimes call equilibrium) as P ∞ (x), which therefore verifies the following equation: We say a stationary solution is normalised when its integral over [0, +∞) (which we sometimes call its mass) is equal to 1. This equation has a unique solution with mass 1, which can be written out explicitly as (Ochab-Marcinek and Tabaka 2015;Pájaro et al. 2015): with ρ(x) defined in (1.1) and Z being a normalising constant such that ∞ 0 P ∞ (x) dx = 1. Alternatively, stationary solutions may be studied by considering the zero-flux case; see for example Bokes and Singh (May 2017); Bokes et al. (Jul 2018).
In case of no self-regulation (open loop network with c(x) = 1; that is, = 1) the stationary solution is a gamma distribution (Friedman et al. 2006), which is in fact the limit of (1.6) as tends to 1: (1.7)

Generalised n-dimensional PIDE model
Recently the 1D PIDE model has been extended to overcome more general gene regulatory networks than the self-regulation considered by Friedman et al. (2006). As a first step in this extension, Bokes and Singh (2015) propose the use of variable protein degradation rate, in order to accommodate gene networks with decoy binding sites (Lee and Maheshri 2012) to the PIDE model structure. Finally, including the previous models and considering genetic networks involving more than one gene Pájaro et al. (2017) proposed the generalised PIDE model for any number of genes. In Pájaro et al. (2017) a general gene regulatory network comprising n genes, . . , DN A n }, is proposed. These genes encoded by DNA-subchains are transcribed into n different messenger RNAs M = {m R N A 1 , . . . , m R N A i , . . . , m R N A n }, which are translated into n proteins types X = {X 1 , . . . , X i , . . . , X n }. We show a schematic representation of the general network in Fig. 2, which is similar to the self-regulation circuit. The main differences are that: (i) each DNA type can be regulated by others different proteins than the one expressed by the considered gene (cross regulation), and (ii) the protein degradation rate can be a variable function of all proteins types considered.
The structure of this multidimensional network is equivalent to the previous selfregulation case. Each promoter can switch from the inactive states (DN Ai off ) to the active one (DN Ai on ) or vice versa with rate constants k i on and k i off respectively. The leakage (basal) messenger RNA production from the inactive promoter is conserved at lower rate constant (k i ε ) than its production from the active state (k i m ). Each i messenger RNA type is translated into the protein X i at rate constant k i x . Both messengers RNA and proteins are degraded with rates γ i m and γ i x (x) respectively. Note that for this general network the total rate of production of m R N A i , R i T , can be written as the rate constant production from the active DN A i state times one input function c i (x) describing all possible types of feedback mechanism. However, there are not universal expressions for c i (x), due to their dependence on the regulatory mechanism considered (the messenger RNA production can occur from intermediate DNA states between the total activated and the total repressed ones), some examples have been described in Alon (2007) and Pájaro et al. (2017). Without loss of generality, we can construct the input function verifying that its image is a positive interval, Considering the set of n proteins X = {X 1 , . . . , X n }, we define the n-vector x = (x 1 , . . . , x n ) ∈ R n + as the amount of each protein type. The generalised (ndimensional) PIDE model, proposed in Pájaro et al. (2017), describes the temporal evolution of the joint density distribution function of n proteins p : R + × R n + → R + : Schematic representation of the transcription-translation mechanism under study. The promoters associated with the genes of interest are assumed to switch between active (DN Ai on ) and inactive (DN Ai off ) states, with rate constants k i on and k i off per unit time, respectively. The transition is assumed to be controlled by a feedback mechanism induced by the binding/unbinding of a given number of X jprotein molecules with j ∈ J (more than one protein type can bind to the DNA), which makes the network self-regulated if i = j or cross-regulated if j = i. Transcription of messenger RNA (m RN A i ) from the active DN Ai form, and translation into protein X i are assumed to occur at rates (per unit time) k i m and k i x , respectively. k i ε is the rate constant associated with transcriptional leakage. The m RN A i degradation is assumed to occur by first order processes with rate constant γ i m . Degradation of the X i -protein may follow different pathways, which is modelled by the function γ i where y i represents the vector state x with its i-th position changed to y i , (that is: is the degradation rate function of each protein. The first term in the right-hand side of the equation accounts for protein degradation whereas the integral describes protein production by bursts. The burst size is assumed to follow an exponential distribution, what leads to the conditional probability for protein jumping from a state y i to a state x i after a burst be given by: are dimensionless frequencies associated to translation which corresponds with the mean protein produced per burst (burst size). The function c i (x) (c i : R n + → [ε i , 1]) is an input function, which models the regulation mechanism of the network considered.
The stationary solution P ∞ (x) of (1.8) satisfies: Note that an analytical expression for the steady state solution is not known for the general case of the PIDE model (1.8). Some properties of the 1D solution remain valid for the nD steady state since P ∞ (x) is a probability density function, then R n + P ∞ (x) dx = 1. However, we do not have any other prior information about the properties of stationary solutions.

Main results
In this work we will apply entropy methods in order to analyse the asymptotic equilibration for the kinetic equations (1.3) and (1.8). These equations bear a similar structure to the self-similar fragmentation and the growth-fragmentation equations (Perthame and Ryzhik 2005;Laurençot and Perthame 2009;Doumic 2010;Cáceres et al. 2011;Balagué et al. 2013), used for instance in cell division modelling. In those cases, the transport term makes the cluster size of particles grow while the integral term breaks the particles into pieces of smaller size. In our present models, the transport term degrades the number density of proteins while the integral term makes the protein number density to grow.
In fact, the kinetic equations (1.3) and (1.8) have the structure of linear population models as in Michel et al. (2004Michel et al. ( , 2005 and Carrillo et al. (2011) for which the socalled general relative entropy applies. This fact already reported in Pájaro et al. (2016) implies the existence of infinitely many Lyapunov functionals for these models useful for different purposes among which to analyse their asymptotic behavior. We will make a summary of the main properties of Eq. (1.3) in Sect. 2 together with a quick treatment of the well-posedness theory for these models. They are easily generalisable to the multidimensional case (1.8).
In Sects. 3 and 4, we will improve over the direct application of the general relative entropy method in Pájaro et al. (2016). On one hand, we study in Sect. 3 the case of gene circuits involving one gene, Eq. (1.3), a direct functional inequality between the L 2 -relative entropy and its production leading to exponential convergence. In order to fix our setting, we recall that ω is given by (1.2) for some b > 0, and c = c(x) is given by (1.4), for some constants K > 0, H ∈ Z\{0} and 0 < ≤ 1; and a > 0 is a constant. For 1 ≤ p < +∞ we denote by L p ( ) the usual Lebesgue spaces of real functions f on such that | f | p is integrable in the Lebesgue sense. We also write L p ( , w) to denote the corresponding spaces of functions f such that | f | p is integrable with a weight w. Theorem 1.1 (Long-time behaviour for the 1-dimensional model) Let p 0 be a probability distribution such that p 0 ∈ L 1 ((0, +∞)) ∩ L 2 ((0, +∞), P −1 ∞ ), and let p be the mild solution to Eq. (1.3) with initial data p 0 (see Definition 2.1). There exists a constant λ > 0 depending only on the parameters of the equation (and not on p 0 ) such that The value of λ can be estimated explicitly from the arguments in the proof, though we do not consider the specific value to be a good approximation of the optimal decay rate. The behaviour of the stationary solutions P ∞ (x) near the origin and infinity is crucial for direct functional inequalities involving the relative entropy and its production in the one dimensional case. What we are showing is essentially a spectral gap in a weighted L 2 norm, and some remarks are in order regarding the specific choice of space L 2 ((0, +∞), P −1 ∞ ) that we have made. As will be seen later, this space is very natural for the technique we are going to use, since the evolution operator is contractive in this norm, and a similar observation is true for any Markov semigroup with an equilibrium. However, it is very likely that this operator also has a spectral gap in L 2 norms with different weights, in weighted L 1 norms, and in other metrics, as is often the case with Markov operators.
In many examples (such as the Fokker-Planck equation) it is known that the spectral gap property breaks for weights which are slowly decaying, so that there may not be a spectral gap in L 1 , for example. In those cases there are well-known examples of initial data with slowly-decaying tails whose associated solution converges to equilibrium as slowly as one wishes. The same happens for example to the Boltzmann equation from kinetic theory; we refer to Gualdani et al. (2010) for details on the extension of spectral gaps to different weights. So the weight is not only a technical assumption: there may be norms and weights in which the convergence is not exponential. However, exponential weights as the ones we use are probably far from being the optimal ones where one can show a similar result.
Section 4 is devoted to the analysis of the multidimensional Eq. (1.8) corresponding to multiple genes involved in the gene transcription. In this case, solutions to the stationary problem (1.9) are not explicit and hence we are not able to control precisely the behaviour of the stationary solutions near the origin and infinity as before. For this reason, we are only able to show convergence towards a unique equilibrium solution assuming its existence with suitable behavior near the origin and infinity: Theorem 1.2 (Long-time behaviour for the n D model) Given any mild solution p with normalised nonnegative initial data p 0 ∈ L 1 (R + ) to Eq. (1.8) and given a normalised stationary solution P ∞ (x) to (1.8) satisfying the technical Assumption 4.1 from Sect. 4, it holds that As a consequence, if a normalised stationary solution P ∞ (x) of (1.8) and satisfying Assumption 4.1 exists, it is unique.
The proof is based on a weaker variant of our one-dimensional inequality, in which the control between the relative entropy and its production is obtained except for an error term which happens to be small under the assumptions of the behavior of the stationary solution P ∞ (x). Both results of equilibration are illustrated with numerical simulations in their corresponding sections.

Properties of stationary solutions
Let us start by discussing the basic properties of the one dimensional stationary states to (1.3). The behaviour of the stationary state at zero and at +∞ depends on both r = aε − 1 and a due to the presence of the function ρ(x) and its dependence on H . It is as follows: There are two large areas where the protein distribution change fundamentally its properties, the first including the shapes one and two, where a < 1 ε and lim x→0 P ∞ (x) = +∞ and the second with P ∞ (x) finite for all non-negative x, which includes shapes three to five Then the stationary state P ∞ (x) exhibits a singularity at zero for 0 < a < 1 and it is smooth otherwise having zero limit for a > 1 and a positive limit for a = 1.
Then the stationary state P ∞ (x) exhibits a singularity at zero for aε < 1 and it is smooth otherwise having zero limit for aε > 1 and a positive limit for aε = 1.
As a particular case, if c(x) ≡ 1 then P ∞ (x) is given by (1.7) and we have P ∞ (x) x a−1 as x → 0 + and P ∞ (x) x a−1 e −x/b as x → +∞. Then the stationary state P ∞ (x) exhibits a singularity at zero for a < 1 and it is smooth otherwise having zero limit for a > 1 and a positive limit for a = 1.
Note that in all cases lim x→∞ P ∞ (x) = 0. As we can see in Fig 3, the stationary solution has five different qualitative behaviours for H < 0 (see also Pájaro et al. 2015): Fig 3). 1.2 Two peaks one in x = 0 and another in x > 0 (Case 2 Fig 3).
Note that, cases 2.1 and 2.3 are equivalent, and lim x→∞ P ∞ (x) = 0 for all cases. If H > 0 (or c(x) = 1) the bimodal behaviour disappears, and only cases 3 or 5 remain for a > 1 and case 1 if a < 1.

Well-posedness
The 1D Eq. (1.3) is a linear integro-differential equation for which well-posedness and some basic properties follow from standard methods. A classical solution to Eq.
for all x ∈ (0, +∞). It is not hard to show that, given an integrable initial condition there exists a unique mass-conserving classical solution. In order to give a brief sketch of the proof it is perhaps easier to work with mild solutions, which we will introduce now. Given and given any function This notation is motivated by the fact that X t # p 0 is the transport of the function p 0 by the dilation map X t (x) := xe −t . By the method of characteristics one easily sees that a classical solution p to (1.3) must satisfy This suggests the following definition.
In addition, there is a constant C > 0 (independent of p 0 ) such that Moreover, for any p 0 ∈ C 1,b (0, +∞) there exists a unique classical solution of (1.3) with initial data p 0 .
Proof This result can be obtained by considering the functional: By following an argument very similar to that of Picard iterations, one obtains the existence of mild solutions on a time interval [0, T ]. Since the equation is linear (and our equation is invariant under time translations), this argument can be iterated to find solutions on [0, +∞). We refer to Engel and Nagel (2006) and Cañizo et al. (2013) for full details of this standard argument. If the initial condition p 0 is in C 1,b (0, +∞), one can see that the iteration above can also be done in the space Z : 1. Positivity is preserved: if p 0 ≥ 0 a.e. then p(t) ≥ 0 a.e., for all t ≥ 0. 2. The L 1 norm is decreasing p(t) 1 ≤ p 0 1 for all t ≥ 0, leading to L 1 -contraction by linearity. If p 0 ≥ 0, the above inequality becomes an identity. 3. Maximum principle: Proof In order to show that positivity is preserved for any classical solution, we can rewrite, using Duhamel's formula, where S t is the semigroup associated to the equation ∂ t p − ∂ x (x p) + ac(x) p = 0 and L + is the operator given by This way of writing the solution clearly shows p is nonnegative if p 0 is nonnegative, since p is a fixed point of the positivity-preserving operator , which is also contractive in the L ∞ norm (for example) for t small enough. Now, for a mild solution we obtain the same result by approximation from classical solutions, taking into account the L 1 -stability (2.2).
For the second part of the result, denote by T t the semigroup in L 1 (0, +∞) defined by the equation, and write f + := max{0, f }, f − := max{0, − f } for the positive and negative parts of a function f , so that f = f + − f − . The positivity and mass preservation imply that: Finally, for the maximum principle just notice that, if M is the supremum on the right hand side, the function q = M P ∞ − p is a mild solution with nonnegative initial data. Due to preservation of positivity we obtain the inequality on the right-hand side. The minimum principle is obtained analogously.

Entropy and H-theorem
Let H [0, +∞) → R be a convex function. We define the general relative entropy functional as: for all t ≥ 0.

Remark 2.5
Notice that the dependence on the time variable in (2.4) has been omitted for simplicity. Observe that the right-hand side in (2.4) is non-positive since the Proposition 2.4 is very close to the results in Section 2 of Michel et al. (2005), but is strictly not contained there due to the form of the integral operator. It is worth giving a derivation of the result, so we include a proof here. We first obtain a technical lemma involving some classical computations in Michel et al. (2005): Lemma 2.6 Under the assumptions of Proposition 2.4, then the following equality is satisfied So that, replacing the first expression in the second we have that: (2.6) Next, by using the following identities: in (2.6) we obtain: Note that the terms u(x)P ∞ (x)− p(x) vanish, since u(x)P ∞ = p(x). Finally, reordering terms in the last equation we obtain the equality (2.5).

Proof of Proposition 2.4
We start the proof computing the time derivative of the general relative entropy functional We replace the time derivative of p(τ, x) by its expression (1.3) to obtain: Using lemma 2.6 and the fact that p(x) = u(x)P ∞ (x) we have: In the above equation the term vanishes since lim x→+∞ x P ∞ (x) = lim x→0 x P ∞ (x) = 0, and noticing that u(x) ≤ M for all t ≥ 0, x > 0 due to the maximum principle in Lemma 2.3. Replacing the term containing the first order derivative by its value in Eq. (1.5) we get Reordering terms in the above equation we have that

H (u(y))c(y)P ∞ (y)dy,
so we can change the order of integration in the above equation to obtain Since ∞ y ω(x − y)dx = 1, we multiply by this integral the second term in the first line on the right-hand side of the above equation to conclude which is the desired identity.

Exponential convergence for the 1D PIDE model
In this section our aim is to prove that Eq. (1.3) converges exponentially to the steady state, P ∞ . For this purpose, we consider the L 2 -relative entropy, i.e., the convex function H is chosen as H (u) = (u − 1) 2 , and where we have used that p(t, x) and P ∞ (x) are probability density functions. Now, by replacing the value of the considered convex function in Proposition 2.4, we obtain the following identity The entropy method consists in finding conditions under which the following functional inequality holds: Notice that the dependence on the time variable can be forgotten at this point, since our objective is to show such an inequality among a subset of suitable probability densities. For this purpose, we start by rewriting G 2 (u) in a equivalent form (Cáceres et al. 2011): Lemma 3.1 Given a non-negative measurable function P ∞ : (0, ∞) → R + such that ∞ 0 P ∞ (x)dx = 1 and defining the functional Proof Expanding the square implies while H 2 (u) is a symmetric function, so that: which is equal to (3.3).
As consequence of this lemma we are reduced to show the inequality among a suitable subset of probability densities.

Entropy-entropy production inequality
We start by obtaining bounds for the steady state solution P ∞ , of the Friedman Eq. (1.3).

Lemma 3.2 (P ∞ bounds)
For δ > 0 we define the intervals of length 1 2 : Then, the following inequality holds: ∀x ∈ I k,δ and ∀k, (3.5) with P ∞ (x) given by (1.6) and A(δ) and B(δ) being positive constants that only depend on δ (and network parameters), but they are independent of protein amount k.

Proof Note that x H + K H a(ε−1) H
and e −x b are decreasing functions, so that their maxima are atx 0 = δ + k 2 and their minima are atx 1 = δ + k+1 2 in I k,δ . The term x a−1 shows different behaviours which depend on the parameter a, (this term is increasing if a > 1, constant if a = 1 and decreasing if a < 1). So that, we can bound P ∞ (x) in the interval I k,δ as follows: . Now, in order to calculate the bounds of P ∞ (x) p k , we divide the expression (3.6) by p k to obtain A(δ, k) ≤ P ∞ (x) p k ≤ B(δ, k) with the functions A and B being, , k)) are well-defined and positive, leading to desired inequality (3.5).
Note that inequality (3.5) can be directly checked for the simplest open loop case, whose stationary solution is given by (1.7).

Lemma 3.3 Let us define
with {m k } k≥1 a positive sequence given by m k = p k e δ+ k 2 2b . Then, there exists C > 0 such that for all k ∈ N. (3.8)

Proof
We define {a j } j≥1 with a j = 1 m j to calculate the following limit Since this limit exists and {M j } j≥1 is a strictly increasing and divergent sequence, we can use the Stolz-Cesàro theorem to obtain that M j ≤ C 0 a j , with C 0 > 0 constant. Then, The summation term at the right hand side can be calculated as follows e−1 , concluding the proof. In order to prove the exponential convergence of the Friedman Eq. (1.3) we are going to split the proof of inequality (3.4) in the following two propositions. with u = p/P ∞ , for all p ∈ L 1 ((0, +∞)) ∩ L 2 ((0, +∞), P −1 ∞ ).
Proof We take 0 < δ < 1 and split H 2 (u) in two parts For i, j ≥ 0 integers we define We can estimate both the left and the right-hand sides of (3.9) by using the quantities A i, j .
Step 1: H 21 (u) bound.-We start working on the term H 21 (u)(τ ), where 0 < δ < y < x. By swapping (x, y) in the domain of integration, we get Now, using the inequality (3.5) and the symmetry A i, j = A j,i , we obtain (3.10) Note that some terms in this expression already appear in the right hand side of (3.9), since: P ∞ (x) < ∞ due to the properties described in Sect. 2.1.
In order to estimate A i, j for j > i we fix i, j and call n := j − i ≥ 1. We use n − 1 "intermediate reactions" to write the following: introduce n − 1 dummy integration variables z i+1 , . . . , z j−1 and denote averaged integrals with a stroke. Thus, we have: where the last step is just renaming x ≡ z j and y ≡ z i . Observe that nothing has been done in the case j = i + 1. Using the Cauchy-Schwarz inequality and (3.7), we have Hence, we deduce that The inequality k i=0 p i ≤ C, in the previous expression, holds because ∞ i=0 p i is a convergent series due to the d'Alembert's ratio test. Moreover, (3.8) implies for a generic constant C > 0. We finally work in the Eq. (3.12) to obtain where we use that y < δ+ k+1 2 < δ+ k+2 2 < y +1 and (3.5). We conclude by plugging the above estimate in (3.12), which together with Eqs. (3.10) and (3.11) show that for some constant λ 1 > 0.
Step 2: H 22 (u) bound.-To prove that there exists λ 2 > 0 such that we use an intermediate variable z ∈ (δ, 1) as follows: We bound each of the terms I 1 , I 2 . First, for I 1 we deduce that since ∞ 0 P ∞ (y)dy = 1. For I 11 we use that P ∞ is bounded below on [δ, 1] ( 1 C δ ≤ P ∞ (x), x ∈ [δ, 1]) to deduce Note that the right hand side of the above equation is bounded by a multiple of the term H 21 (u), thus leading to I 11 ≤ CH 21 (u) with C = 2C δ 1 − δ . Using (3.13) we deduce that I 11 ≤ C D(u).
The integral I 12 is clearly smaller than the right hand side of (3.9) since it involves a smaller domain of integration, indeed we obtain and thus, we also deduce that I 2 ≤ C D (u). Putting together the estimates on I 11 , I 12 and I 2 , we conclude that (3.14) for some λ 2 > 0. Finally, inequalities (3.13) and (3.14) together imply that λH 2 (u) ≤ D(u) concluding the proof.
Proof Note that, y < x < y + 1 on the left hand side of (3.15). Thus, we can bound the term ω(x − y) with x ∈ [y, y + 1]. Since ω(x) is a decreasing function of x, then Moreover, the term c(x) is bounded, ε ≤ c(x) ≤ 1 for all x ∈ R + . So that: which proves the inequality (3.15).

Numerical illustration of exponential convergence
The entropy functional, G 2 (u)(t), is represented in the plots B of Figs. 4, 5, 6, 7 and 8, which address the five possible steady states plots A of Figs. 4, 5, 6, 7 and 8 (see also Fig. 3). For all cases, these functions are represented in a semi-logarithm scale to numerically validate the exponential convergence shown in the previous section. A Gaussian distribution with mean 2 and standard deviation 0.1, N (2, 0.1), has been considered as initial condition.

The nD PIDE model
We can generalise the entropy functional (2.3) defined for the one dimension PIDE model in order to study the convergence of the multidimensional model. A wellposedness theory of mild and classical solutions satisfying the positivity and mass preservation, the L 1 -contraction principle, and the maximum principle can be analogously obtained from the one dimensional strategy in Sect. 2. Let us summarize these properties in the next proposition.
(iv) L q bounds, 1 < q < ∞: (v) Maximum principle: We will not do any details of these classical results. We just point out that these properties can be formally seen as consequences of the general relative entropy method (Michel et al. 2004(Michel et al. , 2005. Let us now concentrate on the entropy method. Given H (u) any convex function of u, we define the n-dimensional general relative entropy functional as: with u(x) := p(x)/P ∞ (x) as above. The main difference in the multidimensional case is that the stationary states are not explicit and thus, we need to assume certain properties on their behavior. In fact, in order to apply the entropy-entropy production method we make the following assumption: Assumption 4.1 The following property holds for any convex function H (u) and for all differentiable p ∈ L 1 ((0, +∞)) ∩ L 2 ((0, +∞), P −1 ∞ ).
Similarly to the one dimensional case, we can obtain the following identity. The proof is totally analogous to the one of Lemma 2.6 and we skip it here for brevity.

Lemma 4.2 Let p be a differentiable function on R n
+ . For any i = 1, . . . , n the following equality is verified: With this identity, we can now derive the evolution of the relative entropy as in the one dimensional case. We will not make explicit the time dependency of the solutions again for simplicity.

Proposition 4.3 Let p be a classical solution to the n D PIDE model with initial data
. For any convex function H (u(x)), the general entropy func-

Proof of proposition 4.3
We compute the time derivative of the general relative entropy functional to get Replacing the time derivative of p(x) in the last equality by its expression (1.8), we obtain Summations and integrals in the above expression are interchangeable, so that (4.2) Next, using Lemma 4.2, the first term on the right hand side in the above equation becomes (4.3) this last identity holds using Assumption 4.1. Note that, the first term in the last summation in Eq. (4.3) is equivalent to (4.4) and the second term in the last summation in Eq. (4.3) is equivalent to (4.5) Thus, using the expressions (4.4-4.5), replacing first in (4.3) and finally in the Eq. (4.2), we obtain the following equality (4.6) By changing the order of integration in the above expression and using the following identity the Eq. (4.6) can be rewritten in the following equivalent form which is equivalent to the expression (4.1) defined in Proposition 4.3, thus concluding the derivation of the identity. Observe finally that due to the convexity of H (u), we deduce that As in the one dimensional case, we will focus on the L 2 -relative entropy, i.e., we choose H (u) = (u − 1) 2 to define Proposition 4.3 leads to the relation (4.7)

Approach to equilibrium
Based on the Assumption 4.1 on stationary solutions, we are now able to control the entropy by the entropy production except for a small error term.
For the integral over the complement, using p ≤ C 1 P ∞ , we deduce On the other hand, for the integral over δ × δ we get We now rewrite u(x) − u(y) as a sum of n terms, each of which being a difference of values of u at points which differ only by one coordinate where K δ,2 is defined by with the infimum running over all i = 1, . . . , n and over all the points in the domain of integration. We notice that the first of the equalities in (4.9) is just obtained by integrating in the variables that do not appear in the expression and renaming the others; and the second equality is due to the symmetry of the integrand in the variables (x i , y i ). Using (4.8)-(4.9) finally gives: We may choose δ > 0 such that the first term is smaller than . This gives then the result with K = 1 2 K δ,1 K δ,2 .
Theorem 4.5 (Long-time behaviour) Given any mild solution p with normalised nonnegative initial data p 0 ∈ L 1 (R + ) to Eq. (1.8) and given a stationary solution P ∞ (x) to (1.8) satisfying Assumption 4.1, then As a consequence, stationary solutions P ∞ (x) of (1.8) satisfying Assumption 4.1, if they exist, they are unique.

Proof
Step 1: Proof for "nice" initial data. We first prove the result for initial data p 0 ∈ L 1 (R n + ) ∩ C 2 (R n + ) such that p 0 ≤ C 1 P ∞ , for some constant C 1 > 0. Observe that this implies in particular that p 0 ∈ L 2 (R n + , P ∞ (x) −1 dx). For such initial data we deduce that for all t ≥ 0 p(t, x) ≤ C 1 P ∞ (x) for almost all x ∈ R n + , from the maximum principle. This enables us to use Lemma 4.4. Using the general A B 0 1 2 3 4 5 10 -2 10 -1 10 0 10 1 10 2 Fig. 9 Example of two self regulated proteins whose distribution has a peak in x = (0, 0). (Same parameters as in the example depicted in Fig. 4 for both proteins)

Numerical exploration of the convergence rates
The entropy functional, G n 2 (u)(t), is represented in the plots B of Figs. 9, 10 and 11, which address three possible steady states (plots A of Figs. 9, 10 and 11) that have been obtained using the SELANSI toolboox (Pájaro et al. 2018). For all cases, these functions are represented in a semi-logarithm scale to numerically check if the convergence shown in the previous section is exponential in higher dimensions.
In the first example, Fig. 9, we consider two different self-regulated proteins with input functions: with H i = −4, ε i = 0.15, K i = 45, a i = 5 and b i = 10 as in the example depicted in Fig. 4. The second example, Fig. 10, is a self and cross-regulated gene network expressing two different proteins where the first one activates the production of both itself and the second protein, while the second protein inhibits the expression of both proteins. The input functions considered, as in Pájaro et al. (2017)  Example of two mutual repressed proteins whose joint distribution is bimodal attaining two peaks in two positive points. Parameters:γ 1 x = γ 2 x = 1, γ 1 m = γ 2 m = 25, k 1 m = k 2 m = 8 and b 1 = b 2 = 16 with input functions defined in (4.13) Our third example, Fig. 11, corresponds to a mutual repressing network of two genes in which the protein produced by the expression of one gene inhibits the production of the other protein in the network. The input functions, as in Pájaro et al. (2017), for this example take the following form: with H 12 = H 21 = 4, K 1 = K 2 = 45 and ε 1 = ε 2 = 0.15. The dimensionless network parameters are γ 1 x = γ 2 x = 1, γ 1 m = γ 2 m = 25, k 1 m = k 2 m = 8 and b 1 = b 2 = 16.

Conclusions
Analytical results for the n D model show convergence to equilibrium via a very general method, but do not give a bound on the convergence rate. The numerical simulations we have carried out clearly support the idea that exponential convergence also holds in the multidimensional case, though we have not been able to prove this using the same entropy method as in the one-dimensional case. Approach to equilibrium seems to follow a steady exponential speed, being quickly dominated by the spectral gap expected from our analysis. There also seem to be initial regimes where the approach to equilibrium can occur much faster; our interpretation is that smaller (more negative) eigenvalues can dominate at initial stages of time evolution, but are overcome by the dominant eigenvalue as equilibrium is approached.