1 Introduction: Motivation and Problem Formulation

There are two sources of motivation for the study of the gradient flow of the entropy of Anosov maps with respect to the Sinai–Ruelle–Bowen (SRB) invariant measure. One is purely mathematical: Let U(L) denote a family of \(C^3\) transitive Anosov systems on a torus of dimension \(n\ge 2\) that are topologically conjugate to a linear Anosov map L. We know that every transitive Anosov map belongs to one of these families [20] (Page 587). Each map \(f \in U(L)\) possesses a unique SRB measure \(\rho _f\) supported on the phase space. It is now well-known that the Kolmogorov–Sinai entropy of f with respect to the SRB measure \(\rho _f\), \(H(\rho _f)\) depends on the Anosov map f differentiably [27]. The value of this entropy \( H(\rho _f), f \in U(L)\) can vary between zero and the map’s topological entropy [14]. Because of the significance of the SRB measure [29], one naturally wishes to know what properties this entropy functional may have and whether it induces a gradient flow in U(L).

But the main source of motivation for studying this gradient flow comes from a new perspective on mathematical modeling of the evolution process of an isolated nonequilibrium thermodynamic system of many-particles based on Gallavotti–Cohen Chaotic Hypothesis [5,6,7,8]. In their hypothesis, one transitive Anosov map is used to represent a thermodynamic system for the study of macroscopic properties: A reversible many-particle system in a stationary state can be regarded as a transitive Anosov system for the purpose of computing the macroscopic properties of the system. We extend the chaotic hypothesis to modeling the process in which an isolated thermodynamic system evolves from its nonequilibrium state to an equilibrium and propose macroscopic properties of the process that we can investigate with this mathematical model.

Extension of Gallavotti–Cohen Chaotic Hypothesis: The process in which an isolated nonequilibrium thermodynamic system evolves to its equilibrium can be regarded as a gradient flow of an entropy functional in a family of transitive Anosov systems for the purpose of investigating macroscopic properties of the process.

This extension of the chaotic hypothesis to the evolution process of an isolated thermodynamic system is quite natural based on the following three considerations:

It extends the chaotic hypothesis to irreversible many-particle systems. As an isolated thermodynamic system evolves to its equilibrium, the entropy increases. Thus, the process is irreversible.

It makes the maximal entropy production principle (MEPP) [25] an integral part of the hypothesis.

It will allow us to explore whether laws of thermodynamics can be proved rigorously in this mathematical model. Proving the basic postulate of thermodynamics that every system evolves to a unique equilibrium would mean to establish the global existence of a gradient flow in which every trajectory converges to a unique limiting Anosov system. Proving the second law of thermodynamics would mean to show that the entropy increases along any trajectory of the flow.

Before we can begin our investigation, we need to clarify a couple of technical issues. First, we need to decide which entropy to choose for a transitive Anosov system corresponding to that of a nonequilibrium thermodynamic system.

The choice of an entropy. We choose the entropy for a transitive Anosov system to be the Kolmogorov–Sinai (a.k.a measure-theoretic or metric) entropy of its SRB measure for several reasons. As Goldstein et al have pointed out [11] that there are essentially two notions of entropy for an isolated thermodynamic system: the Gibbs entropy which is defined for every equilibrium measure that is invariant under the dynamics and the other is the Boltzmann entropy that is defined for the system. They explained the differences between these two notions and why the Boltzmann entropy is the right choice for nonequilibrium systems. Choosing the entropy for a transitive Anosov system to be the Kolmogorov–Sinai entropy of the SRB measure combines the essential features of both notions: it is defined through an invariant measure and it is uniquely defined for each transitive Anosov system.

There is another significant reason why the SRB measure is chosen: it reflects the asymptotic distribution of orbits starting from a typical initial point with respect to the Lebesgue measure. Within the family of transitive Anosov systems, the value of the SRB entropy can take any value greater than zero and less than or equal to the topological entropy. Roughly speaking, when the value of the entropy is near zero, it often indicates that many orbits will spend a disproportional amount of time near a fixed point with a small expansion rate while a large entropy value indicates that orbits are more evenly distributed over the phase space.

There are many other possible choices of invariant measures for the entropy. The SRB entropy is the equilibrium state for the potential function \(\varphi = -\log J^uf \), essentially, the rate of expansion along the unstable subspace. Theoretically, any f-dependent potential function could be chosen as long as it is Hölder continuous and its dependence on f is differentiable. For example, one may choose the sum of the rates along both the stable and unstable subspaces, \(\varphi = -\log J^uf + \log J^s f \). The resulting gradient flow may have different behaviors.

Differentiable conjugacy: After this SRB entropy is chosen, we need to address the issue that different Anosov systems may have the same entropy. Indeed, two systems that are differentiably conjugate always have the same SRB entropy. Thus, the entropy functional is defined over equivalent classes. Systems that are differentiably conjugate can be considered as reversible systems since the entropy does not change.

There are unaddressed issues concerning this extension of the chaotic hypothesis: If Anosov systems are truly good coarse mathematical models for thermodynamic systems, we wish that such systems as a collection would satisfy the axioms proposed by Lieb and Yngvason in [22]. Properties such as the scaling property would hold for the entropy. However, at the moment, it is unclear how to address such issues. Another related question is about the role of the dimension of the phase space. Ideally, the phase space’s dimension should be very large, even infinite as in the coupled systems over an infinite lattice.

Under this extended version of Gallavotti–Cohen Chaotic Hypothesis with the aforementioned choice of the entropy functional, we now are ready to formulate mathematical problems that reflect laws of thermodynamics. We first describe briefly the space where the SRB entropy functional will be defined. Let \(f_0\) be a \(C^r, r \ge 3\) transitive Anosov map or an expanding map on a closed Riemannian manifold M. The \(C^3\) condition is used to guarantee the differentiability of the entropy functional for Anosov maps. For expanding maps, the \(C^3\) condition can be relaxed to \(C^{1+\alpha }\) for some \(\alpha > 0\). Since we will later consider the differentiability of the entropy in a different norm (Sobolev norm), for convenience, we will require expanding maps to have at least the third-order derivative, but not necessarily \(C^3\). Let \(U(f_0)\) be the connected open component of all \(C^r\) maps in either family topologically conjugate to \(f_0\). It is well-known that there exists a unique Sinai–Ruelle–Bowen (SRB) measure \(\rho _f\) for every map \(f \in U(f_0)\) [29] and the entropy of the map with respect to the SRB measure \(\rho _f\) is given by the formula \(H(f)= \int _M \ln J^uf \ d \rho _f\) [26] (Pages 230 & 294), where \(J^uf\) is the Jacobian of f along the unstable subspace. This entropy is a Fréchet differentiable functional in f with respect to the \(C^r\) topology on \(U(f_0)\) [2, 3, 15, 27]. The map \(f_0\) can also be a piecewise \(C^r\) expanding map on an interval or an Axiom A diffeomorphism on a closed Riemannian manifold. For simplicity, we limit our formulation of problems to the Anosov and expanding map cases.

Problem one: Does the entropy functional \(H(f): U(f_0) \rightarrow {\mathbb {R}}\) induce a gradient flow in \(U(f_0)\)? If it does, what properties does the gradient flow have? In particular, does every trajectory of the flow converge to a limiting map?

We note that local or global existence of a gradient flow does not follow from the analyticity of the entropy functional defined on the space of Hölder potential functions. In the traditional setup, a fixed transitive Anosov system f is used to model a thermodynamic system. An equilibrium state is then defined via the variational principal for every Hölder potential function. This equilibrium state can also be defined via a transfer operator induced by the Anosov map [12]. The Kolmogorov–Sinai entropy of the equilibrium state is analytic in terms of potential functions. Thus, the entropy functional defines a gradient flow after an adjustment of the norm in the potential space [9, 23]. We note that this gradient flow is not defined in the space of Anosov systems. It is true that the space \(U(f_0)\) can be embedded into the space of Hölder continuous potential functions via symbolic representation since all maps in \(U(f_0)\) share a common Markov partition. But \(U(f_0)\) is in general, not an invariant submanifold of the gradient flow defined on the potential space.

In our setup, if the global existence of the gradient flow of the SRB entropy functional holds and each trajectory is shown to converge to a unique Anosov system, it can be regarded as a rigorous proof of the basic postulate of the thermodynamics: an isolated thermodynamic nonequilibrium system evolves to an unique equilibrium.

Problem Two. Does the SRB entropy functional have any critical point, for example, a local maximum, where its value is strictly less than the global maximum, i.e., the topological entropy?

Such critical point is supposed to be absent in the evolution of an isolated thermodynamic system. If we can show that such critical point does not exist for the gradient flow and every trajectory converges to a limiting map, then the limiting map’s SRB entropy must be the maximum.

Problem Three. Would it be possible to have a differential equation description of the diffusion process induced by the gradient flow of the SRB entropy functional in the space of Anosov maps or the expanding maps?

The process of evolution of an isolated thermodynamic system from nonequilibrium system to an equilibrium is a diffusive process as the heat, density, or pressure diffuses. If the gradient flow exists and the gradient vector can be explicitly calculated, it may potentially be expressed as differential equations in an infinite-dimensional space. One would like to compare such equations with diffusion equations such as the heat equation derived at the macroscopic level by making assumptions on the heat flux. Considering the gradient flow in the space of Anosov maps or expanding maps will lead to alternative mathematical descriptions of this diffusion process. With possible physics experiments, we can gain insights on the relation between the choice of the entropy and the resulting diffusion process.

At the moment, rigorous results are only proven for low dimensional systems. In [16], it has been proven that for both families of expanding maps and Markov transformations on a closed interval semi-conjugating to a full shift, the SRB entropy functional does not have any nontrivial critical point. If f is a critical point, then f has to be a linear map up to a differentiable conjugacy. In [17, 28], this property is proven for families of general transitive Markov transformations and Anosov maps on a torus of dimension two. Proofs of these results, with the exception in the the Anosov system case, involve only well-established techniques.

In this article we prove global existence of the gradient flow induced by the SRB entropy on the family of expanding maps on the circle preserving a common SRB measure. We also prove that every trajectory of the flow starting from a nonequilibrium map, i.e., its SRB entropy is strictly smaller than its topological entropy, will converge to an expanding map that is differentiably conjugate to the linear expanding map. Furthermore, we obtain an ordinary differential equation representation of the gradient flow that has a close connection to a gradient-dependent diffusion equation.

When the unstable subspace of an Anosov or expanding map is one dimensional, increasing the SRB entropy forces the map to become linear. The mechanism is similar to the situation of maximizing the Shannon entropy \(- \sum _{i=1}^n p_i \log p_i\) in a probability space. When the dimension of the unstable subspace is greater than one, the nonlinearity of the map will not disappear by simply increasing the SRB entropy since the entropy is equal to a sum of positive Lyapunov exponents. However, local and global existence of the gradient flow is expected to be true. But trajectories of the flow will unlikely converge to linear maps.

In next section, we first extend the family \(U(f_0)\) (still denoted by \(U(f_0)\)) to include all expanding maps whose rth derivative is \(L^2\). We show that \(U(f_0)\) is then equipped with a natural Hilbert manifold structure with a Sobolev norm in the tangent space. Under this Hilbert manifold structure, we consider the submanifold of \(U(f_0)\) consisting of maps preserving a common SRB measure and show that the SRB entropy functional H(f) remains Fréchet differentiable and thus the functional gives a gradient vector field over \(U(f_0)\). In Sects. 2 and 3, we show the gradient vector field is at least Lipschitz continuous, which guarantees the existence of a local gradient flow. We note that a differentiable function’s gradient is metric dependent. Under a slightly modified metric, we show in Sect. 4 the global existence of a gradient flow of the SRB entropy functional and the convergence of the flow to the linear expanding map as time approaches infinity. In the last section, via harmonic analysis, we obtain an ordinary differential equation representation of the gradient flow over the Hilbert manifold equipped with the Sobolev norm and give an example of a typical orbit using numerical approximation. The gradient flow also leads to a gradient-dependent diffusion equation on the circle.

2 Hilbert Manifold Structure on the Family of Circle Expanding Maps

The SRB entropy is a differentiable functional in the space of \(C^r, (r \ge 3)\) expanding maps on the unit circle. But when we consider the gradient of the entropy functional, the \(C^r\) norm may not be the most convenient one. So, a Sobolev norm becomes a more natural choice instead of the \(C^r\)-norm. We first give the definition of a gradient vector for any Gateaux differentiable functional on the Hilbert manifold \({\mathcal {M}}\) with a tangent space \(T_p{\mathcal {M}}\) at each point \(p \in {\mathcal {M}}\) and a Hilbert metric (inner product) \(<\cdot >_{\mathcal {M}}\).

Definition 1

A vector \(V \in T_p{\mathcal {M}}\) is called a gradient vector of a Gateaux differentiable functional H if the Gateaux derivative of H at p defines a bounded linear functional on \(T_p{\mathcal {M}}\): \( \nabla H(p): T_p{\mathcal {M}} \rightarrow {\mathbb {R}}\) and the Riesz representation of \(\nabla H(p)\) is V.

We now describe a Hilbert manifold structure on a family of circle expanding maps.

2.1 Hilbert manifold of expanding maps on the circle

First of all, by considering its lifts to the real line \({\mathbb {R}}\), we identify every continuous map f on \(S^1=\{ e^{ i 2\pi x}, x \in [0, 1)\}\) with a unique function \({{\tilde{f}}}\) defined on \({\mathbb {R}}\) satisfying the conditions \( {{\tilde{f}}}(0) \in [0, 1)\) and \(\tilde{f}(x+1) = {{\tilde{f}}} (x) +n\), where n is the degree of the map. For simplicity of statements, we limit our exposition to the orientation preserving expanding maps where \({{\tilde{f}}} '(x) > 0, x \in {\mathbb {R}}\). In the case of orientation reversing expanding maps where \({{\tilde{f}}} '(x) < 0, x \in {\mathbb {R}}\), the results of this paper remain valid but the exposition needs to be modified. Since expanding maps are defined on the circle, there must be a fixed point and we may assume 0 is a fixed point for both f and its lift \({{\tilde{f}}}\): \(f(0)=0={{\tilde{f}}}(0)\) and \({{\tilde{f}}}(1) =n\). Thus, each \(C^r, r \ge 1\) map f on the circle is identified with its lift \({{\tilde{f}}}\) on \({\mathbb {R}}\), the universal covering of \(S^1\), with the following properties: \( {\tilde{f}}(0)=0, {\tilde{f}}(1) = n, {\tilde{f}}'(x) > 1, x \in {\mathbb {R}}\) and \( {\tilde{f}}^{(k)}(0)= {\tilde{f}}^{(k)}(1), k=1,2,\ldots , r.\) We now define a family of \(C^{r-1}\) expanding maps \(F_r, r\ge 3\) via properties of their lift maps where we consider the gradient flow of the entropy functional:

$$\begin{aligned} F_r =\{f:&f \in C^{r-1} (S^1), {{\tilde{f}}}(0)=0, {{\tilde{f}}}(1) = n, {{\tilde{f}}}'(x) > 1, x \in (0,1), \nonumber \\&{\tilde{f}}^{(k)}(0)={\tilde{f}}^{(k)}(1) , k=1,2,\ldots , r-1, {{\tilde{f}}}^{(r)} \in L^2[0,1] \} \end{aligned}$$
(1)

The family \(F_r\) is slightly larger than the \(C^r \) family of expanding maps since we only require the rth derivative \(f^{(r)}\) to be \(L^2\), instead of being continuous. \(F_r\) is a separable Hilbert manifold modeled on the Sobolev space \(H^r\). For any given map \( f\in F_r\), its open neighborhood is given with an open neighborhood of the origin of the following Hilbert space

$$\begin{aligned} \Phi _r =\{ \phi \in C^{r-1} [0,1]: \phi ^{(r)} \in L^2[0,1],\phi (0)=0, \phi ^{(k)}(0)=\phi ^{(k)}(1), 0\le k \le r-1.\} \end{aligned}$$

equipped with the Sobolev norm

$$\begin{aligned} \Vert \phi (x) \Vert ^2_{H^r} = \sum _{k=0}^r \int _0^1 [\phi ^{(k)}]^2 (x) dx. \end{aligned}$$

Notice that \(\Phi _r\) can be identified with a Sobolev sequence space

$$\begin{aligned} \{\phi (x)= \sum _{n=0}^\infty a_n \cos 2 \pi n x + \sum _{n=1}^\infty b_n \sin 2 \pi n x: \ \sum _{n=1}^\infty n^{2r} ( a_n^2 + b_n^2 ) < \infty , \phi (0)=0\} \end{aligned}$$

equipped with the corresponding Sobolev norm

$$\begin{aligned} \Vert \phi (x) \Vert ^2_{H^r}= a_0^2 + \sum _{k=0}^r \sum _{n=1}^\infty n^{2k} ( a_n^2 + b_n^2 ). \end{aligned}$$

where \(\{a_n\}_{n=0}^\infty \) and \(\{b_n\}_{n=1}^\infty \) are Fourier coefficients of \(\phi \).

When \(r \ge 3\), each expanding map \( f \in F_r\) possesses a unique invariant probability measure absolutely continuous with respect to the Lebesgue measure on \(S^1\). Its probability density function \(\rho _f(x)\) is at least \(C^1\) [1] and depends on f differentiably in \(C^{r-1}\) topology. It yields the Fréchet differentiability of the entropy of f with respect to the measure \(\rho _f \), i.e., \(H(f) = \int _{S^1} \ln f'(x) \rho _f(x) dx.\) We want to prove that H(f) is also a Fréchet differentiable functional with respect to the new Hilbert metric on \(F_r\). In general, this can be done by using the imbedding theorem (Theorem 2.7 [13]). Since in this paper, we restrict our study to a simpler case when the SRB measure \(\rho _f\) is preserved by the perturbation of f, \(\rho _f(x)\) is independent of f, the differentiability of the entropy with respect to the Hilbert metric is much easier to prove. We can directly calculate the derivative operator and prove the Fréchet differentiability with respect to the Sobolev norm.

2.2 Hilbert manifold of expanding maps preserving the Lebesgue measure

We now define a Hilbert manifold \(F_r(\rho ), r\ge 3\) to be the subset of \(F_r\) consisting of maps that preserve the same invariant measure with a density function \(\rho (x)\). We may assume \(\rho (x) =\rho _0(x)=1\) by changing the Riemannian metric on the circle [4, 16]. The corresponding subset is denoted by \(F_r(\rho _0)\).

Given any map \(f\in F_r(\rho _0)\), the invariance of the Lebesgue measure under f is characterized by the equation

$$\begin{aligned} 1 = \rho _0(x)=\sum _{i=1}^n \frac{\rho _0(y_i) }{f'(y_i) }=\sum _{i=1}^n \frac{1}{f'(y_i) }, x \in [0,1) \end{aligned}$$
(2)

where \( 0\le y_1< y_2< \ldots< y_n < 1\) are n preimages of \(x \in [0,1):\) \({{\tilde{f}}}(y_i) = x, \text {mod\ 1}, i=1,2, \ldots , n.\)

We see that not only the equation (2) is nonlinear in f, the points \(\{y_i\}\), preimages of x, also depend on f. Thus, it is not convenient when we calculate Gateaux derivatives of the entropy functional with respect to f. Instead, we now identify the subset \(F_r(\rho _0)\) with another Hilbert manifold with a Sobolev tangent space where the same SRB entropy functional’s properties are much easier to study.

For each map \(f \in F_r(\rho _0)\), we consider its lift \({{\tilde{f}}}\)’s inverse map \(g(y), y \in [0, n]\). The rth derivative of \({{\tilde{f}}}\) is in \(L^2[0,1]\) if and only if g(y)’s rth derivative is in \(L^2[0,n]\) since we have \({{\tilde{f}}}'(x) > 1, x \in [0,1]\). Moreover, \({{\tilde{f}}}^{(k)}(0)= {{\tilde{f}}}^{(k)}(1), k=1,\dots r\) if and only if g(y) is differentiable in [0, n] up to order r and \(g^{(k)}(0^+) = g^{(k)}(n^-)\), \(k=1,\dots r\). That is, g(y) can be extended to a function whose kth derivative \(1 \le k \le r\) is a period n function. Given any \(x \in [0, 1)\), \({{\tilde{f}}}\) maps n preimages of \(x\in [0,1)\) under f to \(x, x+1, \ldots , x + (n-1)\) in the universal covering space: \({{\tilde{f}}}(y_i) = x+ i-1, i=1,2,\ldots , n\). Thus, we have

$$\begin{aligned} \frac{1}{f'(y_i) } = g'(x + i-1),\ i=1,2,\ldots , n \end{aligned}$$

and the invariance of the Lebesgue measure becomes an equation linear in g:

$$\begin{aligned} 1 = \sum _{i=1}^n g'(x + i-1), x \in [0,1). \end{aligned}$$

We now define the Hilbert manifold where we consider the SRB entropy’s gradient for \(r \ge 3\).

$$\begin{aligned}{} & {} G_r=\{ g(y) \in C^{r-1}[0,n]: g(0)=0, g(n)=1, 0< g'(x) < 1, g^{(r)} \in L^2[0, n], \\{} & {} \qquad g^{(k)}(0^+) = g^{(k)}(n^-), 1 \le k \le r, \sum _{i=1}^n g'(y + i-1) =1, y \in [0,1)\}. \end{aligned}$$

For any \(g_0, g \in G_r\), \(g-g_0 \) belongs to the following Sobolev space, still denoted by \(\Phi _r\):

$$\begin{aligned} \Phi _r=&\{ \phi : \phi ^{(r)} \in L^2[0, n], \phi (0)=0, \phi ^{(k)}(0^+) = \phi ^{(k)}(n^-), 0 \le k \le r, \nonumber \\&\sum _{i=1}^n \phi '(y + i-1) =0, y \in [0,1)\}. \end{aligned}$$
(3)

The Sobolev norm on \(\Phi _r\) is defined in the same way:

$$\begin{aligned} \Vert \phi (x) \Vert ^2_{H^r} = \sum _{k=0}^r \int _0^n [\phi ^{(k)}]^2 (y) dy. \end{aligned}$$

On the other hand, given any \(g \in G_r\), when \(\epsilon >0\) is sufficiently small, \(g +\epsilon \phi \in G_r\) for all \(\phi \in \Phi _r\) with \( \phi (x) \Vert ^2_{H^r} \le 1\). Thus, for each \(g \in G_r\), its open neighborhood can be identified with an open neighborhood of 0 in \(\Phi _r\).

For convenience, we denote the even larger family of functions without the constraint of preservation of Lebesgue measure by \(\bar{G}_r\) and \({{\bar{\Phi }}}_r\):

$$\begin{aligned}{} & {} {{\bar{G}}}_r=\{ g(y) \in C^{r-1}[0,n], g(0)=0, g(n)=1, 0< g'(x) < 1, g^{(r)} \in L^2[0, n], \\{} & {} \qquad g^{(k)}(0^+) = g^{(k)}(n^-), 1 \le k \le r \}, \\{} & {} {{\bar{\Phi }}}_r=\{ \phi : \phi ^{(r)} \in L^2[0, n], \phi (0)=0, \phi ^{(k)}(0^+) = \phi ^{(k)}(n^-), 0 \le k \le r \}. \end{aligned}$$

Indeed, \(G_r\) is a submanifold of \({{\bar{G}}}_r\) and \(\Phi _r\) a subspace of \({{\bar{\Phi }}}_r\) since the constraint is a linear affine equation.

On the Hilbert manifold \({{\bar{G}}}_r\), there is a natural metric \(d(g_1, g_2)\) that is consistent with the Sobolev norm in the tangent space of \({{\bar{G}}}_r\):

$$\begin{aligned} d^2(g_1, g_2)= \sum _{k=0}^r \int _0^n [g_1^{(k)}- g_2^{(k)} ]^2 (y) dy. \end{aligned}$$

Indeed, \(g_1-g_2 \in {{\bar{\Phi }}}_r\) for any \(g_1, g_2 \in {{\bar{G}}}_r \). Thus,

$$\begin{aligned} d(g_1, g_2) =\Vert g_1-g_2\Vert _{H^r}. \end{aligned}$$

Let \(a_i\) denote the preimage of i under \({{\tilde{f}}}\) for \(i=0, 1, \ldots , n-1\). The SRB entropy defined for every \(f \in F_r(\rho _0)\) becomes

$$\begin{aligned} H(f)= & {} \int _{S^1} \ln f'(x) d x =\sum _{i=1}^n \int _{a_{i-1}}^{a_{i}} \ln {{\tilde{f}}}'(x) d x\nonumber \\= & {} \sum _{i=1}^n \int _{ i-1}^{i} \ln \frac{1}{g'(y_i)}\ g'(y_i) d y_i = - \int _{0}^{n} \ln {g'(y )}\ g'(y ) d y=: H(g). \end{aligned}$$
(4)

Remark

  1. (1)

    We point out an interesting connection between the SRB entropy of measure-preserving expanding maps and the Gibbs entropy of a probability measure with a density. Any function \(g(y) \in {{\bar{G}}}_r\) can be considered as a probability measure on [0, n] with a density function \(0<g'(y)<1\). H(g) is then precisely the Gibbs entropy of a probability measure.

  2. (2)

    We also point out similarities and differences between our approach and the approach of Jordan, Kinderleherer, and Otto (JKO) [18, 19] in their study of the gradient flow of an entropy functional (see also [24]). The main similarity is that both approaches start from the Gibbs entropy. But two approaches have major differences: In JKO’s approach, the entropy (or the Gibbs–Boltzmann entropy as it is called in [18, 19]. For more discussions on Gibbs and Boltzmann entropy, see [11]) is defined for probability density functions on \({\mathbb {R}}^n\) not associated with any dynamical system. They use a discretized process to obtain an approximate orbit from an initial density and then show that the orbit converges to an orbit from the heat equation as the step-size approaches zero. In our approach, the SRB entropy is defined for a chaotic dynamical system. The SRB entropy changes as the underlying chaotic dynamical system varies and we directly calculate the entropy functional’s gradient under the Sobolev norm. Our approach leads to a system of countably many ordinary differential equations where the vector field is defined via integrals. The system has a close connection to a nonlinear partial differential equation, a gradient-dependent diffusion equation on the unit circle. See Sect. 5 for detail.

  3. (3)

    During the reviewing process of the paper, referees bought to our attention another closely related work [9] on the gradient flow of another entropy functional. For a transitive Anosov map on a closed Riemannian manifold, for every Hölder continuous function \(\varphi \), there exists a unique equilibrium measure \(\mu _\phi \) that satisfies the variational principal. The entropy functional \( \varphi \rightarrow h_{\mu _\phi } (f)\) is analytic on the space of Hölder continuous functions. This gradient flow from this functional is different since the map f is fixed. Since all maps in \(U(f_0)\) share a common Markov partition and the potential function for the SRB measure is Hölder continuous, \(U(f_0)\) can be embedded into the space of Hölder continuous functions. But \(U(f_0)\), in general, is not an invariant set of the gradient flow induced by the functional \( \varphi \rightarrow h_{\mu _\phi } (\Sigma _A)\), where \(\Sigma _A\) is the subshift of finite type defined by the Markov partition. The differentiability of these two functionals are also different: \( \varphi \rightarrow h_{\mu _\phi } (f)\) is analytic while the differentiability of \( f \rightarrow H (f)\) depends on that of \(f \in U(f_0)\). See also [23] for related work.

We now state main results on the SRB entropy functional H(g) on the Hilbert manifold \(G_r\).

Theorem 1

  1. (1)

    The SRB entropy functional H(g) is Fréchet differentiable on \(G_r\).

  2. (2)

    The gradient vector field of H(g) is well-defined: for each \(g \in G_r\), there exists a unique vector \(X(g) \in \Phi _r\) such that the directional derivative \(<DH(g), X(g)/\Vert X\Vert _{H^r}>\) is the unique maximum among all directional derivatives.

  3. (3)

    The gradient vector field \(X(g) \in G_r\) is Lipschitz continuous in g.

  4. (4)

    There is a unique critical point for the gradient vector field X(g) at the point where g is a linear function on [0, n].

An immediate consequence of Theorem 1 is that the differential equation defined on \(G_r\) by \( \frac{d {\mathcal {F}}_t}{d t }\big |_{t=0} = X(g)\) has a unique local solution for \(t \in (-\epsilon , \epsilon )\): \({\mathcal {F}}_t(g) = {\mathcal {F}}(t, g) \) is a local flow defined on \((-\epsilon _g, \epsilon _g) \times G_r \).

3 Proof of Theorem 1

Proof of Theorem 1 (1)

We need to show that H(g) is Fréchet differentiable. Since \(G_r\) is a submanifold of \({{\bar{G}}}_r\). We can just prove differentiability in \({{\bar{G}}}_r\). That means we do not need to consider the constraint of preservation of the Lebesgue measure.

We first calculate the first order term in \(\epsilon \) of the difference \(H(g + \epsilon \phi ) - H(g)\).

$$\begin{aligned} H(g+ \epsilon \phi ) = - \int _0^n \ln ( g'(y) +\epsilon \phi '(y)) ( g'(y) + \epsilon \phi '(y)) dy. \end{aligned}$$

Note that

$$\begin{aligned} \ln ( g'(y) + \epsilon \phi '(y))&= \ln g'(y) (1 + \epsilon \phi '(y)/g'(y) ) \nonumber \\&= \ln g'(y) + \ln (1 +\epsilon \phi '(y)/g'(y) )\nonumber \\&= \ln g'(y) + \epsilon \phi '(y)/g'(y) + O([\epsilon \phi '(y)/g'(y) ]^2) . \end{aligned}$$
(5)

We have

$$\begin{aligned}&\ln ( g'(y) + \epsilon \phi '(y)) ( g'(y) + \epsilon \phi '(y)) = g'(y) \ln g'(y) \nonumber \\&\quad + \epsilon \phi '(y)+ \epsilon \phi '(y) \ln g'(y) + O([\epsilon \phi '(y)]^2/g'(y)) + O( [\epsilon \phi '(y)]^3/[g'(y) ]^2). \end{aligned}$$
(6)

Thus, the first order term of \(H(g + \epsilon \phi ) - H(g)\) in \(\epsilon \) is

$$\begin{aligned} - \int _0^n ( 1+ \ln g'(y) ) \phi '(y) dy, \end{aligned}$$

Since \(\phi (0)=\phi (n) =0\), via integration by parts, we have the derivative operator formula

$$\begin{aligned} DH_g\phi&= <DH(g), \phi > = - \int _0^n \ln g'(y) \ \phi '(y) dy\nonumber \\&= - \int _0^n \ln g'(y) \ d \phi (y) = \int _0^n \frac{g''(y)}{g'(y)} \ \phi (y) dy . \end{aligned}$$
(7)

We now show that H(g) is Fréchet differentiable on \({{\bar{G}}}_r\), i.e, for any given \(g \in {{\bar{G}}}_r\),

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \sup _{ \Vert \phi \Vert _{H^r} = 1 } \frac{1}{\epsilon }[H(g + \epsilon \phi ) - H(g) - \epsilon DH_g \phi ] =0. \end{aligned}$$

According to our earlier calculation,

$$\begin{aligned}&| H(g + \epsilon \phi ) - H(g) - \epsilon DH_g \phi |\nonumber \\&\quad = \epsilon ^2 \int _0^n O([\phi '(y)]^2/g'(y)) + \epsilon \ O( [\phi '(y)]^3/[g'(y) ]^2) dy . \end{aligned}$$
(8)

\(\square \)

We now need a simple lemma on the upper bound of \(|\phi '(y)|\):

Lemma 1

Given any function \(\phi \in \Phi _r\), \(r\ge 2\), \(|\phi '(y)| \le M \Vert \phi (y) \Vert _{H^2}\). In general, \(| \phi ^{(k)} | \le M \Vert \phi (y) \Vert _{H^{k+1}}\), where \(1 \le k < r \) and M is a constant independent of \(\phi \).

Proof of Lemma 1

Since \(\phi \) is of period n and \(\phi ^{(r)} \in L^2[0,n]\), we have \(\phi \)’s Fourier expansion

$$\begin{aligned} \phi (y) = a_0 + \sum _{k=1}^\infty a_k \cos \frac{ 2 k \pi }{n} y + b_k \sin \frac{ 2 k \pi }{n} y,\end{aligned}$$

where the Fourier coefficients \(a_k, b_k, k\ge 1\) satisfy the condition

$$\begin{aligned} \sum _{k=1}^\infty k^{2r} (a^2_k + b^2_k ) < \infty . \end{aligned}$$

For convenience, we may assume \(a_0=0.\) Thus, by the Cauchy-Schwartz inequality,

$$\begin{aligned} | \phi (y) |&= | \sum _{k=1}^\infty a_k \cos \frac{ 2 k \pi }{n} y + b_k \sin \frac{ 2 k \pi }{n} y | \le \sum _{k=1}^\infty |a_k | + |b_k|\nonumber \\&= \sum _{k=1}^\infty \frac{1}{k} (k |a_k | + k |b_k|) \le 2 \left[ \sum _{k=1}^\infty \frac{1}{k^2} \right] ^{\frac{1}{2}} \left[ \sum _{k=1}^\infty k^{2} (a^2_k + b^2_k )\right] ^{\frac{1}{2}} . \end{aligned}$$
(9)

Let \(M= 2 \left[ \sum _{k=1}^\infty \frac{1}{k^2} \right] ^{\frac{1}{2}}\) and apply the inequality above to \(\phi '(y)\). We have

$$\begin{aligned} |\phi '(y)| \le M \Vert \phi (y)\Vert _{H_2}. \end{aligned}$$

\(\square \)

Since \(g'(y)>0\) is bounded away from 0 and \(\phi ^{(2)} \in L^2[0,n]\), the integrand \( O([\phi '(y)]^2/g'(y)) + \epsilon O( [\phi '(y)]^3/[g'(y) ]^2)\) is a bounded function over [0, n]. We have

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \epsilon \int _0^n O([\phi '(y)]^2/g'(y)) + \epsilon O( [\phi '(y)]^3/[g'(y) ]^2) dy=0. \end{aligned}$$

H(g) is Fréchet differentiable at any \( g \in G_r\).

Proof of Theorem 1 (2) & (4)

We pick an orthonormal basis of the separable Hilbert space \(\Phi _r\): \(\{{{\textbf {e}}}_i\}_{i=1}^\infty \). Via Riesz representation, every linear functional on \(\Phi _r\) can be identified with an element in \(\Phi _r\). Define

$$\begin{aligned} X(g)= DH_g= \sum _{i=1}^\infty b_i {{\textbf {e}}}_i, \end{aligned}$$

where \(b_i= DH_g ({{\textbf {e}}}_i), i=1,2, \ldots .\)

Given any \(\phi \in \Phi _r\) with \(\Vert \phi \Vert _{H^r} =1\) and \(\phi = \sum _{i=1}^\infty <\phi , {{\textbf {e}}}_i>_{H^r} {{\textbf {e}}}_i \), where \(<\cdot >_{H^r}\) denotes the inner product of \(\Phi _r\), we have

$$\begin{aligned} DH_g(\phi ) =\sum _{i=1}^\infty b_i <\phi , e_i>_{H^r}. \end{aligned}$$

By the Cauchy-Schwartz inequality, \(| DH_g(\phi ) |\) reaches its maximum if and only if \( <\phi , e_i>_{H^r} =C b_i, i = 1,2, \ldots \) for some constant C, i.e. \(\phi = C X(g)\). Since \(\Vert \phi \Vert _{H^r} =1 \), we have \(C = [ \Vert X(g)\Vert _{H^r} ]^{-1}.\)

Thus, the gradient vector field X(g) is well-defined on Hilbert manifold \(G_r\). Since we know that \(X(g)=0\) if and only if g is linear [16], \(X(g) \not =0\) for all \(g \in G_r\) except for g linear. \(\square \)

Proof of Theorem 1 (3)

We now prove that the derivative operator \(DH_g\) is Lipschitz, which leads to the local existence of the gradient flow [21].

Given any two maps \(g_1\not = g_2 \in G_r\), denote \(\psi = g_2 - g_1 \in \Phi _r\). We now estimate the distance between two derivative operators \(DH_{g_1}\) and \(DH_{g_1+\psi }\). When \(g_1\) and \(g_2\) are close, their open neighborhoods overlap. Thus, we can assume two derivative operators \(DH_{g_1}\) and \(DH_{g_2}\) are acting on the same Sobolev space \(\Phi _r\). For simplicity of notation, we drop the subscript in \(g_1\). We have

$$\begin{aligned}{} & {} \Vert DH_{g + \psi } - DH_{g } \Vert = \sup _{\Vert \phi \Vert _{H^r}=1} | \int ^n_0 [ \ln ( g' + \psi ') - \ln (g')] \phi ' dy| \\{} & {} \quad \le \sup _{\Vert \phi \Vert _{H^r}=1} \left[ \int _0^n | \ln ( g' + \psi ') - \ln (g') |^2 d y \int _0^n |\phi '|^2 d y \right] ^{1/2} \\{} & {} \quad \le \left[ \int _0^n | \ln ( g' + \psi ') - \ln (g') |^2 d y \right] ^{1/2} = \left[ \int _0^n | \ln ( 1 + \frac{\psi '}{g'}) |^2 d y \right] ^{1/2} \end{aligned}$$

Since \(g' > 0\) is bounded from below and \(\psi \in \Phi _r, r\ge 2\), we may assume that \(|\psi '|\) is sufficiently small and \(|\frac{\psi '}{g'}|< \delta < 1\). So, there is a constant \(K>0 \) such that

$$\begin{aligned} | \ln ( 1 + \frac{\psi '}{g'}) | \le K |\frac{\psi '}{g'}|. \end{aligned}$$

Thus, we have

$$\begin{aligned} \Vert DH_{g + \psi } - DH_{g } \Vert \le \left[ \int _0^n K^2 | \frac{\psi '}{g'} |^2 d y \right] ^{1/2} \le C \Vert \psi \Vert _{H^r}, \end{aligned}$$

where \(C = K[{\displaystyle \min _{y \in [0,n]} g'(y) }]^{-1/2}.\) We conclude that \(DH_g\) is Lipschitz continuous over \(G_r\). \(\square \)

4 Global Existence of the Gradient Flow

We now prove the global existence of the gradient flow of the SRB entropy for \(t \in [0, \infty )\) and show that every trajectory converges to the unique equilibrium where the expansion rate of the map is a constant and H(f) attains its global maximum. We note that under the Sobolev norm, the gradient vector at point \(g \in G_r\) is defined by the integral \( \int _0^n \frac{g''(y)}{g'(y)} \ \phi (y) dy\). While the integral does define a linear functional in the tangent space \(\Phi _r\) for each \( g \in G_r\), its Riesz representation in \(\Phi _r\) is, in general, not \( \frac{g''(y)}{g'(y)}\) since it may not be a vector in \(\Phi _r\). This poses an obstacle for proving the global existence directly through the integral. We instead reconsider the global existence of the gradient flow in a different Hilbert metric on a slightly different Hilbert manifold.

We first expand the domain of the entropy functional to a larger Hilbert space where the gradient vector’s Riesz representation can be obtained explicitly.

For the Hilbert manifold

$$\begin{aligned}{} & {} G_r=\{ g(y) \in C^{r-1}[0,n]: y\in [0,n]; g(0)=0, g(n)=1; 0< g'(y) < 1; g^{(r)} \in L^2[0,n]; \\{} & {} \qquad \qquad g^{(k)}(o^+) = g^{(k)}(n^-), 1 \le k \le r-1, r\ge 2, \sum _{i=0}^{n-1} g'(y +i) =1, y \in [0,1] \}, \end{aligned}$$

each map \(g \in G_r \) is uniquely defined by its derivative: \(g(y) = \int _0^y g'(\tau ) d \tau \). So, we can embed \( G_r\) into another Hilbert manifold \(G'\):

$$\begin{aligned} G'{} & {} =\{ h(y) \in L^2[0, n]: 0< h(y) < 1\ (a.e.), \int _0^n h(y) dy = 1, \sum _{i=0}^{n-1} h(y+i) =1,\\{} & {} \quad y \in [0,1] \}. \end{aligned}$$

Given any \(g \in G_r\), the embedding map is defined by

$$\begin{aligned} g(y) \rightarrow h(y) = g'(y) \in G'. \end{aligned}$$

\(G'\) is clearly a Hilbert manifold with a tangent space

$$\begin{aligned} \Psi =\{ \psi (y) \in L^2[0, n], \int _0^n \psi (y) dy = 0, \sum _{i=0}^{n-1} \psi (y+i) =0, y \in [0,1] \}, \end{aligned}$$
(10)

equipped with a common Hilbert norm \(\Vert \phi \Vert = \int _0^n \psi ^2(y) dy.\)

Note that the condition \( \int _0^n \psi (y) dy = 0 \) in (10) can be removed since it is implied by the condition \( \sum _{i=0}^{n-1} \psi (y+i) =0\):

$$\begin{aligned}{} & {} \int _0^n \psi (y) dy = \int _0^1 \psi (y) dy + \int _1^2 \psi (y) dy + \ldots + \int _{n-1}^n \psi (y) dy \\{} & {} \quad = \int _0^1 \psi (y) dy + \int _0^1 \psi (z+1) dz +\ldots + \int _0^1 \psi (z+n-1) dz = \int _0^1 \sum _{i=0}^{n-1} \psi (y+i) dy=0. \end{aligned}$$

The tangent space \(\Psi \) is a subspace of \( L^2[0, n]\).

Note also that the entropy functional,

$$\begin{aligned} H(g)= - \int _0^n \ln g'(y)\ g'(y) dy \end{aligned}$$

on \( G_r\) becomes

$$\begin{aligned} H(h) = - \int _0^n \ln h(y)\ h(y) dy, \end{aligned}$$

which is well defined on entire \(G'\) and the Gateaux derivative of H(h) in the direction of \(\psi \in \Psi \) exists and has the same formula (see (7)):

$$\begin{aligned} DH_h(\psi ) = - \int _0^n \ln h(y)\ \psi dy. \end{aligned}$$

This Gateaux derivative defines a bounded linear functional on the tangent space \(\Psi \).

A direct calculation will confirm that this linear functional’s Riesz representation is given by

$$\begin{aligned} R_h(y) =- \ln h(y) + \frac{1}{n}\sum _{i=0}^{n-1} \ln h(y+i), \end{aligned}$$

where h(y) is extended into a period n function over \([0, \infty )\). Indeed, we can easily verify that \(R_h(y) \in \Psi \). We only need to verify that \( \int _0^n R_h(y) \psi dy = - \int _0^n \ln h(y) \psi dy\) for all \(\psi \in \Psi \) since the identity

$$\begin{aligned} \sum _{i=0}^{n-1} R_h(y+i) =0 \end{aligned}$$

for all \(y \in [0, 1]\) clearly holds due to the periodicity of h(y).

To see that \( \int _0^n R_h(y) \psi dy = - \int _0^n \ln h(y) \psi dy\) for all \(\psi \in \Psi \), we first extend \(\psi \) to a period n function and calculate the following integral applying integration by substitution and periodicity of both functions h(y) and \(\psi (y)\):

$$\begin{aligned} \int _0^n \sum _{i=0}^{n-1} \ln h(y+i) \ \psi (y) dy. \end{aligned}$$

Let \(z= y+i\) in each integral, we have

$$\begin{aligned} \int _0^n \sum _{i=0}^{n-1} \ln h(y+i) \ \psi (y) dy = \int _0^n \ln h(z) \ \sum _{i=0}^{n-1} \psi (z-i) dz=0. \end{aligned}$$

We summarize the properties of the entropy functional \(H(h)=- \int _0^n \ln h(y)\ h(y) dy \) over the Hilbert manifold \(G'\) in the following proposition.

Proposition 1

  1. (1)

    H(h) is Gateaux differentiable at every \(h \in G'\) and the derivative formula in the direction of \(\psi \in \Psi \) is given by a continuous linear functional on \(\Psi \):

    $$\begin{aligned} DH_h(\psi ) = - \int _0^n \ln h(y) \psi dy. \end{aligned}$$
  2. (2)

    The Riesz representation of the derivative operator \(DH_h \) over \(\Psi \) is

    $$\begin{aligned} R_h(y)= - \ln h(y) + \frac{1}{n} \sum _{i=0}^{n-1}\ln h(y+i), \end{aligned}$$

    where h(y) is extended periodically to \([0, \infty )\).

  3. (3)

    The maximum value of \(DH_h(\psi )\) over \(\psi \in \Psi \) with \( \int _0^n \psi ^2(y) d y = 1 \) is reached at the unit vector \( R_h(y) [ \int _0^n R_h^2(y) dy]^{-\frac{1}{2}}.\)

We denote this gradient vector field over \(G'\) by

$$\begin{aligned} Y(h)=R_h(y)= - \ln h(y) + \frac{1}{n} \sum _{i=0}^{n-1}\ln h(y+i). \end{aligned}$$

It is Lipschitz continuous in terms of h under the \(L^2\) norm, thus, locally integrable.

We now prove the following theorem on the global existence of the gradient flow of the SRB entropy on \(G'\) and the convergence of every flow trajectory to a unique equilibrium as \( t\rightarrow \infty .\) Recall \(F_r(\rho _0)\) denotes the Hilbert manifold composed of Lebesgue-measure preserving \(C^r, r\ge 3\) expanding maps.

Theorem 2

The SRB entropy functional \(f \rightarrow H(f) = \int _0^1 \ln f'(x) dx, f \in F_r(\rho _0)\) induces a gradient flow on \(G'\), the space of derivatives of inverse map of \(f \in F_r(\rho _0)\) under the \(L^2\)-norm. This gradient flow exists globally for all \(t \in [0, \infty )\) and every trajectory converges to the unique equilibrium which corresponds to the linear expanding map in \(F_r(\rho _0)\) where H(f) attains its maximum value.

Proof

For any fixed initial map \(h(y) \in G'\), let \({{\mathcal {G}}}_t(h)=g(t, h)\) denote the local flow defined by the gradient vector field Y(h) on \(G'\) for \(t \in (- \epsilon _h, \epsilon _h)\). For any \(y \in [0,1]\), We have

$$\begin{aligned} \frac{d}{d t} g(t, h(y) )|_{t=0} = - \ln h(y) + \frac{1}{n} \sum _{i=0}^{n-1}\ln h(y+i). \end{aligned}$$

By periodicity of h(y), for all \(k=1,2, \ldots , n-1\), we also have

$$\begin{aligned} \frac{d}{d t} g(t, h(y+k) )|_{t=0} = - \ln h(y+k) + \frac{1}{n} \sum _{i=0}^{n-1}\ln h(y+i). \end{aligned}$$

Introducing new variables \(x_k = h(y+k-1)\), \( k=1,2,\ldots , n\), we have a system of n ordinary differential equations

$$\begin{aligned} {\dot{x}}_k = - \ln x_k + \frac{1}{n} \sum _{i=1}^{n} \ln x_i, k=1,2, \ldots , n \end{aligned}$$

subject to the condition \( \sum _{k=1}^n x_k =1\). Consequently, we have

$$\begin{aligned} {\left\{ \begin{array}{ll} {\dot{x}}_1 - {\dot{x}}_2 &{}= - ( \ln x_1 - \ln x_2 ) \\ {\dot{x}}_2 - {\dot{x}}_3 &{}= - ( \ln x_2 - \ln x_3 ) \\ \quad \cdots \cdots &{}=\qquad \cdots \cdots \\ {\dot{x}}_n - {\dot{x}}_1 &{}= - ( \ln x_n - \ln x_1 ). \end{array}\right. } \end{aligned}$$

Notice that when \(0< x< y < 1\), \(0< \ln y - \ln x \le \frac{1}{c} (y-x)\) for some \(c \in (x,y)\). Thus, \( \ln y - \ln x > y-x \). The differential equation \({\dot{y}} - {\dot{x}} = - ( \ln y - \ln x )\) implies \({\dot{y}} - {\dot{x}} < - (y-x)\). So, \(y-x\) converges to zero exponentially fast. Therefore, the solution to the system exists globally for all initial values in the region \(0< x_1, x_2, \ldots , x_k < 1 \) on the invariant plane \( x_1+x_2 + \cdots + x_n =1\) and all solutions converge to the unique equilibrium \(x_1=x_2 =\cdots = x_n=\frac{1}{n}\). \(\square \)

5 Differential Equation Representation of the Gradient Flow

We now explore the possibility of representing the gradient flow \({\mathcal {F}}_t(g)\) from Sect. 2 as explicit differential equations.

Let \({{\mathcal {F}}}_t\) denote the gradient flow defined by the vector field X(g) over \(G_r\),i.e, \( {{\mathcal {F}}}_t(g) \) is a map from \( (- \epsilon , \epsilon ) \times G_r\rightarrow G_r\) differentiable in t and \({{\mathcal {F}}}_0(g)=g\) and

$$\begin{aligned} \frac{d}{dt} {{\mathcal {F}}}_t(g)\big |_{t=0} = X(g), \end{aligned}$$

where \(X(g) \in \Phi _r\) is defined by an integral operator

$$\begin{aligned} <X(g), \phi > = - \int _{0}^n \ln g'(y) \ \phi ' dy = \int _{0}^n \frac{g''(y)}{g'(y)} \phi (y) dy. \end{aligned}$$

We see that maps in \( {{\bar{G}}}_r\) can be easily represented as a series. In the simple case when \(n=2\), we can obtain a system of ordinary differential equations that generates the flow. Numerical methods such as Euler’s method [10] can then be used to obtain typical approximate trajectories of the flow.

For any given \( g \in {{\bar{G}}}_r\), \(g(y) - \frac{y}{n} \in {{\bar{\Phi }}}_r \) is a continuous periodic function of period n and its derivative is bounded. Thus, its Fourier series converges to itself both pointwise and in the Sobolev norm. Thus, Hilbert manifold \({{\bar{G}}}_r\) can be represented as

$$\begin{aligned} {{\bar{G}}}_r =\{ \frac{y}{n} + \sum _{k=1}^\infty a_k \cos \frac{2\pi k }{n} y + b_k \sin \frac{2\pi k }{n} y, y \in [0,n], \sum _{k=1}^\infty k^{2r} (a^2_k +b^2_k) < \infty \}, \end{aligned}$$

where \(a_k, b_k\) are Fourier coefficients of \(g(y) - \frac{y}{n}\) satisfying the condition

$$\begin{aligned} 0< \frac{1}{n} + \sum _{k=1}^\infty - \frac{2a_k\pi k }{n} \sin \frac{2\pi k }{n} y + \frac{2b_k\pi k }{n} \cos \frac{2\pi k }{n} y < 1. \end{aligned}$$

Notice that we have replaced the condition \(g(0)=0\) by dropping the constant term in the Fourier series since the entropy is a function of \(g'(x)\). This adjustment is also made to the tangent space \(\bar{\Phi }_r\).

Maps in the submanifold \(G_r\) will have to satisfy an additional linear equation:

$$\begin{aligned} \sum _{i=1}^n \sum _{k=1}^\infty - k a_k \sin \frac{2\pi k }{n} (y+i-1) + kb_k \cos \frac{2\pi k }{n} (y +i -1)=0, y \in [0,1]. \end{aligned}$$
(11)

Suppose that we can find an orthonormal basis of \(\Phi _r\): \(\{ {{\textbf {e}}}_k \}_{k=1}^\infty \). Then, any trajectory of the flow \( u(t,y)= {{\mathcal {F}}}_t(g) \) can be written in the form

$$\begin{aligned} u(t,y) = \frac{y}{n} + \sum _{i=1}^\infty c_k(t) \textbf{e}_k, \end{aligned}$$

with \(u(y, 0)=\sum _{k=1}^\infty c_k(0) \textbf{e}_k = g -\frac{y}{n} \). Thus the flow equation \(\frac{d {\mathcal {F}}_t(g)}{dt}\big |_{t=0} = X_g\) becomes

$$\begin{aligned} \sum _{k=1}^\infty \dot{c}_k(t)\ \textbf{e}_k= \sum _{k}^\infty <X_{u(t,y)}, {{\textbf {e}}}_k> {{\textbf {e}}}_k. \end{aligned}$$

We have a system of countably many ordinary differential equations:

$$\begin{aligned} \dot{c}_k(t) = \sum _{i}^\infty <X_{u(t,y)}, {{\textbf {e}}}_k> =\int _0^n \frac{u_{yy}}{u_y}\ {{\textbf {e}}}_k dy, \end{aligned}$$
(12)

where

$$\begin{aligned}{} & {} u_y = \frac{\partial u(t,y)}{\partial y} =\frac{1}{n}+ \sum _{k=1}^\infty c_i(t) \frac{d \textbf{e}_k}{dy},\ \text {and} \\{} & {} u_{yy} = \frac{\partial ^2 u(t,y)}{\partial y^2} = \sum _{k=1}^\infty c_i(t) \frac{d^2 \textbf{e}_k}{dy^2}. \end{aligned}$$

While it is easy to obtain a set of orthonormal basis for \(\bar{\Phi }_r\) since the set \(\{ \cos \frac{2\pi k }{n} y, \sin \frac{2\pi k }{n} y \}_{k=1}^\infty \) is clearly an orthogonal basis, the linear constraint (11) poses an obstacle to finding orthogonal basis for \(\Phi _r\). Fortunately, in the simple case when \(n=2\), an orthonormal basis for \(\Phi _r\) can be obtained directly from this set. That will allow us to obtain a system of countably many ordinary differential equations explicitly and thus, to approximate numerically typical trajectories of the flow.

5.1 Ordinary differential equation representation when \(n=2\)

We now look at the case when \(n=2\). \(r \ge 2\) can be any number. In this case, the linear constraint (11) becomes

$$\begin{aligned} \sum _{k=1}^\infty {k a_k }(1+ (-1)^k) \sin { \pi k y} + { k b_k } (1+ (-1)^k) \cos { \pi k y } =0. \end{aligned}$$

So, we can conclude that \(a_k=b_k =0\) when k is even. For simplicity, we also let \(r=2\). Since

$$\begin{aligned} \int _0^2 \cos ^2 {\pi k } y dy = \int _0^2 \sin ^2 { \pi k } y dy = 1, \end{aligned}$$

we have

$$\begin{aligned}\Vert \cos \pi k y \Vert _{H^2} =\Vert \sin { \pi k } y \Vert _{H^2} = \left[ (1+ (k\pi )^2 + (k\pi )^4 )\right] ^{1/2} =:\frac{1}{c_k}. \end{aligned}$$

So, the set \(\{ c_{2\,m-1} \cos (2\,m-1) \pi y, c_{2\,m-1} \sin (2\,m-1) \pi y\}_{m=1}^\infty \) is an orthonormal basis of \(\Phi _2\).

Let u(ty) be a trajectory of the flow \({\mathcal {F}}_t(g)\). For each t,

$$\begin{aligned} u(t,y) = \frac{y}{2} + \sum _{k=1}^\infty { a_{2k-1}(t) } \cos [ (2k-1) \pi y] + b_{2k-1} (t) \sin [(2k-1)\pi y] \in G_2. \end{aligned}$$

For a fixed value of t, the gradient vector at u(xt) is

$$\begin{aligned} X(u(t,y))= & {} \sum _{m=1}^\infty<X(u(t,y)), c_{2m-1} \cos (2m-1)\pi y> c_{2m-1} \cos (2m-1)\pi y \\{} & {} +<X(u(t,y)), c_{2m-1} \sin (2m-1)\pi y> c_{2m-1} \sin (2m-1)\pi y \\= & {} \sum _{m=1}^\infty c_{2m-1}^2 \left[<X(u(t,y)), \cos (2m-1)\pi y> \cos (2m-1)\pi y \right. \\{} & {} \left. +<X(u(t,y)), \sin (2m-1)\pi y > \sin (2m-1)\pi y \right] \\= & {} \sum _{m=1}^\infty c_{2m-1}^2 \left[ \int _0^2 \frac{u_{yy}}{u_y} \cos (2m-1) \pi y dy\ \cos (2m-1)\pi y \right. \\{} & {} +\left. \int _0^2 \frac{u_{yy}}{u_y} \sin (2m-1) \pi y dy \ \sin (2m-1)\pi y \right] . \end{aligned}$$

Notice that

$$\begin{aligned} u_t=\frac{d {\mathcal {F}}_t(g)}{dt} = \sum _{m=1}^\infty { \dot{a}_{2m-1} } \cos [ (2m-1) \pi y] + \dot{b}_{2m-1} \sin [(2m-1)\pi y] \in \Phi _2, \end{aligned}$$

We obtain explicitly a system of ordinary differential equations defined on \(G_2\) that generates the gradient flow.i.e., the local flow \({\mathcal {F}}_t(g)\) is the solution to the system of differential equations:

$$\begin{aligned}{} & {} \dot{a}_{2m-1} = c_{2m-1}^2 \int _0^2\frac{u_{yy}}{u_y} \cos (2m-1) \pi y dy; \end{aligned}$$
(13)
$$\begin{aligned}{} & {} \dot{b}_{2m-1} = c_{2m-1}^2 \int _0^2 \frac{u_{yy}}{u_y} \sin (2m-1) \pi y dy, \end{aligned}$$
(14)

where

$$\begin{aligned} u_y = \frac{1}{2} + \pi \sum _{k=1}^\infty (2k-1)[- { a_{2k-1} } \sin [ (2k-1) \pi y] + b_{2k-1} \cos [(2k-1)\pi y]. \end{aligned}$$

5.2 The partial differential equation connection

Since the gradient vector X at \(u(t,y)\in G_2\) is defined by

$$\begin{aligned} <X( u(t,y)), \varphi > = \int _0^2 \frac{u_{yy}}{u_y} \varphi dy,\ \varphi \in \Phi _2, \end{aligned}$$

there is a close connection between the gradient flow \({\mathcal {F}}_t(g)\) and the solution to the nonlinear partial differential equation \(w_t = \frac{w_{yy}}{w_y}\), a gradient-dependent diffusion equation defined on the unit circle. Assume that

$$\begin{aligned} w(t,y) = \frac{y}{2} + \sum _{k=1}^\infty a_{2k-1}(t) \cos (2k-1)\pi y + b_{2k-1}(t) \sin (2k-1)\pi y \in G_2 \end{aligned}$$

is a solution to \(w_t = \frac{w_{yy}}{w_y}\) in some open interval of t and \(w_t \) and \( \frac{w_{yy}}{w_y}\) are both in \(L^2[0, 2]\) for each t. We have

$$\begin{aligned} \sum _{k=1}^\infty \dot{a}_{2k-1}(t) \cos (2k-1)\pi y + \dot{b}_{2k-1}(t) \sin (2k-1)\pi y = \frac{w_{yy}}{w_y} \end{aligned}$$

as functions of \(L^2[0,2]\) for each t. By orthogonality of the set

$$\begin{aligned} \{\cos (2m-1)\pi y, \sin (2m-1)\pi y \}_{m=1}^\infty \end{aligned}$$

and

$$\begin{aligned} \int _0^2 \cos ^2 (2m-1)\pi y dy = \int _0^2 \sin ^2 (2m-1)\pi y dy =1, \end{aligned}$$

we have

$$\begin{aligned} \dot{a}_{2m-1}(t)&= \int ^2_0 \frac{w_{yy}}{w_y} \cos (2m-1)\pi y dy , \end{aligned}$$
(15)
$$\begin{aligned} \dot{b}_{2m-1}(t)&= \int ^2_0 \frac{w_{yy}}{w_y} \sin (2m-1)\pi y dy , \end{aligned}$$
(16)

where

$$\begin{aligned} w_y = \frac{1}{2} + \pi \sum _{k=1}^\infty (2k-1)[-a_{2k-1}(t) \sin (2k-1)\pi y + b_{2k-1}(t) \cos (2k-1)\pi y]. \end{aligned}$$

The systems in (15) and (13) differ only by a constant coefficient in front of each equation.

Fig. 1
figure 1

Graphs of the deviation of the derivative of the inverse map from its equilibrium for various values of t along a trajectory of the gradient flow

Fig. 2
figure 2

The dotted line is the graph of a cosine curve. The solid line is the graph of the deviation of the derivative of the inverse map from its equilibrium when t is large. The vertical axis is re-scaled with a factor of 1000

5.3 Numerical approximation of a flow trajectory

We limit the scope of our numerical exploration to the case when \(g'\) is an even function:

$$\begin{aligned} g'= \frac{1}{2} + \pi \sum _{k=1}^\infty (2k-1) b_{2k-1}\cos (2k-1)\pi y, \ g''=- \pi ^2 \sum _{k=1}^\infty (2k-1)^2 b_{2k-1}\sin (2k-1)\pi y. \end{aligned}$$

We have \(\dot{a}_{2\,m-1}= 0\) for all \(m \in {\mathbb {N}}\). Thus, the system of ODEs in (14) is reduced to

$$\begin{aligned} \dot{b}_{2m-1} = c_{2m-1}^2 \int _0^2 \frac{g''}{g'} \sin (2m-1) \pi y dy, \end{aligned}$$
(17)

where

$$\begin{aligned} g' = \frac{1}{2} + \pi \sum _{k=1}^\infty (2k-1) b_{2k-1} \cos [(2k-1)\pi y]. \end{aligned}$$

Let \(\pi y=\tau . \) We have

$$\begin{aligned} \dot{b}_{2m-1} = c_{2m-1}^2 \int _0^{2\pi } \frac{g''}{g'} \sin [ (2m-1) \tau ] \frac{d\tau }{\pi }, \end{aligned}$$

where \(g' = \frac{1}{2} + \pi \sum _{k=1}^\infty b_{2k-1} (2k-1) \cos [(2k-1)\tau \) and \(g''=- \pi ^2 \sum _{k=1}^\infty (2k-1)^2 b_{2k-1}\sin [(2k-1)\tau ]\). Denote \( B_k=\pi b_{2k-1}(2k-1)\), \(h(\tau ) = g' = \frac{1}{2} + \sum _{k=1}^\infty B_k \cos [ (2k-1)\tau ]\) and \(h'(\tau ) = - \sum _{k=1}^\infty B_k (2k-1) \cos [ (2k-1)\tau ]\). So, the system (17) becomes

$$\begin{aligned} \dot{b}_{2m-1} = c_{2m-1}^2 \int _0^{2\pi } \frac{h'}{h} [\sin (2m-1) \tau ] d\tau . \end{aligned}$$
(18)

Replacing \( \dot{b}_{2m-1}\) in (18) by \(\frac{\dot{B}_m}{\pi (2\,m-1)}\), we have

$$\begin{aligned} \dot{B}_{m} = \pi (2m-1) c_{2m-1}^2 \int _0^{2\pi } \frac{h'}{h} \sin [ (2m-1) \tau ] d\tau , \end{aligned}$$

where

$$\begin{aligned} h(\tau ) = \frac{1}{2} + \sum _{k=1}^\infty B_k \cos [(2k-1)\tau ]. \end{aligned}$$

Or, in a single complete formula,

$$\begin{aligned} \dot{B}_{m}&= \frac{- \pi (2m-1)}{ 1 + (2m-1)^2 \pi ^2 + (2m-1)^4\pi ^4} \nonumber \\&\quad \cdot \int _0^{2\pi } \frac{ \sum _{k=1}^\infty B_k (2k-1) \sin [(2k-1)\tau ]}{ \frac{1}{2} + \sum _{k=1}^\infty B_k \cos [(2k-1)\tau ] } \sin [ (2m-1) \tau ] d\tau .\end{aligned}$$
(19)

Let \(F_m( \{B_k\})\) denote the right hand side of the equation (19).

We use Euler’s method to approximate a trajectory of this system of ODEs:

$$\begin{aligned} B_m( k \epsilon ) = B_m( (k-1) \epsilon ) + \epsilon F ( \{B_m( (k-1)\epsilon )\}), k=1,2,\ldots , \end{aligned}$$

where \(\epsilon >0\) is the step size and the initial point is \(\{B_m(0)\}\).

We choose \(B_1(0)= \frac{1}{4}\) and \(B_m(0)=0, m>1\), i.e., the initial map’s derivative is \(h(y)= \frac{1}{2} + \frac{1}{4} \cos y.\) We have

$$\begin{aligned} B_m( \epsilon ) = B_m( 0) + \epsilon F_m( \{B_m(0)\}), m\ge 0 \end{aligned}$$

where

$$\begin{aligned} F_m( \{B_m(0)\})= & {} - \pi (2m-1)c_{2m-1}^2 \int _0^{2\pi } \frac{ \frac{1}{4} \sin \tau }{ \frac{1}{2} + \frac{1}{4} \cos \tau } \sin (2m-1)\tau d \tau \\= & {} - \pi (2m-1) c_{2m-1}^2 \int _0^{2\pi } \frac{ \sin \tau }{ 2 + \cos \tau } \sin (2m-1)\tau d \tau . \end{aligned}$$

These values can be easily computed using numeric integration. We see that \(B_m(\epsilon )\) is generally not zero for all \(m \ge 1\), regardless how small the step size \(\epsilon >0\) is.

The numerical simulation of solutions to the system (19) is carried out on Maple by Maplesoft. Since \(B_m(k\epsilon )\) decays very fast in m, we have only kept three terms in the Galerkin method. The step size \(\epsilon =0.1\) in Euler’s method [10].

In Fig. 1, graphs of \(\frac{d}{dy} {\mathcal {F}}_t(g) - \frac{1}{2} \) are shown for three values of t, \(t=0, \) \(t= 10\), and \(t= 20\):

$$\begin{aligned} t=0, \quad&h_0 = \frac{1}{2} + \frac{1}{4} \cos \tau .\\ t=10, \quad&h_{200} \approx \frac{1}{2} + 0.121 \cos \tau - 0.000198 \cos 3 \tau - 0.00000196 \cos 5 \tau . \\ t=20, \quad&h_{200} \approx \frac{1}{2} + 0.043 \cos \tau - 0.000196 \cos 3 \tau - 0.00000186 \cos 5 \tau . \end{aligned}$$

In Fig. 2, we show the differences between \(c_m \cos (y)\) (from the trajectory of the heat equation \(u_t =u_{xx}\) with the same initial value) and \(\frac{d}{dy} {\mathcal {F}}_t(g) - \frac{1}{2} \) for an even larger t. The vertical axis is re-scaled by a factor of 1000. \(t=50,\)

$$\begin{aligned} h_{500} \approx \frac{1}{2} + 0.000431 \cos \tau - 0.000113 \cos 3 \tau - 0.00000152 \cos 5 \tau . \end{aligned}$$

Ending Remarks

We see that the diffusion process from this gradient flow is different from that of the heat equation. Due to the linearity, the flow from the heat equation does not create higher frequency terms if the initial heat distribution does not have them. In the gradient flow induced by the SRB entropy, the higher frequency terms appear immediately when t increases even though the amplitudes of these high frequency terms are very small.