Variational structures beyond gradient flows: a macroscopic fluctuation-theory perspective

Macroscopic equations arising out of stochastic particle systems in detailed balance (called dissipative systems or gradient flows) have a natural variational structure, which can be derived from the large-deviation rate functional for the density of the particle system. While large deviations can be studied in considerable generality, these variational structures are often restricted to systems in detailed balance. Using insights from macroscopic fluctuation theory, in this work we aim to generalise this variational connection beyond dissipative systems by augmenting densities with fluxes, which encode non-dissipative effects. Our main contribution is an abstract framework, which for a given flux-density cost and a quasipotential, provides a decomposition into dissipative and non-dissipative components and a generalised orthogonality relation between them. We then apply this abstract theory to various stochastic particle systems -- independent copies of jump processes, zero-range processes, chemical-reaction networks in complex balance and lattice-gas models.


Introduction
When studying an evolution equation, it is often helpful to know if it has an associated variational structure, in order to obtain physical insight and tools for mathematical analysis. An important example of such a structure is a gradient flow or dissipative system; in this case the structure consists of an energy functional and a dissipation mechanism, and the evolution equation is completely characterised by a corresponding minimisation problem involving these two objects. From a thermodynamic point of view, such a variational structure is often related to random fluctuations of an underlying microscopic particle system via a largedeviation principle -examples include the Boltzmann-Gibbs-Helmholtz free energy and the Onsager-Machlup theory.
It has recently become clear that macroscopic equations are always dissipative (called gradient flows) if the underlying microscopic stochastic system is in detailed balance 1 . The energy functional and the dissipation mechanism for such macroscopic equations are then uniquely derived by an appropriate decomposition of the large-deviation rate functional associated to the microscopic systems [ADPZ11, ADPZ13, MPR14, PRV14]. These observations have provided a canonical approach to constructing a variational structure for such macroscopic equations. In addition to having a clear physical interpretation, these variational structures have been used to isolate interesting features of the macroscopic equations and study singular-limit problems arising therein.
So far, this approach has largely been limited to particle systems in detailed balance and corresponding macroscopic dissipative systems. Since a large deviation study is possible far beyond detailed balance, this leads to the following natural question.
Do the large deviations of the underlying particle systems provide a variational structure beyond detailed balance?
While this is a hard question to answer in general, considerable progress has been made in the case of some specific systems in two seemingly independent directions. One direction that is tailored to allow for non-dissipative effects is the study of so-called FIR inequalities, first introduced for the many-particle limit of Vlasov-type nonlinear diffusions [DLPS17], independent particles on a graph [HPST20] and chemical reactions [RZ21,Sec. 5]. These inequalities bound the free-energy difference and Fisher information by the large-deviation rate functional, providing a useful tool to study singular-limit problems and to derive error estimates [DLP + 18, PR21]. Strictly speaking, these inequalities are not variational structures in the sense that they do not fully determine the macroscopic dynamics. However, in this paper we will construct a variational structure which generalises these inequalities and completely characterises the macroscopic dynamics.
Another direction of generalising dissipative systems is by using Macroscopic Fluctuation Theory (MFT) [BDSG + 15]. The main idea here is to consider, in addition to the usual density of the particle system, the particle fluxes at the microscopic level, and to study the large deviations of these fluxes. Consequently using time-reversal arguments, MFT explicitly captures the dissipative and non-dissipative effects in the system. However, most MFT literature has been devoted to diffusive scaling of particle systems and corresponding quadratic rate functions. Such rate functions define a Hilbert space with a natural orthogonal decomposition into dissipative and non-dissipative components. Recently non-quadratic rate functions and connections to MFT have been explored in the case of independent particles on a graph [KJZ18] and chemical reaction networks [RZ21], but a general MFT for non-quadratic rate functions is largely open.
Spurred on by these exciting new developments, we provide a partial but affirmative answer to the question posed above. The basis of our analysis is an abstract action functional (ρ, j) → T 0 L(ρ(t), j(t)) dt. This functional will correspond to the large deviations of random particle systems, but this identification is not necessary for our analysis; in this sense our approach is purely macroscopic. Inspired by FIR-inequalities and MFT, we set up an abstract theory whose central outcome will be a series of decompositions of the integrand L into distinct dissipative and non-dissipative components. These decompositions generalise: (1) the connection between large deviations and dissipative systems from [MPR14] to include non-dissipative effects, (2) the known cases of FIR inequalities [HPST20] to a general setting, and (3) MFT to non-quadratic action functions.
Finally we apply this abstract theory to the density-flux large-deviation rate functional for various stochastic particle systems without assuming detailed balance, and derive new variational formulations for the corresponding macroscopic equations.

Summary of results
Abstract results. Consider the macroscopic densities and fluxes [0, T ] ∋ t → (ρ(t), j(t)) that are evolving according to a coupled system of evolution equations: (1.1a) j(t) = j 0 (ρ(t)), (1.1b) with an associated action functional (ρ, j) → where the non-negative cost function L has the crucial property that for any (ρ, j), j = j 0 (ρ) ⇐⇒ L ρ, j = 0, and hence the action (1.2) is minimised by the trajectory (1.1b). We will interpret equation (1.1a) as a continuity equation and call j 0 (ρ) the zero-cost flux associated to L. Equation (1.1) often describes the macroscopic dynamics arising from a microscopic stochastic particle system and (1.2) is typically the corresponding large-deviation rate functional. Although writing the flux explicitly in (1.1b) instead of directly studyingρ(t) = − div j 0 (ρ(t)) might seem superfluous at first sight, it is motivated by the fact that fluxes can encode information on nondissipative, for instance divergence-free, effects in the system. Consequently, while studying densities is usually sufficient for dissipative systems [Ons31a,Ons31b,OM53,MPR14,MPPR17] (see Section 1.2 below for more details), the inclusion of fluxes is better suited to describe non-dissipative effects at the macroscopic level [BDSG + 15,Mae18].
Our abstract theory requires the existence of three objects: a sufficiently regular density-flux cost function L(ρ, j), an operator that will play the role of divergence and as such defines the continuity equation (1.1a) and a non-negative quasipotential V associated to L. The basis of our approach will be the decomposition L(ρ, j) = Φ(ρ, j) + Φ * (ρ, F (ρ)) − F (ρ), j , where F (ρ) := −dL(ρ, 0) is called the driving force and Φ and its convex dual Φ * the dissipation potentials, see Theorem 2.9 for details. This decomposition is standard in the literature [MPR14,KJZ18,Mae18] and corresponds to a (possibly nonlinear) force-flux response relation j = dΦ * (ρ, F (ρ)) for the zero-cost dynamics; it includes gradient flows as a special case as discussed in Section 1.2.1.
Borrowing ideas from MFT, we uniquely decompose this driving force into a symmetric and antisymmetric part F (ρ) = F sym (ρ) + F asym (ρ).
On a macroscopic level, these notions of (anti)symmetry (defined in Section 2.3) are consistent with the time-reversal symmetry of Markov processes in the context of MFT and large deviations. In particular, if the microscopic system is in detailed balance, then F (ρ) = F sym (ρ) and the (macroscopic) dynamics is purely dissipative, i.e. described by a gradient flow driven by a quasipotential V [MPR14]. It turns out that even for systems that are not in detailed balance, the symmetric force F sym always relates to such a V, which can be defined in terms of the cost L (see Definition 2.6) and is a natural Lyapunov functional for the system. In particular, the symmetric part F sym (ρ) is a conservative force driven by the quasipotential (energy) V. More generally, from a physical point of view, a purely dissipative system is thermodynamically closed, so that the work done is related to the free energy or quasipotential via  Thus for non-closed systems one can think of F sym (ρ) as an internally generated force and the remainder, F asym (ρ), as the force exerted by the system upon the environment. While F asym (ρ(t)), j(t) and F (ρ(t)), j(t) (1.5) can be understood as expressions of power or rates of work, in general there is no reason to expect these to be exact differentials. In our main result, Theorem 2.29, we relate the cost function L to the three powers from (1.4) and (1.5). Specifically, for any λ ∈ [0, 1], the cost function L admits the following decompositions L(ρ, j) = L (1−2λ)F (ρ, j) + R λ F (ρ) − 2λ F (ρ), j , with R λ F (ρ) ≥ 0, (1.6a) L(ρ, j) = L F −2λF sym (ρ, j) + R λ F sym (ρ) − 2λ F sym (ρ), j , with R λ F sym (ρ) ≥ 0, (1.6b) L(ρ, j) = L F −2λF asym (ρ, j) + R λ F asym (ρ) − 2λ F asym (ρ), j , with R λ F asym (ρ) ≥ 0. (1.6c) The parameter λ can be used to switch between different forces and the non-negative terms L G (ρ, j) are modified versions of L where the driving force F (ρ) is replaced by a different covector field G(ρ). Consequently, the zero-cost flux of L G will be a modified dynamics, different from (1.1b). Of particular interest is the case λ = 1 2 , where the decompositions (1.6b) and (1.6c) can be seen as two different ways to split L into purely dissipative and purely non-dissipative components. Indeed, the modified cost L F sym is related to a purely dissipative system that can be formalised as a gradient flow (see Section 1.2.1). By contrast, we interpret the zero-cost flux of L F asym as purely non-dissipative. Although the variational structure and physical interpretation of L F asym remains an open question (see discussion in Section 6), we show for certain examples that its zero-cost behaviour corresponds to a purely Hamiltonian macroscopic evolution. This idea is clearly illustrated by Figure 1, where we plot the phase diagram for the zero-cost flux associated with L F , L F sym and L F asym in the case of independent Markov jump particles on a three-point state space. For details on this example see Sections 2.6 and 4. . Phase diagram for the (zero-cost) trajectories ρ(t) associated to (a) L(ρ(t), j(t)) = 0; (b) L F sym (ρ(t), j(t)) = 0; (c) L F asym (ρ(t), j(t)) = 0. Here ρ i is the mass at point i and we do not plot ρ 3 since i ρ i = 1. The zero-cost trajectories for L F sym and L F asym follow a purely dissipative and Hamiltonian dynamics respectively.
The middle terms in the right hand side of (1.6) are inspired by [HPST20, Def. 1.5], [RZ21,Sec. 5], and are called generalised Fisher informations. For λ ∈ [0, 1] and covector fields G = F, F sym , F asym , they are defined as R λ G (ρ) := −H ρ, −2λG(ρ) , (1.7) where H is the convex dual of L. The terminology is motivated by the fact that (see Proposition 2.18) which in the case G = F sym is the time derivative or dissipation rate of the quasipotential along the zerocost path, i.e. in the limit λ → 0, R λ F sym coincides with the classical Fisher information [HPST20]. The non-negativity of the generalised Fisher informations in (1.6) is essential, since it shows that the three powers in (1.4) and (1.5) are non-negative along the zero-cost flux, thus generalising the second law of thermodynamics.
Scope. To highlight the minimal underlying structure required to obtain the decompositions (1.6), analysis will be carried out in a general abstract setting. This implies that our results can be applied to a broad range of models, not restricted to large deviations or to continuity equations of divergence-type. In theory, after properly setting up the spaces, the only requirements of analysis will be the cost function L together with a continuity equation of the form (1.1a). However for specific applications, explicit calculations are restricted to cost functions L for which the associated quasipotential V is known. For the purpose of this paper, we define the quasipotential in terms of a Hamilton-Jacobi-Bellman equation (Definition 2.6), and solve it for a number of examples. For cost functions that are derived from large deviations, this definition coincides with the large-deviation rate functional of the invariant measure (see Theorem 3.7). However we reiterate that the abstract definition is purely macroscopic and does not require connections to large deviations.
Application. All three decompositions (1.6) are power balances, split into purely dissipative and purely non-dissipative powers in a physically consistent way. From a mathematical perspective, this generalises ideas from dissipative systems to a larger class of systems which include non-dissipative effects. For dissipative systems (F asym (ρ) ≡ 0) these decompositions coincide with the variational formulation of a gradient flow (see Section 1.2.1). However, our abstract theory only requires a suitably convex cost L and quasipotential V for the decompositions (and therefore the corresponding variational ideas) to hold. Lyapunov functions, Fisher informations and dissipation potentials are central ingredients in gradient-flow theory and often difficult to discern in non-dissipative systems (for instance the laws of non-reversible Markov processes). This work provides explicit formulae for these objects in terms of the cost and the quasipotential.
For the zero-cost dynamics (1.1), our results imply that the three powers F, j , F asym , j and F asym , j are always non-positive, and in particular that V is a Lyapunov functional with an explicit expression for its decay (rather than merely an upper bound).
By contrast, the decay (1.3) of the quasipotential V is bounded by a FIR inequality, which connect the cost to the quasipotential and Fisher information. These inequalities are crucial in studying singular limits in non-dissipative systems, for instance to prove compactness of densities and fluxes in suitable topologies. However they are only available in a limited setting. It turns out that since the modified cost functions L G in (1.6) are non-negative, the FIR inequalities naturally arise from these decompositions and therefore we provide a universal recipe to arrive at such inequalities. In fact, the decompositions (1.6) explicitly characterise the gap in the FIR inequalities. For more details see Section 1.2.3.
The aforementioned gap in the inequalities corresponds to the L G on the right-hand side of (1.6). This new term exactly characterises the effects of non-dissipative effects in the variational structure and the corresponding macroscopic evolution. This is especially revealing for jump processes where we find that purely non-dissipative systems (F sym (ρ) ≡ 0) correspond to Hamiltonian-type structures.
From a physical standpoint, the decompositions (1.6) can be interpreted as a novel combination of gradient flows and Hamiltonian systems, in a similar spirit to GENERIC (see Section 1.2.2). However, we stress that all of our examples -apart from the lattice gas model -cannot be cast into the GENERIC framework. This work also provides a framework to study physically relevant 'open-boundary' jump-process systems (see a recent application in [RS22]).
Finally these decompositions also have numerical implications since numerical schemes inspired by gradientflow structures of evolution equations have gained importance [CCH15] in recent years. Numerical schemes often add artificial non-reversibility to speed-up convergence to equilibrium, but their analysis is tricky except in special situations [LNP13]. The decompositions (1.6) explicitly characterise the role of Fisher informations and antisymmetric forces and a natural goal would be to optimise this force to speed up convergence.
Examples. Above we discussed the abstract framework and theory derived from it; this theory is purely macroscopic in that we do not require any connection to particle systems and large deviations. In the latter part of this paper we apply this abstract theory to several microscopic particle systems.
First, we focus on independent Markov jump particles on a finite graph as a guiding example throughout this paper, and generalise the results of [KJZ18]. Second, we study zero-range processes in a scaling which leads to an ordinary differential equation (ODE) in the limit. Third, we study chemical reaction networks in complex balance [AK11] and generalise the results in [RZ21]. In all these three examples the macroscopic dynamics are ODEs and the large-deviation principle yields an exponential rate functional. Finally, we focus on the setting of particles that hop on a lattice in a diffusive limit, which leads to a drift-diffusion equation as the macroscopic evolution. These particles can either be independent random walkers or interact via exclusion. In this setting, the large-deviation principle yields a quadratic rate functional, and we recover the classical MFT results [BDSG + 15].
Boundary issues and global-in-time decompositions. The decompositions (1.6) do not involve time, and therefore when considering trajectories t → (ρ(t), j(t)), they should be considered as local-in-time or instantaneous decompositions of L(ρ(t), j(t)) at time t. Naively, one would simply integrate in time to obtain global decompositions of the rate functional T 0 L(ρ(t), j(t)) dt for arbitrary trajectories (ρ, j). This argument is formal since, strictly speaking, the decompositions (1.6) hold only for ρ, j for which the required terms are defined. More precisely, it turns out that the forces F , F sym and F asym are well-defined only on a proper subset of the domain of definition for the modified cost functions L G and generalised Fisher informations R λ G . This issue is often ignored in the MFT literature. This issue becomes clear in the various examples we consider. For instance when dealing with independent jump processes on a finite lattice X , the large-deviation cost is well defined for any trajectory in the space of probability measures i.e. ρ(t) ∈ P(X ) (see Example 2.1), whereas the symmetric force is only well-defined for trajectories in the space of strictly positive probability measures, i.e. ρ(t) ∈ P + (X ) (see (2.29)). This difference in the domains arises due to the logarithm present in the definition of the symmetric force. Such issues are typically dealt by first extending the domains of definition of the forces involved by appropriately regularising them, second by proving the decompositions on these extended domains, and finally passing to the limit in the regularisations (see for instance the proof of [HPST20, Thm. 1.6]). Although we expect that similar arguments can be applied to (1.6) to arrive at global-in-time decompositions, in this first study we focus on local-in-time results.

Related work
As mentioned earlier, this work connects and generalises existing literature in various directions. Barring fairly recent works [KJZ18,Ren18b,RZ21] which deal with particular examples, the connections between MFT, dissipative systems and FIR inequalities have largely been unexplored in the literature. Not all of these works consider fluxes, and so we will also make use of a 'contracted' cost function, where the velocity u is a placeholder forρ(t) and − div is the abstract operator that maps fluxes to velocities as in (1.1a). This construction is consistent with the notion of contraction in large deviations (see Example 2.1). SinceL(ρ, − div j 0 (ρ)) = 0, we refer to u 0 (ρ) := − div j 0 (ρ) as the zero-cost velocity.

Dissipative/Gradient systems
In the case of dissipative systems F = F sym and F asym = 0, and therefore with λ = 1 2 both (1.6a) and (1.6b) become with the convex dual pair of dissipation potentials defined as Φ(ρ, j) := L 0 (ρ, j) and Φ * (ρ, ζ) := sup j ζ, j − Φ(ρ, j). This decomposition of L is exactly the characterisation of dissipative systems in the density-flux setting [Mae18,Ren18b]; see Section 2.6 for a further elaboration. Using (1.4), F sym = − 1 2 ∇dV (see Corollary 2.21 for definition) and applying the contraction (1.8), we switch to the density settinĝ whereΨ is the contraction of Φ andΨ,Ψ * are convex duals of each other (see [Ren18b,Thm. 3] for details). The identity (1.10) is the standard decomposition of the density cost function that characterises a dissipative system or generalised gradient flow in the following sense. For the zero-cost velocity, the left-hand side satisfiesL(ρ, u 0 (ρ)) = 0, and the right-hand side of (1.10) is the Energy-Energy-Dissipation identity (EDI) [SS04,AGS08,RMS08], which is equivalent by convex duality to where d ξ is the derivative with respect to the second argument. In the special case whenΨ * (ρ, ξ) = 1 2 K(ρ)ξ, ξ is a quadratic form with an inverse metric tensor K(ρ) of a manifold, we arrive at the usual gradient-flow representation of the zero-cost velocity on that manifold This connection between generalised gradient flows and the symmetry F = F sym at the level of densities has been explored more directly in [MPR14], where it was shown that this symmetry holds ifL corresponds to the large-deviation principle of a Markov process in detailed balance. The density-flux formulation (1.9) of a dissipative system with quadratic dissipation has also been investigated extensively in the literature, see for instance [BDSG + 15,Mae18,Ren18b]. Since we derived this decomposition from (1.6a) and (1.6b), these two decompositions can be thought of as the natural generalisations of the EDI to non-dissipative systems.

GENERIC
The GENERIC framework is specifically designed as a coupling between dissipative and non-dissipative effects in a thermodynamically consistent way [GÖ97,ÖG97,Ött05]. Although originally meant to describe evolution equations, recent work has also studied the following natural connection between GENERIC and large deviations from a variational perspective (see (1.10)), where the Poisson structure J and energy E define the Hamiltonian part of the dynamics, and additional non-interaction conditions are required to ensure that the zero-cost velocity Such a connection is discussed in [DPZ13] in the particular setting of weakly interacting diffusions and more recently in the context of hypocoercivity [DO21]. More generally, the recent paper [KLMP20] shows that (1.12) can only hold if the underlying microscopic system consists of stochastic dynamics in detailed balance combined with a deterministic drift. The drift may be replaced by stochastic fluctuations as long as they appear deterministic on the large-deviation scale [Ren18b], but any larger scale fluctuations that are not in detailed balance will break down the GENERIC structure. Therefore, the class of large-deviation cost functions with a GENERIC structure is rather limited.
By contrast, the decompositions (1.6) always hold as soon as the quasipotential V is identified. The crucial difference is that our decompositions are based on a decomposition of forces, i.e.
rather than a decomposition of fluxes or velocities as in GENERIC (1.13). Furthermore, generalised orthogonality between F sym and F asym (see Subsection 2.4) is a natural analogue of the non-interaction conditions used in GENERIC.
Assume that a smooth trajectory [0, T ] ∋ t → ρ(t) satisfies (1.14) for every t. Substituting u =ρ, formally applying the chain rule dV(ρ),ρ = d dt V(ρ), and integrating in time over [0, T ] we arrive at the F("free energy")-I("rate functional")-R("Fisher information") inequality [HPST20, Thm. 1.6] (1.15) Therefore, the decomposition (1.6b) can be thought of as a generalisation of [HPST20] in various ways. First, (1.6b) holds fairly generally (in the abstract framework) and can be applied to systems well beyond independent copies of Markov jump processes studied in [HPST20]. Second, (1.6b) exactly characterises the gap in the inequality (1.14) via L F −2λF sym which we discarded in this discussion due to its non-negativity. And third, a different version of the FIR inequality can also be derived from (1.6c).
It should be noted that the FIR inequalities have been used in the literature as a priori estimates to study singular limits, and we expect that the decomposition (1.6b) and inequality (1.14) will serve the same purpose for a considerably larger class of systems. However, in this paper we limit ourselves to the local-intime decompositions (1.6b) as opposed to the global-in-time inequality (1.15) discussed in [HPST20], since moving from local to global descriptions is a nontrivial technical step outside the scope of this work.

MFT and (non-)quadratic cost function
As stated earlier, most MFT literature is concerned with the diffusive scaling of underlying stochastic particle systems which converge to diffusion-type macroscopic partial differential equations and corresponds to quadratic cost functions of the form [BDSG + 15] L(ρ, j) = 1 2 j − j 0 (ρ) 2 ρ , for some Hilbert norm · ρ .
Crucial arguments in MFT are based on the fact that the dissipative and the non-dissipative effects are orthogonal in this Hilbert space, i.e. F sym (ρ), F asym (ρ) ρ ≡ 0.
However, even the simple example of independent particles on a finite graph (see Example 2.1) yields a non-quadratic cost function L, and the aforementioned orthogonality arguments break down. In [KJZ18] (for independent jump processes) and [RZ21] (for chemical reactions) these ideas are ported to the non-quadratic setting by introducing a generalised notion of orthogonality, where the pairing is no longer bilinear, and rather satisfies a relation of the form (1.16) By contrast, the abstract theory that we develop is not necessarily based on such orthogonality relations, although we do borrow many notions such as time-reversed cost-functions and forces from MFT. However we will show that within our framework, one can also construct a generalised orthogonality pairing θ ρ (fully characterised by L) that satisfies (1.16), and coincides with the bilinear pairings ·, · ρ in case of quadratic cost functions and with θ ρ (·, ·) from [KJZ18,RZ21] in the case of specific non-quadratic cost functions. This will be the content of Subsection 2.4. Indicator function associated to {x}

Summary of notation and outline of the article
In Section 2 we present the abstract framework and theory. In Section 4 we analyse the zero-cost velocity for the antisymmetric L-function in the setting of independent particles on a finite graph. In Section 5 we apply the abstract theory to various stochastic particle systems and conclude with discussion in Section 6. In Section 3 we connect (and thereby motivate) the abstract ideas developed in Section 2 to large deviations.

Abstract theory
In the introduction we worked with the large-deviation cost; we now work with its abstraction, the socalled the L-function 2 . In what follows we first introduce the L-function and other key ingredients of the abstract framework in Section 2.1. Using these objects we introduce dissipation potentials, tilted L-functions and Fisher information in Section 2.2. Using time-reversal-type arguments from MFT, in Section 2.3 we introduce time-reversed L-functions, symmetric and antisymmetric forces, and in Section 2.4 we introduce a generalised notion of orthogonality satisfied by these forces. Section 2.5 contains various decompositions of the L-function and in Section 2.6 we study the symmetric and antisymmetric L-function. Throughout this section we will use the guiding example of Independent Markovian Particles on a Finite Graph (IPFG), which we now introduce.
Example (IPFG). 2.1. Let X be a finite graph with strict ordering. Consider n independent Markovian particles X 1 (t), . . . X n (t) on X , with irreducible generator Q ∈ R X ×X . The particle density (also called empirical measure or mean field), defined as ρ (n) (t) := n −1 n i=1 δ Xi(t) , is a Markov process on R X with generator where 1 x is the indicator function for x ∈ X . With a suitable initial condition, Varadarajan's Theorem implies that the random process ρ (n) converges in the many-particle limit n → ∞ to the deterministic solution of the ODEρ (t) = Q T ρ(t). (2.1) In addition to the empirical measure, we will also track the number of jumps through each edge, which characterises the flux over an edge. For reasons that will be clarified in Section 2.2, it is important to consider net fluxes (over the usual one-sided fluxes), defined on half of the edges (for this purpose we impose an arbitrary ordering < on the finite set X ) X 2 /2 := (x, y) ∈ X × X : x < y . (2.2) More precisely, the so-called integrated net flux W (n) xy (t) over the edge connecting x, y ∈ X , is defined as the difference between the number of jumps from x → y and in the opposite direction from y → x in the time interval [0, t], all rescaled by 1 n . Then the pair (ρ (n) (t), W (n) (t)) is again a Markov process, now in R X × R X 2 /2 with the generator This process converges as n → ∞ to the solution of the macroscopic system where the operator div x j := y∈X :y>x is the discrete divergence for net fluxes. Indeed the system (2.3) is of the form (1.1). In the many-particle limit (n → ∞), the random fluctuations around the mean behaviour decay fast due to averaging effects. The unlikeliness to observe an atypical flux for large but finite n is quantified by the large-deviation principle, formally written as otherwise, (2.5) where the L is given by [Ren18a,Kra21] (the flux j is a placeholder forẇ) which uses the Boltzmann function otherwise. (2.7) Here I 0 is the large-deviation rate functional corresponding to the initial distribution of ρ (n) (0). Indeed L(ρ, j) is non-negative and minimised by (2.3). Due to the contraction principle [DZ09, Thm. 4.2.1], the infimum is taken over all non-negative one-way fluxes (j + xy ) x<y and (j + yx − j yx ) x>y . Applying the contraction principle, the empirical measure satisfies the following large-deviation principle, whereL is related to L via (1.8), (ρ(t),ρ(t)) dt .

Abstract framework
Although at first sight the general setup in this section may seem heavy, it appears naturally in various specific systems. We illustrate this via our guiding example.
Between the two manifolds we define the map φ : W → Z as where div is the discrete divergence from (2.4), ∇ xy ξ := ξ y − ξ x and ρ 0 ∈ Z is an arbitrary but fixed reference measure. Hence the continuity equation can be abstractly written as u = dφ w j ∈ T φ[w] Z for j ∈ T w W. It will be important that the operator φ is surjective. For an arbitrary µ ∈ M 1 (X ), the difference µ − ρ 0 ∈ M 0 (X ). Note that the underlying dynamics (2.3) as well as any path with J (ρ, w) < ∞ conserves the total mass as well as the non-negativity of ρ(t), so that the states will in fact be restricted to the simplex P(X ) ⊂ M 1 (X ) ⊂ R X of probability measures on X (i.e. coordinate-wise non-negative vectors in R X which sum to one). However, we always work with the full manifold M 1 (X ) so that derivatives and the (co)tangent spaces are well defined without needing to worry about boundaries, boundary points etc. Instead we set L(ρ, j) = ∞ whenever ρ lies on (or outside of) the boundary ∂P(X ) and the flux j ∈ T ρ W pushes the state in the outward direction. Indeed, the functional J (ρ, w) and cost L(ρ, j) from Example 2.1 are defined for all ρ ∈ Z = R X , but for any path with J (ρ, w) < ∞, the densities are contained in P(X ).
For the above example dφ w , dφ T w and the (co)tangent spaces T w W, T * w W do not depend on w. In practice, dφ w , dφ T w and T w W, T * w W might depend on w, but only through the corresponding state ρ = φ[w], as for example in a contuinity equation of the form v = − div(ρj). By a slight abuse of notation we shall therefore write dφ ρ , dφ T ρ and T ρ W, T * ρ W for ρ ∈ Z. In particular, this allows us to write L : Inspired by these observations we now introduce the state-flux triple, L-function and the quasipotential, which are the key ingredients in the abstract framework.
The state-space Z and the flux-space W are differentiable Banach manifolds, with corresponding local tangent Banach spaces T ρ Z and T w W.
, so that by a slight abuse of notation we can replace T w W by T ρ W and write T W := {(ρ, j) : ρ ∈ Z, j ∈ T ρ W}.
(iv) φ has a linear bounded differential that depends on w only through ρ = φ[w], so that by a slight abuse of notation we write dφ ρ : The Banach structure should be seen as a reference norm only, that we use to define Gateaux derivatives, the Banach dual spaces T * ρ W, T * ρ Z and the duality pairings T * ρ Z ·, · TρZ , T * ρ W ·, · TρW (where we omit the indices since it will be clear to which spaces the elements belong). Analogously we write The differential dφ ρ corresponds to a continuity equation u = dφ ρ j, where dφ ρ is usually minus a divergence operator or some generalisation thereof. The assumption that dφ is bounded, ensures the existence of a well-defined adjoint. In order to avoid confusion with convex duality, we will denote adjoint operators by T, e.g. dφ T ρ : T * ρ Z → T * ρ W. Remark 2.4. Our state-flux triple is essentially identical to the framework of [ACE + 23]; there Z is called the 'base manifold', T W is called the 'total manifold', and the differential dφ : T W → T Z is called the 'anchor map'.
Definition 2.5. For any S ⊆ Z define (2.10) A mapping L : T S W → R ∪ {∞} is called an L-function on S, if for all ρ ∈ S: is convex and lower semicontinuous (with respect to the Banach norm on T ρ W).
While this definition allows for flexibility in the domain, throughout this paper we will reserve the symbol L for L-functions on the full space S = Z. From Section 2.2 onwards we will encounter functions L G that are only defined on proper subsets of Z (see Remark 2.8 below). The inclusion of ∞ in the codomain of L is essential to encode forbidden fluxes as discussed in Example 2.2.
By lower semicontinuity and convexity, L(ρ, ·) is its own convex bidual with respect to the second variable [Pey15,Prop. 3.56], i.e. there exists an H : It is easy to see that L is an L-function if and only if for any ρ ∈ Z, H(ρ, 0) = 0, H(ρ, ·) is convex, lower semicontinuous, proper and bounded from below by an affine function. Typically L(ρ, 0) < ∞, so that H(ρ, ·) is bounded from below.
We are now ready to introduce the following notion of the quasipotential.
We stress that this notion of a quasipotential is only related to the convex dual H of some abstract function L, where a priori no stochastic particle system is involved. Both nowhere differentiable functions and the zero function are quasipotentials by definition, and our results are true but mostly trivial in this setting. In all the examples we consider, (2.12) will have at least one non-trivial solution and in fact this definition is consistent with the the usual definition from statistical physics when large deviations are involved (see Section 3.2). We envisage that (2.12) should be understood in the sense of viscosity solutions, however it is not clear how one can define a viscosity solution in the general setup of this section.
Example (IPFG). 2.7. In Example 2.1, the processes X 1 (t), X 2 (t), . . . are irreducible and X is finite which ensures the existence of an invariant measure π ∈ P + (X ) (the space of strictly positive probability measures). Consequently, the n-particle density ρ (n) (t) admits an invariant measure Π (n) ∈ P(R X ), where By Sanov's theorem, the large-deviation rate functional corresponding to Π (n) is where s(· | ·) is defined in (2.7), and hence V is indeed the quasipotential corresponding to L in the classical large-deviation sense (see Theorem 3.7). This can also be checked macroscopically by verifying (2.12), without invoking any connection to large deviations of a microscopic particle system. To check this, we first calculate the convex dual of the L-function (2.6): Note that while V(·) would be nowhere differentiable as a functional on R X , it is differentiable at all ρ ∈ P + (X ) (which is a subset of the manifold M 1 (X ) introduced in Example 2.2) since π x > 0 for every In fact by the chain rule, ∇dV(ρ) can also be interpreted as the (classical) derivative of V(φ[w]) with respect to w ∈ R X 2 /2 ; this also explains why the constants c do not play a role after taking the discrete gradient. We then check that V is a quasipotential by concluding that at all points of differentiability of V (i.e. for ρ ∈ P + (X )) using Q T π = 0 and y Q xy = 0 we find where the third and fourth equality follows by interchanging the indices in the second terms of the summation.
Remark 2.8. Most of the analysis that follows will be carried out locally for fixed ρ. Therefore the ρdependencies in L(ρ, j) and dφ ρ do not play a role in the calculations. We however include the dependency for two reasons. First, for almost all practical applications, L and dφ ρ will depend on ρ, either explicitly or implicitly through the domains of definition T ρ W, T ρ Z. Second, even though writing the ρ-dependency is standard in the literature, so far practically all literature on the topic completely ignores the problems at the boundaries, where V may cease to be differentiable due to the appearance of log 0. Our paper is one of the first to make completely precise claims in regards to domain of definitions for various objects involved by very carefully identifying all points ρ for which our results hold; this also motivates the definition of L-functions on subsets S.

Dissipation potentials, tilted L-functions and Fisher information
While the concept of a dissipation potential is standard [CV90,LS95,Mie11], the connection to convex analysis [MPR14] and the application to flux spaces is more recent Classically, a dissipation potential Φ(ρ, j) is convex, lower semicontinuous in the second variable, and satisfies inf Φ(ρ, ·) = 0 = Φ(ρ, 0). To define the dissipation potential in our context, we first present the following basic result on L, which was originally derived in the context of gradient flows where the driving force is the derivative of a certain free energy. As in the literature [Sch76, MN08, Mae17, KJZ18, Ren18a, RZ21], the setting with fluxes allows for more general driving forces. We first focus on a driving forceζ ∈ T * ρ W for a fixed ρ; and later introduce it as a ρ-dependent force field F (ρ).
Definition 2.10. Let L be an L-function on Z. Define and recall the definition of the restricted (co)tangent spaces (2.10). The driving force F and dissipation potentials (corresponding to L) are defined as Note that, Φ * as defined in (2.16) indeed satisfies inf Φ * (ρ, ·) = 0 = Φ * (ρ, 0), since −F is a minimiser of H(ρ, ·) by (2.15), and consequently inf Φ(ρ, ·) = 0 = Φ(ρ, 0) which makes Φ a dissipation potential. Furthermore combining Theorem 2.9 with Definition 2.10, for any (ρ, j) ∈ T Dom(F ) W we have the decomposition In what follows we will make use of (2.18) The following lemma states that the dissipation potential is indeed symmetric in Dom symdiss (F ).
Example (IPFG). 2.12. In practice the force (2.15) is more easily calculated via the equivalent statement dH(ρ, −F (ρ)) = 0. Since ξ = 1 2 log d c minimises ξ → c(e ξ − 1) + d(e −ξ − 1), we find This definition of the driving force has been introduced in [KJZ18, Sec. 2.2]. Using (2.16), the dissipation potentials are given by These dissipation potentials are indeed symmetric (since cosh is even), and therefore Dom symdiss (F ) = Dom(F ). Note that, while a priori Φ and Φ * are only defined for strictly positive probability measures, they can easily be extended to the full space Z = P(X ). For instance, the observation that lim a→0 a cosh * ( x a ) = 0 if x = 0 and +∞ otherwise, offers a trivial extension of Φ to Z, which also reflects the idea "vanishing jump rates guarantee vanishing fluxes".
We note that the Hamiltonian corresponding to one-way fluxes is given by for which the corresponding driving force does not exist at all, i.e. Dom(F one-way ) = ∅ (also see [Ren18a,Rem. 4.10]). Hence one can only construct a meaningful macroscopic fluctuation theory for net fluxes. This further justifies the net-flux approach used in this paper, as opposed to the one-way fluxes typically used for Markov jump processes.
Remark 2.13. In the IPFG example above and all the examples considered in Section 5, Dom symdiss (F ) = Dom(F ), i.e. the dissipation potential is symmetric. However, in general Dom symdiss (F ) may be an (empty) subset of Dom(F ) as the following construction shows. Consider Z = W = R and φ = id. Let H(ρ, ζ) = −ζ +e ζ −1, which corresponds to a real-valued Markov process with generator ( So far we have dealt with L-functions on Z. Using (2.14), we now introduce L-functions defined on subsets of Z. For a given L and an appropriate cotangent field G(ρ), using (2.14) we can define a (G-tilted) L-function L G defined on a subset of Z. We call this a 'tilted' L-function since its definition is motivated by tilted Markov processes (see Section 3.1). Although, technically G is a cotangent field, in this paper we will often refer to it as a force field due to physical considerations.
Definition 2.14. Let L be an L-function on Z. For any G : Lemma 2.15. Let L be an L-function on Z. The tilted function L G is an L-function on Dom(F ) ∩ Dom(G), and satisfies the decomposition The two equalities follow by using convex duality and (2.13), (2.14) withζ = F . For special choices of G(ρ) we obtain L F (ρ, j) = L(ρ, j) and L 0 (ρ, j) = Φ(ρ, j). (2.22) Example (IPFG). 2.16. For any force field G(ρ) ∈ R X 2 /2 we have We now define the notion of generalised Fisher information which was introduced in Section 1.1.
As discussed in Section 1.1, it is important to choose λ and ζ such that R λ ζ is non-negative, as this guarantees that the corresponding powers are non-negative along the zero-cost flux. The following result explores the set of force fields for which this is true (also see Figure 2). Proposition 2.18. Let L be an L-function on Z. For any ρ ∈ Z we have where j 0 is the zero-cost flux for L (see Definition 2.5).
The claim then follows from the definition of the Gateaux derivative.
Note that [HPST20, Thm. 1.7] is a special case of this result for the IPFG example. Following [HPST20], we call R λ the generalised Fisher information since it generalises the classical notion of Fisher information as the dissipation rate of free energy along the solutions of the zero-cost flux of the L-function. This property follows by using (2.25) with appropriate choices for ζ. In the next section we construct ζ for which R 1 2 ζ (ρ) = 0 and the above result can be applied.

Reversed L-function, symmetric and antisymmetric forces
Inspired by the notion of time-reversibility in MFT we now introduce the reversed L-function which will then be used to define symmetric and antisymmetric forces. From now on we assume that V is a quasipotential associated to L in the sense of Definition 2.6.
Definition 2.19. Let L be an L-function on Z. For any ρ ∈ Z where V is Gateaux differentiable and any j ∈ T ρ W, we define the reversed L-function as This notion of the reversed L-function is motivated by the large-deviations of time-reversed Markov processes (see Section 3.3 for details). Note that we use the name reversed L-function as opposed to timereversed L-function since there is no time variable in this abstract framework.
The following result states that ← − L is indeed an L-function, and discusses the driving force and dissipation potential associated to it.
(iii) Additionally, if ρ ∈ Dom(F ) (recall Definition 2.10), then the driving force and dissipation potentials corresponding to ← − L are given by Proof. (i) Follows by a straightforward calculation of the convex dual.
The following result summarises these ideas.
Then for any ρ ∈ Dom(F asym ), (2.28) Note that while we make use of the reversed L-function to construct the symmetric and antisymmetric force, it does not explicitly appear in their definition. In the case of zero antisymmetric force, i.e. F asym (ρ) = 0, the driving forces satisfy F (ρ) = ← − F (ρ) = F sym (ρ), which is the setting of dissipative systems (see Section 2.6).
The expression πx πy Q xy is the generator matrix for a single time-reversed jump process [Nor98, Thm. 3.7.1]. Again, beware that a priori ← − H and ← − L are only defined on Z = Dom(F ), but can be continuously extended to P(X ) in a straightforward manner.
The symmetric and antisymmetric (with respect to the reversal) components of the driving force are (also see [KJZ18]) with Dom(F ) = Dom(F sym ) = Dom(F asym ) = P + (X ). Note that for reversible Markov chains, i.e. those satisfying detailed balance, F asym = 0.
Recall the generalised Fisher information R λ ζ from Definition 2.17, and that we are looking for force fields that make this quantity non-negative. The following result shows that R 1 2 ζ (ρ) = 0 for ζ = 2F (ρ), 2F sym (ρ), 2F asym (ρ). This will be crucial to derive the key decompositions of L in Section 2.5.
In this result we make use of (analogous to (2.18)), Lemma 2.23. Let L be an L-function on Z. We have Figure 2 is a schematic diagram of force fields ζ for which R λ ζ is non-negative. Note that, while there are various possibilities for such ζ, we focus on ζ = 2F (ρ), 2F sym (ρ), 2F asym (ρ) since they correspond to the physically relevant powers defined in (1.4) and (1.5).
Remark 2.24. For all ρ ∈ Dom(F asym ), we can write the reversed function as a tilting in the sense of (2.20) Using (2.21), the corresponding reversed L-function then satisfies

Generalised orthogonality
Before we continue with deriving the main decompositions (1.6) of the L-function, we elaborate further on the decomposition of the driving force F into the symmetric force F sym and antisymmetric force F asym , and investigate the natural question whether these forces are orthogonal in some sense. It turns out that they are indeed orthogonal in a generalised sense, and using this notion of orthogonality we can already derive decompositions (1.6) for λ = 1 2 . As discussed in the introduction, in MFT the dissipation potentials are often squares of appropriate Hilbert norms · ρ , and in that setting one can write where ·, · ρ is the inner product induced by the norm. Typically F sym and F asym are orthogonal in the sense that F sym , F asym ρ = 0. We reiterate these ideas in Section 5.3 which deals with the classical MFT setting of lattice gases. However this orthogonality relation is specific to the quadratic setting. A generalised notion of orthogonality was introduced in [KJZ18] for non-quadratic dissipation potential (2.19) corresponding to independent Markov chains which have cosh-type structure (see Example 2.12) and this principle was further generalised to chemical reaction networks in [RZ21] (see Section 5.2 for details). Based on these results, we now provide a notion of generalised orthogonality which applies to arbitrary dissipation potentials arising within the abstract framework of this section (and does not require any specific structure).
The following result collects the properties of Φ ζ 2 and θ ρ clarifying the notion of orthogonality in the abstract framework. Recall the definition of Dom symdiss (F asym ) from (2.30).
From the general decomposition (2.17) and the generalised orthogonality result above, we can already provide two distinct decompositions of L, as derived in [RZ21,Cor. 4.3] for the case of chemical reactions.
Corollary 2.27. Let L be an L-function on Z. Then for all (ρ, j) ∈ T Dom(F asym ) W, and for all (ρ, j) ∈ T Domsymdiss(F asym ) W, In both decompositions, we may interpret the first three terms as an L-function with a modified force, the fourth term as a Fisher information, and the last term as a power (see Remark 2.32 for details).

Decomposing the L-function
We now present decompositions of the L-function, which are the main results of the abstract theory presented so far. Using G = F, F sym , F asym in (2.21) and encoding convex combinations via the parameter λ, we arrive at three distinct decompositions of L; this corresponds to all the points on the three lines depicted in Figure 2.
Theorem 2.29. Let L be an L-function on Z. It admits the following decompositions (i) For any ρ ∈ Dom symdiss (F ), j ∈ T ρ W and λ ∈ [0, 1], Proof. The decompositions follow directly from Lemma 2.15. The non-negativity of the Fisher informations follows from Proposition 2.18 and Lemma 2.23.
The second equality in (2.35) follows from (2.22) and (2.16) where we use H(ρ, 0) = 0 and the Fisherinformation term vanishes by Lemma 2.23. A careful analysis of the zero-cost flux for L F sym and L F asym will be presented in Subsection 2.6 and Section 4.
Remark 2.32. Using (2.17), we see that (2.36) and (2.37) are the same decompositions as those in Corollary 2.27 which use generalised orthogonality, and that the two corresponding Fisher informations are in fact modified dissipation potentials (as introduced in Section 2.4) This also explains the non-negativity of these Fisher informations for λ = 1 2 .
Example (IPFG). 2.33. Decompositions (2.32), (2.33) and (2.34) hold with the tilted L-functions and the corresponding Fisher informations While non-negativity of these Fisher informations is guaranteed by construction, it can also be proven directly by using (1 − λ)a + λb ≥ a 1−λ b λ . For λ = 1 2 , all three Fisher informations are of the form x =y ( √ · − √ ·) 2 ; interpreting the difference as an abstract discrete gradient, this is reminiscent of the usual Fisher information in continuous space 1 2 (∇ ρ(x)) 2 dx. These decompositions provide new variational characterisations for the IPFG example, which coincide with the classical gradient-flow structure for Markov chains satisfying detailed balance (see Section 2.6) and lead to the FIR inequality as a special case (see Example 2.35 below). The decomposition (2.33) with λ = 1 2 was first discussed in [KJZ18,Cor. 4]. All three L-functions L (1−2λ)F , L F −2λF sym and L F −2λF asym are the large-deviation cost functions for processes with altered jump rates. In particular, L F sym = L F −F asym is the large-deviation cost function corresponding to the jump process with jump rates for a particle to jump from x to y given by where we write ← − v xy := v yx πy πx for the jump rate of a single time-reversed jump process [Nor98, Thm. 3.7.1]. The linearity in ρ x reflects that the system consists of independent Markov particles with generator Q xy ← − Q xy [Ren18a,Kra21]. Similarly, L F asym = L F −F sym is the large-deviation cost function corresponding to a system with jump rates for one particle to jump from x to y given by [PR19] κ asym xy (ρ) := Q xy ρ x ρ y πx πy We can interpret L F asym (ρ, j) as the flux large-deviation cost function corresponding to a system of interacting particles with jump rates nκ asym xy (ρ) [AAPR22]. It should be noted that the usual largedeviation proof techniques break down in this particular case due to the non-uniqueness of solution to the limiting antisymmetric ODE (see Proposition 4.2).
The next corollary connects the decomposition (2.33) to an (abstract-)FIR inequality (recall Section 1.2.3) only defined on the state-space Z and with no dependence on the flux-space W. In order to make this connection we introduce the contracted L-functionL : The definition ofL is inspired by the contraction principle in large-deviation theory, whereL is the largedeviation rate functional only on the state space (recall Example 2.1). This connection will be further clarified in Proposition 3.4.

Symmetric and antisymmetric L-functions
In this section we focus on the two terms L F sym and L F asym in the decompositions (2.37) and (2.36) respectively. Observe that L = L F sym if F asym = 0, and therefore L F sym corresponds to a system with a purely symmetric force. The relation between such systems with gradient flows is well known and follows from the theory in the previous sections, but for completeness we will make this connection explicit here. Similarly, L F asym corresponds to a system with a purely antisymmetric force; in the level of abstraction of our current paper such systems are less understood. Motivated by our analysis in Section 4 and the examples in Section 5 we conjecture below that these L-functions are related to Hamiltonian systems. We first discuss the purely symmetric case. Note that when particle systems and large-deviations are involved, L F sym is the large-deviation cost function of a microscopic system in detailed balance (see Corollary 3.11). In what follows we will make use of the contracted dissipation potentialΨ : Corollary 2.36 (EDI). Let L be an L-function on Z and ρ ∈ Dom(F asym ). For any j ∈ T ρ W we have
Note that the decomposition (2.43) also follows from (2.37) by using (2.13), but for ρ ∈ Dom symdiss (F asym ). Let us first comment on the contracted symmetric functionL F sym . Clearly, its zero-cost velocity u 0 (ρ) satisfies the EDIΨ ρ, u 0 (ρ) +Ψ * ρ, − 1 2 dV(ρ) + 1 2 dV(ρ), u 0 (ρ) = 0, which is equivalent by convex duality to a generalised gradient flow (1.11). Summarising Corollaries 3.11 and 2.36, if a microscopic system is in detailed balance, the large-deviation cost function L = L F sym has a purely symmetric force, and hence induces a generalised gradient flow. This connection between gradient flows and detailed balance was first discussed in this generality in [MPR14]. For the IPFG example, the second symmetry relation in (2.45) correspond to the classical gradient structure for finite-state Markov chains in detailed balance [MPR14,Sec. 4.1] and the decomposition (2.43) is the corresponding flux formulation of the gradient structure for this example [Ren18a, Sec. 4.5]. Note that, strictly speaking (2.43) is not a gradient flow in the density-flux space. However a careful rewriting allows us to see L F sym as a gradient flow, as summarised in the following remark.
Remark 2.37. With L W F sym (w, j) := L F sym (φ[w], j), and applying the chain rule (2.46) In this formulation L F sym is indeed a gradient flow in the density-flux space [Ren18b].
As far as we are aware, the purely antisymmetric cost L F asym has not been studied in the literature, and we could not produce rigorous results for it in the abstract setting of this section. However, as will be discussed in forthcoming sections, we are able to show that for certain examples the zero-cost velocity associated to L F asym is non-dissipative, in the sense that one can associate a non-trivial conserved energy and a skew-symmetric operator to it, which motivates the following conjecture.
Conjecture 2.38. Let L be an L-function on Z andL F asym be the contracted L-function corresponding tô L F asym , i.e.L Then there exists an energy E : Z → R and a skew-symmetric operator J : ρ → (T * ρ Z → T ρ Z) such that the zero-cost velocity ofL F asym can be written as u 0 (ρ) = J(ρ)DE(ρ).
Clearly, the skew-symmetry of J(ρ) implies that the energy E(ρ(t)) will be conserved along solutions oḟ ρ(t) = J(ρ(t))DE(ρ(t)). In fact, for the IPFG and lattice gas examples, the corresponding J even satisfies the Jacobi identity, so that the purely antisymmetric velocity has a Hamiltonian structure (see Sections 4, 5.3 for details).

Formal connection with large deviations
In Section 2 we focussed on the purely macroscopic setting. In this section we motivate the abstract structures introduced therein by connecting them to Markov processes and their large deviations. Although the results presented in this section are largely known in the literature in specific settings, we include them here in a more general setting to provide rationale for the abstract framework discussed in the last section. While these results are formal due to the level of generality at which we work, they can be made rigorous case by case.
Throughout this section we assume a microscopic dynamics described by a sequence of Markov processes (ρ (n) (t), W (n) (t) defined on Z × W. Typically, ρ (n) (t) is the empirical measure, concentration or density corresponding to O(n) particles, and W (n) (t) is the integrated/cumulative particle flux (recall Example 2.1 and see Section 5 for further examples). For now, we assume a fixed deterministic initial condition ρ (n) (0) for the empirical measure; this will be relaxed later on. We always assume that the initial condition for the flux satisfies W (n) (0) = 0 almost surely, since the particles have not moved yet at initial time. For any t ≥ 0, the integrated flux W (n) (t) contains all information required to reconstruct the current state of the system, i.e. almost surely Equivalently, if the random paths allow for a notion of (measure-valued) time-integration, we writė We assume that the sequence (ρ (n) (t), W (n) (t) satisfies a law of large numbers, whereby the microscopic process ρ (n) (t), W (n) (t) converges to a macroscopic, deterministic trajectory (ρ(t), w(t)), which satisfies an equation of the form (1.1), where at this stage we are only interested in the instantaneous flux j =ẇ. Consequently, the corresponding path probability measures P (n) = law(ρ (n) , W (n) ) will concentrate on that path (ρ, w) as n → ∞.
Finally we assume that the sequence (ρ (n) (t), W (n) (t) satisfies a corresponding large-deviation principle in Z × W, which can be formally written as This large-deviation principle characterises the exponentially vanishing probability of paths starting from the fixed deterministic initial conditions which do not converge to the macroscopic path (ρ, w). The function L is non-negative and its zero-cost flux corresponds to the macroscopic path, since for that path P (n) ∼ 1.
In what follows, we first focus on the classical technique for proving the aforementioned large-deviation statement, which motivates the tilted L-function introduced in Lemma 2.15. Consequently we motivate the Definition 2.6 of the quasipotential via the large deviations of invariant measures, and the Definition 2.19 of the reversed L-function using time-reversal.

Tilting, contraction and mixture
Rigorous proofs of large-deviation principles for Markov processes tend to be rather technical. We nevertheless briefly review the classical proof technique, since it is closely related to the macroscopic framework The assumption that H depends on w only via ρ = φ[w] will generally be justified if the noise only depends on the state ρ of the system.
Main proof technique. In order to derive the large deviations (3.1) for a given, atypical path (ρ, w), one changes the probability measure P (n) to a tilted probability measure P (n) ζ . The tilting is defined via a time-dependent force field ζ(t) to be chosen later, and the Radon-Nikodym derivative is explicitly given by (see [PR02] for the generator of the tilted process and related technical details) (3.2) One can then (formally) estimate, for a small ball B ε (ρ, w) around the given atypical path (ρ, w), We choose ζ(t) to be optimum in supζ ζ ,ẇ(t) − H(ρ(t),ζ). It turns out that with this choice, the tilted probability P (n) ζ will concentrate on the given path (ρ, w) and therefore the final term in the right hand side vanishes (even for small ε), which results in Remark 3.2. On this formal level we do not specify the precise topological space in which the large-deviation principle holds; typically one can choose the Skorohod space D(0, T ; Z × W), possibly requiring weaker topologies on Z ×W. However, this topological setting does not influence the geometric picture of Section 2.1. We also stress that although the described proof strategy is classic, there are known cases were it fails [Hey23]. A different proof technique is developed in [FK06], but the main argument described above are the same.
Following similar arguments one can derive the large deviations of the tilted measures.
Corollary 3.3. For a given path ζ(t), the tilted probability P (n) ζ from (3.2) satisfies the large-deviation principle The proof follows from the same arguments as Formal Theorem 3.1, with (3.2) replaced by Note that H ζ−F is exactly as in (2.14) and consequently we interpret the tilted L-functions introduced in Definition 2.14 as the large-deviation cost functions for the tilted probability measures. From the Formal Theorem 3.1, one immediately obtains the following large-deviation principle for the state by applying the contraction principle [DZ09, Thm. 4.2.1], which motivates the definition (1.8) 3 Proposition 3.4. Assume that the large-deviation principle (3.1) holds for the pair (ρ (n) , W (n) ). Then the large-deviation principle also holds for ρ (n) , i.e.
Proposition 3.5 (Mixing [Big04]). Assume that the large-deviation principle (3.1) holds for the pair (ρ (n) , W (n) ) with a deterministic initial condition. If the initial condition is replaced by a sequence ρ (n) (0) ∈ Z which satisfies the large-deviation principle for some functional I 0 : Z → [0, ∞] and W (n) (0) = 0 almost surely, then the pair (ρ (n) , W (n) ) with random initial condition ρ (n) (0) ∈ Z satisfies the large deviation principle Remark 3.6. The abstract framework introduced in Subsection 2.1 automatically fixes the state ρ(0) = φ[0], which coincides with deterministic initial conditions in context of large deviations. Strictly speaking, to work with varying random initial conditions would require additional flexibility in the abstract framework. This can be achieved by either replacing the mapping φ (recall Definition 2.3) by a family of mappings (φ ρ(0) ) ρ(0) , or by keeping a fixed reference state φ[0], and redefining the initial integrated flux as w(0) ∈ φ −1 [ρ(0)], exploiting the surjectivity of φ. To keep the notation simple, we stick to the setup of a deterministic initial condition, and with a slight abuse of notation always tacitly assume that ρ(t) = φ[w(t)] = φ ρ(0) (w(t)).

Quasipotential
We now motivate Definition 2.6 of the quasipotential V. The following result is largely known in the literature, see for instance [BDSG +  Theorem 3.7. Assume that the Markov process ρ (n) (t) satisfies the large-deviation principle (3.4) and has an invariant measure Π (n) ∈ P(Z) that satisfies the large-deviation principle where µ (n) denotes a random variable distributed with Π (n) . Then we have whereL,Ĥ are defined in Proposition 3.4.
Note that (3.7) implies that V is always a Lyapunov function along the zero-cost dynamics, which can also be deduced from the decomposition (2.33).
Hence the large-deviation functional of the left-hand side is equal to the large-deviation rate of the right-hand side, which using a mixing argument [Big04] is given by which proves the first claim. From here on the arguments are purely macroscopic. We proceed by noting that which has the form of the value function from classical control theory, and hence solves the Hamilton-Jacobi-Bellman equationΞ We have already shown that Ξ T ≡ V does not depend on T , and thereforeΞ T (ρ) ≡ 0, which proves the second claim.
Remark 3.8. Strictly speaking, V should be a viscosity solution of the Hamilton-Jacobi-Bellman (3.9) and hence also of the stationary version Theorem 3.7(ii). However, it is not precisely clear to us which boundary conditions should be imposed in the definition of the viscosity solution. This issue is particularly challenging since most classical Hamilton-Jacobi-Bellman theory is developed for quadraticĤ only. Therefore, Theorem 3.7(ii) should be seen as formal. We remind the reader that a viscosity solution V(ρ) is a solution in the classical sense at points of differentiability. At least on a formal level, this already suffices for the applications in this paper.
Remark 3.9. In Theorem 3.7(ii) we do not require that the invariant measure is unique, neither do we claim that the quasipotential V(ρ) will be unique. In particular, we do not require stable points π ∈ Z for whicĥ L(π, 0) = 0 to be unique. In case of uniqueness, the quasipotential from Theorem 3.7(ii) will also satisfy the classical definition of the quasipotential [FW94] V(ρ) = inf In case of multiple stable points, one usually defines a family of non-equilibrium quasipotentials indexed by the stable points [FW94]. Any one of these will also satisfy Theorem 3.7(ii), which is sufficient for our purpose. Therefore the abstract theory from Section 2 can be constructed with any of these quasipotentials.
A special and important case of the previous result pertains to detailed balance.

Zero-cost velocity for IPFG antisymmetric L-function
In Subsection 2.6 we argued that the both the purely symmetric flux and velocity are dissipative, that is, they are generalised gradient flows of the energy 1 2 V (and 1 2 V W respectively). Moreover, L F sym defines the variational structure of those gradient flows via the equalities (2.43) and (2.46).
The interpretation of L F asym is more complicated. In general L F asym will not have V as its quasipotential, and using Lemmas 2.11 and 2.15 for any ρ ∈ Dom symdiss (F asym ) and j ∈ T ρ W it satisfies the time-reversal relation L −F asym (ρ, j) = L F asym (ρ, −j).
This relation in fact holds for any tilted L-function, but −F asym can be interpreted as the time-reversed counterpart of F asym in the sense that ← −−−−−−−−− − F sym + F asym = F sym − F asym (see Remark 2.24). Formally this means that time-reversal reverses the fluxes, which is a physical indication that L F asym might correspond to Hamiltonian dynamics, as proposed in Conjecture 2.38.
In this section we illustrate this principle for the IPFG example with L-function L from Example 2.3. As far as we are aware this is has not been studied in the literature, and as a first step we will focus solely on the trajectories of the zero-cost velocity u(t) =ρ(t) = u 0 (ρ(t)) of L F asym , largely ignoring fluxes as well as the variational structure.
Let (ρ, j) satisfy L F asym ρ(t), j(t) = 0 or equivalently j(t) ∈ ∂Φ * ρ(t), F asym (ρ(t)) , where the subdifferential is with respect to the second variable. Substituting λ = 1 2 in L F −2λF sym (defined in Example 2.33), for any x ∈ X , ρ : [0, T ] → P(X ) satisfies the ODE 5 (4.1) Introducing the change of variables ω x (t) := ρ x (t), the zero-cost velocity (4.1) transforms into a linear ODE with a matrix A ∈ R X ×X , i.e. (4.2) Solutions to this equation have a nice geometric interpretation, see Figure 3 for an example in three dimensions. Clearly, |ω(t)| 2 2 = |ρ(t)| 1 = 1 and so the solutions are confined to the unit sphere S X −1 . On the other hand, the matrix A is skewsymmetric with imaginary eigenvalues and represents rotations around the axis √ π, implying that the solutions are confined to a plane perpendicular to √ π. Therefore, solutions ω(t) lie on the intersection of these planes with the unit sphere, resulting in periodic orbits that conserve the distance of the plane to the origin. In the following result we show that this transformed system is indeed a Hamiltonian system with a suitable energy and Poisson structure which satisfies the Jacobi identity (see Lemma A.1 for a useful alternative characterisation of the Jacobi identity in our context).
Proposition 4.1. The ODE (4.2) admits a Hamiltonian structure (R X ×X ,Ẽ, J), i.e.ω = J(ω)∇Ẽ(ω), where the linear energyẼ : R X → R and Poisson structure J : R X → R X ×X are given bỹ Here ω · v is the standard Euclidean inner product and ω ⊗ v is the outer product of vectors ω, v.
Proof. In Appendix A we present a Hamiltonian structure for a general class of ODEs, which includes the transformed system (4.2). The proof of Proposition 4.1 follows directly from Theorem A.2 with the choice d = |X |, ω * = √ π and observing that |ω * | 2 = x π x = 1 and A √ π = A T √ π = 0 since π is the invariant solution corresponding to the original dynamics (4.1).
We would now like to transform the Hamiltonian structure of the transformed ODE (4.2) back to obtain a Hamiltonian structure for the original non-linear equation (4.1). This transforms the positive octant of the sphere in Figure 3 to the simplex in Figure 1(c). However, transforming back via ω x (t) = √ ρ x (t) is valid only if ω x (t) ≥ 0 for every x ∈ X . In the following result we state the criterion for this to hold. the energy E : R X → R and the Poisson structure J : R X → R X ×X as where A is defined in (4.2). If the energy of the initial distribution ρ 0 ∈ P(X ) for the ODE (4.1) satisfies 0 ≤ E(ρ 0 ) < σ, then (4.1) has a unique solution and admits a Hamiltonian structure (R X ×X , E, J), i.e. ρ = J(ρ)∇E(ρ). If the energy of the initial distribution satisfies E(ρ 0 ) ≥ σ, then (4.1) has non-unique, non-energy-conserving solutions.
Proof. We first analyse the critical case, where the periodic orbit ω(t) of (4.2) touches one of the boundaries of S X −1 ∩ R X ≥0 . The energy level of such an orbit can be calculated by solving the constrained minimisation problem min Ẽ (ω) : ω ∈ S X −1 , ω x = 0 for some x ∈ X = min x∈X min Ẽ (ω) : ω ∈ S X −1 , ω x = 0 .
Here t 1 := min{t ≥ 0 : (e 1 2 At ω 0 )x = 0}, ω 1 := e 1 2 At1 ω 0 and ω 2 := e 1 2 A(t1+δ) ω 1 , andĀ xy := A xy 1 {x,y =x} . Note that δ > 0 must be large enough so that outgoing instead of incoming characteristics cross the boundarŷ x = 0 and small enough that the corners in the simplex are avoided. It is easily checked that ρ(t) is continuously differentiable and satisfies the ODE (4.1). Since δ > 0 is arbitrary we have constructed an infinite number of solutions.  Figure 3: For |X | = 3, the trajectories ω(t) rotate around the √ π-axis, and lie at the intersection of the two-dimensional sphere S 2 and a plane perpendicular to the √ π-axis. The transformation ρ x = √ ω x maps the (octant) sphere to the simplex of Figure 1(c).
In the following remark we comment on the role of λ = 1 2 in L F −2λF sym . Remark 4.3. One can also study the zero-cost velocity associated to L F −2λF sym from (2.33) for λ ∈ (0, 1). For λ < 1 2 , the symmetric part is dominant and the trajectories spiral inwards towards π, i.e. π is a spiral sink, and for λ > 1 2 , the antisymmetric part is dominant and the trajectories spiral outwards from π, i.e. π is a spiral source (compare with Figure 1(c) for λ = 1 2 ). Remark 4.4. As pointed out to us by André Schlichting, the energy E(ρ) = 1 2 x∈X ( √ π x − √ ρ x ) 2 is exactly the squared Hellinger distance between ρ and the steady state π. At this stage we do not know the physical meaning behind the Hellinger distance, but it appears naturally in the context of purely time-antisymmetric flows.

Examples
Throughout Section 2 we applied the abstract theory developed therein to the example of independent Markovian particles. We now apply the abstract theory to three examples of interacting particle systems. In Section 5.1 we consider the example of zero-range processes with an atypical scaling limit which leads to an ODE system in the limit as opposed to the usual parabolic scaling. Section 5.2 deals with the case of chemical reaction networks in complex balance. Finally in Section 5.3 we consider the case of lattice gases with parabolic scaling (which lead to diffusive systems).
For each of these examples we derive the decompositions in Theorem 2.29, and explicitly calculate all the different terms. We stress that these decompositions were previously unknown for zero-range processes and chemical reactions; we include the lattice gas example to show that for quadratic cost functions our decompositions coincide with existing results in MFT. We expect that by using approximation arguments similar to [HPST20, Thm 1.6], [RZ21, Sec. 5] and [Hoe23, Part II.A], one can derive global-in-time decompositions of the rate functionals T 0 L(ρ(t), j(t)) dt; this is beyond the scope of the current paper.

Zero-range processes
Microscopic particle system. To simplify and unify notation, we first consider the irreducible Markov process on a finite graph X from the IPFG example, with generator (represented by a matrix) Q ∈ R X ×X , and assume that it has a unique and coordinate-wise positive invariant measure π ∈ P + (X ). Similar to the setup in Example 2.1 we study the Markov process (ρ (n) (t), W (n) (t)) on P(X ) × X 2 /2, where ρ (n) (t) is the particle density of interacting particles and W (n) (t) is the integrated net flux (both defined in Example 2.1). The interaction between the particles is so that the jump rate nκ xy (ρ) from x to y only depends on the density at the source node x only ("zero-range") κ xy (ρ) = κ xy (ρ x ) = Q xy π x η x ρx πx , for a family of functions η x : [0, ∞) → [0, ∞) satisfying: (i) each η x is strictly increasing, (ii) η x (0) = 0 and η x (1) = 1, The condition η x (0) = 0 ensures that ρ x ≥ 0, i.e. there are no negative densities. The condition η x (1) = 1 ensures that π is also an invariant measure for the many-particle limit (5.1), and is assumed only for convenience (see Remark 5.2 below). The integrability condition is necessary and sufficient for the largedeviation principle to hold [AAPR22]. Observe that the particular choice η x ≡ id corresponds to the IPFG model.
State-flux triple and L-function. The manifolds Z, W with the corresponding tangent and cotangent spaces and the map φ : Z → W with dφ ρ = −div, dφ T = ∇ are exactly as in Example 2.2. It is easily checked that L and H are convex duals of each other, so that L is indeed convex and lower semicontinuous.
Note that V depends on Q through the steady state π only. Moreover, the integral is well-defined due to the integrability condition on η x . This function can be found as the large-deviation rate of the explicitly known invariant measure Π (n) using Theorem 3.7, [KL99,Prop. 3.2] and [GR20, Sec. 4.1]. However, in the next proposition we show that it is the correct quasipotential without any reference to a microscopic particle system, in the macroscopic sense of Definition 2.6.
Proof. At the points of differentiability of V we have where the fourth and fifth equality follows by exchanging indices and the final equality follows since Q T π = 0.
Remark 5.2. Let us discuss the various assumptions on η x . Since η x is nonnegative and strictly increasing, it follows that V(ρ) is strictly convex for any ρ ∈ P(X ), and consequently has a unique minimiser. The property η(1) = 1 ensures that π is this unique minimiser of V. If this condition is not satisfied then, as we show below, one can always construct Q ∈ R X ×X , π ∈ P + (X ) and family η x with η x (1) = 1, such that κ xy (ρ) = Q xy π x η x ρx πx , Q Tπ = 0, and π is the unique stable point of (5.1). To calculate these modified objects, we minimise V(ρ) for ρ ∈ P(X ), which gives the minimiser .
It is easily checked that these modified objects satisfy all the properties described above, and one can work with these objects instead.
with the dissipation potentials Since ℓ → cosh(ℓ) is an even function, using Lemma 2.11 it follows that Dom symdiss (F ) = Dom(F ), i.e. the dissipation potential is symmetric. Using Corollary 2.21 we find with Dom(F sym ) = Dom(F asym ) = P + (X ). Observe that the expressions of F sym and F asym imply that their domains can be easily extended to P + (X ) and P(X ) respectively; however the theory of Section 2 will not automatically be valid on that extension. Also note that F asym xy = 0 if the particle system satisfies detailed balance with respect to π. The orthogonality relations in Proposition 2.26 apply with (see [RZ21]) Decomposition of the L-function. The decompositions in Theorem 2.29 hold with the L-functions and the corresponding Fisher informations In particular, with η x ≡ id, we indeed arrive at the expressions in Example 2.33. With the expressions above the zero-range model satisfies the FIR inequality from Corollary 2.34 for λ = 1 2 , which is consistent with [RZ21,Cor. 4.3] but also holds more generally for λ ∈ [0, 1]. We also mention that the zero-cost flux for the symmetric L F sym satisfies EDI (see Corollary 2.36), i.e. it induces a gradient flow structure. We now turn our attention to its antisymmetric counterpart.
Zero-cost velocity for antisymmetric L-function. As in the IPFG case in Section 4, we now consider the zero-cost velocity associated to L F asym which for any x ∈ X solves the ODĖ Note that the corresponding ODE for IPFG (4.1) follows with η x ≡ id. The geometric arguments of Section 4 cannot be fully repeated, because it is unclear how to transform (5.4) into a linear equation. However, by analogy to that section, we make an educated guess for the energy and the Poisson structure, which is summarised in the following result. We will make use of the following family of functions g x : for every x ∈ X . Using these functions we now show that the Conjecture 2.38 holds for the zero-range process.
Proof. For any x ∈ X we have where the third equality follows since y π y = 1 and (A T √ π) y = 0 for any y ∈ X . Finally, note that (5.4) has unique solutions if the right hand side is Lipschitz, which follows if ρ x > 0, since η x (0) = 0, for every x ∈ X . The expression (5.5) for this threshold follows by solving min E(ρ) : ρ ∈ P(X ), ρ x = 0 for some x ∈ X = min x∈X min E(ρ) : ρ ∈ P(X ), ρ x = 0 , where λ x in (5.5) is the Lagrange multiplier for the constraint z =x ρ z = 1. The non-uniqueness of solutions follows if E(ρ 0 ) ≥ σ due to non-Lipschitz right-hand side in (5.4).
The equation (5.4) may have an underlying Hamiltonian structure, but while the matrix field J(ρ) proposed here is skew-symmetric, it generally does not satisfy the Jacobi identity.

Complex-balanced chemical reaction networks
Microscopic particle system. We now describe a particle system that is commonly used to model chemical reactions. For a detailed review of this particle system with motivation and connections to related particle systems see [AK11].
Let X be a finite set of species, R be the finite set of reactions between the species, and let the vectors γ (r) ∈ R X denote the net number of particles of each species that are created/annihilated during a reaction r ∈ R. Furthermore, let R = R fw ∪ R bw such that each forward reaction r ∈ R fw corresponds to a backward reaction bw(r) ∈ R bw , meaning that γ (bw(r)) = −γ (r) for all r ∈ R fw 6 . The set R fw will play the role of X 2 /2 from Example 2.1.
The microscopic model involves a finite volume V that controls the number of randomly reacting particles in the system. For a fixed V , we study the random concentration or empirical measure ρ (V ) x (t), which is the number of particles belonging to species x ∈ X . Note that the total number of particles may not be conserved here, as opposed to the setting of Example 2.1. We also consider the integrated net reaction flux for r ∈ R fw , Forward and backward microscopic reactions r take place with given microscopic jump rates V κ (V ) r (ρ (V ) ) and V κ (V ) bw(r) (ρ (V ) ) respectively. Typically these jump rates are modelled with combinatoric terms (B.2), see also [AK11]. Since our framework is purely macroscopic, the precise expressions for the microscopic jump rates are not relevant; the only crucial point is that both converge sufficiently strongly to macroscopic reaction rates κ r (ρ) and κ bw(r) (ρ). The pair (ρ (V ) (t), W (V ) (t)) is a Markov process on R X × R R fw with generator Using the matrix notation Γ := [γ (r) ] r∈R fw ∈ R X ×R fw , in the limit V → ∞ the pair (ρ (V ) , W (V ) ) converges to the solution of (see [Kur70] and [RZ21,Sec. 3 x ∈ X . and s(· | ·) is defined in (2.7). As in the IPFG and zero-range models, the infimum over one-way fluxes j + can be derived using the contraction principle. We mention that at this level of generality one can already derive many interesting MFT properties, see [RZ21]. After all, the IPFG and zero-range models fall within this class. However, in order to apply our framework and obtain explicit results, the quasipotential needs to be known. To this aim we make two crucial assumptions. First, the system satisfies mass-action kinetics i.e. there exists stoichiometric vectors or complexes α (r) ∈ R X ≥0 (encoding the number of reactants involved) and reaction constants c r > 0 for each r ∈ R such that γ (r) = α (bw(r)) − α (r) , γ (bw(r)) = α (r) − α (bw(r)) , and the forward and backward rates satisfy, setting ρ α (r) := x∈X ρ Second, we assume that the system is in complex balance [ACK10, Sec. 3.2] with respect to some π ∈ R X >0 , i.e.
State-flux triple and L-function. Fix a reference or initial concentration ρ 0 ∈ R X ≥0 and recall the matrix notation Γw = r∈R fw γ (r) w r . The state space is the flat manifold of concentrations that can be produced from ρ 0 via reactions, with corresponding local (co)tangent spaces: As in the case of IPFG and zero-range, we include negative concentrations to simplify the geometric setting; this set Z is known in the literature as the stoichiometric compatibility class, whereas the subset of Z of coordinate-wise non-negative concentrations is called the stoichiometric simplex 7 . Moreover, as in the previous examples, T ρ Z restricts the directions of R X in which one can differentiate, and T * ρ Z appears as a quotient space. Indeed the Euclidean inner product between tangents u = Γj ∈ Ran(Γ) and cotangents ξ ∈ R X / Ker(Γ T ) is again invariant under addition of vectors ν ∈ Ker(Γ T ), since T * ρ Z ξ + cν, u TρZ = (ξ + cν) · Γj) = ξ · u. The space Ker(Γ T ) encode the quantities (usually numbers of atoms) that are conserved under the reactions.
The flux space and its associated tangent and cotangent spaces are simply the Euclidean space and the continuity map φ : W → Z and its differential are Note that with this setup, φ is indeed surjective. Again, L is convex and lower semicontinuous since L is its own convex bidual.
Quasipotential. The quasipotential is again the relative entropy with respect to the invariant measure, ∞, otherwise.
Recall the relation between the quasipotential and the large-deviation rate functional for the invariant measure of the microscopic system from Theorem 3.7. Whereas in the IPFG model this relative entropy appears as the large-deviation rate functional for independent particles by Sanov's Theorem, in the complex balance case this is the rate functional of the explicitly known invariant measure of the microscopic particle system [ACK10, Thm. 4.1]. As in the previous examples, it can also be checked purely macroscopically that this is the correct quasipotential satisfying (2.12). In fact, it turns out that (2.12) is equivalent to complex balance; both directions of the equivalence will be shown in Theorem B.1 in Appendix B.
Remark 5.4. As mentioned in Subsections 1.1 and 3.2, the quasipotential V is always a Lyapunov function along the zero-cost dynamics (1.1). For the case of chemical reactions this was worked out explicitly in [ACGW15].
Dissipation potential, forces and orthogonality. The driving force is recalling that κ r (ρ) = c r ρ α (r) and Z + denote the positive concentrations in Z. The dissipation potentials are Note that Dom symdiss (F ) = Dom(F ), i.e. the dissipation potential is symmetric. Following Corollary 2.21, the symmetric and antisymmetric forces are with Dom(F sym ) = Dom(F asym ) = Dom(F ) = Z + . The orthogonality relations in Proposition 2.26 apply with This notion of generalised orthogonality is consistent with the derivations in [RZ21].
Decomposition of the L-function. The decompositions in Theorem 2.29 hold with the L-functions with the corresponding Fisher informations .
The zero-cost flux for L F sym is related to a gradient flow by Corollary 2.36; this has been discussed in [Ren18a,Cor. 4.8]. As opposed to IPFG and zero-range examples, the construction of a Poisson structure for L F asym is difficult in the chemical reaction setting due to the non-locality of the jump rates and the interplay with the stoichiometric vectors, and remains an open question.

Lattice gases
In this section we focus on the typical setting of MFT [BDSG + 15], namely discrete state-space particle systems whose hydrodynamic limit is the following drift-diffusion equation on the torus T d : As before ρ ∈ P(T d ) is the limiting density of the particle system, but now ∇, div denote the continuous differential operators in R d . We assume that the strictly positive potential U ∈ C ∞ (T d ; (0, ∞)), covector field A ∈ C ∞ (T d ; R d ) and the 'mobility' χ ∈ C ∞ (R; [0, ∞)) are smooth, and that furthermore 8 , Most results about this class of models are well known; we present them here to show that our abstract theory is consistent with 'classical' MFT.
Microscopic particle system. Although the macroscopic framework works for general mobilities, we only describe two standard microscopic particle systems that give rise to different mobilities. For independent random walkers χ(a) = a, h(a) = a log a − a + 1 and for the simple-exclusion process χ(a) = a(1 − a), h(a) = a log a + (1 − a) log(1 − a). Since these two particle systems with limit (5.11) have been extensively studied in the literature, we only present the essential features here.
For both particle systems, the particles can jump to neighbouring sites on the lattice T d ∩ ( 1 n Z) d . In order to pass to the hydrodynamic limit (5.11) and derive the corresponding large deviations, the state space will be embedded in the continuous torus. The first particle system consists of independent random walkers with drift. For any n ∈ N, the corresponding empirical measure-flux pair (ρ (n) (t), W (n) (t)) is a Markov process in P(T d ) × M(T d ; R d ) with generator (see [Ren18b]) × f ρ − 1 n d δ x + 1 n d δ x+ 1 n τ , w + 1 n d+1 τ δ x+ 1 2n τ − f (ρ, w) . This system can also be derived as the spatial discretisation of interacting stochastic differential equations, although in such continuous-space setting it becomes less straight-forward how to define particle fluxes. The second particle system is the weakly asymmetric simple exclusion process (WASEP) which has been extensively studied in the MFT literature (see for instance [BDSG + 07, BDSG + 15]). In this case the Markov process (ρ (n) (t), W (n) (t)) has generator Observe that in both generators, the flux w has a different scaling than the particle density ρ. This is required to ensure that the discrete-space, finite-n continuity equation converges to the continuous-space continuity equation with differential operator − div.
8 In order to make sure that the rate functional T 0L (ρ(t),ρ(t)) dt = ∞ whenever negative concentrations are reached, one should in fact require χ(a) ≡ 0 for all a / ∈ [0, 1], and assume that χ is continuous and smooth away from its zeros.
Letting n → ∞ we arrive at the hydrodynamic limit (5.11) with χ(a) := a for the first particle system and χ(a) := a(1 − a) for the second particle system. The corresponding large-deviation cost function and its dual are else. (5.13) Here L 2 (χ(ρ)) is the χ(ρ)-weighted L 2 -space on T d with f 2 L 2 (χ(ρ)) := T d f (x) 2 χ(ρ(x))dx and · L 2 (1/χ(ρ)) is the dual norm to · L 2 (χ(ρ)) . Note that L is constructed by taking the convex dual of H which is defined in terms of · L 2 (χ(ρ)) . See State-flux triple and L-function. Apart from the fact that the state space is infinite-dimensional, the lattice gas example differs from the previous examples in a number of ways. First of all, in this setting, one actually has a microscopic state-flux triple (Z n , W n , φ n ) that converges to the macroscopic one (Z, W, φ) in a suitable sense, see for example [Ren18a,Sec. 5]. For simplicity we only present the macroscopic structure. The second difference is that the cost (5.13) happens to be a quadratic functional, which induces a norm on the cotangent space. However as in the finite-dimensional examples, we regard such induced geometry to be a posteriori; one first needs a basic geometric setup in order to derive the dissipation potentials. Therefore we shall work with the following setting, and discuss the geometry induced by (5.13) in Remark 5.5.
The induced flux space is the metric space And the continuity operator is again (5.14). This setup is slightly different from the standard Wasserstein geometry, where by convention the fluxes are defined so as to satisfyρ = div(ρ j), while in our context the fluxes satisfyρ = div j. However, this induced state-flux triple is formal, as Z and W are not Banach manifolds, and differentiability of the quasipotential V becomes less straightforward. We therefore work with the simpler triple described above.
Observe that Dom symdiss (F ) = Dom(F ), i.e. the dissipation potential is symmetric. Following Corollary 2.21, the symmetric and antisymmetric forces are Indeed the antisymmetric force F asym is again independent of ρ.
The positivity of these Fisher informations is obvious from the definition. In this setting, the decompositions in Theorem 2.29 can be derived simply by expanding the squares in the the L-function. Repeating the calculations in Corollary 2.34 for χ(a) = a, we arrive at the local FIR equality for diffusion processes (with u as a placeholder forρ) [HPST20, Eq. (14)] d RelEnt(ρ|µ), u + ∇ log ρ µ L 2 (ρ)

Conclusion and discussion
In this paper we have presented an abstract macroscopic framework, which, for a given flux-density Lfunction, provides its decomposition into dissipative and non-dissipative components and a generalised notion of orthogonality between them. This decomposition provides a natural generalisation of the gradient-flow framework to systems with non-dissipative effects. Specifically we prove that the symmetric component of the L-function corresponds to a purely dissipative system and conjecture that the antisymmetric component corresponds to a Hamiltonian system, which has been verified in several examples. We then apply this framework to various examples, both with quadratic and non-quadratic L-functions.
We now comment on several related issues and open questions. Why does the density-flux description work? While at the level of the evolution equations which are of continuity-type, the density-flux description does not offer any advantage (recall (1.1)), at the level of the cost functions it allows us to naturally encode divergence-free effects. This is clearly visible for instance in Theorem 2.29, where the evolutions corresponding to L F sym , L F asym are dissipative and energypreserving respectively, while the zero of the full L-function characterises the macroscopic evolution. A simple contraction argument allows us to retrieve the classical gradient-flow structure as well as the FIR inequalities in a fairly general setting, which further reveals the power of this description.
Antisymmetric force and L-function. While in the abstract theory the antisymmetric force F asym = F asym (ρ) is a function of ρ ∈ Dom(F asym ), in all the concrete examples studied in this paper, F asym is independent of ρ. It is not clear to us if this is a general property of the antisymmetric force or a special characteristic of the examples studied in this paper.
In Section 2.6 we conjectured that the zero-velocity flux for the contracted antisymmetric L-function admits a Hamiltonian structure, which was concretely proved for IPFG and zero-range process in Proposition 4.2, 5.3 respectively. While this gives insight into the associated zero-flows, it is not clear if L F asym admits a variational formulation akin to the gradient-flow structure for L F sym discussed in Corollory 2.36.
Chemical-reaction networks. In Appendix B we provide a new interpretation of systems in complex balance as being exactly those systems which admit the relative entropy as the quasipotential. This also restricts the search for invariant measures of the CME without complex balance to measures that are not exponentially equivalent to the product-Poisson form. However, motivated by the example in that appendix, an interesting question would be to identify the class of systems which admit a rescaled relative entropy as their quasipotential.
Furthermore, the Hamiltonian structure of the zero-velocity for L F asym in the chemical-reaction setting is open. As pointed out in Section 5.2, the non-locality of the jump rates for chemical-reaction networks offers a challenge as opposed to the local jump rates for IPFG and zero-range process.
Generalised orthogonality. The notion of generalised orthogonality as introduced in Section 2.4 allows us to decompose the L-function as in Theorem 2.29 for the special case λ = 1 2 . However a natural question is whether this notion of orthogonality encoded via θ ρ can be generalised to allow for any λ ∈ [0, 1]. This would provide a deeper understanding of our main decomposition Theorem 2.29 as well as a clear interpretation of the Fisher information in terms of a modified dissipation potential.
Quasipotentials for multiple invariant measures. In Remark 3.9 we discussed the possibility of having multiple quasipotentials. On a macroscopic level, forcing uniqueness for non-quadratic Hamilton-Jacobi-Bellman equations is generally challenging. This is not merely a technical issue, since even on a microscopic level there may be multiple invariant measures; we have not pursued this possibility any further.
Global-in-time decompositions. In this paper we have focussed on the local-in-time description of the Lfunction as opposed to working with time-dependent trajectories. While it is not obvious how to generalise the various abstract results to allow for global-in-time descriptions, we expect that it can be worked out case by case for the examples presented in this paper. The main difficulty here is that the time-dependent trajectories are allowed to explore the boundary of the domain where the forces are not well-defined, and therefore an appropriate regularisation procedure is required to extend the domain of definition of these forces.
where G 1 , G 2 : R d → R are C 2 -mappings, and the C 1 matrix-valued function ω → J(ω) ∈ R d×d is antisymmetric, i.e. J T = − J. The bracket (A.2) satisfies the Jacobi identity if and only if for all smooth G 1 , G 2 , G 3 : R d → R and for all ω ∈ R d we have The proof follows by straightforward manipulation of the Jacobi identity. We now present the Hamiltonian structure for (A.1).
Setting F i = ∇G i we can compute the remaining two terms on the left hand side of (A.3) by rotating the indices.

B Complex balance and quasipotential condition (2.12)
The results in this appendix were developed during discussions with the participants of the online AIM workshop "Limits and control of stochastic reaction networks". This appendix explores the relation between the quasipotential for chemical reaction networks on the one hand (defined via (2.12)) and the notion of complex balance on the other. Interestingly, it turns out that complex balance is equivalent to having the relative entropy as the quasipotential.
Consider chemical-reaction networks satisfying mass-action kinetics (5.7) as explained in Section 5.2. To stress that these results are independent of the flux formulation and do not require a decomposition into forward and backward reactions, we will simply work with the density formulation (2.40), with the dual of the contracted L-function given bŷ H(ρ, ξ) := sup u∈TρZ ξ · j −L(ρ, u) = r∈R c r ρ α (r) e γ (r) ·ξ − 1 .
The equation (2.12) for the quasipotential readŝ H ρ, ∇V(ρ) = 0 for all coordinate-wise positive ρ ∈ R X >0 . (B.1) The following result shows that the notion of complex balance is inherently connected to (B.1), in the case when the quasipotential is the relative entropy. This result has also appears in [GL22, Lemma 3.6].
Theorem B.1. The following two statements are equivalent.
where Z V is the V-dependent normalisation constant. Note that the corresponding (zero-cost) reaction rate equation readsρ(t) = κ b − κ d ρ(t) 2 , which clearly has the steady state π := κ b /κ d . The CME is in detailed balance with respect to Π (V ) , but the reaction network is not in complex balance w.r.t. π. Again by Stirling's formula and using the fact that inf V = 0, we find Although the CME is in detailed balance, this result does not contradict the findings of [MPPR17], since this reaction network is not reversible in the sense of footnote 6.