Information geometry of dynamics on graphs and hypergraphs

Kobayashi, Tetsuya J.; Loutchko, Dimitri; Kamimura, Atsushi; Horiguchi, Shuhei A.; Sughiyama, Yuki

doi:10.1007/s41884-023-00125-w

Information geometry of dynamics on graphs and hypergraphs

Research Paper
Open access
Published: 22 December 2023

Volume 7, pages 97–166, (2024)
Cite this article

Download PDF

You have full access to this open access article

Information Geometry Aims and scope Submit manuscript

Information geometry of dynamics on graphs and hypergraphs

Download PDF

Tetsuya J. Kobayashi ORCID: orcid.org/0000-0001-8474-7942^1,2,3,
Dimitri Loutchko¹,
Atsushi Kamimura¹,
Shuhei A. Horiguchi² &
…
Yuki Sughiyama¹

2083 Accesses
1 Citation
16 Altmetric
Explore all metrics

Abstract

We introduce a new information-geometric structure associated with the dynamics on discrete objects such as graphs and hypergraphs. The presented setup consists of two dually flat structures built on the vertex and edge spaces, respectively. The former is the conventional duality between density and potential, e.g., the probability density and its logarithmic form induced by a convex thermodynamic function. The latter is the duality between flux and force induced by a convex and symmetric dissipation function, which drives the dynamics of the density. These two are connected topologically by the homological algebraic relation induced by the underlying discrete objects. The generalized gradient flow in this doubly dual flat structure is an extension of the gradient flows on Riemannian manifolds, which include Markov jump processes and nonlinear chemical reaction dynamics as well as the natural gradient. The information-geometric projections on this doubly dual flat structure lead to information-geometric extensions of the Helmholtz–Hodge decomposition and the Otto structure in $L^{2}$-Wasserstein geometry. The structure can be extended to non-gradient nonequilibrium flows, from which we also obtain the induced dually flat structure on cycle spaces. This abstract but general framework can broaden the applicability of information geometry to various problems of linear and nonlinear dynamics.

Dynamical Phase Transitions for Flows on Finite Graphs

Article Open access 17 November 2020

Nonlocal-Interaction Equation on Graphs: Gradient Flow Structure and Continuum Limit

Article Open access 15 March 2021

Dynamical Schrödinger Bridge Problems on Graphs

Article 13 March 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Information geometry is finding and establishing a firm position as a geometric language in various scientific disciplines [1, 2]. Information geometry enables us to gain an intuitive understanding of the structures behind complicated problems of inference and estimation, for which Euclidean or Riemannian geometry may not be sufficient. In addition, it can provide ways to devise new solutions and approaches for the problems [1]. While information geometry was originally developed for statistics, its applicability now reaches far beyond statistical problems. Whenever the notions of probability, information, or positive density appear in a problem, it is natural to consider its information–geometric structure.

1.1 Information geometry of dynamics

Dynamical systems and phenomena can be naturally analyzed with information geometric methods, as conventionally one considers the dynamics of probability distributions [3,4,5], e.g., via the Fokker–Planck equations (FPE) and the Master equation, or those of positive densities, e.g, via population dynamics, epidemic models, diffusion dynamics on networks, and chemical reaction dynamics [6,7,8]. Although the application of information geometry to dynamical systems has been attempted almost since its birth, information geometry for dynamics is much less organized and principled compared with those for static problems in statistics, optimization, and others [1]. In connection with statistical inference, information geometry was employed by Amari and others to investigate Gaussian time series and autoregressive moving average (ARMA) models by representing their power spectrum as parametric manifolds [9,10,11]. This idea was also used to investigate linear systems [12]. Markov jump processes on finite states^{Footnote 1} were investigated information-geometrically by considering the hierarchical structure of joint or conditional probabilities at different time points, e.g., $\mathbb {P}_{\varvec{\theta }}(x_{1},x_{2},\ldots , x_{t})$ [13], or by introducing exponential families of Markov kernels (transition matrices), $\mathbb {T}_{\varvec{\theta }}(x|x')$, via exponential tilting of the kernels [14,15,16,17,18,19,20]. Furthermore, information geometry was applied to studies of random walks, nonlinear diffusion equations of porous media, and networks [21,22,23]. In relation to mechanics, integrable systems were associated with the dualistic gradient flow of information geometry in the seminal works [24, 25], and other connections of information geometry with Lagrangian or Hamiltonian mechanics have been pursued [26,27,28].

1.2 Information measures for dynamics

Concurrently with and almost independently of these attempts within the community of information geometry, information measures relevant to information geometry have been employed in various problems of dynamical systems and stochastic processes in information theory [29], filtering theory [30, 31], control theory [32,33,34], and non-equilibrium physics and chemistry [35,36,37]. The Kullback–Leibler (KL) divergence [38] for probabilities and positive densities was shown to be a Lyapunov function of Markov jump processes (MJP) [5], FPE [3, 39], deterministic chemical reaction networks (CRN) [40, 41], and other dynamical systems [42, 43], the origin of which can be dated back to Gibbs’ H-theorem [44]. Among those topics, since the establishment of chemical thermodynamics by Gibbs [44] and chemical kinetics by Guldberg and Waage [45], CRN has played the role of a seedbed for cultivating the theory between dynamics and divergences owing to its close connection with thermodynamics [46,47,48]. More recently, it was also clarified that the divergences and information geometry are fundamental in stochastic thermodynamics [49,50,51,52,53].

In addition to the KL divergence, the Fisher-information-like quantity

$$\begin{aligned} \mathbb {I}_{F}[p] :=\int p(\varvec{r})(\nabla _{\varvec{r}} \ln p(\varvec{r}))^{2}\textrm{d}\varvec{r} \in \mathbb {R}_{\ge 0} \end{aligned}$$

(1)

was also revealed to play an important role in characterizing dynamics for densities on a continuous space, e.g., Gaussian convolution, diffusion processes, and FPE [54,55,56]. Various governing equations in physics were claimed to be derived in a unified way from this quantity [36]. The quantity $\mathbb {I}_{F}$ looks like the Fisher information [57] but is different from the conventional Fisher information matrix [58,59,60] because the derivative $\nabla _{\varvec{r}} \ln p(\varvec{r})$ is not for the parameters but for the base space variable of $p(\varvec{r})$.^{Footnote 2} Because $\mathbb {I}_{F}$ is a scalar, we follow [59] and call it Fisher information number. The Fisher information number $\mathbb {I}_{F}$ is related to the KL divergence in additive Gaussian channels [54] and other systems [56, 62], which is known as the De Bruijn identity [54]. In addition, the logarithmic Sobolev inequality also provides a relation between the Fisher information number and the KL divergence (or Shannon information) [63, 64]. These results have recently been associated with the formal Riemannian geometric structure induced by the $L^{2}$-Wasserstein geometry [65, 66].

1.3 Information geometry and dynamics in machine learning

On top of these traditional trends, information geometry is now playing a pivotal role in machine learning for designing and evaluating online optimization algorithms (dynamics) in the space of model parameters such as natural gradient [67] and mirror descent [68, 69] as well as evolutionary computation (information-geometric optimization) [70]. Geometric interpretation allows us to understand the behaviors and efficiency of algorithms and their dynamics more intuitively in a principled manner [69,70,71].

1.4 Aim and contributions of this work

Despite the wide applicability and the long history of information geometry, we still lack a solid theoretical framework to unify these outcomes that spread across different fields from the viewpoint of information geometry. In this work, we introduce a new information geometric structure for the dynamics of probability and positive densities. In this structure, we consider not only the single dually flat structure built on the space of densities as in [24, 25] but also another structure constructed on the space of fluxes. These two structures are linked algebraically and topologically via the continuity equation and the gradient equation as illustrated in Fig. 1.

Under this doubly dual flat structure, we can consider the dynamics of densities as a generalized flow, and various previous results can be unified in this framework. We exclusively consider dynamics of densities on finite-dimensional discrete manifolds, i.e., finite graphs or hypergraphs, because the structure introduced here can be explicitly manifested in this setup and also because we do not need the mathematically elaborated setup for infinite-dimensional information geometry on a smooth manifold [72]. For the case of FPE in a continuous state space, the dually flat structure built on the flux space can be reduced to the formal Riemannian geometric structure of $L^{2}$ Wasserstein geometry where the convex functions that induce the dually flat structure become quadratic. Our structure generalizes the linear inner product on the tangent and cotangent spaces with the nonlinear Legendre transform, thereby requiring information geometry. By elucidating this information geometric structure, we can easily see that some quantities such as the bilinear product, convex thermodynamic potential functions, the Fisher information matrix, and the Fisher information number are consolidated into one quantity for FPE with the quadratic convex functions (see Sect. 5.3 and Sect. 5.4). Therefore, our structure provides a way to unify the dualistic gradient flow mentioned in Sect. 1.1 and also the information-number related topics in Sect. 1.2.

From the viewpoint of homological algebra, the structure we work on is a modification of the chain and cochain complexes of graphs or hypergraphs, which replace the usual inner product duality [73] on each pair of chains and cochains with Legendre duality. Moreover, the dually flat space built on the flux space is linked to a finite-dimensional version of Orlicz spaces [74], which have been employed for constructing infinite-dimensional information geometry [72]. From the nice properties of the doubly dual flat structures, we can obtain information-geometric extensions of the Helmholtz-Hodge-Kodaira (HHK) decomposition (Theorem 1), the Otto calculus (Theorem 2), and its induction to cycle spaces(Theorem 3).

Our construction of an information geometry for dynamics is heavily based on the idea of using Legendre duality for the force and flux relation, proposed in the recent work of large deviations theory and the macroscopic fluctuation theorem for MJP and CRN led by A.Mieleke, R.I.A.Petterson, M.A.Peletier, D.R.M. Renger, J.Zimmer, and others^{Footnote 3} [75,76,77,78,79,80,81,82]. We clarified its information-geometric aspects in the context of CRN and thermodynamics in our previous work [83]. We also concurrently elucidated the intimate link of equilibrium chemical thermodynamics and information geometry on the density state space [48, 84, 85]. In light of those, the contribution of this work is three-fold. First, we integrate these results in terms of information geometry, which clarifies the underlying geometric nature of the problem, provides transparent interpretations for known results, and leads to new information geometric results and insights (Theorem 1–Theorem 3); Second, this structure substantially extends the applicability of information geometry to a wide variety of dynamical problems; Lastly, the structure links information geometry to algebraic graph theory, discrete calculus, and homological algebra, which were not fully appreciated yet but provides a versatile way to consider the topology of the base manifold in information geometry.

1.5 Organization of this paper

This work is organized as follows: In Sect. 2, we introduce a range of models of dynamics on graphs and hypergraphs. In Sect. 3, we outline the homological algebra of graphs and hypergraphs. In Sect. 4, we abstractly introduce the doubly dual flat structures on the density and flux spaces and define the generalized flow associated with these structures. In Sect. 5, we clarify that the introduced structures include a wide class of dynamics on graphs and hypergraphs. In Sect. 6 and Sect. 7, we further define information-geometric objects and quantities, which naturally appear from this setup and play an integral role in the subsequent analysis of dynamics. In Sect. 8 and Sect. 9, we derive several results for equilibrium and nonequilibrium flows, respectively. Finally, we provide a summary and prospects of our work in Sect. 10. The notations and symbols are listed in the appendix.

2 Classes of models for density dynamics on graphs and hypergraphs

In this work, we focus on linear and nonlinear dynamics defined on graphs [86] and hypergraphs [87].

The linear dynamics of densities on graphs (LDG) includes Markov jump processes (MJP) [88], monomolecular chemical reaction networks [89], and others [86]. We consider an extension of LDG to hypergraphs and nonlinear dynamics, common instances of which are chemical reaction networks (CRN) with the law of mass action (LMA) kinetics [8] and polynomial dynamical systems (PDS) [90]. Because the extension we deal with in this work is a subclass of nonlinear dynamical systems on hypergraphs, we use CRN to designate this subclass.

In the following subsections, LDG and CRN are introduced using the language of algebraic graph theory [86, 91]. Then, we also give a brief and formal introduction of the Fokker-Planck equation (FPE) [3], a linear dynamics of probability densities defined in Euclidean space. We use the FPE throughout this paper only to contrast our results with the previous ones obtained for the FPE.

2.1 Reversible linear dynamics of densities on graphs

Definition 1

(Edge-weighted finite graph $\mathbb {G}_{\varvec{k}^{\pm }}$) A finite graph $\mathbb {G}:=(\{\mathbb {v}_{i}\},\{\mathbb {e}_{e}\},\mathbb {B})$ consists of $N_{\mathbb {v}} \in \mathbb {Z}_{> 0}$ vertices, $\{\mathbb {v}_{i}\}_{i\in [1,N_{\mathbb {v}}]}$, and $N_{\mathbb {e}} \in \mathbb {Z}_{> 0}$ oriented edges, $\{\mathbb {e}_{e}\}_{e\in [1,N_{\mathbb {e}}]}$, each of which connects two different vertices^{Footnote 4} (Fig. 2a). The incidence relation is represented by the incidence matrix $\mathbb {B}\in \{0,\pm 1\}^{N_{\mathbb {v}}\times N_{\mathbb {e}}}$ where, for $\mathbb {B}=(b_{i,e})$,

$$\begin{aligned} b_{i,e}&:=+1{} & {} \text{ if } \mathbb {v}_{i} \text{ is } \text{ the } \text{ tail } \text{ of } \text{ edge } \mathbb {e}_{e}, \\ b_{i,e}&:=-1{} & {} \text{ if } \mathbb {v}_{i} \text{ is } \text{ the } \text{ head } \text{ of } \text{ edge } \mathbb {e}_{e}, \\ b_{i,e}&:=0{} & {} \text{ otherwise }. \end{aligned}$$

An edge-weighted finite graph $\mathbb {G}_{\varvec{k}^{\pm }}:=(\{\mathbb {v}_{i}\},\{\mathbb {e}_{e}\},\mathbb {B}, \{k_{e}^{\pm }\})$ has two positive weighting parameters $k_{e}^{\pm }=(k_{e}^{+},k_{e}^{-})\in \mathbb {R}_{> 0}$ for each edge $\mathbb {e}_{e}$. The parameters $k_{e}^{+}$ and $k_{e}^{-}$ are denoted as forward and reverse rates or weights of edge $\mathbb {e}_{e}$, respectively.

A reversible linear dynamics (rLDG) on a graphs is defined on the edge-weighted finite graph $\mathbb {G}_{\varvec{k}^{\pm }}$:

Definition 2

(Reversible linear dynamics of density on graph $\mathbb {G}_{\varvec{k}^{\pm }}$) The reversible linear dynamics of non-negative density $\varvec{x}(t)=(x_{1}(t), \cdots , x_{N_{\mathbb {v}}}(t))^{T}\in \mathbb {R}_{\ge 0}^{N_{\mathbb {v}}}$ on $\mathbb {G}_{\varvec{k}^{\pm }}$ is defined by the continuity equation

$$\begin{aligned} \dot{\varvec{x}}&=-\mathbb {B}\varvec{j}(\varvec{x})=-\mathbb {B}[\varvec{j}^{+}(\varvec{x})-\varvec{j}^{-}(\varvec{x})], \end{aligned}$$

(2)

and linear forward and reverse one-way fluxes $\varvec{j}^{\pm }(\varvec{x})=(j^{\pm }_{1}(\varvec{x}),\cdots , j^{\pm }_{N_{\mathbb {e}}}(\varvec{x}))^{T}\in \mathbb {R}^{N_{\mathbb {e}}}_{\ge 0}$ with the following specific functional form^{Footnote 5}:

$$\begin{aligned} \varvec{j}^{\pm }(\varvec{x})&=\varvec{k}^{\pm }\circ (\mathbb {B}^{\pm })^{T}\varvec{x} , \end{aligned}$$

(3)

where $\varvec{j}(\varvec{x}):=\varvec{j}^{+}(\varvec{x})-\varvec{j}^{-}(\varvec{x})$ is the total flux, the symbol $\circ $ denotes the component-wise product of two vectors,^{Footnote 6} and $\mathbb {B}^{+}$ and $\mathbb {B}^{-}$ are the head and tail incidence matrices defined respectively as $\mathbb {B}^{+}:=\max [\mathbb {B},0]$ and $\mathbb {B}^{-}:=\max [-\mathbb {B},0]$. The incidence matrix $\mathbb {B}$ in Eq. 2 is often regarded as the discrete divergence operator on a graph [73] and denoted also by $\textrm{div}_{\mathbb {B}}=\mathbb {B}$ to emphasize this interpretation in this work.^{Footnote 7}

Reversible Markov jump processes (rMJP) are a representative class of the rLDG describing random jumps of noninteracting particles on $\mathbb {G}_{\varvec{k}^{\pm }}$.^{Footnote 8} The weighting parameter $k_{e}^{+}$ is interpreted as the forward jump rate from the tail of the oriented edge $\mathbb {e}_{e}$ to its head, whereas $k_{e}^{-}$ is the reverse jump rate from the head to the tail of $\mathbb {e}_{e}$.^{Footnote 9} For infinitely many such particles, we consider $p_{i}(t)\in [0,1]$, the fraction of particles on vertex $\mathbb {v}_{i}$ at time t, which is a non-negative density on vertices. Then, the forward and reverse one-way fluxes on the eth edge defined by Eq. 3 are represented as

$$\begin{aligned} j^{+}_{e}(\varvec{p})&=k_{e}^{+}p_{\mathbb {v}^{+}_{e}}\in \mathbb {R}_{\ge 0},&j^{-}_{e}(\varvec{p})&=k_{e}^{-}p_{\mathbb {v}^{-}_{e}}\in \mathbb {R}_{\ge 0}, \end{aligned}$$

(4)

where $\mathbb {v}^{+}_{e}$ and $\mathbb {v}^{-}_{e}$ are the head and tail vertices of edge $\mathbb {e}_{e}$^{Footnote 10}. The linearity of $j^{\pm }_{e}(\varvec{p})$ with respect to $\varvec{p}$ comes from the independence of particles on the graph. Then, the continuity equation (Eq. 2) with the state vector $\varvec{p}(t):=(p_{1}(t), \cdots , p_{N_{\mathbb {v}}}(t))^{T}\in \mathbb {R}^{N_{\mathbb {v}}}_{\ge 0}$ is reduced to the master equation: $\dot{\varvec{p}}=-\mathbb {B}\varvec{j}(\varvec{p})$.

Definition 3

(Weighted asymmetric graph Laplacian [91, 92]) For $\mathbb {G}_{\varvec{k}^{\pm }}$, the corresponding weighted asymmetric graph Laplacian is defined by

$$\begin{aligned} \mathcal {L}_{\varvec{\theta }}:=\mathbb {B}\left[ \textrm{diag}[\varvec{k}^{+}] (\mathbb {B}^{+})^{T}-\textrm{diag}[\varvec{k}^{-}] (\mathbb {B}^{-})^{T} \right] , \end{aligned}$$

(5)

where $\varvec{\theta }:=(\varvec{k}^{+},\varvec{k}^{-})$ and $\textrm{diag}[\varvec{k}^{+}]$ is the diagonal matrix whose diagonal elements are $\varvec{k}^{+}$. Using $\mathcal {L}_{\varvec{\theta }}$, Eq. 2 and Eq. 3 are represented as

$$\begin{aligned} \dot{\varvec{x}}&=-\mathbb {B}\varvec{j}(\varvec{x})=-\mathbb {B}\left[ \textrm{diag}[\varvec{k}^{+}] (\mathbb {B}^{+})^{T}-\textrm{diag}[\varvec{k}^{-}] (\mathbb {B}^{-})^{T} \right] \varvec{x}=-\mathcal {L}_{\varvec{\theta }}\varvec{x}. \end{aligned}$$

(6)

The operator $\mathcal {L}_{\varvec{\theta }}$ is reduced to the weighted symmetric graph Laplacian if $\varvec{k}^{+}=\varvec{k}^{-}$ and also to the conventional graph Laplacian if $\varvec{k}^{+}=\varvec{k}^{-}=\varvec{1}$ [91, 92]. Equation 6 can also cover linear transport on graphs, a class of linear electric circuits [93], consensus dynamics on graphs [94], and other linear dynamics on graphs [86, 95].^{Footnote 11}

2.2 Chemical reaction network and polynomial dynamical systems on hypergraphs

Next, we introduce a class of nonlinear dynamics on hypergraphs, which includes the rLDG (Eq. 2 and Eq. 3) as a special case. The most common instance is deterministic chemical reaction networks (CRN) with the law of mass action (LMA) kinetics [7, 8, 45, 96], and this class is sometimes referred to as polynomial dynamical systems (PDS). Because the major part of the PDS theory has been developed for CRN, we use CRN to introduce and specify this class in this work.

Definition 4

(Reversible edge-weighted CRN hypergraph $\mathbb {H}_{\varvec{k}^{\pm }}$) The reversible CRN hypergraph consists of a finite number of vertices $\{\mathbb {X}_{i}\}_{i\in [1,N_{\mathbb {X}}]}$ and hyperedges $\{\mathbb {e}_{e}\}_{e\in [1,N_{\mathbb {e}}]}$ where $N_{\mathbb {X}}, N_{\mathbb {e}} \in \mathbb {Z}_{>0}$ (Fig. 2b). Each hyperedge $\mathbb {e}_{e}$ connects two different hypervertices $\hat{\mathbb {v}}^{+}_{e}$ and $\hat{\mathbb {v}}^{-}_{e}$ where $\hat{\mathbb {v}}^{+}_{e} \ne \hat{\mathbb {v}}^{-}_{e}$.^{Footnote 12} The hypervertices are multisets of vertices $\{\mathbb {X}_{i}\}_{i\in [1,N_{\mathbb {X}}]}$, each of which is defined as $\hat{\mathbb {v}}_{\ell }=\sum _{i=1}^{N _{\mathbb {X}}}\gamma _{i,\ell }\mathbb {X}_{i}$ where $\gamma _{i,\ell }\in \mathbb {Z}_{\ge 0}$ is the number of the ith vertex included in the $\ell $th hypervertex.^{Footnote 13} Thus, the nonnegative integer vector $\varvec{\gamma }_{\ell }:=(\gamma _{1,\ell }, \cdots , \gamma _{N_{\mathbb {X}},\ell })^{T}\in \mathbb {Z}_{\ge 0}^{N_{\mathbb {X}}}$ defines the $\ell $th hypervertex. Let $N_{\hat{\mathbb {v}}} \in \mathbb {Z}_{>0}$ be the total number of the hypervertices and be the hypervertex matrix. The matrix $\mathbb {B}\in \{0,\pm 1\}^{N_{\hat{\mathbb {v}}}\times N_{\mathbb {e}}}$ is the incidence matrix encoding the incidence relations among the hypervertices and the hyperedges. The hypergraph incidence matrix $\mathbb {S}\in \mathbb {Z}^{N_{\mathbb {X}} \times N_{\mathbb {e}}}$ is then defined as

(7)

If where $I$ is the identity matrix, then is reduced to $\mathbb {G}=(\{\mathbb {v}_{\ell }\}_{\ell \in [1,N_{\mathbb {X}}]},\{\mathbb {e}_{e}\}_{e\in [1,N_{\mathbb {e}}]},\mathbb {B})$ where $\mathbb {v}_{\ell }=\mathbb {X}_{\ell }$. An edge-weighted CRN hypergraph has forward and reverse rates $k_{e}^{\pm }> 0$ as the weights of edge $\mathbb {e}_{e}$.

In the context of CRN theory, the vertices $\{\mathbb {X}_{i}\}$ correspond to the molecular species involved in a CRN, and each hyperedge $\mathbb {e}_{e}$ represents a pair of forward and reverse reactions:

(8)

where the forward and reverse reactions are from left to right and from right to left, respectively. Head and tail hypervertices $\hat{\mathbb {v}}_{e}^{+}:=(\gamma ^{+}_{1,e}\mathbb {X}_{1}+\cdots +\gamma ^{+}_{N_{\mathbb {X}},e}\mathbb {X}_{N_{\mathbb {X}}})$ and $\hat{\mathbb {v}}_{e}^{-}:=(\gamma ^{-}_{1,e}\mathbb {X}_{1}+\cdots +\gamma ^{-}_{N_{\mathbb {X}},e}\mathbb {X}_{N_{\mathbb {X}}})$ in Eq. 8 are the sets of reactants and products of the eth forward reaction, respectively. More specifically, $\gamma ^{+}_{i,e}\in \mathbb {Z}_{\ge 0}$ and $\gamma ^{-}_{i,e}\in \mathbb {Z}_{\ge 0}$ are the numbers of the molecule $\mathbb {X}_{i}$ involved as the reactants and products of the eth forward reaction, respectively. For the reverse reaction, $\hat{\mathbb {v}}_{e}^{-}$ and $\hat{\mathbb {v}}_{e}^{+}$ are the reactants and products. Some head and tail hypervertices are overlapping among different reactions (hyperedges) as in Fig. 2b. As a result, $\{\hat{\mathbb {v}}_{\ell }\}_{\ell \in N_{\hat{\mathbb {v}}}}$ is the union of the head and tail hypervertices, $\{\hat{\mathbb {v}}_{\ell }\}_{\ell \in N_{\hat{\mathbb {v}}}}=\bigcup _{e \in N_{\mathbb {e}}}\{\hat{\mathbb {v}}_{e}^{+}, \hat{\mathbb {v}}_{e}^{-}\}$.

The hypervertices are called complexes in CRN theory [8]^{Footnote 14}. From $\{\gamma ^{+}_{i,e}\}$ and $\{\gamma ^{-}_{i,e}\}$, we can define

$$\begin{aligned} \varvec{s}_{e}:=(\gamma ^{+}_{1,e}-\gamma ^{-}_{1,e},\cdots , \gamma ^{+}_{N_{\mathbb {X}},e}-\gamma ^{-}_{N_{\mathbb {X}},e})^{T}\in \mathbb {Z}^{N_{\mathbb {X}}}, \end{aligned}$$

(9)

where $\mp \varvec{s}_{e}$ specify the change in the number of molecules induced when the eth forward and reverse reaction occurs just once, respectively. The hypergraph incidence matrix $\mathbb {S}$ defined in Eq. 7 is represented as $\mathbb {S}= (\varvec{s}_{1},\cdots , \varvec{s}_{N_{\mathbb {e}}})\in \mathbb {Z}^{N_{\mathbb {X}} \times N_{\mathbb {e}}}$. In chemistry, the negative of $\varvec{s}_{e}$ and $\mathbb {S}$, i.e., $-\varvec{s}_{e}$ and $-\mathbb {S}$, are called the stoichiometric vector and matrix, respectively [8].

Remark 1

To define a reversible CRN hypergraph, the hypergraph matrix $\mathbb {S}$ is not sufficient. If the head and tail hypervertices of a hyperedge contain the same vertex (molecule), the corresponding element in $\mathbb {S}$ of such a shared vertex becomes 0 by canceling out. Thus, the existence of shared vertices (molecules) is invisible in $\mathbb {S}$, and the pair is required to define $\mathbb {H}$. Such shared molecules are called catalysts in CRN.

For a CRN hypergraph, the continuity equation for CRN is defined:

Definition 5

(CRN continuity equation) Let a vector of nonnegative densities $\varvec{x}=(x_{1},\cdots ,x_{N_{\mathbb {X}}})^{T}\in \mathbb {R}_{\ge 0}^{N_{\mathbb {X}}}$ represents the concentration of molecules $\{\mathbb {X}_{i}\}$. The CRN continuity equation is defined as

$$\begin{aligned} \dot{\varvec{x}}=-\mathbb {S}\varvec{j}(\varvec{x})= -\textrm{div}_{\mathbb {S}} \varvec{j}(\varvec{x}), \end{aligned}$$

(10)

where $j_{e}^{+}(\varvec{x})\in \mathbb {R}_{\ge 0}$ and $j_{e}^{-}(\varvec{x})\in \mathbb {R}_{\ge 0}$ are the one-way fluxes of the eth forward and reverse reactions, $\varvec{j}^{\pm }(\varvec{x}):=(j_{1}^{\pm }(\varvec{x}),\cdots ,j_{N_{\mathbb {e}}}^{\pm }(\varvec{x}))^{T}\in \mathbb {R}_{\ge 0}^{N_{\mathbb {e}}}$ are their vector representations, and $\varvec{j}(\varvec{x}):=\varvec{j}^{+}(\varvec{x})-\varvec{j}^{-}(\varvec{x})\in \mathbb {R}^{N_{\mathbb {e}}}$ is the total reaction flux [7, 8, 96]. The hypergraph divergence operator $\textrm{div}_{\mathbb {S}} :=\mathbb {S}$ is defined accordingly.

To define the dynamics of a CRN, the functional form of $j_{e}^{\pm }(\varvec{x})$ is required.^{Footnote 15} Before introducing specific forms, we define two important properties of the fluxes and also other functions defined on edges:

Definition 6

(Consistency of fluxes $\varvec{j}^{\pm }(\varvec{x})$ with hypergraph $\mathbb {H}$) One-way fluxes $\varvec{j}^{\pm }(\varvec{x})$ are consistent with the hypergraph $\mathbb {H}$ if, for all $e\in [1,N_{\mathbb {e}}]$, $j^{\pm }_{e}(\varvec{x})$ becomes 0 when $x_{i}=0$ where $\mathbb {X}_{i}$ is any reactant of $j^{\pm }_{e}(\varvec{x})$, respectively. In other words, $j^{\pm }_{e}(\varvec{x})$ satisfies $\gamma _{i,e}^{\pm }j_{e}^{\pm }(\varvec{x})=0$ if $x_{i}=0$ for any $i\in [1,N_{\mathbb {X}}]$.

Definition 7

(Locality of function on edges over $\mathbb {H}$) A vector function $\varvec{g}(\varvec{x})\in \mathbb {R}^{N_{\mathbb {e}}}$ defined on edges is local on $\mathbb {H}$ if, for all $e\in [1,N_{\mathbb {e}}]$, $g_{e}(\varvec{x})$ is a function only of the elements of $\varvec{x}$ incident to the edge $\mathbb {e}_{e}$ on $\mathbb {H}$, i.e., $g_{e}(\varvec{x})=g_{e}(\bar{\gamma }_{1,e}^{+} x_{1},\cdots , \bar{\gamma }_{N_{\mathbb {X}},e}^{+}x_{N_{\mathbb {X}}},\bar{\gamma }_{1,e}^{-} x_{1},\cdots , \bar{\gamma }_{N_{\mathbb {X}},e}^{-} x_{N_{\mathbb {X}}})$ where $\bar{\gamma }_{i,e}^{\pm }:=\min [1, \gamma _{i,e}^{\pm }] \in \{0,1\}$.

The consistency condition is indispensable to prohibit a reaction that can decrease $x_{i}$ from occurring when $x_{i}=0$. For $\varvec{j}^{\pm }(\varvec{x})$, the locality means that the fluxes of the eth reaction depend only on the concentrations of their reactants and products. The local flux is determined solely by the information stored on the vertices incident to the edge and plays a crucial role when we regard the structure introduced in this work as an extension of differential forms on continuous manifolds to graphs and hypergraphs. When we work on specific forms of fluxes in this work, we consider only local fluxes consistent with the given hypergraph $\mathbb {H}$.

In chemistry, we have a variety of candidates for the functional form of flux, e.g., the Michaelis-Menten function, Hill’s function, and others [7, 97]. Among others, the LMA kinetics is the most basic and well-established one.

Definition 8

(Waage–Guldberg’s law of mass action kinetics (LMA kinetics)) A CRN follows the LMA kinetics if, for all $e\in [1,N_{\mathbb {e}}]$, the eth forward and reverse reaction fluxes are represented as

$$\begin{aligned} j^{\pm }_{e}(\varvec{x})=k^{\pm }_{e} \prod _{j=1}^{N_{\mathbb {X}}}x_{j}^{\gamma ^{\pm }_{j,e}}=k^{\pm }_{e} \sum _{\ell =1}^{N_{\hat{\mathbb {v}}}}b_{\ell ,e}^{\pm }\prod _{j=1} ^{N_{\mathbb {X}}}x_{j}^{\gamma _{j,\ell }}, \end{aligned}$$

(11)

where $k_{e}^{+}\in \mathbb {R}_{> 0}$ and $k_{e}^{-}\in \mathbb {R}_{> 0}$ are the reaction rate constants of the eth forward and reverse reactions, respectively. The fluxes under LMA kinetics can be compactly represented as

(12)

where and .^{Footnote 16} We use the subscript $\textrm{MA}$ as in $\varvec{j}^{\pm }_{\textrm{MA}}(\varvec{x})$ to discriminate this specific form of the fluxes from others. We can easily observe that $\varvec{j}^{\pm }_{\textrm{MA}}(\varvec{x})$ is consistent and local with respect to $\mathbb {H}$. Furthermore, $\varvec{j}^{\pm }_{\textrm{MA}}(\varvec{x})$ is specified by the edge-weighted CRN hypergraph .

Remark 2

(Algebraic aspect of LMA kinetics) Because is a vector of monomials of $\varvec{x}$, each one-way flux, $j^{\pm }_{e}(\varvec{x})$, is a monomial of $\varvec{x}$ under Eq. 12 and thus the total flux $j_{e}(\varvec{x})=j^{+}_{e}(\varvec{x})-j^{-}_{e}(\varvec{x})$ is a binomial. This fact links the real algebraic geometry of toric varieties [98, 99] to CRN [84, 100] as it does in algebraic statistics [48, 101].

Remark 3

(Extended LMA kinetics) While we mainly work on the normal LMA kinetics, we can extend it. The extended LMA kinetics defined on $\mathbb {H}$ is defined as

(13)

where $\varvec{g}(\varvec{x}) \in \mathbb {R}_{> 0}^{N_{\mathbb {e}}}$ and is local with respect to $\mathbb {H}$.^{Footnote 17} An example of the extended LMA kinetics is reversible Michaelis-Menten kinetics [103].

By combining the continuity equation (Eq. 10) and the LMA kinetics (Eq. 12), we have the following chemical rate equation:

(14)

where $\mathcal {L}_{\varvec{\theta }}$ is the weighted asymmetric graph Laplacian defined as in Eq. 5. Now, we can see that CRN contains rLDG (Eq. 6) as a special case if . Owing to this inclusion relation, CRN with LMA kinetics is a mathematically sound generalization of rLDG. Because LDG has been used in various fields of social science, network science, machine learning, and so on, CRN theory is potentially important for extending the results there.

Example 1

(Simplified Brusselator CRN [8, 104]) The Brusselator is a representative CRN, which can generate non-trivial dynamic behaviors such as oscillations. We use a reversible CRN version of the simplified Brusselator [8, 104], whose CRN-hypergraph depicted in Fig. 2b has the following structural information:

(15)

,

The rate equation (Eq. 14) can be represented as

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\begin{pmatrix}x_{1}\\ x_{2}\end{pmatrix}&=- \overbrace{\begin{pmatrix} -1 &{} +1 &{} -1 \\ 0 &{} -1 &{}+1 \end{pmatrix}}^{\mathbb {S}} \left[ \overbrace{ \begin{pmatrix} k_{1}^{+} \\ k_{2}^{+}x_{1} \\ k_{3}^{+} x_{1}^{2}x_{2} \end{pmatrix}}^{\varvec{j}^{+}(\varvec{x})} - \overbrace{\begin{pmatrix} k_{1}^{-}x_{1} \\ k_{2}^{-}x_{2} \\ k_{3}^{-} x_{1}^{3} \end{pmatrix}}^{\varvec{j}^{-}(\varvec{x})} \right] . \end{aligned}$$

(16)

2.3 Fokker Planck equations

While our main focus is the dynamics on graphs and hypergraphs, we use FPE as a representative class of density dynamics on a continuous Euclidean space. Specifically, we use FPE only to demonstrate the relation of our results with previous ones obtained for FPE in various contexts. Because FPE is infinite-dimensional, we treat it here only formally.

Let $\varvec{r}\in \mathbb {R}^{d}$ be a vector in a d dimensional Euclidean space. We consider infinitely many noninteracting particles randomly walking in the space and describe the dynamics by a probability density $p_{t}(\varvec{r})\in \mathbb {R}_{\ge 0}$ of the particles. The continuity equation for $p_{t}(\varvec{r})$ is

$$\begin{aligned} \partial _{t}p_{t}(\varvec{r})&=-\nabla \cdot \varvec{j}_{\textrm{FP}}[p_{t}(\varvec{r})] \end{aligned}$$

(17)

where $\varvec{j}_{\textrm{FP}}[p_{t}(\varvec{r})]\in \mathbb {R}^{d}$ is the probability flux, $\nabla :=(\partial /\partial r_{1}, \cdots , \partial /\partial r_{d})^{T}$ is the gradient operator on the Euclidean space, and $(\nabla \cdot ): \nabla \cdot \varvec{F}(\varvec{r}):=\sum _{i=1}^{d}\partial F_{i}(\varvec{r})/\partial r_{i} \in \mathbb {R}$ is the divergence. The flux of the FPE is defined as

$$\begin{aligned} \varvec{j}_{\textrm{FP}}[p(\varvec{r})]&= \left[ \varvec{F}(\varvec{r})p(\varvec{r}) - D_{0} \nabla p(\varvec{r})\right] , \end{aligned}$$

(18)

where $\varvec{F}(\varvec{r})\in \mathbb {R}^{d}$ is the drift force, and $D_{0} \in \mathbb {R}_{>0}$ is the diffusion constant.

3 Discrete calculus and homological algebra of graphs and hypergraphs

The algebraic and topological structure of the dynamics on graphs and hypergraphs can be explicitly and abstractly treated using the language of discrete calculus and homological algebra. The discrete version of the gradient and divergence mentioned in Sect. 2 is also characterized. In this section, we briefly introduce the chain and cochain complexes defined for a finite graph or a hypergraph and discrete calculus [73, 91, 105, 106]. We first introduce the complexes for a graph $\mathbb {G}$ and then extend them to a hypergraph $\mathbb {H}$ algebraically.^{Footnote 18} It should be noted that the conventional discrete calculus (the discrete version of the theory for differential forms) presumes the Riemannian metric in the dual space of chains and cochains or that of cochains on primal and dual complexes [107, 108]. However, we are going to introduce Legendre duality instead. For this purpose, our introduction of chain and cochain complexes depends only on the topological (algebraic) information of the underlying graph and hypergraph [73] without specifying the metric information.

3.1 Chain and cochain complexes on graphs

The elements of a graph $\mathbb {G}$ are called cells in discrete calculus.^{Footnote 19} A vertex and an edge are, respectively, called 0-cell and 1-cell, and the graph $\mathbb {G}$ is denoted as a cell-complex.^{Footnote 20} For each type of the cells, we consider vectors (chains and cochains) defined on the cells. For $\mathbb {G}$, a 0-chain with field $\mathbb {R}$ is an $N_{\mathbb {v}}$-tuple of real scalars, each of which is assigned to a vertex, i.e., a 0 cell. Thus, a 0-chain is a real vector defined on the vertices of $\mathbb {G}$ with the basis $\{\mathbb {v}_{i}\}$. This basis is called the standard basis. The vector space of real 0-chains is called the vertex space here and denoted as $C_{0}(\mathbb {G})=\mathbb {R}^{N_{\mathbb {v}}}$ [91].^{Footnote 21} The components of the vector $\varvec{x} \in C_{0}(\mathbb {G})$ are given as $\varvec{x}(\mathbb {v}_{i}):=x_{i}$. Similarly, a real 1-chain is a real vector defined on the edges of $\mathbb {G}$. The real vector space of 1-chains is called the edge space and denoted as $C_{1}(\mathbb {G})=\mathbb {R}^{N_{\mathbb {e}}}$. The standard basis is introduced by using edges $\{\mathbb {e}_{e}\}$, accordingly. A flux $\varvec{j}$ is a 1-chain: $\varvec{j}(\mathbb {e}_{e}):=j_{e}$. The graph incidence matrix $\mathbb {B}$ induces the discrete differential $\delta _{1}: C_{1}(\mathbb {G}) \rightarrow C_{0}(\mathbb {G})$ as $\delta _{1}\varvec{j}:=\mathbb {B}\varvec{j}$.^{Footnote 22}

To obtain an exact sequence, we algebraically define the $(-1)$ and 2 chains and the corresponding differentials $\delta _{0}$ and $\delta _{2}$. Let $C_{2}(\mathbb {G})=\mathbb {R}^{N_{\mathbb {z}}}$ where $N_{\mathbb {z}}=\textrm{dim}[\textrm{Ker}\mathbb {B}]$ and $\{\varvec{v}_{i}\}_{i\in [1,N_{\mathbb {z}}]}$ is a set of complete basis of $\textrm{Ker}\mathbb {B}$ where $\varvec{v}_{i} \in \{0,+1,-1\}^{N_{\mathbb {e}}}$^{Footnote 23}. In algebraic graph theory, $\textrm{Ker}\mathbb {B}$ is called a cycle subspace [86, 91, 109]. For a graph $\mathbb {G}$, we can construct $\{\varvec{v}_{i}\}_{i\in [1,N_{\mathbb {z}}]}$ by, for example, using the fundamental cycle basis of $\mathbb {G}$ obtained from a fixed spanning tree of $\mathbb {G}$^{Footnote 24} [86]. Thus, $C_{2}(\mathbb {G})$ is the vector space defined on the cycles of $\mathbb {G}$ and isomorphic to the cycle subspace. We define a matrix, $\mathbb {V}:=(\varvec{v}_{1},\cdots , \varvec{v}_{N_{\mathbb {z}}})$^{Footnote 25}, and the differential $\delta _{2}: C_{2}(\mathbb {G}) \rightarrow C_{1}(\mathbb {G})$ as $\delta _{2}:=\mathbb {V}$. From the construction, $\mathbb {B}\mathbb {V}=\delta _{1}\delta _{2}=0$ and $\textrm{Im}[\delta _{2}]=\textrm{Ker}[\delta _{1}]$ hold. Similarly, let $C_{-1}(\mathbb {G})=\mathbb {R}^{N_{\mathbb {l}}}$ where $N_{\mathbb {l}}=\textrm{dim}[\textrm{Ker}\mathbb {B}^{T}]$ and $\{\varvec{u}_{\ell }\}_{\ell \in [1,N_{\mathbb {l}}]}$ is a set of complete basis of $\textrm{Ker}\mathbb {B}^{T}$ where $\varvec{u}_{\ell } \in \{0,+1,-1\}^{N_{\mathbb {v}}}$. The subspace $\textrm{Ker}\mathbb {B}^{T}$ is related to the connected components of $\mathbb {G}$ and $\varvec{u}_{i}$ can be chosen such that $u_{i,\ell }=+1$ if the ith vertex is included in the $\ell $th connected component and $u_{i,\ell }=0$, otherwise. Thus, $C_{-1}(\mathbb {G})$ is the vector space on the connected components. From the matrix $\mathbb {U}:=(\varvec{u}_{1},\cdots , \varvec{u}_{N_{\mathbb {l}}})^{T}$, the differential $\delta _{0}: C_{0}(\mathbb {G}) \rightarrow C_{-1}(\mathbb {G})$ is defined as $\delta _{0}:=\mathbb {U}$. From the construction, $\mathbb {U}\mathbb {B}=\delta _{0}\delta _{1}=0$ and $\textrm{Im}[\delta _{1}]=\textrm{Ker}[\delta _{0}]$ hold. Then, we obtain the exact chain sequence^{Footnote 26}^{Footnote 27}:

$$\begin{aligned} 0 \xleftarrow {} C_{-1}(\mathbb {G})\xleftarrow {\delta _{0} =\mathbb {U}}C_{0}(\mathbb {G})\xleftarrow {\delta _{1} =\mathbb {B}}C_{1}(\mathbb {G})\xleftarrow {\delta _{2}=\mathbb {V}} C_{2}(\mathbb {G})\xleftarrow {}0. \end{aligned}$$

(19)

Because $C_{p}(\mathbb {G})$ is a vector space for each $p\in \{-1,0,1,2\}$, we can consider its dual vector space $C^{p}(\mathbb {G}):=C_{p}^{*}(\mathbb {G})$ consisting of the linear functions on $C_{p}(\mathbb {G})$. An element of $C^{p}(\mathbb {G})$ is called p-cochain. Let $\langle \cdot , \cdot \rangle : C_{p}(\mathbb {G}) \times C^{p}(\mathbb {G})\rightarrow \mathbb {R}$ be the standard bilinear pairing of the p-chain and p-cochain defined with the standard basis. The transposes of $\mathbb {U}$, $\mathbb {B}$, and $\mathbb {V}$ induce the differentials between cochains as $\delta ^{-1}:=\mathbb {U}^{T}: C^{-1}(\mathbb {G})\rightarrow C^{0}(\mathbb {G})$, $\delta ^{0}:=\mathbb {B}^{T}: C^{0}(\mathbb {G})\rightarrow C^{1}(\mathbb {G})$, and $\delta ^{1}:=\mathbb {V}^{T}: C^{1}(\mathbb {G})\rightarrow C^{2}(\mathbb {G})$. The differentials $\delta ^{p}$ on cochains are the adjoints of the differentials $\delta _{p}$ on chains, which induce the exact cochain sequence:

$$\begin{aligned} 0 \xrightarrow {} C^{-1}(\mathbb {G})\xrightarrow {\delta ^{-1} =\mathbb {U}^{T}}C^{0}(\mathbb {G})\xrightarrow {\delta ^{0} =\mathbb {B}^{T}}C^{1}(\mathbb {G})\xrightarrow {\delta ^{1} =\mathbb {V}^{T}}C^{2}(\mathbb {G})\xrightarrow {}0. \end{aligned}$$

(20)

Note that the definition of chains, cochains, and differential operators are topological in the sense that we do not include any metric information.

3.2 Chain and cochain complexes on hypergraphs

The definitions of chain and cochain complexes introduced above are algebraically extended to hypergraphs $\mathbb {H}$ simply by replacing the graph incidence matrix $\mathbb {B}$ with the hypergraph incidence matrix $\mathbb {S}$.

Definition 9

(Exact chain and cochain sequences on a hypergraph) The chain and cochain complexes on a hypergraph are defined by the following diagram:

$$\begin{aligned} 0&\xrightarrow {}&C^{-1}(\mathbb {H})&\xrightarrow {\delta ^{-1}=\mathbb {U}^{T}}&C^{0}(\mathbb {H})&\xrightarrow {\delta ^{0}=\mathbb {S}^{T}}&C^{1}(\mathbb {H})&\xrightarrow {\delta ^{1} =\mathbb {V}^{T}}&C^{2}(\mathbb {H})&\xrightarrow {}&0&\\ 0&\xleftarrow {}&C_{-1}(\mathbb {H})&\xleftarrow {\delta _{0}=\mathbb {U}}&C_{0} (\mathbb {H})&\xleftarrow {\delta _{1}=\mathbb {S}}&C_{1}(\mathbb {H})&\xleftarrow {\delta _{2}=\mathbb {V}}&C_{2}(\mathbb {H})&\xleftarrow {}&0&. \end{aligned}$$

where $C^{-1}(\mathbb {H})\simeq C_{-1}(\mathbb {H})\simeq \mathbb {R}^{N_{\mathbb {l}}}$, $C^{0}(\mathbb {H})\simeq C_{0}(\mathbb {H})\simeq \mathbb {R}^{N_{\mathbb {X}}}$, $C^{1}(\mathbb {H})\simeq C_{1}(\mathbb {H})\simeq \mathbb {R}^{N_{\mathbb {e}}}$, and $C^{2}(\mathbb {H})\simeq C_{2}(\mathbb {H})\simeq \mathbb {R}^{N_{\mathbb {z}}}$.

The bases, $\mathbb {V}$ and $\mathbb {U}$, are obtained as integral bases, i.e., the components of $\mathbb {V}$ and $\mathbb {U}$ can be chosen from $\mathbb {Z}$ because $\mathbb {S}$ is an integer-valued matrix.^{Footnote 28} As we will explain in Sect. 6 and Sect. 9, the meaning of $C_{2}(\mathbb {H})$ can be retained as the space on generalized cycles. The meaning of $C_{-1}(\mathbb {H})$ becomes the space of conserved quantities under the dynamics (Eq. 10).

3.3 Discrete calculus on graphs and hypergraphs

The p-cochain and p-chain introduced above are an algebraic abstraction of the p-differential form and its Hodge dual on a differential manifold [73]. Accordingly, the discrete versions of gradient, divergence, and curl are associated with the differentials (exterior derivative).

Definition 10

(Discrete gradients, divergences, and curls) The discrete gradient is defined as $\textrm{grad}_{\mathbb {B}}:=\delta ^{0}=\mathbb {B}^{T}$ for $\mathbb {G}$ and also as $\textrm{grad}_{\mathbb {S}}:=\delta ^{0}=\mathbb {S}^{T}$ for $\mathbb {H}$. The adjoints of the gradients are defined with the corresponding adjoint differentials: $\textrm{grad}^{*}_{\mathbb {B}}:=\delta _{1}=\mathbb {B}$ and $\textrm{grad}^{*}_{\mathbb {S}}:=\delta _{1}=\mathbb {S}$. They are called discrete divergences and denoted also as $\textrm{div}_{\mathbb {B}}=\textrm{grad}^{*}_{\mathbb {B}}$ and $\textrm{div}_{\mathbb {S}}=\textrm{grad}^{*}_{\mathbb {S}}$.^{Footnote 29} The discrete curl and its adjoint are defined as $\textrm{curl}_{\mathbb {V}}:=\delta ^{1}=\mathbb {V}^{T}$ and $\textrm{curl}^{*}_{\mathbb {V}}:=\delta _{2}=\mathbb {V}$, respectively.

3.4 Linear graph Laplacian dynamics and metric structure in discrete calculus

In the theory of graph Laplacian, a metric matrix $M_{p}$ and its associated inner product are typically endowed for each p. To contrast it with the Legendre duality introduced later, we briefly outline it here. For an edge-weighted graph $\mathbb {G}_{\varvec{k}^{\pm }}$ and for the case that $\varvec{k}^{+}=\varvec{k}^{-}=\varvec{k}\in \mathbb {R}_{>0}^{N_{\mathbb {e}}}$, $M_{0}=I$ and $M_{1}=\textrm{diag}[1/\varvec{k}]$ are conventionally employed. With these metric matrices, the graph Laplacian introduced in Eq. 5 can be described as

$$\begin{aligned} \mathcal {L}_{\varvec{k}}=\textrm{div}_{\mathbb {B}} M^{1} \textrm{grad}_{\mathbb {B}} M_{0} \end{aligned}$$

(21)

where $M^{p}:=M_{p}^{-1}$. By including such metric information, the following pair of metric gradient and divergence is often used in graph theory and network theory: $\textrm{grad}_{M}:=\sqrt{M^{1}}\mathbb {B}^{T}$ and $\textrm{div}_{M} :=\mathbb {B}\sqrt{M^{1}}$ where $\sqrt{M^{1}}:=\textrm{diag}[\sqrt{\varvec{k}}]$. This symmetric graph Laplacian $\mathcal {L}_{\varvec{k}}$ induces a linear dynamics of $\varvec{x}\in \mathbb {R}^{N_{\mathbb {v}}}$ on graph via Eq. 6^{Footnote 30}:

$$\begin{aligned} \dot{\varvec{x}}=-\mathcal {L}_{\varvec{k}}\varvec{x}. \end{aligned}$$

(22)

The eigenvalues and eigenvectors of $\mathcal {L}_{\varvec{k}}$ enable us to obtain spectral information of the underlying graph [92]. Even for nonlinear dynamics on a hypergraph as in Eq. 14, the same symmetric Laplacian can provide some information when $\varvec{k}^{+}=\varvec{k}^{-}=\varvec{k}$. We can also include other information in the metric matrices such as the degree of vertices [110]. Various normalizations of the graph Laplacian can be attributed to the choice of metrics.

However, such a choice of metric matrices ends up only with linear dynamics on $\mathbb {R}^{N_{\mathbb {v}}}$ and is relevant only when the weighting is symmetric: $\varvec{k}^{+}=\varvec{k}^{-}=\varvec{k}$. In addition, it may not always capture important aspects of the density dynamics such as gradient flow properties and information–theoretic properties, because nonlinear terms such as $\ln \varvec{p}$ appear in information–theoretic quantities. To extend the class of dynamics being covered and to enable the information-geometric characterization of dynamics, we have to generalize the conventional inner product structure by replacing it with the Legendre dual structure induced by convex functions.

4 Dually flat spaces on vertices and edges and generalized flow

In this section, we introduce two pairs of dually flat spaces (Fig. 1): one is associated with the vertex spaces, i.e., the dual spaces of 0-chains and 0-cochains. The other corresponds to the edge spaces, i.e., the dual spaces of 1-chains and 1-cochains. By combining them, the dynamics on graphs and hypergraphs are characterized as a generalized flow.

4.1 Dually flat spaces on vertices and thermodynamic functions

We work on the density $\varvec{x}$ and the vertex space for CRN because its reduction to rLDG is straightforward. For a probability vector $\varvec{p}$, the introduction of dually flat spaces of $\varvec{p}$ and $\ln \varvec{p}$ is natural from the information-geometric viewpoint. In CRN, $\varvec{x}$ is the vector of concentrations of molecular species. As we recently clarified [48], the dually flat spaces, in this case, result from the Legendre duality between extensive and intensive variables in thermodynamics, which is also natural from the physical viewpoint.

Definition 11

(Density space (primal vertex affine space)) The density space (also called primal affine vertex space) is the positive orthant $\mathcal {X}:=\mathbb {R}_{>0}^{N_{\mathbb {X}}}$ of a vector space $\mathbb {R}^{N_{\mathbb {X}}}$, which is isomorphic to $C_{0}(\mathbb {H})$; $\mathbb {R}^{N_{\mathbb {X}}} \simeq C_{0}(\mathbb {H})$ (Fig. 1, lower left).

Remark 4

The density space $\mathcal {X}$ is defined as the positive orthant rather than as $\mathbb {R}_{\ge 0}^{N_{\mathbb {X}}}$. This excludes the cases where some elements of $\varvec{x}$ become 0. From the viewpoint of information geometry, this restriction is necessary to consider densities with the same support (all $\varvec{x}$ in $\mathcal {X}$ should be equivalent in terms of absolute continuity of measures). From the viewpoint of dynamical systems, depending on the specific functional form of the flux $\varvec{j}(\varvec{x})$, the trajectory $\varvec{x}(t)$ may not be restricted within $\mathcal {X}$. The property $\varvec{x}(t)$ in $\mathcal {X}$ for $t\in [0,\infty ]$ is known as persistence.^{Footnote 31} Without going into this intricate problem, we simply assume that $\varvec{x}(t) \in \mathcal {X}$ for $t\in [0,\infty ]$. We call $\partial \mathcal {X}:=\mathbb {R}^{N_{\mathbb {X}}}_{\ge 0}{\setminus }\mathcal {X}$ the boundary of $\mathcal {X}$.

We define the dual of the density space by the Legendre transformation via the thermodynamic function:

Definition 12

(Primal thermodynamic function) A strictly convex differentiable function $\Phi : \mathcal {X}\rightarrow \mathbb {R}$ is called the primal thermodynamic function^{Footnote 32}^{Footnote 33} if the following two conditions are satisfied: (1) the associated Legendre transformation

$$\begin{aligned} \partial \Phi : \mathcal {X}&\rightarrow \mathbb {R}^{N_{\mathbb {X}}} \end{aligned}$$

(23)

$$\begin{aligned} \varvec{x}&\longmapsto \varvec{y} :=\partial _{\varvec{x}}\Phi (\varvec{x}) = \left( \frac{\partial \Phi (\varvec{x})}{\partial x_{1}}, \cdots , \frac{\partial \Phi (\varvec{x})}{\partial x_{N_{\mathbb {X}}}}\right) ^{T} \end{aligned}$$

(24)

has the image $\mathcal {Y}:=\left\{ \varvec{y}|\varvec{y}=\partial \Phi (\varvec{x}),\varvec{x}\in \mathcal {X}\right\} $ being equal to $\mathbb {R}^{N_{\mathbb {X}}}$, i.e., $\mathcal {Y}=\mathbb {R}^{N_{\mathbb {X}}}$; (2) for any $\varvec{x}_{in}\in \mathcal {X}$ and any point on the boundary $\varvec{x}_{bd}\in \partial \mathcal {X}$,

$$\begin{aligned} \lim _{\lambda \rightarrow +0}\frac{\textrm{d}\Phi (\varvec{x}_{\lambda })}{\textrm{d}\lambda } =- \infty \end{aligned}$$

(25)

holds where $\varvec{x}_{\lambda }:=\lambda \varvec{x}_{in} + (1-\lambda ) \varvec{x}_{bd}$ for $\lambda \in [0,1]$.

Definition 13

(Potential space (dual affine vertex space) and dual thermodynamic function) The potential (field) space $\mathcal {Y}= \mathbb {R}^{N_{\mathbb {X}}}$ (also called the dual affine vertex space) is an affine space dual to $\mathcal {X}$ with the associated vector space $C^{0}(\mathbb {H})$((Fig. 1, upper left)).^{Footnote 34} The dual thermodynamic function $\Phi ^{*}: \mathcal {Y}\rightarrow \mathbb {R}$ is the Legendre-Fenchel conjugate of the primal thermodynamic function:

$$\begin{aligned} \Phi ^{*}: \mathcal {Y}\rightarrow \mathbb {R}, \quad \varvec{y} \mapsto \Phi ^{*}(\varvec{y}) :=\max _{\varvec{x}'\in \mathcal {X}}\left[ \langle \varvec{x}',\varvec{y} \rangle - \Phi (\varvec{x}') \right] , \end{aligned}$$

(26)

where $\langle \cdot ,\cdot \rangle : \mathcal {X}\times \mathcal {Y}\rightarrow \mathbb {R}$ is the bilinear pairing under the standard basis. From the properties of the primal function, $\Phi ^{*}(\varvec{y})$ is also a strictly convex differentiable function. From $\Phi ^{*}(\varvec{y})$, we have the inverse Legendre transformation $\partial \Phi ^{*}: \mathcal {Y}\rightarrow \mathcal {X},\,\varvec{y}\mapsto \varvec{x} =\partial _{\varvec{y}}\Phi ^{*}(\varvec{y})$.

The Legendre transformations, $\partial \Phi $ and $\partial \Phi ^{*}$, are continuous and establish a bijection between $\mathcal {X}$ and $\mathcal {Y}$, where $\partial \Phi ^{*}=\partial \Phi ^{-1}$. In the following, we regard a pair $(\varvec{x},\varvec{y})$ with the same decoration as a Legendre dual pair satisfying $\varvec{y}=\partial \Phi (\varvec{x})$. For a pair, the Legendre-Fenchel-Young identity holds:

$$\begin{aligned} \Phi (\varvec{x})+\Phi ^{*}(\varvec{y})=\langle \varvec{x},\varvec{y}\rangle . \end{aligned}$$

(27)

Different pairs are discriminated with the difference of decorations as $(\varvec{x}', \varvec{y}')$ or $(\varvec{x}_{p}, \varvec{y}_{p})$.

For a thermodynamic function, the Bregman divergence can be defined:

Definition 14

(Bregman divergence [1, 113]) The Bregman divergence on $\mathcal {X}$ with the generating thermodynamic function $\Phi (\varvec{x})$ is defined as

$$\begin{aligned} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']:=\Phi (\varvec{x})-\Phi (\varvec{x}') - \langle \varvec{x}-\varvec{x}', \partial \Phi (\varvec{x}') \rangle \in \mathbb {R}_{\ge 0}. \end{aligned}$$

(28)

The non-negativity of the Bregman divergence follows from the Fenchel-Young inequality for products [114, 115]. Furthermore, from the strict convexity of the thermodynamic function, $\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']$ is also strictly convex with respect to $\varvec{x}$ and $\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']=0$ if and only if $\varvec{x}=\varvec{x}'$. Bregman divergences are defined for $(\varvec{y},\varvec{y}')$ and also for $(\varvec{x},\varvec{y}')$ as

$$\begin{aligned} \mathcal {D}^{\mathcal {Y}}_{\Phi ^{*}}[\varvec{y}'\Vert \varvec{y}]&:=\Phi ^{*}(\varvec{y}') -\Phi ^{*}(\varvec{y}) - \langle \partial \Phi ^{*}(\varvec{y}), \varvec{y}'-\varvec{y} \rangle , \end{aligned}$$

(29)

$$\begin{aligned} \mathcal {D}^{\mathcal {X},\mathcal {Y}}_{\Phi ,\Phi ^{*}}[\varvec{x}; \varvec{y}']&:=\Phi (\varvec{x})+\Phi ^{*}(\varvec{y}') - \langle \varvec{x}, \varvec{y}' \rangle . \end{aligned}$$

(30)

Because $(\varvec{x},\varvec{y})$ and $(\varvec{x}', \varvec{y}')$ are Legendre pairs, all the three representations are equivalent^{Footnote 35}: $\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']=\mathcal {D}^{\mathcal {Y}}_{\Phi ^{*}}[\varvec{y}'\Vert \varvec{y}]=\mathcal {D}^{\mathcal {X},\mathcal {Y}}_{\Phi ,\Phi ^{*}}[\varvec{x}; \varvec{y}']$.^{Footnote 36}$\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']$, $ \mathcal {D}^{\mathcal {Y}}_{\Phi ^{*}}[\varvec{y}'\Vert \varvec{y}]$, and $\mathcal {D}^{\mathcal {X},\mathcal {Y}}_{\Phi ,\Phi ^{*}}[\varvec{x}; \varvec{y}']$ are abbreviated as $\mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \varvec{x}']$, $\mathcal {D}^{\mathcal {Y}}[\varvec{y}'\Vert \varvec{y}]$, and $\mathcal {D}^{\mathcal {X},\mathcal {Y}}[\varvec{x}; \varvec{y}']$, respectively.

Finally, the Hessian matrices of the primal and dual thermodynamic functions are defined when they are twice differentiable^{Footnote 37}:

Definition 15

(Hessian matrices) The primal and dual Hessian matrices, $G_{\varvec{x}}\in \mathbb {R}^{N_{\mathbb {v}}\times N_{\mathbb {v}}}$ and $G_{\varvec{y}}^{*}\in \mathbb {R}^{N_{\mathbb {v}}\times N_{\mathbb {v}}}$, of thermodynamic functions, $\Phi (\varvec{x})$ and $\Phi ^{*}(\varvec{y})$, are defined as

$$\begin{aligned} (G_{\varvec{x}})_{i,j}&:=\frac{\partial ^{2}\Phi (\varvec{x})}{\partial x_{i} \partial x_{j}},&(G_{\varvec{y}}^{*})_{i,j}&:=\frac{\partial ^{2}\Phi ^{*}(\varvec{y})}{\partial y_{i} \partial y_{j}}. \end{aligned}$$

(31)

In addition, they are positive definite and $G_{\varvec{x}}^{-1}=G_{\varvec{y}}^{*}$ holds for a Legendre dual pair $\varvec{x}$ and $\varvec{y}$.

The Hessian matrices induce a Riemannian metric over $\mathcal {X}$. The tangent and cotangent spaces $\mathcal {T}_{\varvec{x}}\mathcal {X}$ and $\mathcal {T}^{*}_{\varvec{x}}\mathcal {X}$ are isomorphic to the corresponding tangent and cotangent spaces $\mathcal {T}^{*}_{\varvec{y}}\mathcal {Y}$ and $\mathcal {T}_{\varvec{y}}\mathcal {Y}$ over $\mathcal {Y}$ and also to $C_{0}(\mathbb {H})$ and $C^{0}(\mathbb {H})$: $\mathcal {T}_{\varvec{x}}\mathcal {X}\cong \mathcal {T}^{*}_{\varvec{y}}\mathcal {Y}\cong C_{0}(\mathbb {H})$ and $\mathcal {T}^{*}_{\varvec{x}}\mathcal {X}\cong \mathcal {T}_{\varvec{y}}\mathcal {Y}\cong C^{0}(\mathbb {H})$.

The typical example of the duality between $\varvec{x}$ and $\varvec{y}$ in statistics is that between probability p and its logarithm ln p. Other than this typical one, depending on the purpose, we adopt different forms of thermodynamic functions $(\Phi (\varvec{x}), \Phi ^{*}(\varvec{y}))$, associated dual variables, and Bregman divergence to endow different properties to inference or estimation methods that we are designing [1]. In the case of CRN, the thermodynamic functions and Legendre duality are associated with the equilibrium thermodynamics [117]. Specifically, as we recently demonstrated [48], $\mathcal {X}$ and $\mathcal {Y}$ are the conjugate spaces of the extensive and intensive thermodynamic variables (density of molecules and their chemical potential), $\Phi (\varvec{x})$ is the thermodynamic potential function of the system, and the Bregman divergence becomes the difference of the total entropy. These correspondences are derived directly from the axiomatic formulation of thermodynamics [48, 117]. The explicit functional form of $\Phi (\varvec{x})$ is then determined by the physical details of the thermodynamic system that we work on.

Before closing this subsection, we introduce the notion of separability, which will be linked to the locality of the flux.

Definition 16

(Separability of a thermodynamic function) A thermodynamic function $\Phi (\varvec{x})$ is separable if it can be represented as

$$\begin{aligned} \Phi (\varvec{x})=\sum _{i=1}^{N_{\mathbb {v}}}c_{i}\phi (x_{i}/x_{i}^{o}), \end{aligned}$$

(32)

where $c_{i}>0$, $x_{i}^{o}>0$, and $\phi (x):\mathbb {R}_{>0} \rightarrow \mathbb {R}$ is a scalar primal thermodynamic function.

If $\Phi (\varvec{x})$ is separable, then its conjugate $\Phi ^{*}(\varvec{y})$ is also separable as $\Phi ^{*}(\varvec{y})=\sum _{i=1}^{N_{\mathbb {v}}}c_{i}\phi ^{*} (\frac{x_{i}^{o}}{c_{i}}y_{i})$ where $\phi ^{*}(y): \mathbb {R}\rightarrow \mathbb {R}$ is the Legendre conjugate of $\phi (x)$.^{Footnote 38} If a thermodynamic function is separable, then the corresponding Bregman divergence is separable. The Hessian matrices become diagonal for a separable thermodynamic function. Most of our results can hold without the separability, but common thermodynamic functions and related quantities are typically separable. For example, the Kullback-Leibler divergence is an example of separable Bregman divergences.

4.2 Dually flat spaces on edges and dissipation functions

Next, we introduce another dually flat structure onto the edge space of graphs and hypergraphs based on the flux-force relation.

Definition 17

(Flux and force spaces (primal and dual edge spaces)) The flux and force spaces on the edges, $\mathcal {J}_{\varvec{x}}=\mathbb {R}^{N_{\mathbb {e}}}$ and $\mathcal {F}_{\varvec{x}}=\mathbb {R}^{N_{\mathbb {e}}}$, are a pair of the primal and dual vector spaces defined for each $\varvec{x}\in \mathcal {X}$, which are isomorphic to $C_{1}(\mathbb {H})$ and $C^{1}(\mathbb {H})$, respectively (Fig. 1, right). The bilinear pairing under the standard basis $\langle \cdot , \cdot \rangle : C_{1}(\mathbb {H}) \times C^{1}(\mathbb {H}) \rightarrow \mathbb {R}$ is inherited to $(\mathcal {J}_{\varvec{x}}, \mathcal {F}_{\varvec{x}})$.

To introduce Legendre duality on $(\mathcal {J}_{\varvec{x}}, \mathcal {F}_{\varvec{x}})$, we use the dissipation functions:

Definition 18

(Dissipation function^{Footnote 39}) A dissipation function on $\mathcal {F}_{\varvec{x}}$, $\Psi ^{*}_{\varvec{x}}:\mathcal {F}_{\varvec{x}} \rightarrow \mathbb {R}, \varvec{f} \mapsto \Psi ^{*}_{\varvec{x}}(\varvec{f})$, is a strictly convex and continuously differentiable function with respect to $\varvec{f}$ for all $\varvec{x} \in \mathcal {X}$ that also satisfies the following additional conditions:

$$\begin{aligned} \text{1-coercive: }{} & {} \frac{\Psi ^{*}_{\varvec{x}}(\varvec{f})}{\Vert \varvec{f}\Vert }&\rightarrow \infty \quad \text{ as } \Vert \varvec{f}\Vert \rightarrow \infty , \end{aligned}$$

(33)

$$\begin{aligned} \text{ Symmetric: }{} & {} \Psi ^{*}_{\varvec{x}}(\varvec{f})&=\Psi ^{*}_{\varvec{x}}(-\varvec{f}) \end{aligned}$$

(34)

$$\begin{aligned} \text{ Bounded } \text{ below } \text{ by } 0:{} & {} \Psi ^{*}_{\varvec{x}}(\varvec{0})&=0, \end{aligned}$$

(35)

Proposition 1

(Duality of dissipation functions) The Legendre-Fenchel conjugate of $\Psi _{\varvec{x}}^{*}(\varvec{f})$, i.e., $\Psi _{\varvec{x}}(\varvec{j}) :=\max _{\varvec{f}}\left[ \langle \varvec{j},\varvec{f} \rangle - \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] $, is also the dissipation function on $\mathcal {J}_{\varvec{x}}$. $\Psi _{\varvec{x}}(\varvec{j})$ and $\Psi ^{*}_{\varvec{x}}(\varvec{f})$ are called primal and dual dissipation functions.

Proof

For each $\varvec{x}\in \mathcal {X}$, the function $\Psi _{\varvec{x}}(\varvec{j})$ is strictly convex, continuously differentiable, 1-coercive, and $\Psi _{\varvec{x}}(\varvec{j})<+\infty $ for all $\varvec{j}\in \mathcal {J}_{\varvec{x}}$ because $\Psi _{\varvec{x}}^{*}(\varvec{f})$ is ( see Corollary 4.1.4 in [118]). For $\varvec{j}\in \mathcal {J}_{\varvec{x}}$, the symmetry holds as $\Psi _{\varvec{x}}(-\varvec{j}) = \max _{\varvec{f}}\left[ \langle \varvec{-j},\varvec{f} \rangle - \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] = \max _{\varvec{f}}\left[ \langle \varvec{j},-\varvec{f} \rangle \! -\! \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] \!=\! \max _{\varvec{f}}\!\left[ \langle \varvec{j},\varvec{f} \rangle \!-\! \Psi ^{*}_{\varvec{x}}(-\varvec{f}) \right] = \max _{\varvec{f}}\left[ \langle \varvec{j},\varvec{f} \rangle - \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] =\Psi _{\varvec{x}}(\varvec{j})$. From the convexity and symmetry, the minimum of $\Psi _{\varvec{x}}(\varvec{j})$ is attained at $\varvec{j}=\varvec{0}$ and $\min _{\varvec{j}}\Psi _{\varvec{x}}(\varvec{j}) = \Psi _{\varvec{x}}(\varvec{0}) =\max _{\varvec{f}}\left[ \langle \varvec{0},\varvec{f} \rangle - \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] =-\min _{\varvec{f}}\Psi ^{*}_{\varvec{x}}(\varvec{f})=0$. $\square $

From these properties, for each $\varvec{x}\in X$, the one-to-one Legendre duality via Legendre transformations is established for all over $(\mathcal {J}_{\varvec{x}}, \mathcal {F}_{\varvec{x}})$:

$$\begin{aligned} \varvec{j}&=\partial _{\varvec{f}} \Psi ^{*}_{\varvec{x}}(\varvec{f}),&\varvec{f}&=\partial _{\varvec{j}} \Psi _{\varvec{x}}(\varvec{j}). \end{aligned}$$

(36)

In the following, we abbreviate the Legendre transformations as $\partial _{\varvec{f}} \Psi ^{*}_{\varvec{x}}(\varvec{f})=\partial \Psi ^{*}_{\varvec{x}}(\varvec{f})$ and $\partial _{\varvec{j}} \Psi _{\varvec{x}}(\varvec{j})=\partial \Psi _{\varvec{x}}(\varvec{j})$^{Footnote 40}. Similarly to the Legendre dual pair $(\varvec{x},\varvec{y})$ in $\mathcal {X}$ and $\mathcal {Y}$, a pair of flux and force with the same decoration, e.g., $(\varvec{j},\varvec{f})_{\varvec{x}}$ or $(\varvec{j}_{0},\varvec{f}_{0})_{\varvec{x}}$, represents a Legendre dual pair linked by Eq. 36 at $\varvec{x}$. We omit the $\varvec{x}$-dependency for simplicity. The Legendre dual pair $(\varvec{j},\varvec{f})$ satisfies the Legendre-Fenchel-Young identity for each $\varvec{x}\in \mathcal {X}$:

$$\begin{aligned} \Psi ^{*}_{\varvec{x}}(\varvec{f})+ \Psi _{\varvec{x}}(\varvec{j})-\langle \varvec{j},\varvec{f}\rangle =0. \end{aligned}$$

(37)

Furthermore, the additional conditions of dissipation functions enable the Legendre duality to work as an extension of a Riemannian metric structure:

Proposition 2

([75]) The Legendre transformations satisfy the following properties:

$$\begin{aligned}&\text{ Pairing } \text{ of } \varvec{0}\in \mathcal {J} \text{ and } \varvec{0}\in \mathcal {F}: \varvec{0}=\partial \Psi ^{*}_{\varvec{x}}(\varvec{0}),\quad \varvec{0}=\partial \Psi _{\varvec{x}}(\varvec{0}) \end{aligned}$$

(38)

$$\begin{aligned}&\text{ Symmetry: } -\varvec{f}=\partial \Psi _{\varvec{x}}(-\varvec{j}),\, \quad -\varvec{j}=\partial \Psi ^{*}_{\varvec{x}}(-\varvec{f}). \end{aligned}$$

(39)

$$\begin{aligned}&\text{ Nonnegativity } \text{ of } \text{ bilinear } \text{ pairing: } \langle \varvec{j},\varvec{f}\rangle = \Psi ^{*}_{\varvec{x}}(\varvec{f})+ \Psi _{\varvec{x}}(\varvec{j})\ge 0. \end{aligned}$$

(40)

The first property means that zero force $\varvec{f}=\varvec{0}$ and zero flux $\varvec{j}=\varvec{0}$ are always Legendre dual regardless of $\varvec{x}$, and the second one indicates that if $(\varvec{j},\varvec{f})$ is a Legendre dual pair, then $(-\varvec{j},-\varvec{f})$ is as well.^{Footnote 41} The third property, as well as the nonnegativity of the dissipation functions, enables them to play the similar roles to the metric-induced norm in Riemannian geometry.^{Footnote 42}

With the dissipation functions, $\Psi _{\varvec{x}}(\varvec{j})$ and $\Psi ^{*}_{\varvec{x}}(\varvec{f})$, we now have the second dually flat structure on the edge spaces $(\mathcal {J}_{\varvec{x}},\mathcal {F}_{\varvec{x}})$. On these dually flat spaces, we define the Bregman divergence and Hessian matrices:

Definition 19

(Bregman divergence and Hessian matrices on the edge spaces) For each $\varvec{x}\in \mathcal {X}$, the Bregman divergence between $\varvec{j}\in \mathcal {J}_{\varvec{x}}$ and $\varvec{f}'\in \mathcal {F}_{\varvec{x}}$ is defined as

$$\begin{aligned} \mathcal {D}_{\varvec{x}}^{\mathcal {J},\mathcal {F}}[\varvec{j};\varvec{f}']:=\Psi _{\varvec{x}}(\varvec{j})+\Psi ^{*}_{\varvec{x}}(\varvec{f}')-\langle \varvec{j}, \varvec{f}'\rangle . \end{aligned}$$

(41)

$\mathcal {D}_{\varvec{x}}^{\mathcal {J}}[\varvec{j}\Vert \varvec{j}']$ and $\mathcal {D}_{\varvec{x}}^{\mathcal {F}}[\varvec{f}\Vert \varvec{f}']$ are also defined analogously to the Bregman divergence on the vertex space $(\mathcal {X}, \mathcal {Y})$. For a Legendre conjugate pair of twice differentiable dissipation functions, the Hessian matrices, $G_{\varvec{x},\varvec{j}}$ and $G_{\varvec{x},\varvec{f}}^{*}$, are defined as

$$\begin{aligned} (G_{\varvec{x},\varvec{j}})_{e,e'}&:=\frac{\partial ^{2}\Psi _{\varvec{x}}(\varvec{j})}{\partial j_{e} \partial j_{e'}},&(G_{\varvec{x},\varvec{f}}^{*})_{e,e'}&:=\frac{\partial ^{2}\Psi _{\varvec{x}}^{*}(\varvec{f})}{\partial f_{e} \partial f_{e'}}. \end{aligned}$$

(42)

These matrices are positive-definite.

The Legendre dual structure via the dissipation functions provides an extension of a Riemannian metric structure in the following sense. If the dissipation function is a quadratic function, i.e., a positive definite quadratic form as

$$\begin{aligned} \Psi ^{q,*}_{\varvec{x}}(\varvec{f}):=\frac{1}{2}\langle \varvec{f},M^{*}_{\varvec{x}} \varvec{f} \rangle , \end{aligned}$$

(43)

where $M^{*}_{\varvec{x}}$ is a positive definite $N^{\mathbb {e}}\times N^{\mathbb {e}}$ matrix, the Legendre transformation is reduced to the linear mapping $\varvec{j}=\partial \Psi ^{q,*}_{\varvec{x}}(\varvec{f})=M^{*}_{\varvec{x}}\varvec{f}$^{Footnote 43}. Then, the bilinear pairing, $\langle \varvec{j},\varvec{f}'\rangle = \langle \varvec{j},M_{\varvec{x}}\varvec{j}'\rangle =\langle M^{*}_{\varvec{x}}\varvec{f},\varvec{f}'\rangle $, becomes the inner product under the metric matrix $M_{\varvec{x}}$ where $M_{\varvec{x}}=(M_{\varvec{x}}^{*})^{-1}$. The dissipation functions are associated with the induced norms: $\Psi ^{*}_{\varvec{x}}(\varvec{f})=\frac{1}{2}\Vert \varvec{f}\Vert _{M^{*}_{\varvec{x}}}^{2}$, $\Psi _{\varvec{x}}(\varvec{j})=\frac{1}{2}\Vert \varvec{j}\Vert _{M_{\varvec{x}}}^{2}$. The Bregman divergence is reduced to the norm-induced squared distance: $\mathcal {D}_{\varvec{x}}^{\mathcal {J},\mathcal {F}}[\varvec{j};\varvec{f}']=\frac{1}{2}\Vert \varvec{j}-\varvec{j}'\Vert _{M_{\varvec{x}}}^{2}=\frac{1}{2}\Vert \varvec{f}-\varvec{f}'\Vert _{M^{*}_{\varvec{x}}}^{2}$.

Finally, we also introduce the notion of separability to the dissipation functions:

Definition 20

(Separability and locality of dissipation functions) A dissipation function $\Psi ^{*}_{\varvec{x}}(\varvec{f})$ is separable if it can be represented as

$$\begin{aligned} \Psi ^{*}_{\varvec{x}}(\varvec{f})=\sum _{e=1}^{N_{\mathbb {e}}}\omega _{e}(\varvec{x}) \psi ^{*}(f_{e}/f^{o}_{e}(\varvec{x})), \end{aligned}$$

(44)

where $\omega _{e}(\varvec{x})>0$ and $f^{o}_{e}(\varvec{x})>0$ for $\varvec{x}\in \mathcal {X}$ are positive weights and $\psi ^{*}(f):\mathbb {R}\rightarrow \mathbb {R}$ is a scalar dissipation function, i.e., a strictly convex differentiable scalar function satisfying Eq. 34, Eq. 35, and Eq. 33. If $\omega _{e}(\varvec{x})$ and $f^{o}_{e}(\varvec{x})$ are additionally local, then the dissipation function is separable and local. If $\Psi ^{*}_{\varvec{x}}(\varvec{f})$ is separable, then its dual $\Psi _{\varvec{x}}(\varvec{j})$ is also separable. The same is true for the locality.

Remark 5

(Young functions and N functions) The scalar dissipation function is a N function, which appears in the theory of Orlicz spaces. A function $\tilde{\psi }(j): [0,\infty ) \rightarrow [0,\infty ]$ represented as $\tilde{\psi }(j)=\int _{0}^{j}\varsigma (j')\textrm{d}j'$ is called Young function where $\varsigma (j):[0,\infty )\rightarrow [0,\infty ]$ is a non-decreasing function satisfying $\varsigma (0)=0$ and being left-continuous on $(0,\infty )$. If $\varsigma (j)$ additionally satisfies $0<\varsigma (j)<+\infty (0<j<\infty )$, $\lim _{j\rightarrow +0} \varsigma (j)=0$, and $\lim _{j\rightarrow \infty } \varsigma (j)=+\infty $, then $\tilde{\psi }(j)$ is called an N-function. If we define a function $\psi (j)$ with a N-function $\tilde{\psi }(j)$ as $\psi (j)=\tilde{\psi }(|j|)$, this becomes a scalar dissipation function [119]. A separable dissipation function (Eq. 44) is often called a weighted N-function [120, 121]. The dissipation function and induced Legendre duality are, therefore, related to Birnbaum-Orlicz spaces, which are an extension of $L^{p}$ spaces.

4.3 Generalized flow on graphs and hypergraphs and its steady state

Because of the one-to-one Legendre duality between $(\varvec{j},\varvec{f})_{\varvec{x}}$, the continuity equation (Eq. 10) can be represented as a generalized flow driven by the force $\varvec{f}(\varvec{x})$ dual to $\varvec{j}(\varvec{x})$ [77]:

Definition 21

(Generalized flow) A curve $\varvec{x}(t)$ is a generalized flow on $\mathbb {H}$ driven by force $\varvec{f}(\varvec{x})$ under the dissipation function $\Psi ^{*}_{\varvec{x}}$ if it can be represented as

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \varvec{j}(\varvec{x})=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]. \end{aligned}$$

(45)

This representation is independent of the specific functional form of $\varvec{f}(\varvec{x})$ and $\Psi ^{*}_{\varvec{x}}(\varvec{f})$ and also on the definition of $\textrm{div}_{\mathbb {S}}$ as long as the generated $\varvec{j}(\varvec{x})$ is consistent with $\mathbb {H}$^{Footnote 44}$^{,}$^{Footnote 45}. Thus, we can potentially apply this framework to various systems by choosing these functions appropriately depending on the system or the problem we work on.

The generalized flow naturally encompasses three types of steady states:

Definition 22

(Steady state, complex-balanced state, and detailed-balanced state) We define the manifolds of steady state $\mathcal {M}^{\textrm{ST}}$, complex-balanced (CB) state $\mathcal {M}^{\textrm{CB}}$, and detailed-balanced (DB) state $\mathcal {M}^{\textrm{DB}}$, respectively, as follows:

$$\begin{aligned} \mathcal {M}^{\textrm{ST}}&:=\{\varvec{x} \in \mathcal {X}| \mathbb {S}\varvec{j}(\varvec{x})=0\}, \end{aligned}$$

(46)

$$\begin{aligned} \mathcal {M}^{\textrm{CB}}&:=\{\varvec{x} \in \mathcal {X}| \mathbb {B}\varvec{j}(\varvec{x})=0\}, \end{aligned}$$

(47)

$$\begin{aligned} \mathcal {M}^{\textrm{DB}}&:=\{\varvec{x} \in \mathcal {X}| \varvec{j}(\varvec{x})=0\}=\{\varvec{x} \in \mathcal {X}| \varvec{f}(\varvec{x})=0\}, \end{aligned}$$

(48)

where we used $\varvec{j}(\varvec{x})=0$ iff $\varvec{f}(\varvec{x})=0$ from the properties of the dissipation functions. The relations $\varvec{j}(\varvec{x})=0$ and $\mathbb {B}\varvec{j}(\varvec{x})=0$ are called the detail-balanced (DB) condition and the complex-balanced (CB) condition, respectively. From the decomposition , an inclusion relation holds: $\mathcal {M}^{\textrm{DB}} \subseteq \mathcal {M}^{\textrm{CB}} \subseteq \mathcal {M}^{\textrm{ST}}$. It should be noted that, depending on the details of $\varvec{j}(\varvec{x})$, these manifolds can be empty.

A steady state is a state at which $\dot{\varvec{x}}=0$ holds. The DB condition $\varvec{j}(\varvec{x})=\varvec{0}$ means that all the fluxes are zero at $\varvec{x}$. In other words, all the forward and reverse fluxes are balanced at $\varvec{x}$, i.e., $j_{e}^{+}(\varvec{x})=j_{e}^{-}(\varvec{x})$. The CB condition is equivalent to the balance of all influx and outflux at each hypervertex of $\mathbb {H}$. As we will see later, DB states are tightly linked to the equilibrium state and equilibrium flow. The CB state is relevant as an extension of the equilibrium state to nonequilibrium flows.

4.4 Generalized gradient flow and De Giorgi’s formulation

When $\varvec{f}(\varvec{x})$ can be represented as a gradient, i.e., $\varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})$ of a function $\mathcal {F}(\varvec{x})\in \mathbb {R}$ on the density space, Eq. 45 is reduced to the generalized gradient flow of $\mathcal {F}(\varvec{x})$.

Definition 23

(Generalized gradient flow) $\varvec{x}(t)$ is a generalized gradient flow when it is a generalized flow driven by a gradient force of $\mathcal {F}(\varvec{x})$, i.e., $\varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})$ and

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \varvec{j}(\varvec{x})=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})]. \end{aligned}$$

(49)

The following proposition ensures that the generalized gradient flow behaves like the conventional gradient flow:

Proposition 3

($\mathcal {F}(\varvec{x})$ is non-increasing along the trajectory of generalized gradient flow) For a trajectory $\{\varvec{x}_{t}\}_{t \in [0,\tau ]}$ of the generalized gradient flow of $\mathcal {F}(\varvec{x})$, $\mathcal {F}(\varvec{x}_{t})$ is always decreasing except at the DB states $\mathcal {M}^{\textrm{DB}}$. In addition, all the steady states of the generalized gradient flow are the DB states, i.e., $\mathcal {M}^{\textrm{ST}}=\mathcal {M}^{\textrm{DB}}$^{Footnote 46}.

Proof

$\mathcal {F}(\varvec{x}_{t})$ is non-increasing over time as follows:

$$\begin{aligned} \dot{\mathcal {F}}(\varvec{x}_{t})&=\langle \dot{\varvec{x}} , \!\partial _{\varvec{x}} \mathcal {F}(\varvec{x})\rangle \!=\! -\langle \textrm{div}_{\mathbb {S}} \!\partial \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})] , \!\partial _{\varvec{x}} \mathcal {F}(\varvec{x})\rangle \nonumber \\&= -\langle \partial \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})] , \textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}} \mathcal {F}(\varvec{x})\rangle \nonumber \\&=-\langle \varvec{j}(\varvec{x}),\varvec{f}(\varvec{x})\rangle = -\left( \Psi _{\varvec{x}}[\varvec{j}(\varvec{x})] + \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]\right) \le 0, \end{aligned}$$

(50)

where Eq. 40 is used. The equality holds iff $\varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})=0$ because $\Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]=\Psi _{\varvec{x}}[\varvec{j}(\varvec{x})]=0$ iff $\varvec{f}(\varvec{x})=\varvec{j}(\varvec{x})=0$. Thus, $\dot{\mathcal {F}}(\varvec{x}_{t})=0$ iff $\varvec{x}_{t}\in \mathcal {M}^{\textrm{DB}}$. Because $\dot{\varvec{x}}_{t}=0 \Rightarrow \dot{\mathcal {F}}(\varvec{x}_{t})=0$, $\mathcal {M}^{\textrm{ST}}=\mathcal {M}^{\textrm{DB}}$. $\square $

It should be noted that, even if $\mathcal {F}(\varvec{x})$ has a single minimum, the steady state $\varvec{x}_{st}:=\lim _{t \rightarrow \infty }\varvec{x}(t)$ may not be the minimum, because $\dot{\mathcal {F}}(\varvec{x}_{t})=0$ holds for any $\varvec{x}\in \mathcal {M}^{\textrm{DB}}$.^{Footnote 47}

The generalized gradient flow of this form (Eq. 49) was devised in the process to extend the conventional gradient flow to metric spaces [122, 123].^{Footnote 48} Furthermore, dissipation functions have been recognized since the seminal work of Onsager [124,125,126]. However, only quadratic dissipation functions have been investigated until very recently [75,76,77,78,79,80,81,82]. This may be partly because we lack an adequate geometric language to handle the non-quadratic cases, i.e., information geometry. Actually, if the dissipation function is quadratic $\Psi ^{q,*}_{\varvec{x}}[\varvec{f}]$ as in Eq. 43, then the generalized flow (Eq. 45) formally reduces to the flow on a Riemannian manifold with the metric $(\mathbb {S}M^{*}_{\varvec{x}}\mathbb {S}^{T})^{-1}$.

The non-negativity of $\dot{\mathcal {F}}(\varvec{x}_{t})$ is essentially attributed to the fact that $\dot{\mathcal {F}}(\varvec{x}_{t})=-\langle \varvec{j}(\varvec{x}),\varvec{f}(\varvec{x})\rangle $ holds in Eq. 50 for the generalized gradient flow. The converse also holds.

Proposition 4

(De Giorgi’s formulation of generalized gradient flow [75, 79]) Let $\varvec{x}_{t}$ be a generalized flow induced by a force $\varvec{f}(\varvec{x})$. $\varvec{x}_{t}$ is the generalized gradient flow of $\mathcal {F}(\varvec{x})$ iff

$$\begin{aligned} \dot{\mathcal {F}}(\varvec{x}_{t})&=-\langle \varvec{j}(\varvec{x}),\varvec{f}(\varvec{x})\rangle = -\left( \Psi _{\varvec{x}}[\varvec{j}(\varvec{x})] + \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]\right) . \end{aligned}$$

(51)

holds. The integral form of Eq. 51,

$$\begin{aligned} \mathcal {F}(\varvec{x}_{0})-\mathcal {F}(\varvec{x}_{t})=\int _{0}^{t} \left[ \Psi ^{*}_{\varvec{x}_{t'}}(\varvec{f}(\varvec{x}_{t'}))+ \Psi _{\varvec{x}_{t'}}(\varvec{j}(\varvec{x}_{t'}))\right] \textrm{d}t', \end{aligned}$$

(52)

is called De Giorgi’s $(\Psi ,\Psi ^{*})$-formulation of generalized gradient flow.

Proof

For a generalized flow $\varvec{x}_{t}$ driven by force $\varvec{f}(\varvec{x})$ as in Eq. 45 and for any $\mathcal {F}(\varvec{x})$, the following inequality holds:

$$\begin{aligned} \dot{\mathcal {F}}(\varvec{x}_{t})=\left\langle \dot{\varvec{x}},\frac{\partial \mathcal {F}(\varvec{x}_{t})}{\partial \varvec{x}} \right\rangle&=\left\langle -\varvec{j}(\varvec{x}_{t}),\textrm{grad}_{\mathbb {S}}\frac{\partial \mathcal {F}(\varvec{x}_{t})}{\partial \varvec{x}} \right\rangle \end{aligned}$$

(53)

$$\begin{aligned}&=-\left[ \Psi _{\varvec{x}_{t}}(\varvec{j}(\varvec{x}_{t})) + \Psi ^{*}_{\varvec{x}_{t}}(\varvec{f}'(\varvec{x}_{t}))\right] \nonumber \\&\quad + \mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}_{t}}[\varvec{j}(\varvec{x}_{t});\varvec{f}'(\varvec{x}_{t})] \end{aligned}$$

(54)

$$\begin{aligned}&\ge -\left[ \Psi _{\varvec{x}_{t}}(\varvec{j}(\varvec{x}_{t})) + \Psi ^{*}_{\varvec{x}_{t}}(\varvec{f}'(\varvec{x}_{t}))\right] , \end{aligned}$$

(55)

where we define $\varvec{f}'(\varvec{x}):=\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})$. The last inequality becomes an equality if and only if $\varvec{f}'(\varvec{x}_{t})$ is the Legendre dual of $\varvec{j}(\varvec{x}_{t})$,^{Footnote 49} i.e.,

$$\begin{aligned} \mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}_{t}}[\varvec{j}(\varvec{x}_{t});\varvec{f}'(\varvec{x}_{t})]=0 \Longleftrightarrow \varvec{j}(\varvec{x}_{t})=\partial \Psi ^{*}_{\varvec{x}_{t}}[\varvec{f}'(\varvec{x}_{t})] \end{aligned}$$

(56)

Thus, Eq. 51 holds only when $\varvec{x}_{t}$ is the generalized gradient flow of $\mathcal {F}(\varvec{x})$. $\square $

De Giorgi’s formulation is a well-established approach for defining gradient flow in metric spaces [122].

4.5 Equilibrium and nonequilibrium flow

In this work, we mainly focus on the case that $\mathcal {F}(\varvec{x})= \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]$ where $\mathcal {D}^{\mathcal {X}}_{\Phi }$ is the Bregman divergence associated with a thermodynamic function $\Phi $.

Definition 24

(Equilibrium force, equilibrium flux, and equilibrium flow) The force generated by the gradient of Bregman divergence associated with a thermodynamic function $\Phi $ is called the (thermodynamic) equilibrium force, and the following equation is denoted as the thermodynamic gradient equation:

$$\begin{aligned} \varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}], \end{aligned}$$

(57)

where $\tilde{\varvec{x}}\in \mathcal {X}$ is a parameter. The dual of $\varvec{f}(\varvec{x})$, i.e., $\varvec{j}(\varvec{x})=\partial \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]$, is called the equilibrium flux: A generalized flow $\varvec{x}(t)$ is an equilibrium flow if it is driven by the equilibrium force:

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]]. \end{aligned}$$

(58)

Using the relation $\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]=\partial \Phi (\varvec{x}) - \tilde{\varvec{y}}$ where $\tilde{\varvec{y}}=\partial \Phi (\tilde{\varvec{x}})$, Eq. 58 can be rewritten as

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}}\left[ \partial \Psi ^{*}_{\varvec{x}}\left[ \textrm{grad}_{\mathbb {S}}\left\{ \partial \Phi (\varvec{x}) - \tilde{\varvec{y}}\right\} \right] \right] , \end{aligned}$$

(59)

which explicitly shows the contribution of both the thermodynamic function and the dissipation function to the dynamics (Fig. 3a).

Various properties of the equilibrium flow (Eq. 58) can be obtained from the doubly dual flat structure as we will see in the following sections. In addition, the equilibrium flow captures the properties that the dynamics of thermodynamic equilibrium systems should have. In this sense, the equilibrium flow is the mathematical representation of the dynamics of equilibrium systems.

Beyond the gradient equilibrium flow, we also consider the non-gradient nonequilibrium flow of the following type:

Definition 25

(Nonequilibrium force and nonequilibrium flow) The force generated by a shift of the equilibrium force

$$\begin{aligned} \varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}] + \varvec{f}_{NE}, \end{aligned}$$

(60)

is called nonequilibrium force if $\varvec{f}_{NE}\not \in \textrm{Im}[\mathbb {S}^{T}]$.^{Footnote 50} If the shift $\varvec{f}_{NE}$ satisfies $\varvec{f}_{NE}\in \textrm{Im}[\mathbb {S}^{T}]$, then $\varvec{f}(\varvec{x})$ is reduced to the equilibrium force $\varvec{f}_{NE}=\varvec{0}$ by appropriately changing $\tilde{\varvec{x}}$. The nonequilibrium flow is the flow induced by the nonequilibrium force (Fig. 3b):

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}\left[ \left[ \textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\right] +\varvec{f}_{NE}\right] . \end{aligned}$$

(61)

In the next section, we show that this equation can cover a sufficiently wide class of models, e.g., all types of rLDG and CRN with extended LMA kinetics. Equation 61 can also be associated with nonequilibrium dynamics with a constant environmental force. The techniques in information geometry, Hessian geometry, and convex analysis enable us to investigate such non-gradient dynamics.

Remark 6

(Variational modeling [128]) We introduced and characterized dynamics based on the thermodynamic functions and dissipation functions. While we employed a restricted definition in order to link dynamics to information geometry, we may further generalize this approach by appropriately choosing the state space, $\varvec{f}(\varvec{x})$, $\Psi ^{*}_{\varvec{x}}(\varvec{f})$, and $\textrm{div}_{\mathbb {S}}$. For example, we may consider a $\varvec{x}$-dependent and noninteger-valued matrix for $\mathbb {S}(\varvec{x})$. The equilibrium flow may not be restricted to $\mathcal {F}(\varvec{x})= \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]$, and the nonequilibrium flow may be defined for $\varvec{x}$-dependent $\varvec{f}_{NE}(\varvec{x})$. This type of approach for modeling dissipative dynamics has been known as variational modeling.

Before closing this section, we mention that the existence of DB states, i.e., $\mathcal {M}^{\textrm{DB}}\ne \emptyset $, is necessary and sufficient for a nonequilibrium flow to be an equilibrium flow.

Proposition 5

(Detailed balance condition and equilibrium flow) Consider a flow given by Eq. 61. If $\mathcal {M}^{\textrm{DB}}\ne \emptyset $, then the flow is equilibrium, i.e., $\varvec{f}_{NE}\in \textrm{Im}\mathbb {S}^{T}$.

Proof

$\mathcal {M}^{\textrm{DB}}\ne \emptyset $ means that there exists $\varvec{x}_{DB}$ satisfying $\varvec{j}(\varvec{x}_{DB})=\varvec{0}$. Then we have $\varvec{j}(\varvec{x}_{DB})=\varvec{0}\Leftrightarrow \varvec{f}(\varvec{x}_{DB})=\varvec{0}$. If $\varvec{f}_{NE}\not \in \textrm{Im}[\mathbb {S}^{T}]$, $\varvec{f}_{NE}\ne \varvec{0}$ and thus $\varvec{f}(\varvec{x}) \ne \varvec{0}$ for all $\varvec{x}\in \mathcal {X}$. Thus, $\varvec{f}_{NE} \in \textrm{Im}[\mathbb {S}^{T}]$ if $\mathcal {M}^{\textrm{DB}}\ne \emptyset $.

The necessity follows basically from Prop. 3, but we have to show $\mathcal {M}^{\textrm{ST}} \ne \emptyset $. This will be shown in the following section (Lemma 1).

5 Explicit form of thermodynamic and dissipation functions

Before investigating the dynamics of the equilibrium (Eq. 58) and nonequilibrium (Eq. 61) flows, we show how the flows can be associated with the dynamics on graphs and hypergraphs via specific forms of the thermodynamic and dissipation functions. The forms of functions depend on the functional form of the flux that we assume: Eq. 3 for rLDG, Eq. 12 for CRN with LMA kinetics, and Eq. 18 for FPE. It should be noted that the choice of the thermodynamic function and the dissipation function is not unique for a given dynamics in general. Depending on the purpose, we should choose or find an appropriate set of functions.

5.1 Explicit form of thermodynamic functions for rLDG and CRN

For rLDG (Eq. 3) and CRN with LMA kinetics (Eq. 12), the following pair of thermodynamic functions is particularly relevant^{Footnote 51}:

$$\begin{aligned} \Phi (\varvec{x})&:=\left[ \ln \varvec{x} - \ln \varvec{x}^{o} - \varvec{1}\right] ^{T}\varvec{x} = \sum _{i=1}^{N_{\mathbb {X}}}\left[ \ln \frac{x_{i}}{x_{i}^{o}}-1\right] x_{i},\nonumber \\ \Phi ^{*}(\varvec{y})&:=(\varvec{x}^{o})^{T}e^{\varvec{y}} =\sum _{i=1}^{N_{\mathbb {X}}}x_{i}^{o}e^{y_{i}}, \end{aligned}$$

(62)

which induce the following Legendre transformation:

$$\begin{aligned} \varvec{y}&=\partial \Phi (\varvec{x}) = \ln \varvec{x}-\ln \varvec{x}^{o},&\varvec{x}&=\partial \Phi ^{*}(\varvec{y}) = \varvec{x}^{o}\circ e^{\varvec{y}}. \end{aligned}$$

(63)

Here, $\mathcal {Y}=\mathbb {R}^{N_{\mathbb {X}}}$, and $\varvec{x}^{o}\in \mathcal {X}$ is a parameter determining the point in $\mathcal {X}$ that is associated with the origin of $\mathcal {Y}$ via the Legendre transformation. For these thermodynamic functions, the Bregman divergence is reduced to the generalized Kullback-Leibler divergence.

$$\begin{aligned} \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \varvec{x}']=\left( \ln \frac{\varvec{x}}{\varvec{x}'}\right) ^{T}\varvec{x}-\varvec{1}^{T}(\varvec{x}-\varvec{x}'). \end{aligned}$$

(64)

These thermodynamic functions and the generalized KL divergence are separable.

If we choose $\varvec{x}^{o}=\varvec{1}$, then the conventional dual representation for the probability density $\varvec{p}$ on a discrete space is recovered:

$$\begin{aligned} \Phi (\varvec{p})&\! = \left[ \ln \varvec{p} \!-\! \varvec{1}\right] ^{T}\varvec{p},&\!\Phi ^{*}(\varvec{y})&= \varvec{1}^{T}e^{\varvec{y}}, \nonumber \\ \!\varvec{y}&\!=\partial _{\varvec{p}}\Phi (\varvec{p}) \!=\! \ln \varvec{p},&\!\varvec{p}&\!=\partial _{\varvec{y}}\Phi ^{*}(\varvec{y}) = e^{\varvec{y}}. \end{aligned}$$

(65)

In this case, $\mathcal {Y}$ is the space of the logarithm of $\varvec{p}$. These representations hold even if $\varvec{p}$ is not a probability density. If $\varvec{p}$ satisfies $\varvec{1}^{T}\varvec{p}=1$, the generalized KL divergence becomes the normal KL divergence $\mathcal {D}^{\mathcal {X}}[\varvec{p}\Vert \varvec{p}']=\left( \ln \frac{\varvec{p}}{\varvec{p}'}\right) ^{T}\varvec{p}$.^{Footnote 52}

5.2 Explicit form of dissipation functions for rLDG and CRN

To determine the dissipation functions, we need the definition of force, which may depend on the phenomena and purpose.^{Footnote 53} In physics, the flux-force relations, which are also called constitutive equations [129], are central because they determine what kind of change is induced by an incurred force.^{Footnote 54} For rMJP and CRNs, the flux and force are conventionally defined using the one-way fluxes, $\varvec{j}^{+}(\varvec{x})$ and $\varvec{j}^{-}(\varvec{x})$ as

$$\begin{aligned} \varvec{j}&=\varvec{j}^{+}-\varvec{j}^{-},&\varvec{f}&=\ln \varvec{j}^{+} - \ln \varvec{j}^{-}, \end{aligned}$$

(66)

where the dependency of $\varvec{j}^{\pm }(\varvec{x})$ on $\varvec{x}$ is abbreviated for notational simplicity. In physics, assuming this form of force-flux relation goes by the name of the local detailed balance (LDB) assumption,^{Footnote 55} or the generalized detailed balance assumption.^{Footnote 56} By defining the frenetic activity [132]:

$$\begin{aligned} \varvec{\omega }:=2 \sqrt{\varvec{j}^{+}\circ \varvec{j}^{-}}\in \mathbb {R}_{{\ge } 0}^{N_{\mathbb {e}}}, \end{aligned}$$

(67)

we have a relation $\varvec{j}=\varvec{\omega }\circ \left[ \frac{\exp (\varvec{f}/{2})-\exp (-\varvec{f}/{2})}{2}\right] $. For a fixed $\varvec{\omega }$, this relation between the pair $(\varvec{j},\varvec{f})$ is a one-to-one Legendre duality induced by the following specific form of dissipation functions:

$$\begin{aligned} {\begin{matrix} \Psi ^{*}_{\varvec{\omega }}(\varvec{f})&{}:={2} \varvec{\omega }^{T} \left[ \cosh (\varvec{f}/{2})-\varvec{1}\right] , \\ \Psi _{\varvec{\omega }}(\varvec{j})&{}:={2}\varvec{\omega }^{T}\left( \textrm{diag}\left[ \frac{\varvec{j}}{\varvec{\omega }}\right] \sinh ^{-1}\left( \frac{\varvec{j}}{\varvec{\omega }}\right) - \left[ \sqrt{\varvec{1}+\left( \frac{\varvec{j}}{\varvec{\omega }} \right) ^{2}}-\varvec{1}\right] \right) , \end{matrix}} \end{aligned}$$

(68)

which lead to the Legendre transformation:

$$\begin{aligned} \varvec{j}&=\partial \Psi ^{*}_{\varvec{\omega }}(\varvec{f})=\varvec{\omega }\circ \sinh (\varvec{f}/{2}),&\varvec{f}&=\partial \Psi _{\varvec{\omega }}(\varvec{j})={2} \sinh ^{-1}\left( \frac{\varvec{j}}{\varvec{\omega }}\right) . \end{aligned}$$

(69)

We can easily verify that these functions satisfy the conditions for dissipation functions, i.e., Eq. 34, Eq. 35, and Eq. 33.

For the flux $\varvec{j}_{\textrm{MA}}(\varvec{x})$ of LMA kinetics (Eq. 12), the force and activity become^{Footnote 57}

(70)

where we introduced a transformation of the kinetic parameters $(\varvec{k}^{+}, \varvec{k}^{-})$ into the force part $\varvec{K}$ and activity part $\varvec{\kappa }$ as $\varvec{\kappa }:=\sqrt{\varvec{k}^{+}\circ \varvec{k}^{-}}$ and $\varvec{K}:=\varvec{k}^{+}/\varvec{k}^{-}$.^{Footnote 58} Because $\varvec{k}^{\pm }=\varvec{\kappa }\circ \varvec{K}^{\pm 1/2}$ holds, $(\varvec{\kappa },\varvec{K})$ has the same information as $(\varvec{k}^{+}, \varvec{k}^{-})$. Moreover, we can verify that the force and activity are dependent only on $\varvec{K}$ and $\varvec{\kappa }$, respectively. The dissipation functions of the forms above and their relations to rLDG and CRN were derived from the large deviation function of the corresponding microscopic stochastic models [75, 133]. Actually, the Bregman divergence $\mathcal {D}_{\varvec{x}}^{\mathcal {J}}[\varvec{j};\varvec{j}_{\textrm{MA}}(\varvec{x})]$ of the dissipation functions is identical to the rate function of the flux for rMJP and CRN. Thus, these dissipation functions are keystones connecting macroscopic and microscopic dynamics.

If there exists $\tilde{\varvec{y}}$ satisfying $-\mathbb {S}^{T}\tilde{\varvec{y}}=\ln \varvec{K}$, i.e., $\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}$, the force in Eq. 70 is represented as

$$\begin{aligned} \varvec{f}_{\textrm{MA}}(\varvec{x};\varvec{K})&=\textrm{grad}_{\mathbb {S}}\left( \ln \frac{\varvec{x}}{\tilde{\varvec{x}}}\right) =\textrm{grad}_{\mathbb {S}} \partial _{\varvec{x}} \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \tilde{\varvec{x}}] \in \textrm{Im}\mathbb {S}^{T}, \end{aligned}$$

(71)

where $\tilde{\varvec{x}}$ is the Legendre conjugate of $\tilde{\varvec{y}}$.^{Footnote 59} Thus, CRN (and rMJP) is an equilibrium flow of the generalized KL divergence $\mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \tilde{\varvec{x}}]$ when the parameter $\varvec{K}$ satisfies $\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}$. In chemistry, the condition $\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}$ is called Wegscheider’s equilibrium condition [47, 134], and the CRN satisfying this parametric condition is called equilibrium CRN.^{Footnote 60} Even if $\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}$ is not satisfied, we can represent $\ln \varvec{K}= -\mathbb {S}^{T}\tilde{\varvec{y}} + \varvec{f}_{NE}$ with $\varvec{f}_{NE}\not \in \textrm{Im}\mathbb {S}^{T}$. The force in Eq. 70 is always represented as

$$\begin{aligned} \varvec{f}_{\textrm{MA}}(\varvec{x};\varvec{K})&=\textrm{grad}_{\mathbb {S}}\left( \ln \frac{\varvec{x}}{\tilde{\varvec{x}}}\right) + \varvec{f}_{NE}=\left[ \textrm{grad}_{\mathbb {S}} \left[ \partial _{\varvec{x}} \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \tilde{\varvec{x}}]\right] + \varvec{f}_{NE}\right] , \end{aligned}$$

(72)

which leads to the nonequilibrium flow (Eq. 61). Thus, CRN with LMA kinetics as well as rLDG are generally within the class of Eq. 61.

Example 2

(Simplified Brusselator CRN [8, 104] (continued)) For the Brusselator CRN introduced in Ex. 1, the force and activity defined in Eq. 70 can be explicitly represented as

(73)

Remark 7

(Wegscheider’s equilibrium condition and Detailed balance condition) While we defined equilibrium flow by the specific functional form of force and obtained Wegscheider’s equilibrium condition as the necessary and sufficient condition to have the equilibrium force under LMA kinetics, the equilibrium dynamics is often defined by the existence of the steady state satisfying the DB condition (Eq. 48) in CRN theory. In addition, the DB condition is also often assumed in statistics when we design or analyze a random walk in parameter spaces, e.g., in the Markov Chain Monte Carlo (MCMC) simulations or in other random-walk-based optimization schemes.^{Footnote 61} These two are equivalent for (extended) LMA kinetics. Actually, $\mathcal {M}^{\textrm{DB}} \ne \emptyset $ means that there exists $\varvec{x}_{DB} \in \mathcal {X}$ such that $\varvec{j}_{MA}(\varvec{x}_{DB})=\varvec{0} \Leftrightarrow - \mathbb {S}^{T}\ln \varvec{x}_{DB}=\ln \varvec{K}$. From the Fredholm alternative, we obtain the Wegscheider’s equilibrium condition $\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}$ for the existence of $\varvec{x}_{DB}$.

Remark 8

(Linear graph Laplacian dynamics) The linear graph Laplacian dynamics defined by Eq. 22 can be formally regarded as a generalized flow. From the form of the graph Laplacian (Eq. 21),^{Footnote 62} it is easy to see that Eq. 22 coincides with Eq. 58 if

$$\begin{aligned} \Phi (\varvec{x})&=\frac{1}{2}\langle \varvec{x},M_{0}\varvec{x} \rangle ,&\Psi ^{*,q}_{\varvec{x}}(\varvec{f})&=\frac{1}{2}\langle \varvec{f},M^{1}\varvec{f}\rangle , \end{aligned}$$

(74)

where $M_{0}=I$, $M^{1}=\textrm{diag}[\varvec{k}]$, and $\tilde{\varvec{x}}=\varvec{0}$. In contrast to rLDG, the natural state space and the corresponding dual is $\mathcal {X}=\mathcal {Y}=\mathbb {R}^{N_{\mathbb {v}}}$.^{Footnote 63} In [23], non-quadratic general $\Phi (\varvec{x})$ is considered as a class of nonlinear diffusion on a network from information geometric viewpoint.

5.3 Some remarks on the dissipation functions for rLDG and CRN

The dissipation functions in Eq. 68 have several notable properties. First, they are separable:

$$\begin{aligned} \Psi ^{*}_{\varvec{\omega }(\varvec{x})}(\varvec{f})&= \sum _{e=1}^{N_{\mathbb {e}}} \omega _{e}(\varvec{x}) \psi ^{*}(f_{e}),&\Psi _{\varvec{\omega }(\varvec{x})}(\varvec{j})&= \sum _{e=1}^{N_{\mathbb {e}}} \omega _{e}(\varvec{x}) \psi (j_{e}/\omega _{e}(\varvec{x})), \end{aligned}$$

(75)

where

$$\begin{aligned} \psi ^{*}(f)&:={2} \left[ \cosh (f/{2})-1\right] \in [0,\infty ), \end{aligned}$$

(76)

$$\begin{aligned} \psi (j)&:={2}\left( j\sinh ^{-1}\left( j\right) - \left[ \sqrt{1+\bar{j}^{2}}-1\right] \right) \in [0,\infty ). \end{aligned}$$

(77)

and $\varvec{\omega }(\varvec{x})$ is local: $\omega _{e}(\varvec{x})=2 \kappa _{e}\prod _{i=1}^{N_{\mathbb {X}}}x_{i}^{(\gamma ^{+}_{i,e}+\gamma ^{-}_{i,e})/2}$. The thermodynamic functions in Eq. 62 are also separable.^{Footnote 64}

Second, the scalar function $\psi ^{*}(f)$ is the N-function. The N-function of the $(\cosh (f)-1)$-type and the associated Orlicz space have been employed for establishing the infinite-dimensional information geometry by Pistone [72, 135, 136]. In functional analysis, the Orlicz space is a generalization of the $L^{p}$ spaces, which arise naturally when we work on the $L \log ^{+} L$ space for the divergences and large deviation functions. Hence, the dissipation functions in Eq. 68 are tightly related to such topics.

Third, various information geometric measures and quantities are related to the dissipation functions in Eq. 68 and also to the associated quantities as follows:

$$\begin{aligned} \frac{1}{4} \Psi ^{*}_{\varvec{\omega }}(\varvec{f})&=\frac{1}{2} \sum _{e=1}^{N_{\mathbb {e}}}\left[ \sqrt{j_{e}^{+}}- \sqrt{j_{e}^{-}}\right] ^{2} =: \mathcal {D}_{Hel}[\varvec{j}^{+};\varvec{j}^{-}]^{2} \end{aligned}$$

(78)

$$\begin{aligned} \frac{1}{2}\varvec{1}^{T}\varvec{\omega }&= \sum _{e=1}^{N_{\mathbb {e}}}\sqrt{j_{e}^{+}j_{e}^{+}} =:\textrm{BC}[\varvec{j}^{+};\varvec{j}^{-}] \end{aligned}$$

(79)

$$\begin{aligned} \langle \varvec{j},\varvec{f}\rangle&= \sum _{e=1}^{N_{\mathbb {e}}}(j^{+}_{e}-j^{-}_{e})\ln \frac{j^{+}_{e}}{j^{-}_{e}}=: \mathcal {D}_{Jef}[\varvec{j}^{+};\varvec{j}^{-}], \end{aligned}$$

(80)

where $\mathcal {D}_{Hel}[\varvec{j}^{+};\varvec{j}^{-}]$, $\textrm{BC}[\varvec{j}^{+};\varvec{j}^{-}]$, and $\mathcal {D}_{Jef}[\varvec{j}^{+};\varvec{j}^{-}]$ are the Hellinger–Kakutani distance, the Bhattacharyya coefficient, and the Jeffreys divergence (symmetrized KL divergence) for $\varvec{j}^{+}$ and $\varvec{j}^{-}$, respectively. In addition, in physics, the bilinear pairing $\langle \varvec{j},\varvec{f}\rangle $ of a Legendre dual pair and its approximation using the Hessian matrix are often referred to as the entropy production rate (EPR) $\dot{\Sigma }$ and pseudo-entropy production rate (pEPR) $\dot{\Sigma }^{p}$, respectively [83, 137]:

$$\begin{aligned} \dot{\Sigma }&:=\langle \varvec{j},\varvec{f}\rangle =\sum _{e=1}^{N_{\mathbb {e}}}(j^{+}_{e}-j^{-}_{e})\ln \frac{j^{+}_{e}}{j^{-}_{e}}. \end{aligned}$$

(81)

$$\begin{aligned} \dot{\Sigma }^{p}&:=\langle \varvec{j}, G_{\varvec{\omega },\varvec{j}}\varvec{j}\rangle = 2\sum _{e=1}^{N_{\mathbb {e}}}\frac{(j^{+}_{e}-j^{-}_{e})^{2}}{j^{+}_{e}+j^{-}_{e}}, \end{aligned}$$

(82)

where we treat $\varvec{j}\in \mathcal {J}_{\varvec{x}}$ as a member of $\mathcal {T}_{\varvec{j}}\mathcal {J}_{\varvec{x}}$ by the isomorphism: $\mathcal {J}_{\varvec{x}} \cong \mathcal {T}_{\varvec{j}}\mathcal {J}_{\varvec{x}} \cong C_{1}(\mathbb {H})$. The pEPR $\dot{\Sigma }^{p}$ is an approximation of EPR by replacing $\varvec{f}=\partial \Psi _{\varvec{\omega }}(\varvec{j})$ with $G_{\varvec{\omega },\varvec{j}}\varvec{j}$ and works as a lower bound of $\dot{\Sigma }$: $\dot{\Sigma }\ge \dot{\Sigma }^{p}$ [137]^{Footnote 65}.

Finally, the dissipation functions in Eq. 68 are not the unique choice to reproduce the force-flux relation in Eq. 66. The quadratic dissipation functions $\Psi ^{q,*}_{\varvec{x}}(\varvec{f}):=\frac{1}{2}\langle \varvec{f},M^{*}_{\varvec{x}} \varvec{f} \rangle $ in Eq. 43 with the following diagonal metric tensor can reproduce the relation in Eq. 66:

$$\begin{aligned} M^{*}_{\varvec{x}}=\textrm{diag}\left[ \frac{\varvec{j}^{+}(\varvec{x}) -\varvec{j}^{-}(\varvec{x})}{\ln \varvec{j}^{+}(\varvec{x})- \ln \varvec{j}^{-}(\varvec{x})} \right] =\textrm{diag}\left[ \left( \frac{j^{+}_{e}(\varvec{x})-j^{-}_{e}(\varvec{x})}{\ln j^{+}_{e}(\varvec{x})- \ln j^{-}_{e}(\varvec{x})}\right) _{e} \right] . \end{aligned}$$

(83)

This type of quadratic dissipation function was proposed even earlier than the non-quadratic ones [138,139,140] and has been investigated [104, 141, 142]. Its advantage is that the induced geometry is Riemannian, and thus the information geometric argument is not necessarily required. In addition, this Riemannian geometric structure is analogous to the formal Riemannian geometric structure of FPE and other diffusion processes on continuous manifolds induced via the $L^{2}$-Wasserstein geometry [65, 66] (Fig. 4). Thus, this quadratic dissipation function provides a consistent extension of these results for FPE and diffusion processes to graphs and hypergraphs. Nevertheless, the doubly dual flat structure with the non-quadratic dissipation functions that we introduce is also another sound generalization of the formal Riemannian geometry of FPE, as we see in the next subsection.

As long as we focus only on the trajectory of the generalized flow (Eq. 45), the difference between the quadratic and non-quadratic functions does not matter because both induce the same dynamics. However, the Bregman divergence of the quadratic dissipation functions is not directly related to the rate function of the microscopic stochastic models, while that of nonquadratic ones in Eq. 68 is [133]. Thus, if we consider projections of fluxes and forces in the edge spaces, different choices of dissipation functions lead to different projections. In addition, for non-quadratic dissipation functions, the contributions of the kinetic parameters $\varvec{k}^{\pm }$ can be clearly separated into the force part $\varvec{K}$ and the activity part $\varvec{\kappa }$ in the case of CRN with the LMA kinetics (Eq. 70). This separation enables a physical realization of the projected flux as we derive in the following section.

5.4 Explicit forms of thermodynamic and dissipation functions for FPE

For FPE, the dualistic representation of the density $p(\varvec{r})$ and its logarithm $y(\varvec{r})=\ln p(\varvec{r})$ is also relevant. This duality is induced formally by the following thermodynamic functions^{Footnote 66}:

$$\begin{aligned} \Phi [p]&= \int [\ln p(\varvec{r}) - \varvec{1}]p(\varvec{r})\textrm{d}\varvec{r},&\Phi ^{*}[y]&= \int e^{y(\varvec{r})}\textrm{d}\varvec{r}, \end{aligned}$$

(84)

the Legendre transformations of which are

$$\begin{aligned} y(\varvec{r})&=\frac{\delta \Phi [p]}{\delta p} = \ln p(\varvec{r}),&p(\varvec{r})&=\frac{\delta \Phi ^{*}[y]}{\delta y} = e^{y(\varvec{r})}. \end{aligned}$$

(85)

The Bregman divergence becomes the KL divergence $\mathcal {D}_{\mathcal {X}}[p\Vert p']=\int \textrm{d}\varvec{r}p(\varvec{r})\ln \frac{p(\varvec{r})}{p'(\varvec{r})}$. In physics, the flux and force for FPE are defined conventionally as

$$\begin{aligned} \varvec{j}_{\textrm{FP}}[p(\varvec{r})]&=D_{0}p(\varvec{r})\left\{ \varvec{F}(\varvec{r})/D_{0}-\nabla \ln p(\varvec{r}) \right\} , \end{aligned}$$

(86)

$$\begin{aligned} \varvec{f}_{\textrm{FP}}[p(\varvec{r})]&=D^{-1}_{0}\varvec{F}(\varvec{r})- \nabla \ln p(\varvec{r}) . \end{aligned}$$

(87)

The dissipation functions associated with the force-flux relation above are

$$\begin{aligned} \Psi ^{\textrm{FP},*}_{\varvec{\omega }[p]}[\varvec{f}]&= \frac{1}{2}\int \varvec{f}(p(\varvec{r}))^{T}M^{*}_{p(\varvec{r})}\varvec{f}(p(\varvec{r}))\textrm{d}\varvec{r},\nonumber \\ \Psi ^{\textrm{FP}}_{\varvec{\omega }[p]}[\varvec{j}]&= \frac{1}{2}\int \varvec{j}(p(\varvec{r}))^{T}M_{p(\varvec{r})} \varvec{j}(p(\varvec{r}))\textrm{d}\varvec{r}, \end{aligned}$$

(88)

where $M^{*}_{p(\varvec{r})}:=\textrm{diag}[ \varvec{\omega }[p(\varvec{r})] ] $, $M_{p(\varvec{r})}:=(M^{*}_{p(\varvec{r})})^{-1}$, and $\omega _{i}[p(\varvec{r})]=D_{0} p(\varvec{r})$. Thus, the dissipation functions are formally quadratic and positive definite. If $\varvec{F}(\varvec{r})$ is a gradient of $U(\varvec{r})$ as $\varvec{F}(\varvec{r})=D_{0}\nabla U(\varvec{r})$, $\varvec{f}_{\textrm{FP}}[p(\varvec{r})]=- \nabla \ln \frac{p(\varvec{r})}{\tilde{p}(\varvec{r})}$ holds where $\tilde{p}(\varvec{r}):=\exp [U(\varvec{r})]$. Then, the dissipation functions, the bilinear pairing $\langle \varvec{j}_{\textrm{FP}},\varvec{f}_{\textrm{FP}}\rangle $, the EPR $\dot{\Sigma }_{\textrm{FP}}$ in Eq. 81, and the pEPR $\dot{\Sigma }^{p}_{\textrm{FP}}$ in Eq. 82 formally consolidate into the same quantity:

$$\begin{aligned}&2\Psi ^{\textrm{FP},*}_{\varvec{\omega }[p]}[\varvec{f}_{\textrm{FP}}] =2\Psi ^{\textrm{FP}}_{\varvec{\omega }[p]}[\varvec{j}_{\textrm{FP}}] =\langle \varvec{j}_{\textrm{FP}},\varvec{f}_{\textrm{FP}}\rangle =\dot{\Sigma }_{\textrm{FP}}=\dot{\Sigma }^{p}_{\textrm{FP}}\nonumber \\&= D_{0}\int p(\varvec{r})\left( \nabla _{\varvec{r}} \ln \frac{p(\varvec{r})}{\tilde{p}(\varvec{r})}\right) ^{2}\textrm{d}\varvec{r}. \end{aligned}$$

(89)

The last quantity without $D_{0}$ is known as relative Fisher information [66, 143] and Hyvärinen divergence [120, 144] between p and $\tilde{p}$. For $U(\varvec{f})=0$, it reduces to the Fisher information number $\mathbb {I}_{F}[p]$ in Eq. 1. This consolidation is a source of confusion, because the same quantity for FPE or linear diffusion processes has different names in different contexts and in different disciplines. However, they actually have different definitions, roles, and meanings, which become explicit in the information-geometric formulation.

6 Orthogonal subspaces, dual foliations, and Pythagorean relation

To investigate the behaviors and properties of the equilibrium (Eq. 58) and nonequilibrium (Eq. 61) flow, especially its topological and algebraic constraints from the graph or hypergraph structure, information geometry provides the ideal tools. In particular, the four affine subspaces associated with the cycle and cocycle subspaces of the chain and cochain complexes (Fig. 5) form dual foliations via the Legendre transformation, whose geometric properties are captured by information geometry [1, 145]. It should be noted that the results of this section do not assume the specific forms of the thermodynamic and dissipation functions introduced in Sect. 5.

6.1 Four affine subspaces

Two families of orthogonally complement affine subspaces are naturally introduced on $\mathcal {X}$ and $\mathcal {Y}$, respectively, from the topological structure of graph and hypergraph, i.e., $\mathbb {B}$ and $\mathbb {S}$.

Definition 26

(Stoichiometric subspaces in $\mathcal {X}$) The stoichiometric subspaces are defined as^{Footnote 67}

$$\begin{aligned} \mathcal {P}^{sc}(\varvec{x}_{0})&:=\{\varvec{x}\in \mathcal {X}| \varvec{x}-\varvec{x}_{0} \in \textrm{Im}\mathbb {S}\}, \quad \varvec{x}_{0}\in \mathcal {X} \end{aligned}$$

(90)

where $\varvec{x}_{0}$ is a parameter to specify the position of the subspace (Fig. 5, lower left)^{Footnote 68}.

Definition 27

(Equilibrium subspaces in $\mathcal {Y}$) The equilibrium subspaces (Fig. 5, upper left) are defined as

$$\begin{aligned} \mathcal {P}^{eq}(\tilde{\varvec{y}})&:=\left\{ \varvec{y}\in \mathcal {Y}| \varvec{y}-\tilde{\varvec{y}} \in \textrm{Ker}\mathbb {S}^{T}\right\} , \quad \tilde{\varvec{y}}\in \mathcal {Y}. \end{aligned}$$

(91)

$\mathcal {P}^{sc}(\varvec{x}_{0})$ and $\mathcal {P}^{eq}(\tilde{\varvec{y}})$ are of orthogonal complement to each other: $\langle \varvec{x}-\varvec{x}_{0},\varvec{y}'-\tilde{\varvec{y}}\rangle =0$ for $\varvec{x}\in \mathcal {P}^{sc}(\varvec{x}_{0})$ and $\varvec{y}' \in \mathcal {P}^{eq}(\tilde{\varvec{y}})$.^{Footnote 69} Because $\mathbb {S}$ and $\mathbb {S}^{T}$ are the discrete differentials, $\delta _{1}$ and $\delta ^{0}$, $\mathcal {P}^{sc}(\varvec{x}_{0})$ and $\mathcal {P}^{eq}(\tilde{\varvec{y}})$ are associated with the 0-cycle and 0-cocycle spaces, respectively.

Two other families of orthogonal-complement subspaces are introduced on $\mathcal {J}_{\varvec{x}}$ and $\mathcal {F}_{\varvec{x}}$.

Definition 28

(Iso-velocity subspaces in $\mathcal {J}_{\varvec{x}}$) The iso-velocity subspaces (Fig. 5, lower right) are defined as

$$\begin{aligned} \mathcal {P}^{vl}(\hat{\varvec{j}})=\left\{ \varvec{j}\in \mathcal {J}_{\varvec{x}}|\varvec{j}-\hat{\varvec{j}} \in \textrm{Ker}\mathbb {S}\right\} , \qquad \hat{\varvec{j}}\in \mathcal {J}_{\varvec{x}}. \end{aligned}$$

(92)

Definition 29

(Iso-force subspaces in $\mathcal {F}_{\varvec{x}}$) The iso-external-force subspaces, iso-force subspaces in short, (Fig. 5, upper right) are defined as

$$\begin{aligned} \mathcal {P}^{fr}(\varvec{f}'):=\left\{ \varvec{f}\in \mathcal {F}_{\varvec{x}}|\varvec{f}-\varvec{f}'\in \textrm{Im}\mathbb {S}^{T} \right\} , \qquad \varvec{f}'\in \mathcal {F}_{\varvec{x}}. \end{aligned}$$

(93)

Again, from the correspondence of $\delta _{1}=\mathbb {S}$ and $\delta ^{0}=\mathbb {S}^{T}$, $\mathcal {P}^{vl}(\hat{\varvec{j}})$ and $\mathcal {P}^{fr}(\varvec{f}')$ are associated with the 1-cycle and 1-cocycle spaces, respectively. We specifically call $\mathcal {P}^{vl}(\varvec{0})$ and $\mathcal {P}^{fr}(\varvec{0})$ zero-velocity subspace and equilibrium force subspace, respectively.

6.2 Meaning of the subspaces

All four subspaces are natural constituents in the theory of algebraic graph theory and homological algebra. Here, we provide their meaning in terms of the dynamics on graphs and hypergraphs.

The stoichiometric and iso-velocity subspaces, $\mathcal {P}^{sc}(\varvec{x}_{0})$ and $\mathcal {P}^{vl}(\hat{\varvec{j}})$, are related by the continuity equation (Eq. 10). From the continuity equation, $\mathcal {P}^{vl}(\hat{\varvec{j}})$ is the set of fluxes that induce the same velocity as a reference $\hat{\varvec{j}}$ does: $\varvec{j}\in \mathcal {P}^{vl}(\hat{\varvec{j}}) \Longleftrightarrow \dot{\varvec{x}}=-\mathbb {S}\hat{\varvec{j}}=-\mathbb {S}\varvec{j} $. Thereby, $\mathcal {P}^{vl}(\hat{\varvec{j}})$ is parametrized as follows:

$$\begin{aligned} \mathcal {P}^{vl}(\dot{\varvec{x}})&= \{\varvec{j}\in \mathcal {J}_{\varvec{x}}| - \mathbb {S}\varvec{j} =\dot{\varvec{x}}\},\quad \dot{\varvec{x}}\in \textrm{Im}[\mathbb {S}]=\textrm{Ker}[\mathbb {U}], \end{aligned}$$

(94)

This subspace is crucial to characterize fluxes that can realize the same dynamics as the reference one.

The stoichiometric subspace $\mathcal {P}^{sc}(\varvec{x}_{0})$ determines the subspace in which the dynamics are algebraically constrained via the topology of the underlying graph or hypergraph. Because $\dot{\varvec{x}}=-\mathbb {S}\varvec{j}(\varvec{x}(t))$, for an initial state $\varvec{x}(0)=\varvec{x}_{0}$, $\varvec{x}(t)-\varvec{x}_{0} \in \textrm{Im}[\mathbb {S}]$ should hold, meaning that $\varvec{x}(t)\in \mathcal {P}^{sc}(\varvec{x}_{0})$. Thus, $\mathcal {P}^{sc}(\varvec{x}_{0})$ is the subspace in which the dynamics are restricted by the initial condition $\varvec{x}_{0}$. $\mathcal {P}^{sc}(\varvec{x}_{0})$ can also be represented parametrically by the quantities which are conserved by the dynamics. For any vector $\varvec{u} \in \textrm{Ker}\mathbb {S}^{T}$, $\eta (t):=\varvec{u}^{T}\varvec{x}(t)$ is constant over time:

$$\begin{aligned} \dot{\eta }(t)=\frac{\textrm{d}\varvec{u}^{T}\varvec{x}(t)}{\textrm{d}t}=\varvec{u}^{T}\frac{\textrm{d}\varvec{x}(t)}{\textrm{d}t}=-\varvec{u}^{T}\mathbb {S}\varvec{j}(\varvec{x})=0. \end{aligned}$$

(95)

In Sect. 3.1, we defined a matrix $\mathbb {U}$ by a complete basis of $\textrm{Ker}\mathbb {S}^{T}$ so that $\textrm{Im}\mathbb {U}^{T}=\textrm{Ker}\mathbb {S}^{T}$. Using $\mathbb {U}$, the conserved quantities for a given initial condition $\varvec{x}_{0}$ are obtained as $\varvec{\eta }=\mathbb {U}\varvec{x}_{0}=\mathbb {U}\varvec{x}(t)$. Because $\textrm{Im}\mathbb {U}$ is isomorphic to $C_{-1}(\mathbb {H})$, the stoichiometric subspace is explicitly parametrized by the conserved quantities (an element of $C_{-1}(\mathbb {H})$):

$$\begin{aligned} \mathcal {P}^{sc}(\varvec{\eta })&= \{\varvec{x}\in \mathcal {X}| \mathbb {U}\varvec{x} =\varvec{\eta }\},\quad \varvec{\eta }\in C_{-1}(\mathbb {H}). \end{aligned}$$

(96)

For rMJP, the conserved quantity is reduced to the conservation of probability $\varvec{1}^{T}\varvec{p}(t)=1$ and $\mathcal {P}^{sc}(\varvec{p}_{0})$ becomes the probability simplex. Because $\textrm{Ker}\mathbb {B}^{T}$ determines the connected components of the graph $\mathbb {G}$ and we conventionally assume that the underlying graph is connected in rMJP, we only have the one-dimensional cokernel space and one conserved quantity, which is $\eta =1$. Thus, the conservation of probability or, equivalently, the restriction of $\varvec{p}$ in the probability simplex is automatically guaranteed from the topological constraint of the dynamics if we start from an initial state satisfying $\varvec{1}^{T}\varvec{p}_{0}=1$.

The iso-force subspace $\mathcal {P}^{fr}(\varvec{f}')$ and the equilibrium subspace $\mathcal {P}^{eq}(\tilde{\varvec{y}})$ are related to the equilibrium and nonequilibrium force equations, Eq. 57 and Eq. 60. The equilibrium force defined in Eq. 57 satisfies $\varvec{f}(\varvec{x})\in \textrm{Im}\mathbb {S}^{T}=\mathcal {P}^{fr}(\varvec{0})$. Thus, the equilibrium-force subspace $\mathcal {P}^{fr}(\varvec{0})$ is literally the set of equilibrium forces. $\mathcal {P}^{fr}(\varvec{f}')$ is its shift by $\varvec{f}'\in \mathcal {F}_{\varvec{x}}$. Using $\mathbb {V}$ defined in Sect. 3.1, we can represent $\mathcal {P}^{fr}$ parametrically as

$$\begin{aligned} \mathcal {P}^{fr}(\varvec{\zeta })&= \{\varvec{f}\in \mathcal {F}_{\varvec{x}}| \mathbb {V}^{T}\varvec{f} =\varvec{\zeta }\},\quad \varvec{\zeta }\in C^{2}(\mathbb {H}) \end{aligned}$$

(97)

because $\mathcal {F}_{\varvec{x}}/\textrm{Im}\mathbb {S}^{T}\cong \mathcal {F}_{\varvec{x}}/\textrm{Ker}\mathbb {V}^{T}\cong \textrm{Im}\mathbb {V}^{T}\cong C^{2}(\mathbb {H})$. Thus, $\varvec{\zeta }$ characterizes the type of nonequilibrium forces quotient by the equilibrium forces.

Finally, the equilibrium subspace $\mathcal {P}^{eq}(\tilde{\varvec{y}})$ can also be regarded as the set of potentials $\varvec{y}$ that generate the same equilibrium force because any $\varvec{y}\in \mathcal {P}^{eq}(\tilde{\varvec{y}})$ satisfies $\varvec{f}'=\mathbb {S}^{T}\varvec{y}=\mathbb {S}^{T}\tilde{\varvec{y}} \in \mathcal {P}^{fr}(\varvec{0})$. Due to this, the equilibrium subspace $\mathcal {P}^{eq}$ is parameterized as

$$\begin{aligned} \mathcal {P}^{eq}(\varvec{f}')&= \{\varvec{y}\in \mathcal {Y}| \mathbb {S}^{T}\varvec{y} =\varvec{f}'\},\quad \varvec{f}'\in \textrm{Im}[\mathbb {S}^{T}]. \end{aligned}$$

(98)

The parametric forms of the subspaces are summarized as follows:

$$\begin{aligned} \mathcal {P}^{vl}(\dot{\varvec{x}})&= \{\varvec{j}\in \mathcal {J}_{\varvec{x}}| - \mathbb {S}\varvec{j} =\dot{\varvec{x}}\},&\dot{\varvec{x}}&\in \textrm{Im}[\mathbb {S}]=\textrm{Ker}[\mathbb {U}], \end{aligned}$$

(99)

$$\begin{aligned} \mathcal {P}^{sc}(\varvec{\eta })&= \{\varvec{x}\in \mathcal {X}| \mathbb {U}\varvec{x} =\varvec{\eta }\},&\varvec{\eta }&\in \textrm{Im}[\mathbb {U}] =C_{-1}(\mathbb {H}), \end{aligned}$$

(100)

$$\begin{aligned} \mathcal {P}^{fr}(\varvec{\zeta })&= \{\varvec{f}\in \mathcal {F}_{\varvec{x}}| \mathbb {V}^{T}\varvec{f} =\varvec{\zeta }\},&\varvec{\zeta }&\in \textrm{Im}[\mathbb {V}^{T}]=C^{2}(\mathbb {H}), \end{aligned}$$

(101)

$$\begin{aligned} \mathcal {P}^{eq}(\varvec{f}')&= \{\varvec{y}\in \mathcal {Y}| \mathbb {S}^{T}\varvec{y} =\varvec{f}'\},&\varvec{f}'&\in \textrm{Im}[\mathbb {S}^{T}]=\textrm{Ker}[\mathbb {V}^{T}]. \end{aligned}$$

(102)

From these subspaces, we can obtain dual foliations on the vertex and edge spaces.

6.3 Dual manifolds, dual foliations, and Pythagorean relation in vertex spaces

For the subspaces $\mathcal {P}^{sc}$ and $\mathcal {P}^{eq}$ in the density and potential spaces, we introduce their Legendre transformation via the thermodynamic functions, $\Phi (\varvec{x})$ and $\Phi ^{*}(\varvec{y})$, which form the dual foliation with the subspaces of orthogonal complement (Fig. 6, left).

Definition 30

(Stoichiometric manifold in $\mathcal {Y}$ and equilibrium manifold in $\mathcal {X}$) The stoichiometric and equilibrium manifolds (Fig. 6, left) are defined respectively as

$$\begin{aligned} \mathcal {M}^{sc}(\varvec{y}_{0})&:=\partial \Phi [\mathcal {P}^{sc}(\varvec{x}_{0})]\subset \mathcal {Y}, \quad \varvec{y}_{0}=\partial \Phi (\varvec{x}_{0}), \end{aligned}$$

(103)

$$\begin{aligned} \mathcal {M}^{eq}(\tilde{\varvec{x}})&:=\partial \Phi ^{*}[\mathcal {P}^{eq}(\tilde{\varvec{y}})] \subset \mathcal {X}, \quad \tilde{\varvec{x}}=\partial \Phi ^{*}(\tilde{\varvec{y}}). \end{aligned}$$

(104)

Lemma 1

(Dual foliations in density and potential spaces [48]) $\mathcal {P}^{sc}$ and $\mathcal {M}^{eq}$ are foliations of $\mathcal {X}$, and $\mathcal {M}^{sc}$ and $\mathcal {P}^{eq}$ are foliations of $\mathcal {Y}$. For each pair of $(x_{0},\tilde{x})$, the intersection of $\mathcal {P}^{sc}(\varvec{x}_{0})$ and $\mathcal {M}^{eq}(\tilde{\varvec{x}})$ is unique and transversal. The same applies to $\mathcal {M}^{sc}(\varvec{y}_{0})$ and $\mathcal {P}^{eq}(\tilde{\varvec{y}})$. Then, $(\mathcal {P}^{sc}, \mathcal {M}^{eq})$ and $(\mathcal {M}^{sc}, \mathcal {P}^{eq})$ form dual foliations (nonlinear coordinate systems) in $\mathcal {X}$ and $\mathcal {Y}$ spaces, respectively.

Proof

The polyhedron $\mathcal {P}^{sc}(\varvec{x}_{0})$ and the affine subspace $\mathcal {P}^{eq}(\tilde{\varvec{y}})$ can cover the whole $\mathcal {X}$ and $\mathcal {Y}$ by changing $\varvec{x}_{0}$ and $\tilde{\varvec{y}}$, respectively. Similarly, $\mathcal {M}^{eq}(\tilde{\varvec{x}})$ and $\mathcal {M}^{sc}(\varvec{y}_{0})$ can cover the whole $\mathcal {X}$ and $\mathcal {Y}$ because Legendre transformations by the thermodynamic functions are one-to-one between $\mathcal {X}$ and $\mathcal {Y}$. Consider the intersection of $\mathcal {P}^{sc}(\varvec{x}_{0})$ and $\mathcal {M}^{eq}(\tilde{\varvec{x}})$ in $\mathcal {X}$ space. The condition that $\mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}}) \ne \emptyset $ is related to the existence of $\varvec{x}^{\dagger }$ defined by the following convex optimization problem:

$$\begin{aligned} \varvec{x}^{\dagger }:=\arg \min _{\varvec{x}\in \overline{\mathcal {P}^{sc}(\varvec{x}_{0})}} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]. \end{aligned}$$

(105)

Because of the properties of $\Phi (\varvec{x})$, $\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]$ and its restriction to $\overline{\mathcal {P}^{sc}(\varvec{x}_{0})}$ are strictly convex with respect to $\varvec{x}$. Thus, $\varvec{x}^{\dagger }$ is unique and either satisfies the stationarity condition $\varvec{x}^{\dagger } \in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})$ if $\varvec{x}^{\dagger }\in \mathcal {P}^{sc}(\varvec{x}_{0})$ or locates on the boundary $\partial \mathcal {X}$ if $\varvec{x}^{\dagger }\not \in \mathcal {P}^{sc}(\varvec{x}_{0})$, where we used $\mathbb {S}^{T}\frac{\partial \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \tilde{\varvec{x}}]}{\partial \varvec{x}} =\varvec{0} \Leftrightarrow \mathbb {S}^{T}(\varvec{y}-\tilde{\varvec{y}}) =\varvec{0} \Leftrightarrow \varvec{y}\in \mathcal {P}^{eq}(\tilde{\varvec{y}})\Leftrightarrow \varvec{x}\in \mathcal {M}^{eq}(\tilde{\varvec{x}})$. Let $\varvec{x}_{bd}\in \partial \mathcal {X}$ and $\varvec{x}_{in}\in \mathcal {X}$ be arbitrary points on the boundary and interior of $\mathcal {X}$. From the condition Eq. 25 of the thermodynamic function, for $\varvec{x}_{\lambda }:=\lambda \varvec{x}_{in} + (1-\lambda )\varvec{x}_{bd}$ where $\lambda \in [0,1]$,

$$\begin{aligned} \lim _{\lambda \rightarrow +0}\frac{\textrm{d}\mathcal {D}^{\mathcal {X}}[\varvec{x}_{\lambda }\Vert \tilde{\varvec{x}}]}{\textrm{d}\lambda }=\lim _{\lambda \rightarrow +0}\left[ \frac{\textrm{d}\Phi (\varvec{x}_{\lambda })}{\textrm{d}\lambda } - \left\langle \tilde{\varvec{y}}, \frac{\textrm{d}\varvec{x}_{\lambda }}{\textrm{d}\lambda } \right\rangle \right] = -\infty . \end{aligned}$$

(106)

Thus, $\varvec{x}^{\dagger }\not \in \mathcal {X}$ is excluded, and the intersection exists, i.e., $\varvec{x}^{\dagger }\in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})$. The intersection $\varvec{x}^{\dagger }$ is unique and transversal because $\langle \varvec{x}_{sc}-\varvec{x}^{\dagger }, \varvec{y}_{eq}-\varvec{y}^{\dagger }\rangle =0$ holds for any $\varvec{x}_{sc}\in \mathcal {P}^{sc}(\varvec{x}_{0})$ and $\varvec{x}_{eq} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})$ and the dimensions of $\mathcal {P}^{sc}(\varvec{x}_{0})$ and $\mathcal {M}^{eq}(\tilde{\varvec{x}})$ are complementary because $\mathcal {P}^{sc}(\varvec{x}_{0})$ and $\mathcal {P}^{eq}(\tilde{\varvec{y}})$ are of orthogonal complement (see also the proof in [83]). As a result, $\varvec{x}^{\dagger } \in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})$ always exists, and $(\mathcal {P}^{sc}, \mathcal {M}^{eq})$ forms a dual foliation in $\mathcal {X}$. Also $(\mathcal {M}^{sc}, \mathcal {P}^{eq})$ does in $\mathcal {Y}$ because they are bijective Legendre duals of $(\mathcal {P}^{sc}, \mathcal {M}^{eq})$. $\square $

This result is reduced to Birch’s theorem [48, 100] and the seminal result by Horn and Jackson [41] when the thermodynamic function is the generalized KL divergence.

With the dual foliation, we can consider the generalized Pythagorean relations and orthogonal decomposition. For any three points satisfying $\varvec{x} \in \mathcal {P}^{sc}(\varvec{x}_{0})$, $\varvec{x}_{q} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})$, and $\varvec{x}^{\dagger } = \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})$^{Footnote 70}, we have the generalized Pythagorean relation:

$$\begin{aligned} \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \varvec{x}_{q}]=\mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \varvec{x}^{\dagger }] + \mathcal {D}^{\mathcal {X}}[\varvec{x}^{\dagger }\Vert \varvec{x}_{q}]. \end{aligned}$$

(107)

In $\mathcal {Y}$ space, we also have the dual version of the relations as

$$\begin{aligned} \mathcal {D}^{\mathcal {Y}}[\varvec{y}_{q}\Vert \varvec{y}]=\mathcal {D}^{\mathcal {Y}}[\varvec{y}_{q}\Vert \varvec{y}^{\dagger }] + \mathcal {D}^{\mathcal {Y}}[\varvec{y}^{\dagger }\Vert \varvec{y}]. \end{aligned}$$

(108)

These relations are used to characterize the steady state of equilibrium and nonequilibrium flow geometrically and also variationally.

Remark 9

(Interpretation in terms of statistical inference) The meaning of the equilibrium manifold in statistics can be clarified more explicitly by considering the specific form of thermodynamic function (Eq. 65). For this thermodynamic function, the equilibrium manifold $\mathcal {M}^{eq}(\tilde{\varvec{p}})$ is represented as

$$\begin{aligned} \mathcal {M}^{eq}(\tilde{\varvec{p}})&=\left\{ \varvec{p}\in \mathcal {X}| \ln \varvec{p}-\ln \tilde{\varvec{p}} \in \textrm{Ker}\mathbb {S}^{T}\right\} \nonumber \\ {}&=\left\{ \varvec{p}\in \mathcal {X}| \varvec{p}=\tilde{\varvec{p}}\circ \exp \left[ \mathbb {U}^{T}\varvec{\eta }^{*}\right] , \varvec{\eta }^{*}\in C^{-1}(\mathbb {H}) \right\} \end{aligned}$$

(109)

where we use the fact $\mathbb {S}^{T}\mathbb {U}^{T}=0$. Thus, $\mathcal {M}^{eq}(\tilde{\varvec{p}})$ is an exponential family with algebraic constraints via $\mathbb {U}^{T}$. In contrast, $\mathcal {P}^{sc}(\varvec{\eta })$ can be regarded as the data manifold, which constrains $\varvec{p}$ by $\varvec{\eta }=\mathbb {U}\varvec{p}$, because $\mathbb {U}\varvec{p}$ can be interpreted as expectation of observables $\{\varvec{u}_{\ell }\}_{\ell \in [1,N_{\mathbb {l}}]}$. Thus, the intersection $\varvec{p}^{\dagger }=\mathcal {P}^{sc}(\varvec{\eta })\cap \mathcal {M}^{eq}(\tilde{\varvec{p}})$ is the maximum likelihood estimator. The exponential family with linear algebraic constraints as in Eq. 109 appears in algebraic statistics where $\mathbb {U}$ is sometimes called the design matrix [48].^{Footnote 71}

6.4 Dual manifolds, dual foliations in edge spaces and information-geometric extension of Helmholtz-Hodge-Kodaira decomposition

For the edge spaces, we similarly introduce the iso-velocity and iso-force manifolds, which are the duals of $\mathcal {P}^{fr}(\varvec{f}')$ and $\mathcal {P}^{vl}(\hat{\varvec{j}})$, respectively, via the Legendre transformations by the dissipation functions, $\Psi _{\varvec{x}}(\varvec{j})$ and $\Psi ^{*}_{\varvec{x}}(\varvec{f})$ (Fig. 6, right):

Definition 31

(Iso-velocity manifold in $\mathcal {F}_{\varvec{x}}$ and iso-force manifold in $\mathcal {J}_{\varvec{x}}$) The iso-velocity and iso-force manifolds (Fig. 6, right) are defined as follows:

$$\begin{aligned} \mathcal {M}^{vl}_{\varvec{x}}(\hat{\varvec{f}})&:=\partial \Psi _{\varvec{x}}[\mathcal {P}^{vl}(\hat{\varvec{j}})]\subset \mathcal {F}_{\varvec{x}},&\hat{\varvec{f}}&=\partial \Psi _{\varvec{x}}(\hat{\varvec{j}}), \end{aligned}$$

(110)

$$\begin{aligned} \mathcal {M}^{fr}_{\varvec{x}}(\varvec{j}')&:=\partial \Psi ^{*}_{\varvec{x}}[\mathcal {P}^{fr}(\varvec{f}')] \subset \mathcal {J}_{\varvec{x}},&\varvec{j}'&=\partial \Psi ^{*}_{\varvec{x}}(\varvec{f}'). \end{aligned}$$

(111)

It should be noted that $\mathcal {M}^{vl}_{\varvec{x}}(\hat{\varvec{f}})$ and $\mathcal {M}^{fr}_{\varvec{x}}(\varvec{j}')$ are dependent on $\varvec{x}$ via the $\varvec{x}$ dependence of the dissipation functions. We obtain the intersections in $\mathcal {J}_{\varvec{x}}$ and $\mathcal {F}_{\varvec{x}}$:

$$\begin{aligned} \varvec{j}^{\dagger }&:=\mathcal {P}^{vl}(\hat{\varvec{j}}) \cap \mathcal {M}^{fr}_{\varvec{x}}(\varvec{j}'),&\varvec{f}^{\dagger }&:=\mathcal {M}^{vl}_{\varvec{x}}(\hat{\varvec{f}}) \cap \mathcal {P}^{fr}(\varvec{f}'), \end{aligned}$$

(112)

which are also unique and transversal for each $\varvec{x}\in \mathcal {X}$ because $\mathcal {J}_{\varvec{x}}$ and $\mathcal {F}_{\varvec{x}}$ are whole vector spaces and the Legendre transformations are one-to-one. Thus, similarly to the case of vertex space, we have the dual foliation:

Lemma 2

(Dual foliations in edge spaces [83]) For each $\varvec{x}\in X$, $(\mathcal {P}^{vl}, \mathcal {M}^{fr}_{\varvec{x}})$ and $(\mathcal {M}^{vl}_{\varvec{x}}, \mathcal {P}^{fr})$ form dual foliations in $\mathcal {J}_{\varvec{x}}$ and $\mathcal {F}_{\varvec{x}}$ spaces, respectively.

For $\hat{\varvec{j}}$ and $\varvec{f}'$, and their intersections $\varvec{j}^{\dagger }$ and $\varvec{f}^{\dagger }$ defined in Eq. 112, $\langle \hat{\varvec{j}}-\varvec{j}^{\dagger },\varvec{f}^{\dagger }-\varvec{f}' \rangle =0$ holds. Thus, we have the generalized Pythagorean relations:

$$\begin{aligned} \mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\hat{\varvec{j}}\Vert \varvec{j}']&=\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\hat{\varvec{j}}\Vert \varvec{j}^{\dagger }] +\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}^{\dagger }\Vert \varvec{j}'], \nonumber \\ \mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}'\Vert \hat{\varvec{f}}]&=\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}'\Vert \varvec{f}^{\dagger }] +\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}^{\dagger }\Vert \hat{\varvec{f}}]. \end{aligned}$$

(113)

In contrast to the thermodynamic functions $(\Phi , \Phi ^{*})$, the dissipation functions have symmetry, which makes the origins $\varvec{0}$ in $\mathcal {J}_{\varvec{x}}$ and $\mathcal {F}_{\varvec{x}}$ special and leads to an extension of Helmholtz-Hodge-Kodaira decomposition.

Theorem 1

(Information-geometric extension of Helmholtz-Hodge-Kodaira (HHK) decomposition [83]) For a given flux-force Legendre pair $(\varvec{j},\varvec{f})\in (\mathcal {J}_{\varvec{x}},\mathcal {F}_{\varvec{x}})$, we have their unique $\varvec{x}$-dependent decompositions:

$$\begin{aligned} \varvec{j}&=\varvec{j}_{eq}(\varvec{x})+(\varvec{j}-\varvec{j}_{eq}(\varvec{x})),&\varvec{f}&=\varvec{f}_{st}(\varvec{x})+(\varvec{f}-\varvec{f}_{st}(\varvec{x})), \end{aligned}$$

(114)

such that $\varvec{f}_{eq}(\varvec{x}) \in \mathcal {P}^{fr}(\varvec{0})$, $\varvec{j}-\varvec{j}_{eq}(\varvec{x}) \in \mathcal {P}^{vl}(\varvec{0})$, $\varvec{f}-\varvec{f}_{st}(\varvec{x}), \in \mathcal {P}^{fr}(\varvec{0})$, and $\varvec{j}_{st}(\varvec{x}) \in \mathcal {P}^{vl}(\varvec{0})$ hold. In addition, $\varvec{j}_{eq}(\varvec{x})$ and $\varvec{f}_{st}(\varvec{x})$ are characterized geometrically as

$$\begin{aligned} \varvec{j}_{eq}(\varvec{x})&:=\mathcal {P}^{vl}(\varvec{j})\cap \mathcal {M}^{fr}_{\varvec{x}}(\varvec{0}),&\varvec{f}_{st}(\varvec{x})&:=\mathcal {M}^{vl}_{\varvec{x}}(\varvec{0})\cap \mathcal {P}^{fr}(\varvec{f}). \end{aligned}$$

(115)

Furthermore, $\varvec{j}_{eq}$ and $\varvec{f}_{st}$ are also characterized variationally as the minimizers of dissipation functions:

$$\begin{aligned} \varvec{j}_{eq}(\varvec{x})&= \arg \min _{\varvec{j}' \in \mathcal {P}^{vl}(\varvec{j})} \Psi _{\varvec{x}}(\varvec{j}'),&\varvec{f}_{st}(\varvec{x})&= \arg \min _{\varvec{f}'' \in \mathcal {P}^{fr}(\varvec{f})} \Psi _{\varvec{x}}^{*}(\varvec{f}''). \end{aligned}$$

(116)

Proof

The uniqueness of $\varvec{j}_{eq}(\varvec{x})$ and $\varvec{f}_{st}(\varvec{x})$ as intersections in Eq. 115 follows immediately from the property of the dual foliations. Because, for any $\varvec{j}' \in \mathcal {P}^{vl}(\varvec{j})$ and $\varvec{f}'' \in \mathcal {P}^{fr}(\varvec{f})$, $\langle \varvec{j}'-\varvec{j}_{eq},\varvec{f}_{eq} \rangle =0$ and $\langle \varvec{j}_{st}, \varvec{f}''-\varvec{f}_{st} \rangle =0$ hold, the generalized Pythagorean relations lead to

$$\begin{aligned} \begin{aligned} \mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}'\Vert \varvec{0}]&=\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}'\Vert \varvec{j}_{eq}] +\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}_{eq}\Vert \varvec{0}],&\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}''\Vert \varvec{0}]&=\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}''\Vert \varvec{f}_{st}] +\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}_{st}\Vert \varvec{0}]. \end{aligned} \end{aligned}$$

Because $\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}'\Vert \varvec{0}]=\Psi _{\varvec{x}}(\varvec{j}')$ and $\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}''\Vert \varvec{0}]=\Psi ^{*}_{\varvec{x}}(\varvec{f}'')$ hold, the relations are reduced to

$$\begin{aligned} \Psi _{\varvec{x}}(\varvec{j}')&=\mathcal {D}_{\varvec{x}}^{\mathcal {J}}[\varvec{j}'\Vert \varvec{j}_{eq}]+\Psi _{\varvec{x}}(\varvec{j}_{eq}),&\Psi ^{*}_{\varvec{x}}(\varvec{f}'')&=\mathcal {D}_{\varvec{x}}^{\mathcal {F}}[\varvec{f}''\Vert \varvec{f}_{st}]+\Psi ^{*}_{\varvec{x}}(\varvec{f}_{st}). \end{aligned}$$

(117)

Then Eq. 116 follows. $\square $

The decomposed flux $\varvec{j}_{eq}$ and force $\varvec{f}_{st}$ play a particularly important role in dynamics. From the definition, $\varvec{j}_{eq}$ is the equilibrium flux, which induces the same instantaneous velocity $\dot{\varvec{x}}$ as $\varvec{j}$ does, i.e., $\dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}}\varvec{j}=-\textrm{div}_{\mathbb {S}}\varvec{j}_{eq}$. Thus, $\varvec{j}_{eq}$ is the equilibrium flux mimicking the instantaneous dynamics induced by the nonequilibrium flux $\varvec{j}$. This equilibrium flux is uniquely determined owing to the information-geometric orthogonality of $\mathcal {P}^{vl}(\varvec{j})$ and $\mathcal {M}^{fr}_{\varvec{x}}(\varvec{0})$. Moreover, the decomposition $\varvec{j}=\varvec{j}_{eq}+(\varvec{j}-\varvec{j}_{eq})$ can be regarded as an information-geometric extension of the Helmholtz-Hodge-Kodaira decomposition in vector calculus and differential form, because $(\varvec{j}-\varvec{j}_{eq})$ is divergence free, i.e., $\textrm{div}_{\mathbb {S}}(\varvec{j}-\varvec{j}_{eq})=0$, and $\varvec{f}_{eq}$ is a curl-free equilibrium force, i.e., $\varvec{f}_{eq}\in \mathcal {P}^{fr}(\varvec{0})=\textrm{Im}[\mathbb {S}^{T}]=\textrm{Ker}[\mathbb {V}^{T}]$.

On the contrary, by definition, $\varvec{j}_{st}\in \mathcal {P}^{v}(\varvec{0})$ is the flux that makes the state $\varvec{x}$ a steady state, i.e., $\dot{\varvec{x}}=0$, and is also induced by the force in the same quotient set of force $\mathcal {P}^{fr}(\varvec{f})$ as $\varvec{f}$. The decomposition $\varvec{f}=\varvec{f}_{st}+(\varvec{f}-\varvec{f}_{st})$ is also a HHK decomposition because $\varvec{j}_{st}\in \mathcal {P}^{vl}(\varvec{0})$ is divergence free, i.e., $\textrm{div}_{\mathbb {S}}\varvec{j}_{st}=0$, and $\varvec{f}-\varvec{f}_{st}$ is a curl-free equilibrium force, i.e., $\varvec{f}-\varvec{f}_{st}\in \mathcal {P}^{fr}(\varvec{0})=\textrm{Im}[\mathbb {S}^{T}]=\textrm{Ker}[\mathbb {V}^{T}]$.^{Footnote 72} These decompositions are used in the subsequent sections (Sect. 8 and Sect. 9).

7 Central affine manifold and Hilbert orthogonality

The dual foliation is an essential geometric object in information geometry. While less common than the dual foliation, the central affine manifold defined by a convex function also plays an integral role in information geometry [145].

Definition 32

(Central affine manifolds in $\mathcal {J}_{\varvec{x}}$ and $\mathcal {F}_{\varvec{x}}$) The central affine manifolds in $\mathcal {J}_{\varvec{x}}$ and $\mathcal {F}_{\varvec{x}}$ are defined as the level sets of $\Psi _{\varvec{x}}(\varvec{j})$ and $\Psi ^{*}_{\varvec{x}}(\varvec{f})$, respectively^{Footnote 73}:

$$\begin{aligned} \mathcal {C}_{\varvec{x}}^{\Psi }(c)&:=\{\varvec{j}|\Psi _{\varvec{x}}(\varvec{j})=c\} \subset \mathcal {J}_{\varvec{x}},&\mathcal {C}_{\varvec{x}}^{\Psi ^{*}}(c)&:=\{\varvec{f}|\Psi ^{*}_{\varvec{x}}(\varvec{f})=c\}\subset \mathcal {F}_{\varvec{x}}, \end{aligned}$$

(118)

where $c\in \mathbb {R}_{\ge 0}$. For a given $\varvec{j}'\in \mathcal {J}_{\varvec{x}}$ or $\varvec{f}'\in \mathcal {F}_{\varvec{x}}$, the manifolds are also denoted as

$$\begin{aligned} \mathcal {C}_{\varvec{x}}^{\Psi }(\varvec{j}')&:=\{\varvec{j}|\Psi _{\varvec{x}}(\varvec{j})=\Psi _{\varvec{x}}(\varvec{j}')\},&\mathcal {C}_{\varvec{x}}^{\Psi ^{*}}(\varvec{f}')&:=\{\varvec{f}|\Psi ^{*}_{\varvec{x}}(\varvec{f})=\Psi ^{*}_{\varvec{x}}(\varvec{f}')\}. \end{aligned}$$

(119)

Their Legendre transformations are also called (dual) central affine manifolds:

$$\begin{aligned} \mathcal {M}_{\varvec{x}}^{\Psi }(c)&:=\partial \Psi _{\varvec{x}}[\mathcal {C}_{\varvec{x}}^{\Psi }(c)]\subset \mathcal {F}_{\varvec{x}},&\mathcal {M}_{\varvec{x}}^{\Psi ^{*}}(c)&:=\partial \Psi ^{*}_{\varvec{x}}[\mathcal {C}_{\varvec{x}}^{\Psi ^{*}}(c)]\subset \mathcal {J}_{\varvec{x}}. \end{aligned}$$

(120)

7.1 Pseudo-Hilbert-isosceles orthogonality and decomposition

By employing the central affine manifold, we can introduce another type of generalized orthogonality:

Definition 33

(Pseudo-Hilbert-isosceles orthogonality [78, 79, 81]) Pseudo-Hilbert-isosceles orthogonality between $\varvec{j}_{S}, \varvec{j}_{A} \in \mathcal {J}_{\varvec{x}}$ and between $\varvec{f}_{S}, \varvec{f}_{A} \in \mathcal {F}_{\varvec{x}}$ are defined as follows:

$$\begin{aligned} \varvec{j}_{S}\perp _{H} \varvec{j}_{A}&\Longleftrightarrow \Psi _{\varvec{x}}(\varvec{j}_{S}+\varvec{j}_{A})=\Psi _{\varvec{x}}(\varvec{j}_{S} -\varvec{j}_{A}) \end{aligned}$$

(121)

$$\begin{aligned} \varvec{f}_{S}\perp _{H} \varvec{f}_{A}&\Longleftrightarrow \Psi _{\varvec{x}}^{*}(\varvec{f}_{S}+\varvec{f}_{A})=\Psi _{\varvec{x}}^{*}(\varvec{f}_{S} -\varvec{f}_{A}). \end{aligned}$$

(122)

This orthogonality is motivated by the relation $\Vert \varvec{j}_{S}+\varvec{j}_{A}\Vert ^{2}=\Vert \varvec{j}_{S}-\varvec{j}_{A}\Vert ^{2}$ satisfied by an orthogonal pair $\varvec{j}_{S}\perp \varvec{j}_{A}$ under a usual inner product structure and its induced norm $\Vert \cdot \Vert ^{2}$.^{Footnote 74} By employing this orthogonality, we obtain pseudo-Hilbert isosceles decompositions of $\varvec{j}$ and $\varvec{f}$ as follows:

Lemma 3

(Positive decompositions of the bilinear pairing via pseudo-Hilbert-isosceles orthogonality [78, 79, 81, 83]) For a given $\varvec{j}\in \mathcal {J}_{\varvec{x}}$ and any $\varvec{j}'$ on the same central affine manifold as $\varvec{j}$, i.e., $\varvec{j}' \in \mathcal {C}_{\varvec{x}}^{\Psi }(\varvec{j})$, we obtain the pseudo-Hilbert-isosceles orthogonal decomposition $\varvec{j}=\varvec{j}_{S}+\varvec{j}_{A}$:

$$\begin{aligned} \varvec{j}_{S}&:=\frac{1}{2}(\varvec{j}+\varvec{j}'),&\varvec{j}_{A}&:=\frac{1}{2}(\varvec{j}-\varvec{j}'), \end{aligned}$$

(123)

where $\varvec{j}_{S} \perp _{H}\varvec{j}_{A}$ and $\varvec{j}'=\varvec{j}_{S}-\varvec{j}_{A}$ hold. In addition, this decomposition induces a positive decomposition of the bilinear product $\langle \varvec{j},\varvec{f}\rangle =\langle \varvec{j}_{S},\varvec{f}\rangle +\langle \varvec{j}_{A},\varvec{f}\rangle $ where

$$\begin{aligned} \langle \varvec{j}_{S}, \varvec{f}\rangle&=\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j}'; -\varvec{f}]\ge 0,&\langle \varvec{j}_{A}, \varvec{f}\rangle&=\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j}'; \varvec{f}]\ge 0, \end{aligned}$$

(124)

hold. Similarly, for $\varvec{f} \in \mathcal {F}_{\varvec{x}}$ and $\varvec{f}''\in \mathcal {C}_{\varvec{x}}^{\Psi ^{*}}(\varvec{f})$, a positive orthogonal decomposition $\varvec{f}=\varvec{f}_{S}+\varvec{f}_{A}$ is obtained by $\varvec{f}_{S}:=\frac{1}{2}(\varvec{f}+\varvec{f}'')$ and $\varvec{f}_{A}:=\frac{1}{2}(\varvec{f}-\varvec{f}'')$, which satisfy the associated relations:

$$\begin{aligned} \langle \varvec{j}, \varvec{f}_{S}\rangle&=\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j}; -\varvec{f}'']\ge 0,&\langle \varvec{j}, \varvec{f}_{A}\rangle&=\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j}; \varvec{f}'']\ge 0. \end{aligned}$$

(125)

These decompositions were introduced in [78] for rMJP and extended to CRN in [79, 81], whereas we pointed out its information-geometric aspect in [83]. The decomposition plays a role of characterizing the gradient-flow-like property of non-gradient flows.

8 Information-geometric properties of Equilibrium flow

In this section, we describe several properties of the equilibrium flow (Eq. 58) from the viewpoint of information geometry by employing the objects introduced in the previous sections. Such properties include the existence and uniqueness of the steady state (static property), convergence to the state (kinetic property), and the balance between information-geometric quantities associated with the steady state and convergence along the trajectory (the connection between static and kinetic properties). These properties are consistent with those that thermodynamic equilibrium systems should have. In addition, several results are extensions of the results obtained for FPE in the context of functional analysis, partial differential equations, and optimal transport.

8.1 Properties of equilibrium flow

The following property of the equilibrium state characterizes the static aspect of the equilibrium flow and is fundamentally ascribed to the dually flat structure of density and potential spaces^{Footnote 75}:

Proposition 6

(Equilibrium state and its geometric and variational characterizations) The steady state of the equilibrium flow $\varvec{x}_{t}$ (Eq. 58) starting from $\varvec{x}(0)=\varvec{x}_{0}$ is called the equilibrium state $\varvec{x}_{eq}$. For each $\varvec{x}_{0}$, the equilibrium state is identical to the intersection $\varvec{x}^{\dagger }=\mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})$, i.e., $\varvec{x}_{eq}=\varvec{x}^{\dagger }$, and thus uniquely exists for a given pair of the initial state $\varvec{x}_{0}$ and the parameter of equilibrium flow $\tilde{\varvec{x}}$. The equilibrium state $\varvec{x}_{eq}$ is also characterized variationally as

$$\begin{aligned} \varvec{x}_{eq}&= \arg \min _{\varvec{x}\in \mathcal {P}^{sc}(\varvec{x}_{0})}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}] =\arg \min _{\varvec{x}_{q}\in \mathcal {M}^{eq}(\tilde{\varvec{x}})}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{0}\Vert \varvec{x}_{q}]. \end{aligned}$$

(126)

Moreover, $\mathcal {M}^{eq}(\tilde{\varvec{x}})=\mathcal {M}^{\textrm{DB}}$ holds.

Proof

From Prop. 3, $\varvec{x}_{eq} \in \mathcal {M}^{\textrm{DB}}$, from which $\varvec{f}(\varvec{x}_{eq})=0$ follows. For the equilibrium force (Eq. 57), $ \mathcal {M}^{\textrm{DB}}=\{\varvec{x}|\varvec{f}(\varvec{x})=0\}=\{\varvec{x}|\mathbb {S}^{T}(\partial \Phi [\varvec{x}]-\partial \Phi [\tilde{\varvec{x}}])=0\}=\mathcal {M}^{eq}(\tilde{\varvec{x}})$ holds. Thus, $\varvec{x}_{eq} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})$. Because the initial state is $\varvec{x}_{0}$, $\varvec{x}_{eq}\in \mathcal {P}^{sc}(\varvec{x}_{0})$. Thus, $\varvec{x}_{eq} = \varvec{x}^{\dagger } \in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})$. The first equality of Eq. 126 is obvious from the proof of the dual foliation (Lemma 1). The second equality is from the generalized Pythagorean relation (Eq. 107). $\square $

The second property of the equilibrium flow is kinetic in nature and characterizes the Bregman divergence as the generalized driving potential, which ensures the convergence of $\varvec{x}_{t}$ to the equilibrium state. This property is attributed to the dually flat structure on the edge spaces.

Proposition 7

(Bregman divergence and Gibbs’ H-Theorem) For the trajectory of the equilibrium flow $\varvec{x}_{t}$ (Eq. 58) starting from $\varvec{x}(0)=\varvec{x}_{0}$, the thermodynamic function $\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]$ decreases, that is, $\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]/\textrm{d}t< 0$ except at $\varvec{x}_{t} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})$ where $\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]/\textrm{d}t=0$ holds. Thus, the equilibrium state $\varvec{x}_{eq} \in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})$ is locally and asymptotically stable.

Proof

By replacing $\mathcal {F}(\varvec{x})$ in Prop. 3 with $\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]$, we obtain $\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]/\textrm{d}t=-\left[ \Psi ^{*}_{\varvec{x}_{t}}(\varvec{f}(\varvec{x}_{t}))+ \Psi _{\varvec{x}_{t}}(\varvec{j}(\varvec{x}_{t}))\right] \le 0$ and the equality holds if and only if $\varvec{f}(\varvec{x}_{t})=0 (\Leftrightarrow \varvec{x}_{t} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})$). $\square $

Because $\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]$ can be identified with the difference of total entropy between $\varvec{x}$ and $\tilde{\varvec{x}}$ for thermodynamic systems such as CRN [85], $\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{eq}]/\textrm{d}t\le 0$ corresponds to the nondecreasing property of thermodynamic entropy, which is also referred as Gibbs’ H-theorem.^{Footnote 76}

The third property provides a connection between the thermodynamic function and the dissipation function, which is immediately obtained from the De Giorgi’s formulation of the generalized gradient flow (Eq. 52):

Proposition 8

(Balancing of thermodynamic function and dissipation function) For the trajectory of the equilibrium flow $\varvec{x}_{t}$ (Eq. 58) starting from $\varvec{x}(0)=\varvec{x}_{0}$, the following relation holds for the thermodynamic function and the dissipation function:

$$\begin{aligned} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{0}\Vert \tilde{\varvec{x}}]-\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]= \int _{t'=0}^{t}\left[ \Psi ^{*}_{\varvec{x}_{t'}}(\varvec{f}(\varvec{x}_{t'}))+ \Psi _{\varvec{x}_{t'}}(\varvec{j}(\varvec{x}_{t'}))\right] \textrm{d}t' = \int _{t'=0}^{t}\dot{\Sigma }_{t'}\textrm{d}t', \end{aligned}$$

(127)

In physics and chemistry, this relation means that the difference in the thermodynamic (potential) function between $\varvec{x}_{t}$ and $\varvec{x}_{0}$ (the left-hand side), i.e., the change in total entropy, is equal to the integral of dissipation along $\varvec{x}_{t}$(the right-hand side), i.e., the entropy production, for equilibrium systems.

All these results indicate that the equilibrium flow and its properties mathematically abstract the properties of physical equilibrium systems. The equilibrium state $\varvec{x}_{eq}$ is characterized algebraically by the unique intersection of $\mathcal {P}^{sc}(\varvec{x}_{0})$ and $\mathcal {M}^{eq}(\tilde{\varvec{x}})$ and also variationally by Eq. 126. The convergence to $\varvec{x}_{eq}$ is guaranteed by $\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]/\textrm{d}t\le 0$. Furthermore, the entropy-dissipation balance relation (Eq. 127) itself defines the equilibrium system abstractly as the De Giorgi’s formulation (Eq. 52) does.

8.2 Induced dually flat structure on tangent–cotangent spaces

The equilibrium state is characterized geometrically and variationally via the information-geometric structure on the vertex spaces ($\mathcal {X}$, $\mathcal {Y}$) as in Prop. 6. Similarly, the flux (kinetic law) of equilibrium systems (gradient systems) can be obtained variationally as the flux minimizing the dissipation function under the restriction of the continuity equation.

Lemma 4

(Equilibrium force as the minimizer of primal dissipation function) For a given trajectory $\{\varvec{x}_{t}\}$, we define the trajectory of the flux $\{\varvec{j}^{\dagger }_{t}\}$ minimizing the primal dissipation:

$$\begin{aligned} \{\varvec{j}^{\dagger }_{t}\}:=\arg \min _{\{\varvec{j}_{t}\}}\int _{0}^{t}\Psi _{\varvec{x}_{t'}}[\varvec{j}_{t'}]\textrm{d}t',\quad \text{ s.t. } \dot{\varvec{x}}_{t'}+\textrm{div}_{\mathbb {S}} \varvec{j}_{t'}=0\hbox { for all }t'\in [0,t]. \end{aligned}$$

(128)

Then, $\varvec{j}^{\dagger }_{t}$ is generated by the equilibrium force, $\varvec{f}_{t}^{\dagger }=\partial \Psi _{\varvec{x}}[\varvec{j}^{\dagger }_{t}] \in \mathcal {P}^{fr}(\varvec{0})$. Thus, the minimum primal dissipation flux that generates the given $\{\varvec{x}_{t}\}$ is the equilibrium flux.

Proof

Because the minimization of Eq. 128 can be conducted pointwise-manner for each $t'\in [0,t]$ and $\dot{\varvec{x}}_{t'}+\textrm{div}_{\mathbb {S}} \varvec{j}_{t'}=0\Longleftrightarrow \varvec{j}_{t'}\in \mathcal {P}^{vl}(\dot{\varvec{x}}_{t'})$, we have

$$\begin{aligned} \varvec{j}_{t'}^{\dagger }=\varvec{j}^{\dagger }(\varvec{x}_{t'},\dot{\varvec{x}}_{t'})= \arg \min _{\varvec{j}\in \mathcal {P}^{vl}(\dot{\varvec{x}}_{t'}) }\Psi _{\varvec{x}_{t'}}[\varvec{j}]=\mathcal {P}^{vl}(\dot{\varvec{x}}_{t'}) \cap \mathcal {M}^{fr}_{\varvec{x}_{t'}}(\varvec{0}), \end{aligned}$$

(129)

where we used Eq. 115 and Eq. 116. Thus, from $\varvec{j}_{t'}^{\dagger } \in \mathcal {M}^{fr}_{\varvec{x}_{t'}}(\varvec{0}) \Longleftrightarrow \varvec{f}_{t'}^{\dagger } \in \mathcal {P}^{fr}(\varvec{0})$, the minimum dissipation flux $\{\varvec{j}^{\dagger }_{t}\}$ is generated by the equilibrium force, $\{\varvec{f}_{t}^{\dagger }\}=\{\varvec{f}^{\dagger }(\varvec{x}_{t'},\dot{\varvec{x}}_{t'})\}\in \mathcal {P}^{fr}(\varvec{0})$ where $\varvec{f}^{\dagger }(\varvec{x},\dot{\varvec{x}}):=\partial \Psi _{\varvec{x}}[\varvec{j}^{\dagger }(\varvec{x},\dot{\varvec{x}})]$. $\square $

By exploiting this unique pairing between $\dot{\varvec{x}}_{t'}$ and $\varvec{j}_{t'}^{\dagger }$ or $\dot{\varvec{x}}_{t'}$ and $\varvec{f}_{t'}^{\dagger }$, we can obtain an induced dually flat structure on the restricted tangent and cotangent spaces of $\mathcal {X}$ and $\mathcal {Y}$ (Fig. 7), which can be regarded as an information-geometric extension of the Otto structure.

Theorem 2

(Induced dually flat structure on tangent and cotangent spaces) Let $\tilde{\mathcal {T}}_{\varvec{x}} \mathcal {X}:=\textrm{Im}\mathbb {S}\cong \mathcal {P}^{sc}(\varvec{0})\subset \mathcal {T}_{\varvec{x}}\mathcal {X}$ and $\tilde{\mathcal {T}}_{\varvec{x}}^{*} \mathcal {X}:=\mathcal {T}_{\varvec{x}}^{*}\mathcal {X}/\textrm{Ker}\mathbb {S}^{T}$ be tangent and cotangent spaces on $\mathcal {X}$ restricted by $\mathbb {S}$. On $\tilde{\mathcal {T}}_{\varvec{x}} \mathcal {X}$ and $\tilde{\mathcal {T}}_{\varvec{x}}^{*} \mathcal {X}$, we have the Legendre conjugate dissipation functions $\tilde{\Psi }_{\varvec{x}}: \tilde{\mathcal {T}}_{\varvec{x}} \mathcal {X}\rightarrow \mathbb {R}$ and $\tilde{\Psi }_{\varvec{x}}^{*}: \tilde{\mathcal {T}}_{\varvec{x}}^{*} \mathcal {X}\rightarrow \mathbb {R}$ induced by the dissipation functions on the edge spaces (Fig. 7).

Proof

By employing Eq. 129, for each $\varvec{v}\in \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}$, we can uniquely determine $\varvec{j}^{\dagger }(\varvec{x},\varvec{v})$, $\varvec{f}^{\dagger }(\varvec{x},\varvec{v})\in \mathcal {P}^{fr}(\varvec{0})$, and $\varvec{u}^{\dagger }(\varvec{x},\varvec{v}) \in \tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X}$^{Footnote 77}. They satisfy

$$\begin{aligned} \varvec{v}&= - \mathbb {S}\varvec{j}^{\dagger }(\varvec{x},\varvec{v}), \quad \varvec{j}^{\dagger }(\varvec{x},\varvec{v}) = \partial \Psi ^{*}_{\varvec{x}}[\varvec{f}^{\dagger }(\varvec{x},\varvec{v})],\nonumber \\ \varvec{f}^{\dagger }(\varvec{x},\varvec{v})&= - \mathbb {S}^{T} \varvec{u}^{\dagger }(\varvec{x},\varvec{v}). \quad \end{aligned}$$

(130)

Conversely, for a given $\varvec{u}\in \tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X}$, we have $\varvec{f}^{\ddagger }(\varvec{u})$, $\varvec{j}^{\ddagger }(\varvec{x},\varvec{u})$, and $\varvec{v}^{\ddagger }(\varvec{x},\varvec{u})$ as follows:

$$\begin{aligned} \varvec{v}^{\ddagger }(\varvec{x},\varvec{u}) = - \mathbb {S}\varvec{j}^{\ddagger }(\varvec{x},\varvec{u}),\quad \varvec{j}^{\ddagger }(\varvec{x},\varvec{u}) = \partial \Psi ^{*}_{\varvec{x}}[\varvec{f}^{\ddagger }(\varvec{u})], \quad \varvec{f}^{\ddagger }(\varvec{u}) = - \mathbb {S}^{T} \varvec{u}. \end{aligned}$$

(131)

Thus, for a pair of $(\varvec{v},\varvec{u})_{\varvec{x}}$ satisfying $\varvec{u}=\varvec{u}^{\dagger }(\varvec{x},\varvec{v})$, we have $\varvec{v}=\varvec{v}^{\ddagger }(\varvec{x},\varvec{u})$, $\varvec{j}^{\dagger }(\varvec{x},\varvec{v})=\varvec{j}^{\ddagger }(\varvec{x},\varvec{u})$, and $\varvec{f}^{\dagger }(\varvec{x},\varvec{v})=\varvec{f}^{\ddagger }(\varvec{x},\varvec{u})$. This pairing establishes a bijection between $\tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}$ and $\tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X}$. Moreover, this bijection is realized by the Legendre transformations of the following induced dissipation functions on $ \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}$ and $\tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X}$:

$$\begin{aligned} \tilde{\Psi }_{\varvec{x}}(\varvec{v})&:=\Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},\varvec{v})),&\tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u})&:=\Psi _{\varvec{x}}^{*}(\varvec{f}^{\ddagger }(\varvec{u})). \end{aligned}$$

(132)

These functions are Legendre conjugate as follows:

$$\begin{aligned} \max _{\varvec{u}'\in \tilde{\mathcal {T}}^{*}_{\varvec{x}}\mathcal {X}}&\left[ \langle \varvec{v},\varvec{u}'\rangle -\tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u}') \right] \\&=\max _{\begin{array}{c} \varvec{u}'\in \tilde{\mathcal {T}}^{*}_{\varvec{x}}\mathcal {X}\\ \varvec{j}\in \mathcal {P}^{vl}(\varvec{v}) \end{array}}\left[ \langle -\mathbb {S}\varvec{j},\varvec{u}'\rangle -\tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u}') \right] =\max _{\begin{array}{c} \varvec{u}'\in \tilde{\mathcal {T}}^{*}_{\varvec{x}}\mathcal {X}\\ \varvec{j}\in \mathcal {P}^{vl}(\varvec{v}) \end{array}}\left[ \langle \varvec{j},-\mathbb {S}^{T}\varvec{u}'\rangle -\Psi _{\varvec{x}}^{*}(-\mathbb {S}^{T}\varvec{u}') \right] \\&=\max _{\begin{array}{c} \varvec{f}'\in \mathcal {P}^{fr}(\varvec{0})\\ \varvec{j}\in \mathcal {P}^{vl}(\varvec{v}) \end{array}}\left[ \langle \varvec{j},\varvec{f}'\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}') \right] \\&=\max _{\begin{array}{c} \varvec{f}'\in \mathcal {P}^{fr}(\varvec{0})\\ \varvec{j}\in \mathcal {P}^{vl}(\varvec{v}) \end{array}}\left[ \langle \varvec{j}^{\dagger }(\varvec{x},\varvec{v}),\varvec{f}'\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}') + \langle (\varvec{j}-\varvec{j}^{\dagger }(\varvec{x},\varvec{v})),\varvec{f}'\rangle \right] \\&=\max _{\begin{array}{c} \varvec{f}'\in \mathcal {P}^{fr}(\varvec{0}) \end{array}}\left[ \langle \varvec{j}^{\dagger }(\varvec{x},\varvec{v}),\varvec{f}'\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}') \right] =\Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},\varvec{v}))=\tilde{\Psi }_{\varvec{x}}(\varvec{v}), \end{aligned}$$

where we used $\langle \varvec{j}-\varvec{j}^{\dagger }(\varvec{x},\varvec{v}),\varvec{f}'\rangle =0$ because $\varvec{f}' \in \mathcal {P}^{fr}(\varvec{0})=\textrm{Im}\mathbb {S}^{T}$ and $(\varvec{j}-\varvec{j}^{\dagger }(\varvec{x},\varvec{v})) \in \textrm{Ker}\mathbb {S}$. The inverse is also shown:

$$\begin{aligned} \max _{\varvec{v}'\in \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}}&\left[ \langle \varvec{v}',\varvec{u}\rangle -\tilde{\Psi }_{\varvec{x}}(\varvec{v}') \right] =\max _{\begin{array}{c} \varvec{v}'\in \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X} \end{array}}\left[ \langle -\mathbb {S}\varvec{j}^{\dagger }(\varvec{x},\varvec{v}'),\varvec{u}\rangle -\Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},\varvec{v}')) \right] \\&=\max _{\begin{array}{c} \varvec{j}^{\dagger }\in \mathcal {M}^{fr}_{\varvec{x}}(\varvec{0}) \end{array}}\left[ \langle \varvec{j}^{\dagger },-\mathbb {S}^{T}\varvec{u}\rangle -\Psi _{\varvec{x}}(\varvec{j}^{\dagger }) \right] =\max _{ \varvec{j}^{\dagger }\in \mathcal {M}^{fr}_{\varvec{x}}(\varvec{0})}\left[ \langle \varvec{j}^{\dagger },\varvec{f}^{\ddagger }(\varvec{u})\rangle -\Psi _{\varvec{x}}(\varvec{j}^{\dagger }) \right] \\&=\Psi _{\varvec{x}}^{*}(\varvec{f}^{\ddagger }(\varvec{u}))=\tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u}), \end{aligned}$$

where we used the fact that $\{\varvec{j}^{\dagger }(\varvec{x},\varvec{v}')\}_{\varvec{v}'\in \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}}=\mathcal {M}^{fr}_{\varvec{x}}(\varvec{0})$ in the second line. The pair $(\varvec{v}$, $\varvec{u})_{\varvec{x}}$ is Legendre dual of these functions:

$$\begin{aligned} \partial _{\varvec{v}} \tilde{\Psi }_{\varvec{x}}(\varvec{v})&= \left[ \frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}\right] ^{T}\left. \frac{\partial \Psi _{\varvec{x}}(\varvec{j})}{\partial \varvec{j}}\right| _{\varvec{j}=\varvec{j}^{\dagger }(\varvec{x},\varvec{v})} =\left[ \frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}\right] ^{T}\varvec{f}^{\dagger }(\varvec{x},\varvec{v}) \end{aligned}$$

(133)

$$\begin{aligned}&=-\left[ \frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}\right] ^{T}\mathbb {S}^{T}\varvec{u}^{\dagger } (\varvec{x},\varvec{v})=\varvec{u}, \end{aligned}$$

(134)

$$\begin{aligned} \partial _{\varvec{u}} \tilde{\Psi }^{*}_{\varvec{x}}(\varvec{u})&= \left[ \frac{\partial \varvec{f}^{\ddagger }(\varvec{u})}{\partial \varvec{u}}\right] ^{T}\left. \frac{\partial \Psi ^{*}_{\varvec{x}}(\varvec{f})}{\partial \varvec{f}}\right| _{\varvec{f}=\varvec{f}^{\ddagger }(\varvec{u})}=\left[ \frac{\partial \varvec{f}^{\ddagger }(\varvec{u})}{\partial \varvec{u}}\right] ^{T}\varvec{j}^{\ddagger }(\varvec{x},\varvec{u}) \end{aligned}$$

(135)

$$\begin{aligned}&=-\mathbb {S}\varvec{j}^{\ddagger }(\varvec{x},\varvec{u})=\varvec{v}, \end{aligned}$$

(136)

where we used $\left[ \frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}\right] ^{T}\mathbb {S}^{T}=-I$ from $\frac{\partial }{\partial \varvec{v}}[\varvec{v}+\mathbb {S}\varvec{j}^{\dagger }(\varvec{x},\varvec{v})]=I+ \mathbb {S}\frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}=0$ and $\frac{\partial \varvec{f}^{\ddagger }(\varvec{u})}{\partial \varvec{u}} =-\mathbb {S}^{T}$. They are dissipation functions; strict convexity and 1-coercivity follow from those of the original dissipation functions. Also, we have

$$\begin{aligned}&\text{ Symmetry }:\quad \tilde{\Psi }_{\varvec{x}}(-\varvec{v}) = \Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},-\varvec{v}))=\Psi _{\varvec{x}} (-\varvec{j}^{\dagger }(\varvec{x},\varvec{v}))=\tilde{\Psi }_{\varvec{x}}(\varvec{v}) \end{aligned}$$

(137)

$$\begin{aligned}&\text{ Bounded } \text{ by } 0 \text{ at } \varvec{0}: \quad \tilde{\Psi }_{\varvec{x}}(\varvec{v}=\varvec{0}) =\Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},\varvec{0}))=\Psi _{\varvec{x}}(\varvec{0})=0. \end{aligned}$$

(138)

$\square $

Using the induced dissipation functions, we define the Bregman divergence on $(\tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}, \tilde{\mathcal {T}}^{*}_{\varvec{x}}\mathcal {X})$, which is associated with the Bregman divergence on $(\mathcal {J}_{\varvec{x}},\mathcal {F}_{\varvec{x}})$:

$$\begin{aligned} \mathcal {D}^{\mathcal {X},\mathcal {Y}}_{\tilde{\Psi }_{\varvec{x}}}[\varvec{v}\Vert \varvec{u}']&:=\tilde{\Psi }_{\varvec{x}}(\varvec{v}) + \tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u}') -\langle \varvec{v}, \varvec{u}'\rangle \end{aligned}$$

(139)

$$\begin{aligned}&=\Psi _{\varvec{x}}(\varvec{j}^{\dagger }) + \Psi _{\varvec{x}}^{*}(\varvec{f}'^{\ddagger })-\langle \varvec{j}^{\dagger }, \varvec{f}'^{\ddagger }\rangle =\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}} [\varvec{j}^{\dagger }\Vert \varvec{f}'^{\ddagger }], \end{aligned}$$

(140)

where $\varvec{j}^{\dagger }=\varvec{j}^{\dagger }(\varvec{x},\varvec{v})$ and $\varvec{f}'^{\ddagger }=\varvec{f}^{\ddagger }(\varvec{x},\varvec{u}')$. Therefore, we have the induced dually flat structure on $(\tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}, \tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X})$. This induced structure can be regarded as an extension to discrete manifolds of the Otto structure [65, 66]: the formal Riemannian structure induced by the $L^{2}$ -Wasserstein distance. This is also related to Pistone’s infinite-dimensional information geometry [72, 121].

8.3 Fisher information, natural gradient, mirror descent, evolutionary computation, and optimal transport

In information geometry, it is conventional to use the Fisher information matrices, i.e., the Hessian matrices $G_{\varvec{x}}$ and $G^{*}_{\varvec{y}}$ (Eq. 31) as the metric tensor (Fisher–Rao metric) on $(\mathcal {T}_{\varvec{x}}\mathcal {X}, \mathcal {T}_{\varvec{x}}^{*}\mathcal {X})$ or equivalently on $(\mathcal {T}_{\varvec{y}}\mathcal {Y}, \mathcal {T}_{\varvec{y}}^{*}\mathcal {Y})$ (Fig. 7). Gradient systems have been defined information-geometrically [25] as a Riemannian gradient flow using the Bregman divergence and the Fisher information matrix of $\Phi (\varvec{x})$ as the gradient function and the metric tensor, respectively: $\dot{\varvec{x}}=-G_{\varvec{x}}^{-1} \partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]$. Because both $G_{\varvec{x}}$ and $\mathcal {D}^{\mathcal {X}}_{\Phi }$ are derived from $\Phi (\varvec{x})$, this gradient flow becomes a geodesic in $\mathcal {Y}$ space: $\dot{\varvec{y}}=- (\varvec{y}-\tilde{\varvec{y}})$. In natural gradient descent [67, 69, 148], the Fisher information matrix is used to find the steepest descent gradient of a function $\mathcal {F}(\varvec{\theta })$ on a parameter space $\Theta $ as $\dot{\varvec{\theta }}=-G_{\varvec{\theta }}^{-1} \partial \mathcal {F}(\varvec{\theta })$, where $G_{\varvec{\theta }}$ is determined independently of $\mathcal {F}(\varvec{\theta })$ by considering the underlying model parameter space. In optimization, the natural gradient is fundamental in information-geometric optimization algorithms, which contain various evolutionary optimization schemes [70]. In relation to machine learning, the mirror descent is identified with the natural gradient descent by a naive continuous limit [69, 149]. Furthermore, optimal transport has recently been employed to replace or integrate the Fisher-Rao metric with the Wasserstein metric [150, 151]. Because the Wasserstein metric can take the information of the base manifold into account, their integration may provide more amenable ways to accommodate various prior and structural information.

The doubly dual flat structure introduced in this work actually provides a solution to generalize those results and the associated problems. The base space $\mathcal {X}$ with the dually flat structure and the associated Fisher information matrix accommodates the conventional natural gradient. The graph or hypergraph structure endows the additional topological relation to the base space of $\mathcal {X}$. The dissipation functions on the edge spaces or their induced versions bestow a more flexible way than the Fisher-Rao metric to represent the loss of the potential function, i.e., the dissipation, at each point in the state space. Upon necessity, we may combine both of them (Fig. 7), for example, as $\dot{\varvec{x}}=-G_{\varvec{x}}^{-1} \partial \mathcal {F}^{(1)}(\varvec{x})-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}^{(2)}(\varvec{x})]$ where $\mathcal {F}^{(1)}(\varvec{x})$ and $\mathcal {F}^{(2)}(\varvec{x})$ could be different. This flexibility may contribute to the design of new algorithms for machine learning. Actually, this integrated representation is quite relevant to the filtering equations [152] in sequential inference where the first term, i.e., $-G_{\varvec{x}}^{-1} \partial \mathcal {F}^{(1)}(\varvec{x})$, can usually be associated with the update of posterior probability by observation and the second term, $-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}^{(2)}(\varvec{x})]$ can represent the prediction by the prior information on the dynamics. Our framework may provide a unified information-geometric perspective to various information-geometric analyses and extensions of filtering, e.g., projection-filters [153], information-geometric nonlinear filtering [154], and information geometric optimization [70]. Furthermore, the generalized gradient flow can be regarded as a continuous time limit of the mirror descent where the nonlinear Legendre duality between primal and dual spaces is preserved at the limit. This fact may be employed to design new gradient-based algorithms via the doubly dual flat structure.

9 Information-geometric properties of generalized nonequilibrium flow

In this section, we consider the nonequilibrium flow defined by Eq. 61, i.e.,

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}\left[ \left[ \textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\right] +\varvec{f}_{NE}\right] , \end{aligned}$$

(141)

with $\varvec{f}_{NE}\not \in \textrm{Im}\mathbb {S}^{T}$, and show how information geometry can be employed to analyze such dynamics. While we can obtain several properties of equilibrium flow independently of the detail of the thermodynamic function and the dissipation function, these functions should be related so as to obtain nice properties for the nonequilibrium flow. We will observe that the thermodynamic function and the dissipation function of LMA kinetics actually have such a relation.

9.1 Gradient-flow-like property and Lyapunov function of nonequilibrium flow

For the equilibrium flow (Eq. 58), the Bregman divergence $\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]$ is a Lyapunov function. The Bregman divergence can still be a Lyapunov function even for the nonequilibrium flow (Eq. 141) under the following conditions:

Lemma 5

Suppose that, for all $\varvec{x}\in \mathcal {X}$, the force $\varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]+\varvec{f}_{NE}$ is orthogonally decomposed as $\varvec{f}(\varvec{x})=\varvec{f}_{S}(\varvec{x}) + \varvec{f}_{A}(\varvec{x})$ where $\varvec{f}_{S}(\varvec{x}):=\textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}_{CB}]$ and $\varvec{f}_{A}(\varvec{x})\in \mathcal {F}_{\varvec{x}}$ satisfy the pseudo-Hilbert-isosceles orthogonality $\varvec{f}_{S}(\varvec{x}) \perp _{H} \varvec{f}_{A}(\varvec{x})$. Then $\frac{\textrm{d}}{\textrm{d}t}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}]\le 0$ holds. In addition, $\varvec{x}_{CB}=\mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})$ is the unique steady state of Eq. 141 with the initial state $\varvec{x}_{0}$ that attains $\frac{\textrm{d}}{\textrm{d}t}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}]=0$. Thus, $\varvec{x}_{CB}$ is locally and asymptotically stable.^{Footnote 78}

Proof

We can directly verify $\frac{\textrm{d}}{\textrm{d}t}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}]\le 0$ as follows:

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}]&=\langle \dot{\varvec{x}}, \partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}] \rangle =-\langle \textrm{div}_{\mathbb {S}}\varvec{j}(\varvec{x}), \partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}] \rangle \end{aligned}$$

(142)

$$\begin{aligned}&=-\langle \varvec{j}(\varvec{x}), \textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}] \rangle \nonumber \\&=-\langle \varvec{j}(\varvec{x}), \varvec{f}_{S}(\varvec{x}) \rangle =-\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j} (\varvec{x})\Vert -\varvec{f}''(\varvec{x})] \le 0 \end{aligned}$$

(143)

where we used Eq. 125 and $\varvec{f}''(\varvec{x}):=\varvec{f}_{S}(\varvec{x})-\varvec{f}_{A}(\varvec{x})$. The equality holds if and only if $\varvec{f}(\varvec{x})=-\varvec{f}''(\varvec{x})$, which means that

$$\begin{aligned} \varvec{f}(\varvec{x})&=-\varvec{f}''(\varvec{x}) \Longleftrightarrow \varvec{f}_{S}(\varvec{x})=0 \Longleftrightarrow \textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}_{CB}]=0\nonumber \\&\qquad \Longleftrightarrow \varvec{x} \in \mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB}). \end{aligned}$$

(144)

Because $\varvec{x}_{t}\in \mathcal {P}^{sc}(\varvec{x}_{0})$, $\varvec{x}_{CB}=\mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})$ holds. $\square $

Thus, if the pseudo-Hilbert-isosceles orthogonal decomposition exists, then the nonequilibrium flow behaves like the equilibrium flow.

9.2 Complex-balanced state and pseudo-Hilbert-isosceles orthogonality

General conditions or situations under which the orthogonal decomposition in Lemma 5 exists are still an open problem. However, for CRN with LMA kinetics, the decomposition holds if a complex-balanced steady state exists.

Proposition 9

(Complex-balanced steady state and orthogonal decomposition for CRN with LMA kinetics [78, 79, 81, 83]) Suppose that a complex balanced steady state exists, i.e., $\mathcal {M}^{\textrm{CB}}\ne \emptyset $ for CRN with LMA kinetics (Eq. 12). Using any $\tilde{\varvec{x}}_{CB} \in \mathcal {M}^{\textrm{CB}}$, consider a decomposition of the force $\varvec{f}_{MA}(\varvec{x})$ as $\varvec{f}_{MA}(\varvec{x})=\varvec{f}_{S}(\varvec{x})+\varvec{f}_{A}$ where $\varvec{f}_{S}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}_{CB}]$, $\varvec{f}_{A}=\ln \varvec{K}+\mathbb {S}^{T}\ln \tilde{\varvec{x}}_{CB}$, and $\Phi (\varvec{x})$ is as in Eq. 62. Then, for the dissipation functions in Eq. 68, the pseudo-Hilbert isosceles orthogonality $\varvec{f}_{S}(\varvec{x})\perp _{H} \varvec{f}_{A}$ holds for all $\varvec{x} \in \mathcal {X}$. In addition, $\mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})=\mathcal {M}^{\textrm{CB}}$ holds.

Proof

We can prove the orthogonality by direct computation. The orthogonality condition is

$$\begin{aligned} \Psi ^{*}_{\varvec{x}}[\varvec{f}_{S}(\varvec{x})+\varvec{f}_{A}]&=\Psi ^{*}_{\varvec{x}}[\varvec{f}_{S}(\varvec{x})-\varvec{f}_{A}] \end{aligned}$$

(145)

$$\begin{aligned}&\Leftrightarrow \left\langle \varvec{j}^{+}_{MA}(\varvec{x}),\left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{-\mathbb {S}^{T}} -\varvec{1}\right\rangle \nonumber \\&\quad +\left\langle \varvec{j}^{-}_{MA}(\varvec{x}),\! \left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\mathbb {S}^{T}} \!-\!\varvec{1}\right\rangle \!=\!0. \end{aligned}$$

(146)

Consider the following equality:

$$\begin{aligned} \left\langle \varvec{j}^{\pm }_{MA}(\varvec{x}),\left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\mp \mathbb {S}^{T}} -\varvec{1}\right\rangle&=\sum _{e=1}^{N_{\mathbb {e}}}k_{e}^{\pm }\tilde{\varvec{x}}_{CB}^{\varvec{\gamma }_{e}^{\pm }} \left[ \left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\varvec{\gamma }_{e}^{\mp }} -\left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\varvec{\gamma }_{e}^{\pm }}\right] , \end{aligned}$$

(147)

where . By using this, we have the following:

$$\begin{aligned} \text{ Eq. }\,146&=\sum _{e=1}^{N_{\mathbb {e}}}\left( k_{e}^{+}\tilde{\varvec{x}}_{CB}^{\gamma _{e}^{+}} -k_{e}^{-}\tilde{\varvec{x}}_{CB}^{\gamma _{e}^{-}}\right) \left[ \left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\gamma _{e}^{-}} -\left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\gamma _{e}^{+}}\right] \end{aligned}$$

(148)

(149)

(150)

Thus, Eq. 145 holds for any $\varvec{x}\in \mathcal {X}$ if $\mathbb {B}\varvec{j}_{MA}(\tilde{\varvec{x}}_{CB})=\varvec{0}$ holds.^{Footnote 79}$\mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})=\mathcal {M}^{\textrm{CB}}$ can be proved by obtaining the parametric representation of $\mathcal {M}^{\textrm{CB}}$ as $\mathcal {M}^{\textrm{CB}}=\{\varvec{x}\in \mathcal {X}| \ln \varvec{x}-\ln \tilde{\varvec{x}}_{CB} \in \textrm{Ker}\mathbb {S}^{T} \}$ via solving $\varvec{j}_{MA}(\varvec{x})=\varvec{0}$.^{Footnote 80} This representation is identical to that of $\mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})$ (Eq. 104). $\square $

Remark 10

(Algebraic structure of detailed balanced and complex balanced manifolds) We here mention about the underlying algebraic source of why $\mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})=\mathcal {M}^{\textrm{CB}}$ holds. First, we already showed that $\mathcal {M}^{DB}=\mathcal {M}^{eq}(\tilde{\varvec{x}})$ holds generally if $\mathcal {M}^{\textrm{DB}} \ne \emptyset $. Under LMA kinetics (Eq. 12), the DB condition $\varvec{j}_{MA}(\varvec{x})=\varvec{0}$ is nothing but the binomial equations because $\varvec{j}^{\pm }_{MA}(\varvec{x})$ are vectors of monominals of $\varvec{x}$. Owing to this, $\mathcal {M}^{DB}$ becomes a toric variety.^{Footnote 81} In contrast, the CB condition $\mathbb {B}\varvec{j}_{MA}(\varvec{x}_{CB})=\varvec{0}$ is a set of polynomial equations for LMA kinetics. Nonetheless, it was shown that $\mathcal {M}^{\textrm{CB}}$ is binomially generated and has the same structural matrix $\mathbb {S}^{T}$ as the equilibrium manifold [100]. Because of that, they become equivalent as manifolds.

Because rLDG (Eq. 3) is a subclass of CRN where and thus $\mathbb {S}=\mathbb {B}$ holds, the complex-balanced condition is always satisfied for rLDG.

Corollary 1

(rLDG is unconditionally complex-balanced [8]) All the steady states of rLDG are complex-balanced states, i.e., $\mathcal {M}^{\textrm{ST}} = \mathcal {M}^{\textrm{CB}}$ independently of the parameter values $\varvec{k}^{\pm }$ of the flux (Eq. 3).^{Footnote 82} Thus, KL divergence (Eq. 64) always works as a Lyapunov function of rLDG.^{Footnote 83}

The properties described in this corollary are well-known for rMJP and are usually obtained by using the Perron-Frobenius theorem for linear operators. The framework of the generalized flow enables us to extend them to the nonlinear regime.

9.3 Effective flux of the nonequilibrium flow by the primal information-geometric projection

In general, the nonequilibrium force or flux has redundant degrees of freedom in terms of generating a specific vector field or trajectory $\{\varvec{x}_{t}\}$ on the density space. By using the extended HHK projective decomposition (Theorem 1), we can carve out the effective part of the flux for the trajectory $\{\varvec{x}_{t}\}$. In addition, we can obtain an effective time-dependent equilibrium flux that mimics the trajectory $\{\varvec{x}_{t}\}$:

Lemma 6

(Effective time-dependent equilibrium flux) Suppose that $\varvec{j}(\varvec{x})$ is the flux of a generalized flow (Eq. 45), and define the corresponding effective equilibrium flux by $\varvec{j}_{eq}(\varvec{x})=\partial \Psi ^{*}_{\varvec{x}}[\mathbb {S}^{T}\varvec{u}_{eq}(\varvec{x})]$ where $\varvec{u}_{eq}(\varvec{x}):=\partial \tilde{\Psi }_{\varvec{x}}[-\mathbb {S}\varvec{j}(\varvec{x})]$. Then, $\varvec{j}_{eq}(\varvec{x})$ induces the same velocity as $\varvec{j}(\varvec{x})$ does.^{Footnote 84} Furthermore, for a given trajectory $\{\varvec{x}_{t}\}$ of $\varvec{j}(\varvec{x})$, we can construct a time-dependent equilibrium flux $\varvec{j}_{eq}(t,\varvec{x})$ that can generate the same $\{\varvec{x}_{t}\}$ as follows

$$\begin{aligned} \varvec{j}_{eq}(t,\varvec{x})&:=\partial \Psi ^{*}_{\varvec{x}}\left[ \textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}_{t}]\right] ,&\tilde{\varvec{x}}_{t} :=\partial \Phi ^{*}[\partial \Phi (\varvec{x}_{t})+\varvec{u}_{eq}(\varvec{x}_{t})]. \end{aligned}$$

(151)

Proof

From Theorem 1, $\varvec{j}(\varvec{x})$ can be decomposed as $\varvec{j}(\varvec{x})=\varvec{j}_{eq}(\varvec{x})+(\varvec{j}(\varvec{x})-\varvec{j}_{eq}(\varvec{x}))$. Because $\varvec{f}_{eq}(\varvec{x})=\partial \Psi _{\varvec{x}}^{*}[\varvec{j}_{eq}(\varvec{x})] \in \mathcal {P}^{fr}(\varvec{0})$, there exists $\varvec{u}_{eq}(\varvec{x})$ satisfying $-\mathbb {S}^{T}\varvec{u}_{eq}(\varvec{x})=\varvec{f}_{eq}(\varvec{x})$. By employing the duality introduced in Theorem 2, $\varvec{u}_{eq}(\varvec{x})$ can be represented as $\varvec{u}_{eq}(\varvec{x})=\partial \tilde{\Psi }_{\varvec{x}}[\varvec{v}(\varvec{x})]$ where $\varvec{v}(\varvec{x})=-\mathbb {S}\varvec{j}(\varvec{x})$. Because $-\mathbb {S}\varvec{j}_{eq}(\varvec{x})=\varvec{v}(\varvec{x})$, $\varvec{j}_{eq}(\varvec{x})$ generates the same dynamics or vector field as $\varvec{j}(\varvec{x})$ does. By solving $-\varvec{u}_{eq}(\varvec{x}_{t})=\partial \Phi (\varvec{x}_{t})-\partial \Phi (\tilde{\varvec{x}}_{t})$, we have Eq. 151. $\square $

The effective time-dependent equilibrium flux is obtained more explicitly for CRN with LMA kinetics:

Corollary 2

(Effective equilibrium force and flux of LMA kinetics) Consider the following quantities of CRN with LMA kinetics:

$$\begin{aligned}&\mathrm {Flux\,defined\,in\,Eq.}\,\!:12 \quad \varvec{j}_{MA}(\varvec{x};\varvec{k}^{\pm }) \end{aligned}$$

(152)

$$\begin{aligned}&\mathrm {Force\,defined\,in\,Eq.}\,\!:70 \quad \varvec{f}_{\textrm{MA}}(\varvec{x};\varvec{K}) \end{aligned}$$

(153)

$$\begin{aligned}&\mathrm {Thermodynamic\,function\,defined\,in\, Eq.}\,\!:62 \quad \Phi (\varvec{x}) \end{aligned}$$

(154)

$$\begin{aligned}&\mathrm {Dissipation\, functions\,defined\,in\,Eq.}\,68\, \mathrm {with\,Eq.}\,\!:70 \quad \Psi ^{*}_{\varvec{x},\varvec{\kappa }}[\varvec{f}] = \Psi ^{*}_{\varvec{\omega }_{\textrm{MA}}(\varvec{x};\varvec{\kappa })}[\varvec{f}], \end{aligned}$$

(155)

where $\varvec{k}^{\pm }=\varvec{\kappa }\circ \varvec{K}^{\pm 1/2}$ holds. For a trajectory $\{\varvec{x}_{t}\}$ generated by $\varvec{j}_{MA}(\varvec{x};\varvec{k}^{\pm })$, the effective time-dependent equilibrium force $\varvec{f}_{eq}(t,\varvec{x})$ can be described as $\varvec{f}_{MA}(\varvec{x}; \varvec{K}_{eq}(t))$ where $\varvec{K}_{eq}(t)$ is determined by

$$\begin{aligned} \varvec{K}_{eq}(t)&:=\exp \left[ -\mathbb {S}^{T}\left( \varvec{u}_{eq}(\varvec{x}_{t};\varvec{\kappa })+\partial \Phi (\varvec{x}_{t}) \right) \right] , \nonumber \\&\varvec{u}_{eq}(\varvec{x}_{t};\varvec{\kappa }) :=\partial \tilde{\Psi }_{\varvec{x}_{t},\varvec{\kappa }}[-\mathbb {S}\varvec{j}_{MA}(\varvec{x}_{t})] \end{aligned}$$

(156)

Thus, the effective time-dependent equilibrium flux $\varvec{j}_{eq}(t,\varvec{x})$ is represented as $\varvec{j}_{eq}(t,\varvec{x})=\varvec{j}_{MA}(\varvec{x}; \varvec{k}^{\pm }_{eq}(t))$ where $\varvec{k}^{\pm }_{eq}(t)=\varvec{\kappa }\circ \varvec{K}^{\pm 1/2}_{eq}(t)$.

This corollary means that the effective time-dependent flux of LMA kinetics is always obtained by a time-dependent modulation of the kinetic parameters $\varvec{k}^{\pm }$. More specifically, the modulation of force part $\varvec{K}_{eq}(t)$ is sufficient while the activity part $\varvec{\kappa }$ is kept constant.^{Footnote 85}

Example 3

(Simplified Brusselator CRN [8, 104] (continued)) By using the Brusselator CRN (Ex. 1), we numerically obtained a nonequilibrium trajectory $\{\varvec{x}_{t}\}$ (Fig. 8a, d top left panel). By using Cor. 2, we also computed the corresponding time-dependent kinetic parameter set $\varvec{k}^{\pm }_{eq}(t)$ that generates the time-dependent equilibrium flux $\varvec{j}_{eq}(t,\varvec{x})=\varvec{j}_{MA}(t,\varvec{x}; \varvec{k}^{\pm }_{eq}(t))$ (Fig. 8b, c). Figure 8d shows the vector field $\varvec{v}_{eq}(t, \varvec{x})=-\mathbb {S}\varvec{j}_{eq}(t,\varvec{x})$ induced by the time-dependent equilibrium flux $\varvec{j}_{eq}(t,\varvec{x})$ and the contours of $\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}_{t}}]$ where $\tilde{\varvec{x}}_{t}$ follows from Eq. 151. From Fig. 8, we can see that the trajectory $\{\varvec{x}_{t}\}$ originally generated by the nonequilibrium flux $\varvec{j}(\varvec{x})$ can be traced by the time-dependent equilibrium flux $\varvec{j}_{eq}(t,\varvec{x})$ and also that $\varvec{j}_{eq}(t,\varvec{x})$ can be physically realized by the modulation of the kinetic parameters $\varvec{k}^{\pm }_{eq}(t)$.

9.4 Characterization of the nonequilibrium flow by the dual information geometric projection

The nonequilibrium flow is redundant in terms of generating a specific vector field or trajectory $\{\varvec{x}_{t}\}$. Such redundancy is crucial to characterize the extent of nonequilibrium. One approach for the characterization is to investigate the cycle force or flux, which has been employed in the linear theory of dynamics on graphs and also in graph-theoretic approaches to nonequilibrium phenomena [89, 155,156,157,158]. To extract such cyclic components, we can use $\mathbb {V}^{T}=\textrm{curl}_{\mathbb {V}}$, its adjoint $\mathbb {V}=\textrm{curl}^{*}_{\mathbb {V}}$, the associated cycle subspaces $C^{2}(\mathbb {H})$ and $C_{2}(\mathbb {H})$, and also the generalized HKK decomposition (Theorem 1).

Definition 34

(Cycle spaces) The cycle spaces at $\varvec{x}\in \mathcal {X}$ are defined as $\mathcal {Z}_{\varvec{x}} = C_{2}(\mathbb {H})=\textrm{Ker}[\mathbb {V}]$ and $\mathfrak {Z}_{\varvec{x}} = C^{2}(\mathbb {H})=\textrm{Im}[\mathbb {V}^{T}]$.

For a given nonequilibrium force $\varvec{f}(\varvec{x})$, we can obtain its cycle component $\varvec{\zeta } = \textrm{curl}_{\mathbb {V}}\varvec{f}(\varvec{x})=\mathbb {V}^{T}\varvec{f}(\varvec{x}) \in \mathfrak {Z}_{\varvec{x}}$. $\varvec{\zeta }$ contains the information to categorize the force because $\mathcal {P}^{fr}(\varvec{f})=\mathcal {P}^{fr}(\varvec{\zeta })$ is the quotient space of force by the equilibrium force. For each $\zeta \in \mathfrak {Z}_{\varvec{x}}$, we obtain the representative force $\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })$ via the following variational problem:

Lemma 7

(Steady (zero-velocity) force as the minimizer of the dual dissipation function) For a given $\varvec{\zeta }\in \mathfrak {Z}_{\varvec{x}}$, we define the force $\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })$ minimizing the dual dissipation function:

$$\begin{aligned} \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }):=\arg \min _{\varvec{f}}\Psi _{\varvec{x}}^{*}[\varvec{f}],\quad \text{ s.t. } \textrm{curl}_{\mathbb {V}} \varvec{f}=\varvec{\zeta }. \end{aligned}$$

(157)

Then, $\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })$ is the steady (zero-velocity) force, i.e., $\varvec{j}^{\lozenge }=\partial \Psi _{\varvec{x}}^{*}[\varvec{f}^{\lozenge }] \in \mathcal {P}^{vl}(\varvec{0})$.

Proof

From Eq. 115 in Theorem 1, $\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {P}^{fr}(\varvec{\zeta })\cap \mathcal {M}^{vl}(\varvec{0})$. Thus, $\varvec{j}^{\lozenge }\in \mathcal {P}^{vl}(\varvec{0})$. $\square $

Among various forces $\varvec{f}$ that has the same cyclic component $\varvec{\zeta }$, the force $\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })$ is the one that induces no dynamics of $\varvec{x}$. Because any dynamics of $\varvec{x}$ can be represented by the effective equilibrium flux as in Lemma 6, $\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })$ can be regarded as the force being purely relevant to the cycle.

Using $\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })$, we can establish the induced duality between $\mathcal {Z}_{\varvec{x}}$ and $\mathfrak {Z}_{\varvec{x}}$ spaces:

Theorem 3

(Induced dually flat structure on cycle spaces) On the cycle spaces, $\mathcal {Z}_{\varvec{x}}$ and $\mathfrak {Z}_{\varvec{x}}$, we have the Legendre conjugate dissipation functions $\hat{\Psi }_{\varvec{x}}: \mathcal {Z}_{\varvec{x}} \rightarrow \mathbb {R}$ and $\hat{\Psi }_{\varvec{x}}^{*}: \mathfrak {Z}_{\varvec{x}} \rightarrow \mathbb {R}$ induced by the dissipation functions on the edge spaces (Fig. 9).

Proof

For each $\varvec{\zeta }\in \mathfrak {Z}_{\varvec{x}}$, we can uniquely determine $\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {F}_{\varvec{x}}$ and $\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })\in \mathcal {J}_{\varvec{x}}$ as

$$\begin{aligned} \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })&:=\mathcal {P}^{fr}(\varvec{\zeta })\cap \mathcal {M}^{vl}_{\varvec{x}}(\varvec{0})=\arg \min _{\varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta })} \Psi _{\varvec{x}}^{*}[\varvec{f}], \end{aligned}$$

(158)

$$\begin{aligned} \varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })&:=\partial \Psi ^{*}_{\varvec{x}}[\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })] \in \mathcal {P}^{vl}(\varvec{0}). \end{aligned}$$

(159)

In addition, $\varvec{z}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {Z}_{\varvec{x}}$ satisfying $\mathbb {V}\varvec{z}^{\lozenge }(\varvec{x},\varvec{\zeta }) = \varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })$ is also uniquely determined because $\mathbb {V}: \mathcal {Z}_{\varvec{x}} \rightarrow \mathcal {J}_{\varvec{x}}$, $\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {P}^{vl}(\varvec{0})=\textrm{Im}[\mathbb {V}]$ and $\textrm{Ker}\mathbb {V}=\{\varvec{0}\}$. For these quantities,

$$\begin{aligned} \varvec{\zeta }&= \mathbb {V}^{T} \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }),&\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })&= \partial \Psi _{\varvec{x}}[\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })],&\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })&= \mathbb {V}\varvec{z}^{\lozenge }(\varvec{x},\varvec{\zeta }), \end{aligned}$$

(160)

hold. Conversely, for a given $\varvec{z}\in \mathcal {Z}_{\varvec{x}}$, we have $\varvec{j}^{\blacklozenge }(\varvec{z})$, $\varvec{f}^{\blacklozenge }(\varvec{x}, \varvec{z})$, and $\varvec{\zeta }^{\blacklozenge }(\varvec{x},\varvec{z})$ as follows:

$$\begin{aligned} \varvec{\zeta }^{\blacklozenge }(\varvec{x},\varvec{z})&= \mathbb {V}^{T} \varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z}),&\varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z})&= \partial \Psi _{\varvec{x}}[\varvec{j}^{\blacklozenge }(\varvec{z})],&\varvec{j}^{\blacklozenge }(\varvec{z})&= \mathbb {V}\varvec{z}. \end{aligned}$$

(161)

For a pair of $(\varvec{z},\varvec{\zeta })_{\varvec{x}}$ that satisfies $\varvec{z}=\varvec{z}^{\lozenge }(\varvec{x},\varvec{\zeta })$, then $\varvec{\zeta }=\varvec{\zeta }^{\blacklozenge }(\varvec{x},\varvec{z})$, $\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })=\varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z})$, and $\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })=\varvec{j}^{\blacklozenge }(\varvec{z})$ hold. This pairing establishes a bijection between $\mathcal {Z}_{\varvec{x}}$ and $\mathfrak {Z}_{\varvec{x}}$. This bijection is realized by the Legendre transformations of the following induced dissipation functions on $\mathcal {Z}_{\varvec{x}}$ and $\mathfrak {Z}_{\varvec{x}}$:

$$\begin{aligned} \hat{\Psi }_{\varvec{x}}(\varvec{z})&:=\Psi _{\varvec{x}}(\varvec{j}^{\blacklozenge }(\varvec{z})),&\hat{\Psi }_{\varvec{x}}^{*}(\varvec{\zeta })&:=\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })). \end{aligned}$$

(162)

These functions are Legendre conjugate as follows:

$$\begin{aligned} \max _{\varvec{z}'\in \mathcal {Z}_{\varvec{x}}}&\left[ \langle \varvec{z}', \varvec{\zeta }\rangle -\hat{\Psi }_{\varvec{x}}(\varvec{z}') \right] =\max _{\begin{array}{c} \varvec{z}'\in \mathcal {Z}_{\varvec{x}}\\ \varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta }) \end{array}}\left[ \langle \varvec{z}', \mathbb {V}^{T}\varvec{f}\rangle -\hat{\Psi }_{\varvec{x}}(\varvec{z}') \right] \\&=\max _{\begin{array}{c} \varvec{z}'\in \mathcal {Z}_{\varvec{x}}\\ \varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta }) \end{array}}\left[ \langle \mathbb {V}\varvec{z}', \varvec{f}\rangle -\Psi _{\varvec{x}}(\mathbb {V}\varvec{z}') \right] \\&=\max _{\begin{array}{c} \varvec{j}'\in \mathcal {P}^{vl}(\varvec{0})\\ \varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta }) \end{array}}\left[ \langle \varvec{j}', \varvec{f}\rangle -\Psi _{\varvec{x}}(\varvec{j}') \right] \\&=\max _{\begin{array}{c} \varvec{j}'\in \mathcal {P}^{vl}(\varvec{0})\\ \varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta }) \end{array}}\left[ \langle \varvec{j}', \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\rangle -\Psi _{\varvec{x}}(\varvec{j}') + \langle \varvec{j}', (\varvec{f}-\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }))\rangle \right] \\&=\max _{\begin{array}{c} \varvec{j}'\in \mathcal {P}^{vl}(\varvec{0}) \end{array}}\left[ \langle \varvec{j}', \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\rangle -\Psi _{\varvec{x}}(\varvec{j}') \right] =\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }))\nonumber \\ {}&=\hat{\Psi }_{\varvec{x}}^{*}(\varvec{\zeta }), \end{aligned}$$

where we used $\langle \varvec{j}', (\varvec{f}-\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }))\rangle =0$ because $\varvec{j}' \in \mathcal {P}^{vl}(\varvec{0})=\textrm{Im}\mathbb {V}$ and $\varvec{f}-\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {P}^{fr}(\varvec{0})=\textrm{Ker}\mathbb {V}^{T}$. The inverse is also shown:

$$\begin{aligned} \max _{\varvec{\zeta }'\in \mathfrak {Z}_{\varvec{x}}}&\left[ \langle \varvec{z}, \varvec{\zeta }'\rangle -\hat{\Psi }_{\varvec{x}}^{*}(\varvec{\zeta }') \right] =\max _{\begin{array}{c} \varvec{\zeta }'\in \mathfrak {Z}_{\varvec{x}} \end{array}}\left[ \langle \varvec{z}, \mathbb {V}^{T}\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }')\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }')) \right] \\&=\max _{\begin{array}{c} \varvec{f}^{\lozenge }\in \mathcal {M}^{vl}_{\varvec{x}}(\varvec{0}) \end{array}}\left[ \langle \mathbb {V}\varvec{z}, \varvec{f}^{\lozenge }\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }) \right] =\max _{ \varvec{f}^{\lozenge }\in \mathcal {M}^{vl}_{\varvec{x}}(\varvec{0})} \left[ \langle \varvec{j}^{\blacklozenge }(\varvec{z}), \varvec{f}^{\lozenge }\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }) \right] \\&=\Psi _{\varvec{x}}(\varvec{j}^{\blacklozenge }(\varvec{z})) =\hat{\Psi }_{\varvec{x}}(\varvec{z}) \end{aligned}$$

The pair $(\varvec{z}$, $\varvec{\zeta })_{\varvec{x}}$ is Legendre dual with respect to these functions:

$$\begin{aligned} \partial _{\varvec{\zeta }} \hat{\Psi }_{\varvec{x}}^{*}(\varvec{\zeta })&= \left[ \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }}\right] ^{T}\left. \frac{\partial \Psi _{\varvec{x}}^{*}(\varvec{f})}{\partial \varvec{f}}\right| _{\varvec{f} =\varvec{f}^{\lozenge }(\varvec{x}, \varvec{\zeta })}\nonumber \\ {}&=\left[ \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }} \right] ^{T}\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta }) \end{aligned}$$

(163)

$$\begin{aligned}&=\left[ \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }}\right] ^{T}\mathbb {V}\varvec{z}^{\lozenge } (\varvec{x},\varvec{\zeta })=\varvec{z}, \end{aligned}$$

(164)

$$\begin{aligned} \partial _{\varvec{z}} \hat{\Psi }_{\varvec{x}}(\varvec{z})&= \left[ \frac{\partial \varvec{j}^{\blacklozenge }(\varvec{z})}{\partial \varvec{z}}\right] ^{T}\left. \frac{\partial \Psi _{\varvec{x}}(\varvec{j})}{\partial \varvec{j}}\right| _{\varvec{j} =\varvec{j}^{\blacklozenge }(\varvec{z})} =\left[ \frac{\partial \varvec{j}^{\blacklozenge }(\varvec{z})}{\partial \varvec{z}}\right] ^{T}\varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z})\nonumber \\ {}&=\mathbb {V}^{T}\varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z})=\varvec{\zeta }, \end{aligned}$$

(165)

where we used $\left[ \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }}\right] ^{T}\mathbb {V}=I$ from $\frac{\partial }{\partial \varvec{\zeta }}[\varvec{\zeta }-\mathbb {V}^{T} \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })]=I- \mathbb {V}^{T} \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }}=0$ and $\frac{\partial \varvec{j}^{\blacklozenge }(\varvec{x},\varvec{z})}{\partial \varvec{z}} =\mathbb {V}$. They are dissipation functions: strict convexity and 1-coercivity follow from those of the original dissipation functions. Also, we have

$$\begin{aligned}&\text{ Symmetry: } \hat{\Psi }_{\varvec{x}}(-\varvec{z}) = \Psi _{\varvec{x}}(\varvec{j}^{\blacklozenge }(-\varvec{z})) =\Psi _{\varvec{x}}(-\varvec{j}^{\blacklozenge }(\varvec{z})) =\hat{\Psi }_{\varvec{x}}(\varvec{z}) \end{aligned}$$

(166)

$$\begin{aligned}&\text{ Bounded } \text{ by } 0\hbox { at }\varvec{0}: \hat{\Psi }_{\varvec{x}}(\varvec{z}=\varvec{0})=\Psi _{\varvec{x}} (\varvec{j}^{\blacklozenge }(\varvec{0}))=\Psi _{\varvec{x}}(\varvec{0})=0. \end{aligned}$$

(167)

$\square $

For a given force $\varvec{f}(\varvec{x})$ of the generalized flow (Eq. 45), $\varvec{f}_{st}(\varvec{x})=\varvec{f}^{\lozenge }(\varvec{x}, \mathbb {V}^{T}\varvec{f}(\varvec{x}))$ works as the effective cycle force for each $\varvec{x} \in \mathcal {X}$. Similarly to Cor. 2, we obtain the effective cycle force and the flux for LMA kinetics by parametric modulation:

Corollary 3

(Effective cycle force and flux for LMA kinetics) Consider CRN with LMA kinetics as in Cor. 2. For each $\varvec{x}\in \mathcal {X}$, the effective cycle force $\varvec{f}_{st}(\varvec{x})$ associated with $\varvec{j}_{MA}(\varvec{x};\varvec{k}^{\pm })$ can be described as $\varvec{f}_{st}(\varvec{x})=\varvec{f}_{MA}(\varvec{x}; \varvec{K}_{st}(\varvec{x}))$ where $\varvec{K}_{st}(\varvec{x})$ is determined by

$$\begin{aligned} \varvec{K}_{st}(\varvec{x})&:=\exp \left[ \varvec{f}^{\lozenge }(\varvec{x};\mathbb {V}^{T} \varvec{f}_{MA}(\varvec{x};\varvec{K}))-\mathbb {S}^{T}\partial \Phi (\varvec{x}) \right] . \end{aligned}$$

(168)

Thus, the effective cycle flux $\varvec{j}_{st}(\varvec{x})$ is represented as $\varvec{j}_{st}(\varvec{x})=\varvec{j}_{MA}(\varvec{x}; \varvec{k}^{\pm }_{st}(\varvec{x}))$ where $\varvec{k}^{\pm }_{st}(\varvec{x})=\varvec{\kappa }\circ \varvec{K}^{\pm 1/2}_{st}(\varvec{x})$. For a given trajectory $\{\varvec{x}_{t}\}$, which is generated by a generalized flow, we have the effective time-dependent cycle flux $\varvec{j}_{st}(t, \varvec{x})$ as $\varvec{j}_{st}(t, \varvec{x})=\varvec{j}_{MA}(\varvec{x}; \varvec{k}^{\pm }_{st}(\varvec{x}_{t}))$. From the construction, this time-dependent cycle flux makes $\varvec{x}_{t}$ a steady state for each t, i.e., $\mathbb {S}\varvec{j}_{st}(t, \varvec{x}_{t})=\varvec{0}$ holds for any t.

Example 4

(Simplified Brusselator CRN [8, 104] (continued)) For the nonequilibrium trajectory of the Brusselator CRN in Fig. 10a, we numerically obtained the effective cycle flux $\varvec{j}_{st}(t,\varvec{x})$ and the corresponding time-dependent kinetic parameter set $\varvec{k}_{st}^{\pm }(t)$ (Fig. 10b, c). Figure 10d shows the vector field $\varvec{v}_{st}(t,\varvec{x})=-\mathbb {S}\varvec{j}_{st}(t,\varvec{x})$ induced by the effective cycle flux $\varvec{j}_{st}(t,\varvec{x})$. From Fig. 10, we can see that any point on the trajectory $\{\varvec{x}_{t}\}$ originally generated by the nonequilibrium flux $\varvec{j}(\varvec{x}_{t})$ can be kept steady with the time-dependent cycle flux $\varvec{j}_{st}(t,\varvec{x})$ realized by the modulation of the kinetic parameter $\varvec{k}^{\pm }_{st}(t)$.

In modern nonequilibrium thermodynamics, it has been a great challenge to establish thermodynamic characterizations for nonequilibrium phenomena. To this end, the dissection of dynamics and the corresponding flux and force has been attempted [104, 159,160,161,162]. For a given trajectory $\{\varvec{x}_{t}\}$, the effective time-dependent equilibrium flux $\varvec{j}_{eq}(t, \varvec{x})$ generates exactly the same trajectory, which dissects and mimics the dynamic aspect of the trajectory. On the other hand, the effective time-dependent cycle flux $\varvec{j}_{st}(t, \varvec{x})$ makes each point on the trajectory steady, which can be recognized as the nonequilibrium aspect of the trajectory. Moreover, these two types of fluxes can be realized by appropriately modulating the kinetic parameter set $\varvec{k}^{\pm }$, which makes the dissected fluxes physically meaningful and accessible. More specifically, the modulation of the force part $\varvec{K}$ of the kinetic parameter is sufficient for realization (Eq. 156 and Eq. 168), while the activity part $\varvec{\kappa }$ is kept constant. In the case of CRN, the former is linked to the free energy difference between reactants and products of each reaction, and the latter is associated with the height of the energy barrier between them. This clear separation of different physical parameters in our framework is advantageous to further investigate physical aspects of dynamics on graphs and hypergraphs. Thus, the dually flat structure on the edge space and the HHK decomposition provides a new and promising way to characterize the nonequilibrium flow.^{Footnote 86}

10 Summary and discussion

In this work, we have shown that the doubly dual flat structure of the vertex and edge spaces on graphs and hypergraphs provides the information-geometric basis for the dynamics on graphs and hypergraphs. Two notions of orthogonality, pseudo-Hilbert isosceles orthogonality and information-geometric orthogonality, have been introduced and shown to dissect the equilibrium and nonequilibrium aspects of the dynamics into the induced structures on the tangent and cotangent spaces and the cycle spaces. The doubly dual flat structure naturally connects the topological information of underlying discrete manifolds, i.e., graphs and hypergraphs, with the dynamics on them and thus endows more flexibility and representation power to the information-geometric modeling of dynamics. Furthermore, the generalized equilibrium and nonequilibrium flows, as well as the generalized flow, accommodate a sufficiently wide range of models, which include the reversible Markov jump processes on finite graphs and CRN with LMA kinetics (a class of PDS). These results could substantially extend the applicability of information geometry to dynamical problems.

10.1 Extension of other relations involving information measures

While we demonstrated that the generalized flow and the doubly dual flat structure can extend several results known for FPE and diffusion processes, we still have potentially relevant results and problems that could be explained and extended in our framework. For example, for FPE and diffusion processes, the Fisher information number $\mathbb {I}_{F}$ was extended to the relative Fisher information (also known as Hyvärinen divergence [120, 144]). The relative Fisher information of two trajectories $p^{(1)}_{t}(\varvec{r})$ and $p^{(2)}_{t}(\varvec{r})$ is known to satisfy information–theoretic relations such as the De Brujin identity [54] and its extensions [56, 62]. In addition, the logarithmic Sobolev inequality also constitutes a relationship between the Fisher information number and the KL divergence (or Shannon information) [63, 64]. It would be an important future problem to associate these results with the doubly dual flat structure.

Moreover, several relations potentially being related to De Giorgi’s formulation (Eq. 52) have been known for mutual information in filtering and control theories. For example, Guo, Shamai, and Verdu found a relation between mutual information and the minimum mean square error (MMSE) in Gaussian channels [163]. Relations similar to these have also been reported by Mayer-Wolf and Zakai [164, 165]. Our framework may offer a unified perspective behind these different types of relations involving information measures.

10.2 Extensions of the doubly dual flat structure

There is also room for extensions of the doubly dual flat structure. While we consider only strictly convex thermodynamic functions and dissipation functions, the strict convexity is not necessarily required, at least for defining the generalized flow and the equilibrium and nonequilibrium flows^{Footnote 87}. Actually, in terms of thermodynamics, the thermodynamic function can be non-strictly convex when a phase transition of the system occurs [117]. The loss of bijectivity via the loss of the strict convexity can happen in complicated and degenerate statistical models [166]. Techniques from algebraic geometry could be employed to address such situations [167].

Moreover, the structure introduced for rLDG and CRN may be extended to irreversible cases, where some edges have only either forward or reverse jumps or reactions. For this purpose, we may take advantage of several results about the CB states obtained in CRN theory [8] where the reversibility is not necessarily assumed and those in stochastic thermodynamics for absolute irreversible processes [168].

While the nonequilibrium flow is general enough to cover at least all reversible CRN with LMA kinetics, the classes of nonlinear dynamics other than CRN are much wider in general. To further extend the range of models that can be covered, GENERIC (General Equation for Non-Equilibrium Reversible–Irreversible Coupling) would be a good candidate [169]. GENERIC is a theoretical framework to integrate dissipative dynamics (gradient flow dynamics) and conservative dynamics (Hamiltonian dynamics). The extension of the generalized flow to GENERIC has already been attempted but is still ongoing [77, 79]. One might also consider Hamiltonian-type dynamics, which differs from the GENERIC structure mentioned above. In the doubly dual flat structure, dual spaces are statically coupled by the Legendre duality. However, we could consider the coupling of two dynamics, each of which is defined on the primal and the dual spaces. Such coupling has been investigated in relation to accelerations of gradient flows [170], optimal control problems [171], and also mean field game problems [172]. It would be an interesting problem to formulate this dynamic coupling in relation to our results and also the results of GENERIC. The information geometry could offer new insights and techniques to achieve these missions.

10.3 Homological algebra and differential geometric formulations

From the viewpoint of the standard homological algebra, the doubly dual flat structure that we introduced is an extension of chain and cochain complexes with inner product structure. Because the homological algebra used here is an abstraction of the differential form, the doubly dual flat structure can also be viewed as an extension of the differential form and might be called dually flat form. It would be an interesting mission to characterize this stricture under a more rigorous mathematical formulation and to investigate if the Legendre duality can be consistently introduced for chains and cochains higher than those of the edge space. From the viewpoint of differential geometry, the dual flat structure can be defined independently of the specific coordinate by the Hessian geometry [145]. While we stick to the standard basis of the graph or hypergraph on which the convex thermodynamic function is defined, we can formulate it more generally. It would be an important future work to clarify how the doubly dual flat structure can be formulated from a differential geometric perspective.

10.4 Consistency and Persistence

Finally, we would like to mention the problem of consistency and persistence. In this work, we presume that the flux $\varvec{j}(\varvec{x})$ is consistent with $\mathbb {H}$ and that the trajectory is persistent. The explicit conditions when these are satisfied are still elusive. Actually, the condition for consistency is intricate, even for the separable cases. For an illustrative example, suppose that the ith molecule is involved as a reactant in the $e\hbox {th}$ reaction of a CRN with LMA kinetics. For $x_{i} \rightarrow 0$, $j_{e}^{+}(\varvec{x}) \rightarrow 0$ holds. However, their Legendre dual diverges as $y_{i} \rightarrow -\infty $ and $f_{e}(\varvec{x}) \rightarrow -\infty $. The flux $j_{e}(\varvec{x})$ stays finite because $\omega _{e}(\varvec{x})\rightarrow 0$ holds. This example suggests that we have to consider a certain limit of relevant quantities to appropriately address the consistency condition.

The persistence of the nonequilibrium flow would be a much harder problem. While persistence has been approached for CB CRN with LMA kinetics using techniques from algebraic geometry [112], its connection to information geometry has yet to be clarified. Moreover, from an information-geometric viewpoint, the loss of persistence means a change in the support of probability or positive density, which effectively results in a change in the topology of the underlying graph or hypergraph. To resolve the problem, we may need a deeper understanding of the interrelationship among dynamics, information-geometric structure, and the underlying topology.

Data availability statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

Notes

Also known as Markov chains.
The relation between the two forms of Fisher information has been explained in multiple ways. For example, they are related as the shift of the base space via parameters [36, 61]. The Fisher information number was introduced by Rao [58].
Ordered alphabetically.
This means that we exclude self loops.
We may consider other functional forms for $\varvec{j}(\varvec{x})$, which can induce nonlinear dynamics on the graph. In this work, we focus mainly on the linear case.
Also known as the Hadamard product or Schur product of vectors.
This interpretation is because Eq. 2 is associated with the continuity equation on a Euclidean space or on a Riemannian manifold where we have the divergence operator $\nabla \cdot $ instead of $\mathbb {B}$. However, divergence on a Riemannian manifold implicitly includes the information of the metric via the Hodge operator. On the contrary, $\mathbb {B}$ does not. From the viewpoint of homological algebra, $\mathbb {B}$ should be regarded as the adjoint (transpose) of the discrete exterior derivative operator $\delta ^{0}:=\mathbb {B}^{T}$, which is also often called a discrete gradient operator [73]. For a Euclidean space, they are the same.
We use the word ’reversible’ in this work to mean that each edge allows both forward and reverse jumps while reversible Markov jump processes sometimes mean that the detailed balance condition is satisfied. We introduce the notion of equilibrium later to designate the detailed balanced situation.
If we allow $k_{e}^{\pm }$ to be 0, we can include the irreversible MJP and also LDG in this formulation. We leave this extension for future work because it should require additional assumptions on the Legendre duality introduced in the subsequent sections.
Here, we have abused the notation $\mathbb {v}^{+}_{e}$ to indicate the index of the vertex $\mathbb {v}^{+}_{e}$. Equation 3 is reduced to Eq. 4 because $b^{\pm }_{i,e}=+1$ only when i is the index of the tail (head) vertex $\mathbb {v}^{\pm }_{e}$ of $\mathbb {e}_{e}$ and 0 otherwise.
For some of these applications, the relevant state space is $\mathbb {R}^{N_{\mathbb {v}}}$ instead of $\mathbb {R}^{N_{\mathbb {v}}}_{\ge 0}$.
This means that we exclude self loop hyperedges. However, the head and tail hypervertices are allowed to contain the same vertices as long as $\hat{\mathbb {v}}^{+}_{e} \ne \hat{\mathbb {v}}^{-}_{e}$ holds.
Our definition of CRN hypergraph differs in a couple of aspects from the conventional definition because of the additional information required to define CRN. For example, while the definition of edges is usually extended from those of graphs [87], our definition extends vertices instead.
Because we reserve the term complex for the cell complex in homological algebra, we use hypervertices to indicate the complexes of the CRN theory.
Even though the functional form of $j_{e}^{\pm }(\varvec{x})$ is automatically determined in the case of LDG because of the linearity, we have multiple possibilities to define nonlinear $j_{e}^{\pm }(\varvec{x})$.
We should note an important relation, , which holds because every column vector of $\mathbb {B}^{\pm }$ contains only one $+1$ and the others are 0.
There exists another type of extension known as generalized LMA kinetics where the monomials are replaced with fractional monomials, i.e., powers of $\varvec{x}$ with nonnegative real-valued exponents [102].
The complex used here should not be confused with complexes used in CRN theory [8]
We follow the terminology in [73]. While we use “cell”, we do not presume any N-dimensional topological manifold underlying the graph. The graph is just treated algebraically as in algebraic graph theory and homological algebra.
Depending on the choice of which elements of a graph are considered, the content of the complex changes. For example, vertices and edges are the major ingredients of the complex of a graph. The faces of a graph are often included in the complex. The definition of the higher-order elements than edges requires additional structural information to the incidence matrix of the graph, e.g., the edge-face incidence matrix.
In algebraic graph theory, the chain of a graph $\mathbb {G}$ is defined as an integer-valued vector space $\mathbb {Z}^{N_{\mathbb {v}}}$ to represent the discrete and combinatorial nature of $\mathbb {G}$ and also to specify the domain of integration. Here, we use $\mathbb {R}$ as the field of the vector space.
In algebraic graph theory, $\mathbb {B}$ is also identical to the discrete boundary operator from $C_{1}(\mathbb {G};\mathbb {Z})$ to $C_{0}(\mathbb {G};\mathbb {Z})$.
$N_{\mathbb {z}}=0$ when $\mathbb {G}$ is a set of trees.
The spanning tree chosen specifies a fundamental cycle and cocycle bases.
$\mathbb {V}^{T}$ is called the fundamental tieset matrix in graph theory
We should note that the sequence is not canonical because $\mathbb {U}$ and $\mathbb {V}$ depend on the choice of bases.
Upon necessity, we can consider the harmonic components by employing an under-complete basis for $\mathbb {V}$.
As far as we know, there is not a systematic and widely-appreciated way to define these bases because we have multiple ways to extend the notion of spanning tree of a graph to a hypergraph.
These notations are consistent with those in Sect. 2
Here $\varvec{x}$ is not density but a vector in $\mathbb {R}^{N_{\mathbb {v}}}$.
The persistence of a dynamical system is a hard problem, and the persistence for a subclass of CRN is an open problem [111, 112], which goes by the name of Global Attractor Conjecture since 1974.
In information geometry, the convex function inducing duality is often called a potential function. We avoid using the word “potential” to discriminate it with an element of the dual vertex affine space, which is called a potential (field) or chemical potential in physics and chemistry.
We may consider a convex function $\Phi (\varvec{x})$, which does not induce a bijection between $\mathcal {X}$ and $\mathcal {Y}$, e.g., the one which is not strictly convex. Such a situation can arise if a phase transition occurs. It would be an important direction to include this class of functions in this framework.
$\mathcal {Y}$ is not only associated with but also isomorphic to the 0-cochain. This condition is important when we consider information-geometric projections in the later sections. In the theory of differential forms, a 0-form is often described as a potential field on a manifold. Our choice of the potential space is consistent with this convention.
$\mathcal {D}^{\mathcal {X},\mathcal {Y}}_{\Phi ,\Phi ^{*}}[\varvec{x}; \varvec{y}']$ is also called Fenchel-Young divergence [116].
We here used the Legendre-Fenchel-Young identity (Eq. 27).
When we work on Hessian matrices, we always suppose additionally that they are twice-differentiable.
One may further generalize the separability so that $\phi (x)$ depends on i as $\phi _{i}(x)$.
The definition of dissipation functions is more strict than those used in the previous works, e.g., [75]. This is because we define extended projections in this space as in [83].
We do not use differentiation of $\Psi ^{*}_{\varvec{x}}(\varvec{f})$ and $\Psi _{\varvec{x}}(\varvec{j})$ with respect to $\varvec{x}$ in this work.
From the physical point of view, these conditions are consistent with the thermodynamic requirement that, if the force is zero, the corresponding flux becomes zero, and vice versa and that a sign-reversed force induced the sign-reversed flux.
In the context of thermodynamics, the nonnegativity of $\langle \varvec{j},\varvec{f}\rangle $ is linked to the nonnegativity of the entropy production rate and thus the second law of thermodynamics.
This correspondence illustrates that the dependency of $\Psi ^{q,*}_{\varvec{x}}(\varvec{f})$ on $\varvec{x}$ is a formal generalization of the Riemannian metric. But for this case, the relevant state space for $\varvec{x}$ is not the positive orthant but the vector space $\mathbb {R}^{N_{\mathbb {v}}}$.
The consistency is required because of our choice of $\mathbb {R}_{\ge 0}^{N_{\mathbb {X}}}$ as the density space.
The consistency with $\mathbb {H}$ is assumed to hold.
$\mathcal {M}^{\textrm{ST}}=\mathcal {M}^{DB}=\emptyset $ can hold, e.g., when $\mathcal {F}(\varvec{x})$ is a strictly monotonous function.
In addition, there exists the possibility that $\varvec{x}(t)$ converges to the boundary of $\mathcal {X}$.
The metric here means a general metric, which is not restricted to one associated with the inner product.
These inequality and equality conditions are usually derived by using Cauchy-Schwarz inequality [127]. From the information-geometric framework, they are trivially attributed to the non-negativity of Bregman divergence.
In physics, such $\varvec{f}_{NE}$ can be identified with a nonequilibrium force applied externally to the system.
For CRN, these forms of the thermodynamic functions are derived from the conventional thermodynamics of ideal gas or dilute solution with non-reactive solvent [48]. Mathematically, we may employ other functions as we introduce different information geometric structures onto a family of probabilities depending on the purpose. Such exploitation is an interesting open problem.
As we will see later, the condition $\varvec{1}^{T}\varvec{p}(t)=1$ need not be assumed but is automatically satisfied due to the topological constraint of the graph and the initial condition $\varvec{1}^{T}\varvec{p}(0)=1$ when we work on rMJP.
This is parallel to the problem of how to define the dual of $\varvec{x}$. The choice of logarithm is contingent on the domain and knowledge of physics and statistics.
Some relations were obtained empirically through experiments and others were computed theoretically from microscopic models.
LDB assumption is different from the DB condition in Def. 22.
The validity of LDB was shown for rMJP and CRN with LMA kinetics via large deviation theory for the corresponding microscopic Markovian models or via its consistency with the macroscopic chemical thermodynamics [130, 131].
The dissipation functions in Eq. 68 and the induced Legendre transformation in Eq. 69 are not necessarily restricted to these particular types of force and activity. Actually, the extended LMA kinetics (Eq. 13) can also be represented by replacing $\varvec{\omega }_{\textrm{MA}}(\varvec{x})$ with $\varvec{\omega }_{\textrm{eMA}}(\varvec{x})=\varvec{g}(\varvec{x})\circ \varvec{\omega }_{\textrm{MA}}(\varvec{x})$. Thus, Eq. 68 could be applied to a wider class of kinetics than Eq. 12.
For CRN, $\varvec{K}$ is referred as the equilibrium constant in chemistry.
It should be noted that, while $\tilde{\varvec{y}}$ is not uniquely determined by $\varvec{K}$ in general, it does not cause problems. This is clarified in the following section (Sect. 8) by introducing appropriate affine subspaces.
Historically, the equilibrium chemical systems were characterized by macroscopic thermodynamics. The equilibrium condition was derived as the necessary and sufficient condition that the flux of the LMA kinetics (Eq. 12) should satisfy to have consistent properties with the thermodynamic equilibrium systems. It was found only recently that the equilibrium properties are mathematically attributed to the generalized gradient flow structure.
The DB condition is conventionally adopted because, for example, it makes it easy to obtain an MCMC with a desirable stationary distribution. This nice property comes from the gradient-flow property of the equilibrium flow.
It should be noted that this representation holds only when $\varvec{k}^{+}=\varvec{k}^{-}$ holds.
Even if we restrict the dynamics to $\mathcal {X}=\mathcal {Y}=\mathbb {R}^{N_{\mathbb {v}}}_{>0}$, no problem arises for defining the generalized flow as long as we do not consider projections that we are going to introduce.
The locality and separability may sound natural. However, from the physical viewpoint, the Onsager matrix can have nondiagonal components, which implies nonseparable dissipation functions. In addition, equilibrium thermodynamics does not preclude thermodynamic functions from being nonseparable.
This inequality is obtained directly from the inequality $2(a-b)^{2}/(a+b)\le (a-b)\ln a/b$ for $a,b>0$.
The base measure is omitted because this is just a formal one.
In CRN theory, a stoichiometric subspace is called stoichiometric compatibility class [8].
Because $\mathcal {P}^{sc}(\varvec{x}_{0})$ is restricted within the positive orthant $\mathcal {X}$, $\mathcal {P}^{sc}(\varvec{x}_{0})$ is a polyhedron. If bounded, it is called a polytope in discrete geometry and also in combinatorial optimization [146]. However, we abuse the word (affine) subspace for $\mathcal {P}^{sc}(\varvec{x}_{0})$, and use polyhedron or polytope when we care about the boundary.
In this work, orthogonality always means the orthogonal complement in dual vector spaces except otherwise stated.
We abuse the notation $\varvec{x}^{\dagger } = \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})$ because the intersection $\mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})$ is a unique point.
In algebraic statistics, $\mathbb {U}$ is explicitly given as constraints of a statistical model. In the dynamics on graphs and hypergraphs, $\mathbb {S}$ is explicitly given, and $\mathbb {U}$ is implicitly defined as a complete basis of $\textrm{Ker}\mathbb {S}^{T}$. As a result, their connection is not apparently obvious.
It should be noted that, in general, $\varvec{j}_{st}\ne \varvec{j}-\varvec{j}_{eq}$, $\varvec{f}_{eq}\ne \varvec{f}-\varvec{f}_{st}$, $\varvec{j} \ne \varvec{j}_{eq}+\varvec{j}_{st}$, and $\varvec{f} \ne \varvec{f}_{eq}+\varvec{f}_{st}$ hold due to the nonlinearity of Legendre transformation, except when the dissipation functions are quadratic under which the Legendre dual relation is reduced to the linear inner-product relation.
While we introduce the central affine manifolds only on the edge space, they can be defined on the vertex space as well. The central affine manifolds on the vertex space of CRN become fundamental when we work on the isobaric processes in which the volume changes in conjunction with the reactions [85]. In this case, the volume is a global variable affecting all the reactions simultaneously.
If $\Vert \cdot \Vert ^{2}$ is a squared norm that is not necessarily induced by an inner product, the orthogonality is called isosceles or James orthogonality [147]. Here, $\Vert \cdot \Vert ^{2}$ is further replaced with the dissipation function, which does not satisfy some conditions required to be a norm.
Actually, this result is independent of the detail of the dissipation functions.
We have multiple types of H-theorems. The most famous one is Boltzmann’s H theorem, in which the H function is derived from the microscopic dynamics of a system. In Gibbs’ H-theorem, H function is obtained by coarse-graining the microscopic system [44]
Because $\varvec{f}^{\dagger }(\varvec{x},\varvec{v}) \in \mathcal {P}^{fr}(\varvec{0})=\textrm{Im}[\mathbb {S}^{T}]$ and $\tilde{\mathcal {T}}_{\varvec{x}}^{*} \mathcal {X}:=\mathcal {Y}/\textrm{Ker}\mathbb {S}^{T}$, $\varvec{u}^{\dagger }(\varvec{x},\varvec{v})$ is uniquely determined.
To have global stability, we have to consider the boundary of $\mathcal {X}$.
The transformation of Eq. 146 here is strongly dependent on the specific functional form of $\varvec{j}_{MA}(\varvec{x})$.
We skip the derivation because it is involved. See the original derivation [100] or our rephrased version [84]
The real variety generated by a toric ideal, i.e., a binomial and prime ideal [100].
Such a situation is called unconditionally complex balanced.
If $\textrm{Rank}[\mathbb {S}]=\textrm{Rank}[\mathbb {B}]$, CRN is unconditionally complex-balanced. This condition is called the deficiency zero condition [8].
$\varvec{j}_{eq}(\varvec{x})$ may not be equilibrium flux because $\varvec{u}_{eq}(\varvec{x})$ is not necessarily represented by the gradient of a certain function $\mathcal {F}(\varvec{x})$ as $\varvec{u}_{eq}(\varvec{x})=\partial _{\varvec{x}} \mathcal {F}(\varvec{x})$.
While this result may sound not so significant mathematically, for physics and chemistry, the result means that the effective flux is physically realizable and testable.
It should be noted that various definitions of housekeeping and excess EPRs have been proposed in the research field of nonequilibrium thermodynamics. The definition here is just one of them. It is an open problem how to define the notion of effective entropy for nonequilibrium dynamics.
The strict convexity of the dissipation functions is assumed to work on the projections in the edge spaces where the bijection between $\mathcal {J}_{\varvec{x}}$ and $\mathcal {F}_{\varvec{x}}$ are important. In addition, $\mathcal {Y}=\mathbb {R}^{N_{\mathbb {v}}}$ is assumed to make the induced dually flat structure well-defined for all given $\varvec{v}$. Only to define the generalized flow and equilibrium and nonequilibrium flow, injectivity of $\partial \Psi ^{*}_{\varvec{x}}$ and $\partial \Phi $ is sufficient and $\mathcal {Y}=\mathbb {R}^{N_{\mathbb {v}}}$ is not required.

References

Amari, S.-I.: Information Geometry and Its Applications. Springer, New York (2016)
Book Google Scholar
Ay, N., Gibilisco, P., Matúš, F.: Information Geometry and Its Applications: On the Occasion of Shun-ichi Amari’s 80th Birthday, IGAIA IV Liblice, Czech Republic, June 2016. Springer, New York (2018)
Book Google Scholar
Risken, H., Frank, T.: The Fokker-Planck Equation: Methods of Solution And Applications. Springer Science & Business Media, New York (1996)
Book Google Scholar
Horsthemke, W., Lefever, R.: Noise-Induced Transitions: Theory and Applications in Physics, Chemistry, and Biology. Springer Science & Business Media, New York (2006)
Google Scholar
Gardiner, C.: Stochastic Methods: A Handbook for the Natural and Social Sciences. Springer, Berlin (2010)
Google Scholar
Murray, J.D.: In: Mathematical Biology: I. An Introduction. Springer Science & Business Media, New York (2007)
Beard, D.A., Qian, H.: Chemical Biophysics: Quantitative Analysis of Cellular Systems. Cambridge Texts in Biomedical Engineering. Cambridge University Press, Cambridge (2008). (10.1017/CBO9780511803345)
Book Google Scholar
Feinberg, M.: Foundations of Chemical Reaction Network Theory. Springer, New York (2019)
Book Google Scholar
Amari, S.: Differential Geometry in Statistical Inference. Institute of mathematical Statistics, Hayward, Calif (1987)
Book Google Scholar
Ravishanker, N., Melnick, E.L., Tsai, C.-L.: Differential Geometry of Arma Models. J. Time Ser. Anal. 11(3), 259–274 (1990). https://doi.org/10.1111/j.1467-9892.1990.tb00057.x
Article MathSciNet Google Scholar
Tanaka, F., Komaki, F.: Asymptotic expansion of the risk difference of the Bayesian spectral density in the autoregressive moving average model. Sankhya A 73(1), 162–184 (2011). https://doi.org/10.1007/s13171-011-0005-1
Article MathSciNet Google Scholar
Amari, S.-I: Differential geometry of a parametric family of invertible linear systems-Riemannian metric, dual affine connections, and divergence. Math. Systems Theory 20(1), 53–82 (1987). https://doi.org/10.1007/BF01692059
Amari, S.-I.: Information geometry on hierarchy of probability distributions. IEEE Trans. Inf. Theory 47(5), 1701–1711 (2001). https://doi.org/10.1109/18.930911
Article MathSciNet Google Scholar
Nakagawa, K., Kanaya, F.: On the converse theorem in statistical hypothesis testing for Markov chains. IEEE Trans. Inf. Theory 39(2), 629–633 (1993). https://doi.org/10.1109/18.212294
Article MathSciNet Google Scholar
Takeuchi, J., Barron, A.R.: Asymptotically minimax regret by Bayes mixtures. In: Proc. 1998 IEEE Int. Symp. Inf. Theory Cat No98CH36252, p. 318 (1998). https://doi.org/10.1109/ISIT.1998.708923
Nagaoka, H.: The Exponential Family of Markov Chains and Its Information. Geometry (2017). https://doi.org/10.48550/arXiv.1701.06119
Article Google Scholar
Takeuchi, J., Kawabata, T.: Exponential Curvature of Markov Models. In: 2007 IEEE Int. Symp. Inf. Theory, pp. 2891–2895 (2007). https://doi.org/10.1109/ISIT.2007.4557657
Hayashi, M., Watanabe, S.: Information geometry approach to parameter estimation in Markov chains. Ann. Stat. 44(4), 1495–1535 (2016). https://doi.org/10.1214/15-AOS1420
Article MathSciNet Google Scholar
Wolfer, G., Watanabe, S.: Information Geometry of Reversible Markov Chains. Info. Geo. 4(2), 393–433 (2021). https://doi.org/10.1007/s41884-021-00061-7
Article MathSciNet Google Scholar
Pistone, G., Rogantin, M.P.: The algebra of reversible Markov chains. Ann. Inst. Stat. Math. 65(2), 269–293 (2013). https://doi.org/10.1007/s10463-012-0368-7
Article MathSciNet Google Scholar
Obata, T., Hara, H., Endo, K.: Differential geometry of nonequilibrium processes. Phys. Rev. A 45(10), 6997–7001 (1992). https://doi.org/10.1103/PhysRevA.45.6997
Article MathSciNet Google Scholar
Ohara, A.: Geometric study for the Legendre duality of generalized entropies and its application to the porous medium equation. Eur. Phys. J. B 70(1), 15–28 (2009). https://doi.org/10.1140/epjb/e2009-00170-y
Article Google Scholar
Ohara, A., Zhang, X.: Properties of Nonlinear Diffusion Equations on Networks and Their Geometric Aspects. In: Nielsen, F., Barbaresco, F. (eds.) Geom. Sci. Inf. Lecture Notes in Computer Science, pp. 736–743. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-80209-7_79
Nakamura, Y.: Completely integrable gradient systems on the manifolds of Gaussian and multinomial distributions. Japan J. Indust. Appl. Math. 10(2), 179 (1993). https://doi.org/10.1007/BF03167571
Article MathSciNet Google Scholar
Fujiwara, A., Amari, S.-I.: Gradient systems in view of information geometry. Physica D 80(3), 317–327 (1995). https://doi.org/10.1016/0167-2789(94)00175-P
Article MathSciNet Google Scholar
Felice, D., Ay, N.: Dynamical Systems Induced by Canonical Divergence in Dually Flat Manifolds. arXiv (2018). https://doi.org/10.48550/arXiv.1812.04461
Goto, S.-I., Wada, T.: Hessian–information geometric formulation of Hamiltonian systems and generalized Toda’s dual transform. J. Phys. A: Math. Theor. 51(32), 324001 (2018). https://doi.org/10.1088/1751-8121/aacbdf
Article MathSciNet Google Scholar
Chirco, G., Malagò, L., Pistone, G.: Lagrangian and Hamiltonian Mechanics for Probabilities on the Statistical Manifold. Int. J. Geom. Methods Mod. Phys., 2250214 (2022) https://doi.org/10.1142/S0219887822502140 arxiv:2009.09431 [hep-th, stat]
Ihara, S.: Information Theory for Continuous Systems. World Scientific (1993)
Brigo, D., Hanzon, B., Gland, F.L.: Approximate nonlinear filtering by projection on exponential manifolds of densities. Bernoulli 5(3), 495–534 (1999)
Article MathSciNet Google Scholar
Newton, N.J.: Nonlinear Filtering and Information Geometry: A Hilbert Manifold Approach. In: Ay, N., Gibilisco, P., Matúš, F. (eds.) Inf. Geom. Its Appl. Springer Proceedings in Mathematics & Statistics, pp. 189–208. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-97798-0_7
Fleming, W.H., Mitter, S.K.: Optimal control and nonlinear filtering for nondegenerate diffusion processes. Stochastics 8(1), 63–77 (1982). https://doi.org/10.1080/17442508208833228
Article MathSciNet Google Scholar
Todorov, E.: Efficient computation of optimal actions. Proc. Natl. Acad. Sci. 106(28), 11478–11483 (2009). https://doi.org/10.1073/pnas.0710743106
Article Google Scholar
Theodorou, E.A., Todorov, E.: Relative entropy and free energy dualities: Connections to Path Integral and KL control. In: 2012 IEEE 51st IEEE Conf. Decis. Control CDC, pp. 1466–1473 (2012). https://doi.org/10.1109/CDC.2012.6426381
Jaynes, E.T.: Information Theory and Statistical Mechanics. Phys. Rev. 106(4), 620–630 (1957). https://doi.org/10.1103/PhysRev.106.620
Article MathSciNet Google Scholar
Frieden, B.R.: Science from Fisher Information: A Unification. Cambridge University Press, Cambridge (2004). https://doi.org/10.1017/CBO9780511616907
Book Google Scholar
Sagawa, T.: Thermodynamics of Information Processing in Small Systems. Springer Science & Business Media (2012)
Google Scholar
Kullback, S., Leibler, R.A.: On Information and Sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951). https://doi.org/10.1214/aoms/1177729694
Article MathSciNet Google Scholar
Lebowitz, J.L., Bergmann, P.G.: Irreversible gibbsian ensembles. Ann. Phys. 1(1), 1–23 (1957). https://doi.org/10.1016/0003-4916(57)90002-7
Article MathSciNet Google Scholar
Shear, D.: An analog of the Boltzmann H-theorem (a Liapunov function) for systems of coupled chemical reactions. J. Theor. Biol. 16(2), 212–228 (1967). https://doi.org/10.1016/0022-5193(67)90005-7
Article Google Scholar
Horn, F., Jackson, R.: General mass action kinetics. Arch. Rational Mech. Anal. 47(2), 81–116 (1972). https://doi.org/10.1007/BF00251225
Article MathSciNet Google Scholar
Goh, B.S.: Global Stability in Many-Species Systems. Am. Nat. 111(977), 135–143 (1977). https://doi.org/10.1086/283144
Article Google Scholar
Figueiredo, A., Gléria, I.M., Rocha Filho, T.M.: Boundedness of solutions and Lyapunov functions in quasi-polynomial systems. Phys. Lett. A 268(4), 335–341 (2000). https://doi.org/10.1016/S0375-9601(00)00175-4
Article MathSciNet Google Scholar
Gibbs, J.W.: Elementary Principles in Statistical Mechanics: Developed with Especial Reference to the Rational Foundation of Thermodynamics. Cambridge Library Collection - Mathematics. Cambridge University Press, Cambridge (2010). https://doi.org/10.1017/CBO9780511686948
Waage, P., Gulberg, C.M.: Studies concerning affinity. J. Chem. Educ. 63(12), 1044 (1986). https://doi.org/10.1021/ed063p1044
Article Google Scholar
Ge, H., Qian, H.: Nonequilibrium thermodynamic formalism of nonlinear chemical reaction systems with Waage–Guldberg’s law of mass action. Chem. Phys. 472, 241–248 (2016). https://doi.org/10.1016/j.chemphys.2016.03.026
Article Google Scholar
Rao, R., Esposito, M.: Nonequilibrium Thermodynamics of Chemical Reaction Networks: Wisdom from Stochastic Thermodynamics. Phys. Rev. X 6(4), 041064 (2016). https://doi.org/10.1103/PhysRevX.6.041064
Article Google Scholar
Sughiyama, Y., Loutchko, D., Kamimura, A., Kobayashi, T.J.: Hessian geometric structure of chemical thermodynamic systems with stoichiometric constraints. Phys. Rev. Research 4(3), 033065 (2022). https://doi.org/10.1103/PhysRevResearch.4.033065
Article Google Scholar
Seifert, U.: Stochastic thermodynamics, fluctuation theorems and molecular machines. Rep. Prog. Phys. 75(12), 126001 (2012). https://doi.org/10.1088/0034-4885/75/12/126001
Article Google Scholar
Ito, S.: Stochastic Thermodynamic Interpretation of Information Geometry. Phys. Rev. Lett. 121(3), 030605 (2018). https://doi.org/10.1103/PhysRevLett.121.030605
Article Google Scholar
Kolchinsky, A., Wolpert, D.H.: Work, Entropy Production, and Thermodynamics of Information under Protocol Constraints. Phys. Rev. X 11(4), 041024 (2021). https://doi.org/10.1103/PhysRevX.11.041024
Article Google Scholar
Yoshimura, K., Ito, S.: Information geometric inequalities of chemical thermodynamics. Phys. Rev. Research 3(1), 013175 (2021). https://doi.org/10.1103/PhysRevResearch.3.013175
Article Google Scholar
Ohga, N., Ito, S.: Information-geometric Legendre duality in stochastic thermodynamics. ArXiv211211008 Cond-Mat (2021) arxiv:2112.11008 [cond-mat]
Stam, A.J.: Some inequalities satisfied by the quantities of information of Fisher and Shannon. Inf. Control 2(2), 101–112 (1959). https://doi.org/10.1016/S0019-9958(59)90348-1
Article MathSciNet Google Scholar
Plastino, A.R., Casas, M., Plastino, A.: Fisher’s information, Kullback’s measure, and H-theorems. Phys. Lett. A 246(6), 498–504 (1998). https://doi.org/10.1016/S0375-9601(98)00567-2
Article Google Scholar
Wibisono, A., Jog, V., Loh, P.-L.: Information and estimation in Fokker-Planck channels. In: 2017 IEEE Int. Symp. Inf. Theory ISIT, pp. 2673–2677 (2017). https://doi.org/10.1109/ISIT.2017.8007014
Fisher, R.A., Russell, E.J.: On the mathematical foundations of theoretical statistics. Philos. Trans. R. Soc. Lond. Ser. Contain. Pap. Math. Phys. Character 222(594–604), 309–368 (1922). https://doi.org/10.1098/rsta.1922.0009
Article Google Scholar
Rao, B.R.: On an analogue of Cramér-Rao’s inequality. Scand. Actuar. J. 1958(1–2), 57–67 (1958). https://doi.org/10.1080/03461238.1958.10405982
Article Google Scholar
Papaioannou, T., Ferentinos, K.: On Two Forms of Fisher’s Measure of Information. Commun. Stat. - Theory Methods 34(7), 1461–1470 (2005). https://doi.org/10.1081/STA-200063386
Article MathSciNet Google Scholar
Kharazmi, O., Asadi, M.: On the time-dependent Fisher information of a density function. Braz. J. Probab. Stat. 32(4), 795–814 (2018). https://doi.org/10.1214/17-BJPS366
Article MathSciNet Google Scholar
Johnson, O.: Information Theory and the Central Limit Theorem. World Scientific (2004)
Yamano, T.: De Bruijn-type identity for systems with flux. Eur. Phys. J. B 86(8), 363 (2013). https://doi.org/10.1140/epjb/e2013-40634-9
Article Google Scholar
Gross, L.: Logarithmic Sobolev Inequalities. Am. J. Math. 97(4), 1061–1083 (1975). https://doi.org/10.2307/2373688
Article MathSciNet Google Scholar
Gross, L.: Hypercontractivity and logarithmic Sobolev inequalities for the Clifford-Dirichlet form. Duke Math. J. 42(3), 383–396 (1975). https://doi.org/10.1215/S0012-7094-75-04237-4
Article MathSciNet Google Scholar
Otto, F.: The Geometry of Dissipative Evolution Equations: The Porous Medium Equation. Commun. Partial Differ. Equ. 26(1–2), 101–174 (2001). https://doi.org/10.1081/PDE-100002243
Article MathSciNet Google Scholar
Villani, C.: Topics in Optimal Transportation. American Mathematical Soc, New York (2003)
Book Google Scholar
Amari, S.-I: Natural Gradient Works Efficiently in Learning. Neural Comput. 10(2), 251–276 (1998). https://doi.org/10.1162/089976698300017746
Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31(3), 167–175 (2003). https://doi.org/10.1016/S0167-6377(02)00231-6
Article MathSciNet Google Scholar
Raskutti, G., Mukherjee, S.: The Information Geometry of Mirror Descent. IEEE Trans. Inf. Theory 61(3), 1451–1457 (2015). https://doi.org/10.1109/TIT.2015.2388583
Article MathSciNet Google Scholar
Ollivier, Y., Arnold, L., Auger, A., Hansen, N.: Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles. J. Mach. Learn. Res. 18(18), 1–65 (2017)
MathSciNet Google Scholar
Hino, H., Akaho, S., Murata, N.: Geometry of EM and related iterative algorithms. Info. Geo. (2022). https://doi.org/10.1007/s41884-022-00080-y
Article Google Scholar
Pistone, G.: Information Geometry of Smooth Densities on the Gaussian Space: Poincaré Inequalities. In: Nielsen, F. (ed.) Progress in Information Geometry: Theory and Applications. Signals and Communication Technology, pp. 1–17. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-65459-7_1
Grady, L.J., Polimeni, J.R.: Discrete Calculus: Applied Analysis on Graphs for Computational Science. Springer Science & Business Media, New York (2010)
Book Google Scholar
Musielak, J.: Orlicz Spaces and Modular Spaces. Springer, New York (1983)
Book Google Scholar
Mielke, A., Peletier, M.A., Renger, D.R.M.: On the Relation between Gradient Flows and the Large-Deviation Principle, with Applications to Markov Chains and Diffusion. Potential Anal 41(4), 1293–1327 (2014). https://doi.org/10.1007/s11118-014-9418-5
Article MathSciNet Google Scholar
Mielke, A., Patterson, R.I.A., Peletier, M.A., Michiel Renger, D.R.: Non-equilibrium Thermodynamical Principles for Chemical Reactions with Mass-Action Kinetics. SIAM J. Appl. Math. 77(4), 1562–1585 (2017). https://doi.org/10.1137/16M1102240
Article MathSciNet Google Scholar
Renger, D.R.M.: Gradient and GENERIC Systems in the Space of Fluxes. Applied to Reacting Particle Systems. Entropy 20(8), 596 (2018). https://doi.org/10.3390/e20080596
Article Google Scholar
Kaiser, M., Jack, R.L., Zimmer, J.: Canonical Structure and Orthogonality of Forces and Currents in Irreversible Markov Chains. J. Stat. Phys. 170(6), 1019–1050 (2018). https://doi.org/10.1007/s10955-018-1986-0
Article MathSciNet Google Scholar
Patterson, R.I.A., Renger, D.R.M., Sharma, U.: Variational structures beyond gradient flows: A macroscopic fluctuation-theory perspective. ArXiv210314384 Math-Ph (2021) arxiv:2103.14384 [math-ph]
Peletier, M.A., Rossi, R., Savaré, G., Tse, O.: Jump processes as generalized gradient flows. Calc. Var. 61(1), 33 (2022). https://doi.org/10.1007/s00526-021-02130-2
Article MathSciNet Google Scholar
Renger, D.R.M., Zimmer, J.: Orthogonality of fluxes in general nonlinear reaction networks. Discrete Contin. Dyn. Syst. - S 14(1), 205 (2021). https://doi.org/10.3934/dcdss.2020346
Article MathSciNet Google Scholar
Peletier, M.A., Schlichting, A.: Cosh gradient systems and tilting. Nonlinear Anal. 231, 113094 (2023). https://doi.org/10.1016/j.na.2022.113094
Article MathSciNet Google Scholar
Kobayashi, T.J., Loutchko, D., Kamimura, A., Sughiyama, Y.: Hessian geometry of nonequilibrium chemical reaction networks and entropy production decompositions. Phys. Rev. Research 4(3), 033208 (2022). https://doi.org/10.1103/PhysRevResearch.4.033208
Article Google Scholar
Kobayashi, T.J., Loutchko, D., Kamimura, A., Sughiyama, Y.: Kinetic derivation of the Hessian geometric structure in chemical reaction networks. Phys. Rev. Research 4(3), 033066 (2022). https://doi.org/10.1103/PhysRevResearch.4.033066
Article Google Scholar
Sughiyama, Y., Kamimura, A., Loutchko, D., Kobayashi, T.J.: Chemical thermodynamics for growing systems. Phys. Rev. Research 4(3), 033191 (2022). https://doi.org/10.1103/PhysRevResearch.4.033191
Article Google Scholar
Godsil, C., Royle, G.F.: Algebraic Graph Theory. Springer Science & Business Media, New York (2013)
Google Scholar
Bretto, A.: Hypergraph Theory: An Introduction. Springer Science & Business Media, New York (2013)
Book Google Scholar
Meyn, S., Tweedie, R.L.: Markov Chains and Stochastic Stability. Cambridge University Press, Cambridge (2009)
Book Google Scholar
Schnakenberg, J.: Network theory of microscopic and macroscopic behavior of master equation systems. Rev. Mod. Phys. 48(4), 571–585 (1976). https://doi.org/10.1103/RevModPhys.48.571
Article MathSciNet Google Scholar
Craciun, G.: Polynomial Dynamical Systems, Reaction Networks, and Toric Differential Inclusions. arXiv (2019). https://doi.org/10.48550/arXiv.1901.02544
Biggs, N.: Algebraic Potential Theory on Graphs. Bull. Lond. Math. Soc. 29(6), 641–682 (1997). https://doi.org/10.1112/S0024609397003305
Article MathSciNet Google Scholar
Chung, F.R.K., Graham, F.C.: Spectral Graph Theory. American Mathematical Soc (1997)
Dörfler, F., Simpson-Porco, J.W., Bullo, F.: Electrical Networks and Algebraic Graph Theory: Models, Properties, and Applications. Proc. IEEE 106(5), 977–1005 (2018). https://doi.org/10.1109/JPROC.2018.2821924
Article Google Scholar
Saber, R.O., Murray, R.M.: Consensus protocols for networks of dynamic agents. In: Proc. 2003 Am. Control Conf. 2003, vol. 2, pp. 951–956 (2003). https://doi.org/10.1109/ACC.2003.1239709
Veerman, J.J.P., Lyons, R.: A Primer on Laplacian Dynamics in Directed Graphs. arXiv (2020). https://doi.org/10.48550/arXiv.2002.02605
Qian, H., Ge, H.: Stochastic Chemical Reaction Systems in Biology. Springer Nature, New York (2021)
Book Google Scholar
Keener, J., Sneyd, J.: Mathematical Physiology: I: Cellular Physiology. Springer Science & Business Media, New York (2008)
Google Scholar
Sottile, F.: Toric ideals, real toric varieties, and the algebraic moment map. arXiv:math/0212044 (2008) arxiv:math/0212044
Cox, D.A., Little, J.B., Schenck, H.K.: Toric Varieties. American Mathematical Soc, New York (2011)
Book Google Scholar
Craciun, G., Dickenstein, A., Shiu, A., Sturmfels, B.: Toric dynamical systems. J. Symb. Comput. 44(11), 1551–1565 (2009). https://doi.org/10.1016/j.jsc.2008.08.006
Article MathSciNet Google Scholar
Rapallo, F.: Toric statistical models: Parametric and binomial representations. AISM 59(4), 727–740 (2007). https://doi.org/10.1007/s10463-006-0079-z
Article MathSciNet Google Scholar
Müller, S., Regensburger, G.: Generalized Mass Action Systems: Complex Balancing Equilibria and Sign Vectors of the Stoichiometric and Kinetic-Order Subspaces. SIAM J. Appl. Math. 72(6), 1926–1947 (2012). https://doi.org/10.1137/110847056
Article MathSciNet Google Scholar
Noor, E., Flamholz, A., Liebermeister, W., Bar-Even, A., Milo, R.: A note on the kinetics of enzyme action: A decomposition that highlights thermodynamic effects. FEBS Lett. 587(17), 2772–2777 (2013). https://doi.org/10.1016/j.febslet.2013.07.028
Article Google Scholar
Yoshimura, K., Kolchinsky, A., Dechant, A., Ito, S.: Housekeeping and excess entropy production for general nonlinear dynamics. Phys. Rev. Res. 5(1), 013017 (2023). https://doi.org/10.1103/PhysRevResearch.5.013017
Article Google Scholar
Lim, L.-H.: Hodge Laplacians on Graphs. SIAM Rev. 62(3), 685–715 (2020). https://doi.org/10.1137/18M1223101
Article MathSciNet Google Scholar
Sunada, T.: Topological Crystallography: With a View Towards Discrete Geometric Analysis. Springer Science & Business Media, New York (2012)
Google Scholar
Desbrun, M., Hirani, A.N., Leok, M., Marsden, J.E.: Discrete Exterior Calculus. arXiv (2005). https://doi.org/10.48550/arXiv.math/0508341
Hirani, A.N.: Discrete Exterior Calculus. PhD thesis, California Institute of Technology (2003). https://doi.org/10.7907/ZHY8-V329
Knauer, U.: Algebraic Graph Theory: Morphisms. Monoids and Matrices. Walter de Gruyter, New York (2011)
Book Google Scholar
Chen, W.-K.: Applied Graph Theory. Elsevier, Amsterdam (2012)
Google Scholar
Craciun, G., Nazarov, F., Pantea, C.: Persistence and Permanence of Mass-Action and Power-Law Dynamical Systems. SIAM J. Appl. Math. 73(1), 305–329 (2013). https://doi.org/10.1137/100812355
Article MathSciNet Google Scholar
Craciun, G.: Toric Differential Inclusions and a Proof of the Global Attractor Conjecture (2016). https://doi.org/10.48550/arXiv.1501.02860
Article Google Scholar
Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3), 200–217 (1967). https://doi.org/10.1016/0041-5553(67)90040-7
Article MathSciNet Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press (1997)
Mitroi, F.-C., Niculescu, C.P.: An Extension of Young’s Inequality. Abstr. Appl. Anal. 2011, 162049 (2011). https://doi.org/10.1155/2011/162049
Article MathSciNet Google Scholar
Nielsen, F.: On Geodesic Triangles with Right Angles in a Dually Flat Space. In: Nielsen, F. (ed.) Progress in Information Geometry: Theory and Applications. Signals and Communication Technology, pp. 153–190. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-65459-7_7
Callen, H.B., Callen, H.B.: Thermodynamics and an Introduction to Thermostatistics. Wiley, New York (1985)
Google Scholar
Hiriart-Urruty, J.-B., Lemarechal, C.: Convex Analysis and Minimization Algorithms II: Advanced Theory and Bundle Methods. Springer Science & Business Media, New York (1996)
Google Scholar
Krasnosel’skij, M.A., Rutickij, J.B.: Convex Functions and Orlicz Spaces. Hindustan Publ, New York (1962)
Google Scholar
Lods, B., Pistone, G.: Information Geometry Formalism for the Spatially Homogeneous Boltzmann Equation. Entropy 17(6), 4323–4363 (2015). https://doi.org/10.3390/e17064323
Article MathSciNet Google Scholar
Pistone, G.: Information Geometry of the Gaussian Space. In: Ay, N., Gibilisco, P., Matúš, F. (eds.) Inf. Geom. Its Appl. Springer Proceedings in Mathematics & Statistics, pp. 119–155. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-97798-0_5
Ambrosio, L., Gigli, N., Savare, G.: Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Springer Science & Business Media, New York (2006)
Google Scholar
Liero, M., Mielke, A., Savaré, G.: Optimal Transport in Competition with Reaction: The Hellinger-Kantorovich Distance and Geodesic Curves. SIAM J. Math. Anal. 48(4), 2869–2911 (2016). https://doi.org/10.1137/15M1041420
Article MathSciNet Google Scholar
Onsager, L.: Reciprocal Relations in Irreversible Processes. I. Phys. Rev. 37(4), 405–426 (1931). https://doi.org/10.1103/PhysRev.37.405
Article Google Scholar
Onsager, L.: Reciprocal Relations in Irreversible Processes. II. Phys. Rev. 38(12), 2265–2279 (1931). https://doi.org/10.1103/PhysRev.38.2265
Article Google Scholar
Machlup, S., Onsager, L.: Fluctuations and Irreversible Process. II. Systems with Kinetic Energy. Phys. Rev. 91(6), 1512–1515 (1953). https://doi.org/10.1103/PhysRev.91.1512
Article MathSciNet Google Scholar
Lisini, S.: Nonlinear diffusion equations with variable coefficients as gradient flows in Wasserstein spaces. ESAIM: COCV 15(3), 712–740 (2009). https://doi.org/10.1051/cocv:2008044
Article MathSciNet Google Scholar
Peletier, M.A.: Variational Modelling: Energies, Gradient Flows, and Large Deviations. arXiv (2014). https://doi.org/10.48550/arXiv.1402.1990
Truesdell, C.A., Truesdell, C., Noll, W., Antman, S., Noll, W.: The Non-Linear Field Theories of Mechanics. Springer Science & Business Media (2004)
Bergmann, P.G., Lebowitz, J.L.: New Approach to Nonequilibrium Processes. Phys. Rev. 99(2), 578–587 (1955). https://doi.org/10.1103/PhysRev.99.578
Article MathSciNet Google Scholar
Maes, C.: Local detailed balance. SciPost Phys. Lect. Notes, 032 (2021) https://doi.org/10.21468/SciPostPhysLectNotes.32
Maes, C.: Non-Dissipative Effects in Nonequilibrium Systems. Springer, New York (2017)
Google Scholar
Patterson, R., Renger, M.: Large deviations of reaction fluxes. Math. Phys. Anal. Geom. 22(3), 21 (2019). https://doi.org/10.1007/s11040-019-9318-4. arxiv:1802.02512 [math]
Article Google Scholar
Wegscheider, R.: Über simultane Gleichgewichte und die Beziehungen zwischen Thermodynamik und Reaktionskinetik homogener Systeme. Z. Für Phys. Chem. 39U(1), 257–303 (1902). https://doi.org/10.1515/zpch-1902-3919
Article Google Scholar
Pistone, G., Sempi, C.: An Infinite-Dimensional Geometric Structure on the Space of all the Probability Measures Equivalent to a Given One. Ann. Stat. 23(5), 1543–1561 (1995). https://doi.org/10.1214/aos/1176324311
Article MathSciNet Google Scholar
Pistone, G.: Examples of the Application of Nonparametric Information Geometry to Statistical Physics. Entropy 15(10), 4042–4065 (2013). https://doi.org/10.3390/e15104042
Article MathSciNet Google Scholar
Shiraishi, N.: Optimal Thermodynamic Uncertainty Relation in Markov Jump Processes. J. Stat. Phys. 185(3), 19 (2021). https://doi.org/10.1007/s10955-021-02829-8
Article MathSciNet Google Scholar
Chow, S.-N., Huang, W., Li, Y., Zhou, H.: Fokker-Planck Equations for a Free Energy Functional or Markov Process on a Graph. Arch Rational Mech Anal 203(3), 969–1008 (2012). https://doi.org/10.1007/s00205-011-0471-6
Article MathSciNet Google Scholar
Maas, J.: Gradient flows of the entropy for finite Markov chains. J. Funct. Anal. 261(8), 2250–2292 (2011). https://doi.org/10.1016/j.jfa.2011.06.009
Article MathSciNet Google Scholar
Mielke, A.: Geodesic convexity of the relative entropy in reversible Markov chains. Calc. Var. 48(1), 1–31 (2013). https://doi.org/10.1007/s00526-012-0538-8
Article MathSciNet Google Scholar
Liero, M., Mielke, A.: Gradient structures and geodesic convexity for reaction–diffusion systems. Philos. Trans. R. Soc. Math. Phys. Eng. Sci. 371(2005), 20120346 (2013). https://doi.org/10.1098/rsta.2012.0346
Article MathSciNet Google Scholar
Van Vu, T., Saito, K.: Thermodynamic Unification of Optimal Transport: Thermodynamic Uncertainty Relation, Minimum Dissipation, and Thermodynamic Speed Limits. Phys. Rev. X 13(1), 011013 (2023). https://doi.org/10.1103/PhysRevX.13.011013
Article Google Scholar
Yamano, T.: Phase space gradient of dissipated work and information: A role of relative Fisher information. J. Math. Phys. 54(11), 113301 (2013). https://doi.org/10.1063/1.4828855
Article MathSciNet Google Scholar
Hyvärinen, A.: Estimation of Non-Normalized Statistical Models by Score Matching. J. Mach. Learn. Res. 6, 695–709 (2005)
MathSciNet Google Scholar
Shima, H.: The Geometry of Hessian Structures. World Scientific, Singapore (2007)
Book Google Scholar
Wolsey, L.A., Nemhauser, G.L.: Integer and Combinatorial Optimization. Wiley, New York (1999)
Google Scholar
Davis, H.F.: On Isosceles Orthogonality. Math. Mag. 32(3), 129–131 (1959). https://doi.org/10.2307/3029494
Amari, S.-I.: Neural learning in structured parameter spaces: Natural Riemannian gradient. In: Proc. 9th Int. Conf. Neural Inf. Process. Syst. NIPS’96, pp. 127–133. MIT Press, Cambridge, MA, USA (1996)
Gunasekar, S., Woodworth, B., Srebro, N.: Mirrorless Mirror Descent: A Natural Derivation of Mirror Descent. In: Proc. 24th Int. Conf. Artif. Intell. Stat., pp. 2305–2313. PMLR (2021)
Li, W., Montúfar, G.: Natural gradient via optimal transport. Info. Geo. 1(2), 181–214 (2018). https://doi.org/10.1007/s41884-018-0015-3
Article MathSciNet Google Scholar
Amari, S.-I., Karakida, R., Oizumi, M.: Information geometry connecting Wasserstein distance and Kullback-Leibler divergence via the entropy-relaxed transportation problem. Info. Geo. 1(1), 13–37 (2018). https://doi.org/10.1007/s41884-018-0002-8
Article MathSciNet Google Scholar
Bain, A., Crisan, D.: Fundamentals of Stochastic Filtering. Springer Science & Business Media, New York (2008)
Google Scholar
Brigo, D., Hanzon, B., LeGland, F.: A differential geometric approach to nonlinear filtering: The projection filter. IEEE Trans. Autom. Control 43(2), 247–252 (1998). https://doi.org/10.1109/9.661075
Article MathSciNet Google Scholar
Li, Y., Cheng, Y., Li, X., Wang, H., Hua, X., Qin, Y.: Information geometric approach for nonlinear filtering. In: 2017 36th Chin. Control Conf. CCC, pp. 1211–1216 (2017). https://doi.org/10.23919/ChiCC.2017.8027514
Hill, T.L.: Free Energy Transduction and Biochemical Cycle Kinetics. Courier Corporation, New York (2005)
Google Scholar
Altaner, B., Grosskinsky, S., Herminghaus, S., Katthän, L., Timme, M., Vollmer, J.: Network representations of nonequilibrium steady states: Cycle decompositions, symmetries, and dominant paths. Phys. Rev. E 85(4), 041133 (2012). https://doi.org/10.1103/PhysRevE.85.041133
Article Google Scholar
Polettini, M., Esposito, M.: Irreversible thermodynamics of open chemical networks. I. Emergent cycles and broken conservation laws. J. Chem. Phys. 141(2), 024117 (2014). https://doi.org/10.1063/1.4886396
Article Google Scholar
Strang, A.: Applications of the Helmholtz-Hodge Decomposition to Networks and Random Processes. PhD thesis, Ann Arbor, United States (August 2020)
Oono, Y., Paniconi, M.: Steady State Thermodynamics. Prog. Theor. Phys. Suppl. 130, 29–44 (1998). https://doi.org/10.1143/PTPS.130.29
Article MathSciNet Google Scholar
Komatsu, T.S., Nakagawa, N., Sasa, S.-I., Tasaki, H.: Steady-State Thermodynamics for Heat Conduction: Microscopic Derivation. Phys. Rev. Lett. 100(23), 230602 (2008). https://doi.org/10.1103/PhysRevLett.100.230602
Article Google Scholar
Maes, C., Netočný, K.: A Nonequilibrium Extension of the Clausius Heat Theorem. J. Stat. Phys. 154(1), 188–203 (2014). https://doi.org/10.1007/s10955-013-0822-9
Article MathSciNet Google Scholar
Dechant, A., Sasa, S.-I., Ito, S.: Geometric decomposition of entropy production in out-of-equilibrium systems. ArXiv210912817 Cond-Mat (2022) arxiv:2109.12817 [cond-mat]
Guo, D., Shamai, S., Verdu, S.: Mutual information and minimum mean-square error in Gaussian channels. IEEE Trans. Inf. Theory 51(4), 1261–1282 (2005). https://doi.org/10.1109/TIT.2005.844072
Article MathSciNet Google Scholar
Mayer-Wolf, E., Zakai, M.: On a formula relating the Shannon information to the fisher information for the filtering problem. In: Korezlioglu, H., Mazziotto, G., Szpirglas, J. (eds.) Filter. Control Random Process. Lecture Notes in Control and Information Sciences, pp. 164–171. Springer, Berlin, Heidelberg (1984). https://doi.org/10.1007/BFb0006569
Mayer-Wolf, E., Zakai, M.: Some relations between mutual information and estimation error in Wiener space. Ann. Appl. Probab. 17(3), 1102–1116 (2007). https://doi.org/10.1214/105051607000000131
Article MathSciNet Google Scholar
Amari, S.-I., Park, H., Ozeki, T.: Singularities Affect Dynamics of Learning in Neuromanifolds. Neural Comput. 18(5), 1007–1065 (2006). https://doi.org/10.1162/neco.2006.18.5.1007
Article MathSciNet Google Scholar
Watanabe, S.: Algebraic Geometry and Statistical Learning Theory. Cambridge University Press, Cambridge (2009)
Book Google Scholar
Murashita, Y., Funo, K., Ueda, M.: Nonequilibrium equalities in absolutely irreversible processes. Phys. Rev. E 90(4), 042110 (2014). https://doi.org/10.1103/PhysRevE.90.042110
Article Google Scholar
Öttinger, H.C.: Beyond Equilibrium Thermodynamics. Wiley, New York (2005)
Book Google Scholar
Wang, Y., Li, W.: Accelerated Information Gradient Flow. J. Sci. Comput. 90(1), 11 (2021). https://doi.org/10.1007/s10915-021-01709-3
Article MathSciNet Google Scholar
Li, W., Liu, S., Osher, S.: Controlling conservation laws I: Entropy–entropy flux. J. Comput. Phys. 480(C), (2023) https://doi.org/10.1016/j.jcp.2023.112019
Gao, Y., Li, W., Liu, J.-G.: Master Equations for Finite State Mean Field Games with Nonlinear Activations. arXiv (2022). https://doi.org/10.48550/arXiv.2212.05675

Download references

Acknowledgements

This research is supported by JST (JPMJCR2011,JPMJCR1927) and JSPS (19H05799). The authors thank Hideyuki Miyahara and Tomonari Sei for their critical comments. We also express our sincere thanks to the anonymous reviewer who carefully read our manuscript and provided valuable comments.

Funding

Open access funding provided by The University of Tokyo.

Author information

Authors and Affiliations

Institute of Industrial Science, The University of Tokyo, 4-6-1, Komaba, Meguro-ku, Tokyo, 153-8505, Japan
Tetsuya J. Kobayashi, Dimitri Loutchko, Atsushi Kamimura & Yuki Sughiyama
Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
Tetsuya J. Kobayashi & Shuhei A. Horiguchi
Universal Biology Institute, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan
Tetsuya J. Kobayashi

Authors

Tetsuya J. Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar
Dimitri Loutchko
View author publications
You can also search for this author in PubMed Google Scholar
Atsushi Kamimura
View author publications
You can also search for this author in PubMed Google Scholar
Shuhei A. Horiguchi
View author publications
You can also search for this author in PubMed Google Scholar
Yuki Sughiyama
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tetsuya J. Kobayashi.

Ethics declarations

Conflict of interest

The authors have no conflict of interest, financial or otherwise.

Additional information

Communicated by Noboru Murata.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Symbols, Notations, and Abbreviations

Table 1 List of symbols and notations related to graph, hypergraph, and homological algebra

Full size table

Table 2 List of symbols and notations related to the dynamics on graphs and hypergraphs

Full size table

Table 3 List of symbols and notations related to the information-geometric structure

Full size table

Table 4 List of abbreviations

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kobayashi, T.J., Loutchko, D., Kamimura, A. et al. Information geometry of dynamics on graphs and hypergraphs. Info. Geo. 7, 97–166 (2024). https://doi.org/10.1007/s41884-023-00125-w

Download citation

Received: 01 December 2022
Revised: 22 September 2023
Accepted: 05 November 2023
Published: 22 December 2023
Issue Date: June 2024
DOI: https://doi.org/10.1007/s41884-023-00125-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Information geometry of dynamics on graphs and hypergraphs

Abstract

Similar content being viewed by others

Dynamical Phase Transitions for Flows on Finite Graphs

Nonlocal-Interaction Equation on Graphs: Gradient Flow Structure and Continuum Limit

Dynamical Schrödinger Bridge Problems on Graphs

1 Introduction

1.1 Information geometry of dynamics

1.2 Information measures for dynamics

1.3 Information geometry and dynamics in machine learning

1.4 Aim and contributions of this work

1.5 Organization of this paper

2 Classes of models for density dynamics on graphs and hypergraphs

2.1 Reversible linear dynamics of densities on graphs

Definition 1

Definition 2

Definition 3

2.2 Chemical reaction network and polynomial dynamical systems on hypergraphs

Definition 4

Remark 1

Definition 5

Definition 6

Definition 7

Definition 8

Remark 2

Remark 3

Example 1

2.3 Fokker Planck equations

3 Discrete calculus and homological algebra of graphs and hypergraphs

3.1 Chain and cochain complexes on graphs

3.2 Chain and cochain complexes on hypergraphs

Definition 9

3.3 Discrete calculus on graphs and hypergraphs

Definition 10

3.4 Linear graph Laplacian dynamics and metric structure in discrete calculus

4 Dually flat spaces on vertices and edges and generalized flow

4.1 Dually flat spaces on vertices and thermodynamic functions

Definition 11

Remark 4

Definition 12

Definition 13

Definition 14

Definition 15

Definition 16

4.2 Dually flat spaces on edges and dissipation functions

Definition 17

Definition 18

Proposition 1

Proof

Proposition 2

Definition 19

Definition 20

Remark 5

4.3 Generalized flow on graphs and hypergraphs and its steady state

Definition 21

Definition 22

4.4 Generalized gradient flow and De Giorgi’s formulation

Definition 23

Proposition 3

Proof

Proposition 4

Proof

4.5 Equilibrium and nonequilibrium flow

Definition 24

Definition 25

Remark 6

Proposition 5

Proof

5 Explicit form of thermodynamic and dissipation functions

5.1 Explicit form of thermodynamic functions for rLDG and CRN

5.2 Explicit form of dissipation functions for rLDG and CRN

Example 2

Remark 7

Remark 8

5.3 Some remarks on the dissipation functions for rLDG and CRN

5.4 Explicit forms of thermodynamic and dissipation functions for FPE

6 Orthogonal subspaces, dual foliations, and Pythagorean relation

6.1 Four affine subspaces

Definition 26

Definition 27