1 Introduction

Information geometry is finding and establishing a firm position as a geometric language in various scientific disciplines [1, 2]. Information geometry enables us to gain an intuitive understanding of the structures behind complicated problems of inference and estimation, for which Euclidean or Riemannian geometry may not be sufficient. In addition, it can provide ways to devise new solutions and approaches for the problems [1]. While information geometry was originally developed for statistics, its applicability now reaches far beyond statistical problems. Whenever the notions of probability, information, or positive density appear in a problem, it is natural to consider its information–geometric structure.

1.1 Information geometry of dynamics

Dynamical systems and phenomena can be naturally analyzed with information geometric methods, as conventionally one considers the dynamics of probability distributions [3,4,5], e.g., via the Fokker–Planck equations (FPE) and the Master equation, or those of positive densities, e.g, via population dynamics, epidemic models, diffusion dynamics on networks, and chemical reaction dynamics [6,7,8]. Although the application of information geometry to dynamical systems has been attempted almost since its birth, information geometry for dynamics is much less organized and principled compared with those for static problems in statistics, optimization, and others [1]. In connection with statistical inference, information geometry was employed by Amari and others to investigate Gaussian time series and autoregressive moving average (ARMA) models by representing their power spectrum as parametric manifolds [9,10,11]. This idea was also used to investigate linear systems [12]. Markov jump processes on finite statesFootnote 1 were investigated information-geometrically by considering the hierarchical structure of joint or conditional probabilities at different time points, e.g., \(\mathbb {P}_{\varvec{\theta }}(x_{1},x_{2},\ldots , x_{t})\) [13], or by introducing exponential families of Markov kernels (transition matrices), \(\mathbb {T}_{\varvec{\theta }}(x|x')\), via exponential tilting of the kernels [14,15,16,17,18,19,20]. Furthermore, information geometry was applied to studies of random walks, nonlinear diffusion equations of porous media, and networks [21,22,23]. In relation to mechanics, integrable systems were associated with the dualistic gradient flow of information geometry in the seminal works [24, 25], and other connections of information geometry with Lagrangian or Hamiltonian mechanics have been pursued [26,27,28].

1.2 Information measures for dynamics

Concurrently with and almost independently of these attempts within the community of information geometry, information measures relevant to information geometry have been employed in various problems of dynamical systems and stochastic processes in information theory [29], filtering theory [30, 31], control theory [32,33,34], and non-equilibrium physics and chemistry [35,36,37]. The Kullback–Leibler (KL) divergence [38] for probabilities and positive densities was shown to be a Lyapunov function of Markov jump processes (MJP) [5], FPE [3, 39], deterministic chemical reaction networks (CRN) [40, 41], and other dynamical systems [42, 43], the origin of which can be dated back to Gibbs’ H-theorem [44]. Among those topics, since the establishment of chemical thermodynamics by Gibbs [44] and chemical kinetics by Guldberg and Waage [45], CRN has played the role of a seedbed for cultivating the theory between dynamics and divergences owing to its close connection with thermodynamics [46,47,48]. More recently, it was also clarified that the divergences and information geometry are fundamental in stochastic thermodynamics [49,50,51,52,53].

In addition to the KL divergence, the Fisher-information-like quantity

$$\begin{aligned} \mathbb {I}_{F}[p] :=\int p(\varvec{r})(\nabla _{\varvec{r}} \ln p(\varvec{r}))^{2}\textrm{d}\varvec{r} \in \mathbb {R}_{\ge 0} \end{aligned}$$
(1)

was also revealed to play an important role in characterizing dynamics for densities on a continuous space, e.g., Gaussian convolution, diffusion processes, and FPE [54,55,56]. Various governing equations in physics were claimed to be derived in a unified way from this quantity [36]. The quantity \(\mathbb {I}_{F}\) looks like the Fisher information [57] but is different from the conventional Fisher information matrix [58,59,60] because the derivative \(\nabla _{\varvec{r}} \ln p(\varvec{r})\) is not for the parameters but for the base space variable of \(p(\varvec{r})\).Footnote 2 Because \(\mathbb {I}_{F}\) is a scalar, we follow [59] and call it Fisher information number. The Fisher information number \(\mathbb {I}_{F}\) is related to the KL divergence in additive Gaussian channels [54] and other systems [56, 62], which is known as the De Bruijn identity [54]. In addition, the logarithmic Sobolev inequality also provides a relation between the Fisher information number and the KL divergence (or Shannon information) [63, 64]. These results have recently been associated with the formal Riemannian geometric structure induced by the \(L^{2}\)-Wasserstein geometry [65, 66].

1.3 Information geometry and dynamics in machine learning

On top of these traditional trends, information geometry is now playing a pivotal role in machine learning for designing and evaluating online optimization algorithms (dynamics) in the space of model parameters such as natural gradient [67] and mirror descent [68, 69] as well as evolutionary computation (information-geometric optimization) [70]. Geometric interpretation allows us to understand the behaviors and efficiency of algorithms and their dynamics more intuitively in a principled manner [69,70,71].

1.4 Aim and contributions of this work

Despite the wide applicability and the long history of information geometry, we still lack a solid theoretical framework to unify these outcomes that spread across different fields from the viewpoint of information geometry. In this work, we introduce a new information geometric structure for the dynamics of probability and positive densities. In this structure, we consider not only the single dually flat structure built on the space of densities as in [24, 25] but also another structure constructed on the space of fluxes. These two structures are linked algebraically and topologically via the continuity equation and the gradient equation as illustrated in Fig. 1.

Under this doubly dual flat structure, we can consider the dynamics of densities as a generalized flow, and various previous results can be unified in this framework. We exclusively consider dynamics of densities on finite-dimensional discrete manifolds, i.e., finite graphs or hypergraphs, because the structure introduced here can be explicitly manifested in this setup and also because we do not need the mathematically elaborated setup for infinite-dimensional information geometry on a smooth manifold [72]. For the case of FPE in a continuous state space, the dually flat structure built on the flux space can be reduced to the formal Riemannian geometric structure of \(L^{2}\) Wasserstein geometry where the convex functions that induce the dually flat structure become quadratic. Our structure generalizes the linear inner product on the tangent and cotangent spaces with the nonlinear Legendre transform, thereby requiring information geometry. By elucidating this information geometric structure, we can easily see that some quantities such as the bilinear product, convex thermodynamic potential functions, the Fisher information matrix, and the Fisher information number are consolidated into one quantity for FPE with the quadratic convex functions (see Sect. 5.3 and Sect. 5.4). Therefore, our structure provides a way to unify the dualistic gradient flow mentioned in Sect. 1.1 and also the information-number related topics in Sect. 1.2.

Fig. 1
figure 1

Diagrammatic illustration of the doubly dual flat structure established on the vertex affine spaces (left) and the edge spaces (right), which are topologically related via the underlying graph or hypergraph

From the viewpoint of homological algebra, the structure we work on is a modification of the chain and cochain complexes of graphs or hypergraphs, which replace the usual inner product duality [73] on each pair of chains and cochains with Legendre duality. Moreover, the dually flat space built on the flux space is linked to a finite-dimensional version of Orlicz spaces [74], which have been employed for constructing infinite-dimensional information geometry [72]. From the nice properties of the doubly dual flat structures, we can obtain information-geometric extensions of the Helmholtz-Hodge-Kodaira (HHK) decomposition (Theorem 1), the Otto calculus (Theorem 2), and its induction to cycle spaces(Theorem 3).

Our construction of an information geometry for dynamics is heavily based on the idea of using Legendre duality for the force and flux relation, proposed in the recent work of large deviations theory and the macroscopic fluctuation theorem for MJP and CRN led by A.Mieleke, R.I.A.Petterson, M.A.Peletier, D.R.M. Renger, J.Zimmer, and othersFootnote 3 [75,76,77,78,79,80,81,82]. We clarified its information-geometric aspects in the context of CRN and thermodynamics in our previous work [83]. We also concurrently elucidated the intimate link of equilibrium chemical thermodynamics and information geometry on the density state space [48, 84, 85]. In light of those, the contribution of this work is three-fold. First, we integrate these results in terms of information geometry, which clarifies the underlying geometric nature of the problem, provides transparent interpretations for known results, and leads to new information geometric results and insights (Theorem 1–Theorem 3); Second, this structure substantially extends the applicability of information geometry to a wide variety of dynamical problems; Lastly, the structure links information geometry to algebraic graph theory, discrete calculus, and homological algebra, which were not fully appreciated yet but provides a versatile way to consider the topology of the base manifold in information geometry.

1.5 Organization of this paper

This work is organized as follows: In Sect. 2, we introduce a range of models of dynamics on graphs and hypergraphs. In Sect. 3, we outline the homological algebra of graphs and hypergraphs. In Sect. 4, we abstractly introduce the doubly dual flat structures on the density and flux spaces and define the generalized flow associated with these structures. In Sect. 5, we clarify that the introduced structures include a wide class of dynamics on graphs and hypergraphs. In Sect. 6 and Sect. 7, we further define information-geometric objects and quantities, which naturally appear from this setup and play an integral role in the subsequent analysis of dynamics. In Sect. 8 and Sect. 9, we derive several results for equilibrium and nonequilibrium flows, respectively. Finally, we provide a summary and prospects of our work in Sect. 10. The notations and symbols are listed in the appendix.

2 Classes of models for density dynamics on graphs and hypergraphs

In this work, we focus on linear and nonlinear dynamics defined on graphs [86] and hypergraphs [87].

The linear dynamics of densities on graphs (LDG) includes Markov jump processes (MJP) [88], monomolecular chemical reaction networks [89], and others [86]. We consider an extension of LDG to hypergraphs and nonlinear dynamics, common instances of which are chemical reaction networks (CRN) with the law of mass action (LMA) kinetics [8] and polynomial dynamical systems (PDS) [90]. Because the extension we deal with in this work is a subclass of nonlinear dynamical systems on hypergraphs, we use CRN to designate this subclass.

In the following subsections, LDG and CRN are introduced using the language of algebraic graph theory [86, 91]. Then, we also give a brief and formal introduction of the Fokker-Planck equation (FPE) [3], a linear dynamics of probability densities defined in Euclidean space. We use the FPE throughout this paper only to contrast our results with the previous ones obtained for the FPE.

2.1 Reversible linear dynamics of densities on graphs

Definition 1

(Edge-weighted finite graph \(\mathbb {G}_{\varvec{k}^{\pm }}\)) A finite graph \(\mathbb {G}:=(\{\mathbb {v}_{i}\},\{\mathbb {e}_{e}\},\mathbb {B})\) consists of \(N_{\mathbb {v}} \in \mathbb {Z}_{> 0}\) vertices, \(\{\mathbb {v}_{i}\}_{i\in [1,N_{\mathbb {v}}]}\), and \(N_{\mathbb {e}} \in \mathbb {Z}_{> 0}\) oriented edges, \(\{\mathbb {e}_{e}\}_{e\in [1,N_{\mathbb {e}}]}\), each of which connects two different verticesFootnote 4 (Fig. 2a). The incidence relation is represented by the incidence matrix \(\mathbb {B}\in \{0,\pm 1\}^{N_{\mathbb {v}}\times N_{\mathbb {e}}}\) where, for \(\mathbb {B}=(b_{i,e})\),

$$\begin{aligned} b_{i,e}&:=+1{} & {} \text{ if } \mathbb {v}_{i} \text{ is } \text{ the } \text{ tail } \text{ of } \text{ edge } \mathbb {e}_{e}, \\ b_{i,e}&:=-1{} & {} \text{ if } \mathbb {v}_{i} \text{ is } \text{ the } \text{ head } \text{ of } \text{ edge } \mathbb {e}_{e}, \\ b_{i,e}&:=0{} & {} \text{ otherwise }. \end{aligned}$$

An edge-weighted finite graph \(\mathbb {G}_{\varvec{k}^{\pm }}:=(\{\mathbb {v}_{i}\},\{\mathbb {e}_{e}\},\mathbb {B}, \{k_{e}^{\pm }\})\) has two positive weighting parameters \(k_{e}^{\pm }=(k_{e}^{+},k_{e}^{-})\in \mathbb {R}_{> 0}\) for each edge \(\mathbb {e}_{e}\). The parameters \(k_{e}^{+}\) and \(k_{e}^{-}\) are denoted as forward and reverse rates or weights of edge \(\mathbb {e}_{e}\), respectively.

Fig. 2
figure 2

Schematic diagrams of a reversible finite graph \(\mathbb {G}\) (a) and a CRN-hypergraph \(\mathbb {H}\) (b). Each pair of thick and thin arrows represents the pair of forward and reverse orientations of the corresponding edge. The CRN-hypergraph \(\mathbb {H}\) in (b) corresponds to the simplified Brusselator reaction. The hypervertex \(\hat{\mathbb {v}}_{1}\) contains no vertices

A reversible linear dynamics (rLDG) on a graphs is defined on the edge-weighted finite graph \(\mathbb {G}_{\varvec{k}^{\pm }}\):

Definition 2

(Reversible linear dynamics of density on graph \(\mathbb {G}_{\varvec{k}^{\pm }}\)) The reversible linear dynamics of non-negative density \(\varvec{x}(t)=(x_{1}(t), \cdots , x_{N_{\mathbb {v}}}(t))^{T}\in \mathbb {R}_{\ge 0}^{N_{\mathbb {v}}}\) on \(\mathbb {G}_{\varvec{k}^{\pm }}\) is defined by the continuity equation

$$\begin{aligned} \dot{\varvec{x}}&=-\mathbb {B}\varvec{j}(\varvec{x})=-\mathbb {B}[\varvec{j}^{+}(\varvec{x})-\varvec{j}^{-}(\varvec{x})], \end{aligned}$$
(2)

and linear forward and reverse one-way fluxes \(\varvec{j}^{\pm }(\varvec{x})=(j^{\pm }_{1}(\varvec{x}),\cdots , j^{\pm }_{N_{\mathbb {e}}}(\varvec{x}))^{T}\in \mathbb {R}^{N_{\mathbb {e}}}_{\ge 0}\) with the following specific functional formFootnote 5:

$$\begin{aligned} \varvec{j}^{\pm }(\varvec{x})&=\varvec{k}^{\pm }\circ (\mathbb {B}^{\pm })^{T}\varvec{x} , \end{aligned}$$
(3)

where \(\varvec{j}(\varvec{x}):=\varvec{j}^{+}(\varvec{x})-\varvec{j}^{-}(\varvec{x})\) is the total flux, the symbol \(\circ \) denotes the component-wise product of two vectors,Footnote 6 and \(\mathbb {B}^{+}\) and \(\mathbb {B}^{-}\) are the head and tail incidence matrices defined respectively as \(\mathbb {B}^{+}:=\max [\mathbb {B},0]\) and \(\mathbb {B}^{-}:=\max [-\mathbb {B},0]\). The incidence matrix \(\mathbb {B}\) in Eq. 2 is often regarded as the discrete divergence operator on a graph [73] and denoted also by \(\textrm{div}_{\mathbb {B}}=\mathbb {B}\) to emphasize this interpretation in this work.Footnote 7

Reversible Markov jump processes (rMJP) are a representative class of the rLDG describing random jumps of noninteracting particles on \(\mathbb {G}_{\varvec{k}^{\pm }}\).Footnote 8 The weighting parameter \(k_{e}^{+}\) is interpreted as the forward jump rate from the tail of the oriented edge \(\mathbb {e}_{e}\) to its head, whereas \(k_{e}^{-}\) is the reverse jump rate from the head to the tail of \(\mathbb {e}_{e}\).Footnote 9 For infinitely many such particles, we consider \(p_{i}(t)\in [0,1]\), the fraction of particles on vertex \(\mathbb {v}_{i}\) at time t, which is a non-negative density on vertices. Then, the forward and reverse one-way fluxes on the eth edge defined by Eq. 3 are represented as

$$\begin{aligned} j^{+}_{e}(\varvec{p})&=k_{e}^{+}p_{\mathbb {v}^{+}_{e}}\in \mathbb {R}_{\ge 0},&j^{-}_{e}(\varvec{p})&=k_{e}^{-}p_{\mathbb {v}^{-}_{e}}\in \mathbb {R}_{\ge 0}, \end{aligned}$$
(4)

where \(\mathbb {v}^{+}_{e}\) and \(\mathbb {v}^{-}_{e}\) are the head and tail vertices of edge \(\mathbb {e}_{e}\)Footnote 10. The linearity of \(j^{\pm }_{e}(\varvec{p})\) with respect to \(\varvec{p}\) comes from the independence of particles on the graph. Then, the continuity equation (Eq. 2) with the state vector \(\varvec{p}(t):=(p_{1}(t), \cdots , p_{N_{\mathbb {v}}}(t))^{T}\in \mathbb {R}^{N_{\mathbb {v}}}_{\ge 0}\) is reduced to the master equation: \(\dot{\varvec{p}}=-\mathbb {B}\varvec{j}(\varvec{p})\).

Definition 3

(Weighted asymmetric graph Laplacian [91, 92]) For \(\mathbb {G}_{\varvec{k}^{\pm }}\), the corresponding weighted asymmetric graph Laplacian is defined by

$$\begin{aligned} \mathcal {L}_{\varvec{\theta }}:=\mathbb {B}\left[ \textrm{diag}[\varvec{k}^{+}] (\mathbb {B}^{+})^{T}-\textrm{diag}[\varvec{k}^{-}] (\mathbb {B}^{-})^{T} \right] , \end{aligned}$$
(5)

where \(\varvec{\theta }:=(\varvec{k}^{+},\varvec{k}^{-})\) and \(\textrm{diag}[\varvec{k}^{+}]\) is the diagonal matrix whose diagonal elements are \(\varvec{k}^{+}\). Using \(\mathcal {L}_{\varvec{\theta }}\), Eq. 2 and Eq. 3 are represented as

$$\begin{aligned} \dot{\varvec{x}}&=-\mathbb {B}\varvec{j}(\varvec{x})=-\mathbb {B}\left[ \textrm{diag}[\varvec{k}^{+}] (\mathbb {B}^{+})^{T}-\textrm{diag}[\varvec{k}^{-}] (\mathbb {B}^{-})^{T} \right] \varvec{x}=-\mathcal {L}_{\varvec{\theta }}\varvec{x}. \end{aligned}$$
(6)

The operator \(\mathcal {L}_{\varvec{\theta }}\) is reduced to the weighted symmetric graph Laplacian if \(\varvec{k}^{+}=\varvec{k}^{-}\) and also to the conventional graph Laplacian if \(\varvec{k}^{+}=\varvec{k}^{-}=\varvec{1}\) [91, 92]. Equation 6 can also cover linear transport on graphs, a class of linear electric circuits [93], consensus dynamics on graphs [94], and other linear dynamics on graphs [86, 95].Footnote 11

2.2 Chemical reaction network and polynomial dynamical systems on hypergraphs

Next, we introduce a class of nonlinear dynamics on hypergraphs, which includes the rLDG (Eq. 2 and Eq. 3) as a special case. The most common instance is deterministic chemical reaction networks (CRN) with the law of mass action (LMA) kinetics [7, 8, 45, 96], and this class is sometimes referred to as polynomial dynamical systems (PDS). Because the major part of the PDS theory has been developed for CRN, we use CRN to introduce and specify this class in this work.

Definition 4

(Reversible edge-weighted CRN hypergraph \(\mathbb {H}_{\varvec{k}^{\pm }}\)) The reversible CRN hypergraph consists of a finite number of vertices \(\{\mathbb {X}_{i}\}_{i\in [1,N_{\mathbb {X}}]}\) and hyperedges \(\{\mathbb {e}_{e}\}_{e\in [1,N_{\mathbb {e}}]}\) where \(N_{\mathbb {X}}, N_{\mathbb {e}} \in \mathbb {Z}_{>0}\) (Fig. 2b). Each hyperedge \(\mathbb {e}_{e}\) connects two different hypervertices \(\hat{\mathbb {v}}^{+}_{e}\) and \(\hat{\mathbb {v}}^{-}_{e}\) where \(\hat{\mathbb {v}}^{+}_{e} \ne \hat{\mathbb {v}}^{-}_{e}\).Footnote 12 The hypervertices are multisets of vertices \(\{\mathbb {X}_{i}\}_{i\in [1,N_{\mathbb {X}}]}\), each of which is defined as \(\hat{\mathbb {v}}_{\ell }=\sum _{i=1}^{N _{\mathbb {X}}}\gamma _{i,\ell }\mathbb {X}_{i}\) where \(\gamma _{i,\ell }\in \mathbb {Z}_{\ge 0}\) is the number of the ith vertex included in the \(\ell \)th hypervertex.Footnote 13 Thus, the nonnegative integer vector \(\varvec{\gamma }_{\ell }:=(\gamma _{1,\ell }, \cdots , \gamma _{N_{\mathbb {X}},\ell })^{T}\in \mathbb {Z}_{\ge 0}^{N_{\mathbb {X}}}\) defines the \(\ell \)th hypervertex. Let \(N_{\hat{\mathbb {v}}} \in \mathbb {Z}_{>0}\) be the total number of the hypervertices and be the hypervertex matrix. The matrix \(\mathbb {B}\in \{0,\pm 1\}^{N_{\hat{\mathbb {v}}}\times N_{\mathbb {e}}}\) is the incidence matrix encoding the incidence relations among the hypervertices and the hyperedges. The hypergraph incidence matrix \(\mathbb {S}\in \mathbb {Z}^{N_{\mathbb {X}} \times N_{\mathbb {e}}}\) is then defined as

(7)

If where \(I\) is the identity matrix, then is reduced to \(\mathbb {G}=(\{\mathbb {v}_{\ell }\}_{\ell \in [1,N_{\mathbb {X}}]},\{\mathbb {e}_{e}\}_{e\in [1,N_{\mathbb {e}}]},\mathbb {B})\) where \(\mathbb {v}_{\ell }=\mathbb {X}_{\ell }\). An edge-weighted CRN hypergraph has forward and reverse rates \(k_{e}^{\pm }> 0\) as the weights of edge \(\mathbb {e}_{e}\).

In the context of CRN theory, the vertices \(\{\mathbb {X}_{i}\}\) correspond to the molecular species involved in a CRN, and each hyperedge \(\mathbb {e}_{e}\) represents a pair of forward and reverse reactions:

(8)

where the forward and reverse reactions are from left to right and from right to left, respectively. Head and tail hypervertices \(\hat{\mathbb {v}}_{e}^{+}:=(\gamma ^{+}_{1,e}\mathbb {X}_{1}+\cdots +\gamma ^{+}_{N_{\mathbb {X}},e}\mathbb {X}_{N_{\mathbb {X}}})\) and \(\hat{\mathbb {v}}_{e}^{-}:=(\gamma ^{-}_{1,e}\mathbb {X}_{1}+\cdots +\gamma ^{-}_{N_{\mathbb {X}},e}\mathbb {X}_{N_{\mathbb {X}}})\) in Eq. 8 are the sets of reactants and products of the eth forward reaction, respectively. More specifically, \(\gamma ^{+}_{i,e}\in \mathbb {Z}_{\ge 0}\) and \(\gamma ^{-}_{i,e}\in \mathbb {Z}_{\ge 0}\) are the numbers of the molecule \(\mathbb {X}_{i}\) involved as the reactants and products of the eth forward reaction, respectively. For the reverse reaction, \(\hat{\mathbb {v}}_{e}^{-}\) and \(\hat{\mathbb {v}}_{e}^{+}\) are the reactants and products. Some head and tail hypervertices are overlapping among different reactions (hyperedges) as in Fig. 2b. As a result, \(\{\hat{\mathbb {v}}_{\ell }\}_{\ell \in N_{\hat{\mathbb {v}}}}\) is the union of the head and tail hypervertices, \(\{\hat{\mathbb {v}}_{\ell }\}_{\ell \in N_{\hat{\mathbb {v}}}}=\bigcup _{e \in N_{\mathbb {e}}}\{\hat{\mathbb {v}}_{e}^{+}, \hat{\mathbb {v}}_{e}^{-}\}\).

The hypervertices are called complexes in CRN theory [8]Footnote 14. From \(\{\gamma ^{+}_{i,e}\}\) and \(\{\gamma ^{-}_{i,e}\}\), we can define

$$\begin{aligned} \varvec{s}_{e}:=(\gamma ^{+}_{1,e}-\gamma ^{-}_{1,e},\cdots , \gamma ^{+}_{N_{\mathbb {X}},e}-\gamma ^{-}_{N_{\mathbb {X}},e})^{T}\in \mathbb {Z}^{N_{\mathbb {X}}}, \end{aligned}$$
(9)

where \(\mp \varvec{s}_{e}\) specify the change in the number of molecules induced when the eth forward and reverse reaction occurs just once, respectively. The hypergraph incidence matrix \(\mathbb {S}\) defined in Eq. 7 is represented as \(\mathbb {S}= (\varvec{s}_{1},\cdots , \varvec{s}_{N_{\mathbb {e}}})\in \mathbb {Z}^{N_{\mathbb {X}} \times N_{\mathbb {e}}}\). In chemistry, the negative of \(\varvec{s}_{e}\) and \(\mathbb {S}\), i.e., \(-\varvec{s}_{e}\) and \(-\mathbb {S}\), are called the stoichiometric vector and matrix, respectively [8].

Remark 1

To define a reversible CRN hypergraph, the hypergraph matrix \(\mathbb {S}\) is not sufficient. If the head and tail hypervertices of a hyperedge contain the same vertex (molecule), the corresponding element in \(\mathbb {S}\) of such a shared vertex becomes 0 by canceling out. Thus, the existence of shared vertices (molecules) is invisible in \(\mathbb {S}\), and the pair is required to define \(\mathbb {H}\). Such shared molecules are called catalysts in CRN.

For a CRN hypergraph, the continuity equation for CRN is defined:

Definition 5

(CRN continuity equation) Let a vector of nonnegative densities \(\varvec{x}=(x_{1},\cdots ,x_{N_{\mathbb {X}}})^{T}\in \mathbb {R}_{\ge 0}^{N_{\mathbb {X}}}\) represents the concentration of molecules \(\{\mathbb {X}_{i}\}\). The CRN continuity equation is defined as

$$\begin{aligned} \dot{\varvec{x}}=-\mathbb {S}\varvec{j}(\varvec{x})= -\textrm{div}_{\mathbb {S}} \varvec{j}(\varvec{x}), \end{aligned}$$
(10)

where \(j_{e}^{+}(\varvec{x})\in \mathbb {R}_{\ge 0}\) and \(j_{e}^{-}(\varvec{x})\in \mathbb {R}_{\ge 0}\) are the one-way fluxes of the eth forward and reverse reactions, \(\varvec{j}^{\pm }(\varvec{x}):=(j_{1}^{\pm }(\varvec{x}),\cdots ,j_{N_{\mathbb {e}}}^{\pm }(\varvec{x}))^{T}\in \mathbb {R}_{\ge 0}^{N_{\mathbb {e}}}\) are their vector representations, and \(\varvec{j}(\varvec{x}):=\varvec{j}^{+}(\varvec{x})-\varvec{j}^{-}(\varvec{x})\in \mathbb {R}^{N_{\mathbb {e}}}\) is the total reaction flux [7, 8, 96]. The hypergraph divergence operator \(\textrm{div}_{\mathbb {S}} :=\mathbb {S}\) is defined accordingly.

To define the dynamics of a CRN, the functional form of \(j_{e}^{\pm }(\varvec{x})\) is required.Footnote 15 Before introducing specific forms, we define two important properties of the fluxes and also other functions defined on edges:

Definition 6

(Consistency of fluxes \(\varvec{j}^{\pm }(\varvec{x})\) with hypergraph \(\mathbb {H}\)) One-way fluxes \(\varvec{j}^{\pm }(\varvec{x})\) are consistent with the hypergraph \(\mathbb {H}\) if, for all \(e\in [1,N_{\mathbb {e}}]\), \(j^{\pm }_{e}(\varvec{x})\) becomes 0 when \(x_{i}=0\) where \(\mathbb {X}_{i}\) is any reactant of \(j^{\pm }_{e}(\varvec{x})\), respectively. In other words, \(j^{\pm }_{e}(\varvec{x})\) satisfies \(\gamma _{i,e}^{\pm }j_{e}^{\pm }(\varvec{x})=0\) if \(x_{i}=0\) for any \(i\in [1,N_{\mathbb {X}}]\).

Definition 7

(Locality of function on edges over \(\mathbb {H}\)) A vector function \(\varvec{g}(\varvec{x})\in \mathbb {R}^{N_{\mathbb {e}}}\) defined on edges is local on \(\mathbb {H}\) if, for all \(e\in [1,N_{\mathbb {e}}]\), \(g_{e}(\varvec{x})\) is a function only of the elements of \(\varvec{x}\) incident to the edge \(\mathbb {e}_{e}\) on \(\mathbb {H}\), i.e., \(g_{e}(\varvec{x})=g_{e}(\bar{\gamma }_{1,e}^{+} x_{1},\cdots , \bar{\gamma }_{N_{\mathbb {X}},e}^{+}x_{N_{\mathbb {X}}},\bar{\gamma }_{1,e}^{-} x_{1},\cdots , \bar{\gamma }_{N_{\mathbb {X}},e}^{-} x_{N_{\mathbb {X}}})\) where \(\bar{\gamma }_{i,e}^{\pm }:=\min [1, \gamma _{i,e}^{\pm }] \in \{0,1\}\).

The consistency condition is indispensable to prohibit a reaction that can decrease \(x_{i}\) from occurring when \(x_{i}=0\). For \(\varvec{j}^{\pm }(\varvec{x})\), the locality means that the fluxes of the eth reaction depend only on the concentrations of their reactants and products. The local flux is determined solely by the information stored on the vertices incident to the edge and plays a crucial role when we regard the structure introduced in this work as an extension of differential forms on continuous manifolds to graphs and hypergraphs. When we work on specific forms of fluxes in this work, we consider only local fluxes consistent with the given hypergraph \(\mathbb {H}\).

In chemistry, we have a variety of candidates for the functional form of flux, e.g., the Michaelis-Menten function, Hill’s function, and others [7, 97]. Among others, the LMA kinetics is the most basic and well-established one.

Definition 8

(Waage–Guldberg’s law of mass action kinetics (LMA kinetics)) A CRN follows the LMA kinetics if, for all \(e\in [1,N_{\mathbb {e}}]\), the eth forward and reverse reaction fluxes are represented as

$$\begin{aligned} j^{\pm }_{e}(\varvec{x})=k^{\pm }_{e} \prod _{j=1}^{N_{\mathbb {X}}}x_{j}^{\gamma ^{\pm }_{j,e}}=k^{\pm }_{e} \sum _{\ell =1}^{N_{\hat{\mathbb {v}}}}b_{\ell ,e}^{\pm }\prod _{j=1} ^{N_{\mathbb {X}}}x_{j}^{\gamma _{j,\ell }}, \end{aligned}$$
(11)

where \(k_{e}^{+}\in \mathbb {R}_{> 0}\) and \(k_{e}^{-}\in \mathbb {R}_{> 0}\) are the reaction rate constants of the eth forward and reverse reactions, respectively. The fluxes under LMA kinetics can be compactly represented as

(12)

where and .Footnote 16 We use the subscript \(\textrm{MA}\) as in \(\varvec{j}^{\pm }_{\textrm{MA}}(\varvec{x})\) to discriminate this specific form of the fluxes from others. We can easily observe that \(\varvec{j}^{\pm }_{\textrm{MA}}(\varvec{x})\) is consistent and local with respect to \(\mathbb {H}\). Furthermore, \(\varvec{j}^{\pm }_{\textrm{MA}}(\varvec{x})\) is specified by the edge-weighted CRN hypergraph .

Remark 2

(Algebraic aspect of LMA kinetics) Because is a vector of monomials of \(\varvec{x}\), each one-way flux, \(j^{\pm }_{e}(\varvec{x})\), is a monomial of \(\varvec{x}\) under Eq. 12 and thus the total flux \(j_{e}(\varvec{x})=j^{+}_{e}(\varvec{x})-j^{-}_{e}(\varvec{x})\) is a binomial. This fact links the real algebraic geometry of toric varieties [98, 99] to CRN [84, 100] as it does in algebraic statistics [48, 101].

Remark 3

(Extended LMA kinetics) While we mainly work on the normal LMA kinetics, we can extend it. The extended LMA kinetics defined on \(\mathbb {H}\) is defined as

(13)

where \(\varvec{g}(\varvec{x}) \in \mathbb {R}_{> 0}^{N_{\mathbb {e}}}\) and is local with respect to \(\mathbb {H}\).Footnote 17 An example of the extended LMA kinetics is reversible Michaelis-Menten kinetics [103].

By combining the continuity equation (Eq. 10) and the LMA kinetics (Eq. 12), we have the following chemical rate equation:

(14)

where \(\mathcal {L}_{\varvec{\theta }}\) is the weighted asymmetric graph Laplacian defined as in Eq. 5. Now, we can see that CRN contains rLDG (Eq. 6) as a special case if . Owing to this inclusion relation, CRN with LMA kinetics is a mathematically sound generalization of rLDG. Because LDG has been used in various fields of social science, network science, machine learning, and so on, CRN theory is potentially important for extending the results there.

Example 1

(Simplified Brusselator CRN [8, 104]) The Brusselator is a representative CRN, which can generate non-trivial dynamic behaviors such as oscillations. We use a reversible CRN version of the simplified Brusselator [8, 104], whose CRN-hypergraph depicted in Fig. 2b has the following structural information:

(15)

,

The rate equation (Eq. 14) can be represented as

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\begin{pmatrix}x_{1}\\ x_{2}\end{pmatrix}&=- \overbrace{\begin{pmatrix} -1 &{} +1 &{} -1 \\ 0 &{} -1 &{}+1 \end{pmatrix}}^{\mathbb {S}} \left[ \overbrace{ \begin{pmatrix} k_{1}^{+} \\ k_{2}^{+}x_{1} \\ k_{3}^{+} x_{1}^{2}x_{2} \end{pmatrix}}^{\varvec{j}^{+}(\varvec{x})} - \overbrace{\begin{pmatrix} k_{1}^{-}x_{1} \\ k_{2}^{-}x_{2} \\ k_{3}^{-} x_{1}^{3} \end{pmatrix}}^{\varvec{j}^{-}(\varvec{x})} \right] . \end{aligned}$$
(16)

2.3 Fokker Planck equations

While our main focus is the dynamics on graphs and hypergraphs, we use FPE as a representative class of density dynamics on a continuous Euclidean space. Specifically, we use FPE only to demonstrate the relation of our results with previous ones obtained for FPE in various contexts. Because FPE is infinite-dimensional, we treat it here only formally.

Let \(\varvec{r}\in \mathbb {R}^{d}\) be a vector in a d dimensional Euclidean space. We consider infinitely many noninteracting particles randomly walking in the space and describe the dynamics by a probability density \(p_{t}(\varvec{r})\in \mathbb {R}_{\ge 0}\) of the particles. The continuity equation for \(p_{t}(\varvec{r})\) is

$$\begin{aligned} \partial _{t}p_{t}(\varvec{r})&=-\nabla \cdot \varvec{j}_{\textrm{FP}}[p_{t}(\varvec{r})] \end{aligned}$$
(17)

where \(\varvec{j}_{\textrm{FP}}[p_{t}(\varvec{r})]\in \mathbb {R}^{d}\) is the probability flux, \(\nabla :=(\partial /\partial r_{1}, \cdots , \partial /\partial r_{d})^{T}\) is the gradient operator on the Euclidean space, and \((\nabla \cdot ): \nabla \cdot \varvec{F}(\varvec{r}):=\sum _{i=1}^{d}\partial F_{i}(\varvec{r})/\partial r_{i} \in \mathbb {R}\) is the divergence. The flux of the FPE is defined as

$$\begin{aligned} \varvec{j}_{\textrm{FP}}[p(\varvec{r})]&= \left[ \varvec{F}(\varvec{r})p(\varvec{r}) - D_{0} \nabla p(\varvec{r})\right] , \end{aligned}$$
(18)

where \(\varvec{F}(\varvec{r})\in \mathbb {R}^{d}\) is the drift force, and \(D_{0} \in \mathbb {R}_{>0}\) is the diffusion constant.

3 Discrete calculus and homological algebra of graphs and hypergraphs

The algebraic and topological structure of the dynamics on graphs and hypergraphs can be explicitly and abstractly treated using the language of discrete calculus and homological algebra. The discrete version of the gradient and divergence mentioned in Sect. 2 is also characterized. In this section, we briefly introduce the chain and cochain complexes defined for a finite graph or a hypergraph and discrete calculus [73, 91, 105, 106]. We first introduce the complexes for a graph \(\mathbb {G}\) and then extend them to a hypergraph \(\mathbb {H}\) algebraically.Footnote 18 It should be noted that the conventional discrete calculus (the discrete version of the theory for differential forms) presumes the Riemannian metric in the dual space of chains and cochains or that of cochains on primal and dual complexes [107, 108]. However, we are going to introduce Legendre duality instead. For this purpose, our introduction of chain and cochain complexes depends only on the topological (algebraic) information of the underlying graph and hypergraph [73] without specifying the metric information.

3.1 Chain and cochain complexes on graphs

The elements of a graph \(\mathbb {G}\) are called cells in discrete calculus.Footnote 19 A vertex and an edge are, respectively, called 0-cell and 1-cell, and the graph \(\mathbb {G}\) is denoted as a cell-complex.Footnote 20 For each type of the cells, we consider vectors (chains and cochains) defined on the cells. For \(\mathbb {G}\), a 0-chain with field \(\mathbb {R}\) is an \(N_{\mathbb {v}}\)-tuple of real scalars, each of which is assigned to a vertex, i.e., a 0 cell. Thus, a 0-chain is a real vector defined on the vertices of \(\mathbb {G}\) with the basis \(\{\mathbb {v}_{i}\}\). This basis is called the standard basis. The vector space of real 0-chains is called the vertex space here and denoted as \(C_{0}(\mathbb {G})=\mathbb {R}^{N_{\mathbb {v}}}\) [91].Footnote 21 The components of the vector \(\varvec{x} \in C_{0}(\mathbb {G})\) are given as \(\varvec{x}(\mathbb {v}_{i}):=x_{i}\). Similarly, a real 1-chain is a real vector defined on the edges of \(\mathbb {G}\). The real vector space of 1-chains is called the edge space and denoted as \(C_{1}(\mathbb {G})=\mathbb {R}^{N_{\mathbb {e}}}\). The standard basis is introduced by using edges \(\{\mathbb {e}_{e}\}\), accordingly. A flux \(\varvec{j}\) is a 1-chain: \(\varvec{j}(\mathbb {e}_{e}):=j_{e}\). The graph incidence matrix \(\mathbb {B}\) induces the discrete differential \(\delta _{1}: C_{1}(\mathbb {G}) \rightarrow C_{0}(\mathbb {G})\) as \(\delta _{1}\varvec{j}:=\mathbb {B}\varvec{j}\).Footnote 22

To obtain an exact sequence, we algebraically define the \((-1)\) and 2 chains and the corresponding differentials \(\delta _{0}\) and \(\delta _{2}\). Let \(C_{2}(\mathbb {G})=\mathbb {R}^{N_{\mathbb {z}}}\) where \(N_{\mathbb {z}}=\textrm{dim}[\textrm{Ker}\mathbb {B}]\) and \(\{\varvec{v}_{i}\}_{i\in [1,N_{\mathbb {z}}]}\) is a set of complete basis of \(\textrm{Ker}\mathbb {B}\) where \(\varvec{v}_{i} \in \{0,+1,-1\}^{N_{\mathbb {e}}}\)Footnote 23. In algebraic graph theory, \(\textrm{Ker}\mathbb {B}\) is called a cycle subspace [86, 91, 109]. For a graph \(\mathbb {G}\), we can construct \(\{\varvec{v}_{i}\}_{i\in [1,N_{\mathbb {z}}]}\) by, for example, using the fundamental cycle basis of \(\mathbb {G}\) obtained from a fixed spanning tree of \(\mathbb {G}\)Footnote 24 [86]. Thus, \(C_{2}(\mathbb {G})\) is the vector space defined on the cycles of \(\mathbb {G}\) and isomorphic to the cycle subspace. We define a matrix, \(\mathbb {V}:=(\varvec{v}_{1},\cdots , \varvec{v}_{N_{\mathbb {z}}})\)Footnote 25, and the differential \(\delta _{2}: C_{2}(\mathbb {G}) \rightarrow C_{1}(\mathbb {G})\) as \(\delta _{2}:=\mathbb {V}\). From the construction, \(\mathbb {B}\mathbb {V}=\delta _{1}\delta _{2}=0\) and \(\textrm{Im}[\delta _{2}]=\textrm{Ker}[\delta _{1}]\) hold. Similarly, let \(C_{-1}(\mathbb {G})=\mathbb {R}^{N_{\mathbb {l}}}\) where \(N_{\mathbb {l}}=\textrm{dim}[\textrm{Ker}\mathbb {B}^{T}]\) and \(\{\varvec{u}_{\ell }\}_{\ell \in [1,N_{\mathbb {l}}]}\) is a set of complete basis of \(\textrm{Ker}\mathbb {B}^{T}\) where \(\varvec{u}_{\ell } \in \{0,+1,-1\}^{N_{\mathbb {v}}}\). The subspace \(\textrm{Ker}\mathbb {B}^{T}\) is related to the connected components of \(\mathbb {G}\) and \(\varvec{u}_{i}\) can be chosen such that \(u_{i,\ell }=+1\) if the ith vertex is included in the \(\ell \)th connected component and \(u_{i,\ell }=0\), otherwise. Thus, \(C_{-1}(\mathbb {G})\) is the vector space on the connected components. From the matrix \(\mathbb {U}:=(\varvec{u}_{1},\cdots , \varvec{u}_{N_{\mathbb {l}}})^{T}\), the differential \(\delta _{0}: C_{0}(\mathbb {G}) \rightarrow C_{-1}(\mathbb {G})\) is defined as \(\delta _{0}:=\mathbb {U}\). From the construction, \(\mathbb {U}\mathbb {B}=\delta _{0}\delta _{1}=0\) and \(\textrm{Im}[\delta _{1}]=\textrm{Ker}[\delta _{0}]\) hold. Then, we obtain the exact chain sequenceFootnote 26Footnote 27:

$$\begin{aligned} 0 \xleftarrow {} C_{-1}(\mathbb {G})\xleftarrow {\delta _{0} =\mathbb {U}}C_{0}(\mathbb {G})\xleftarrow {\delta _{1} =\mathbb {B}}C_{1}(\mathbb {G})\xleftarrow {\delta _{2}=\mathbb {V}} C_{2}(\mathbb {G})\xleftarrow {}0. \end{aligned}$$
(19)

Because \(C_{p}(\mathbb {G})\) is a vector space for each \(p\in \{-1,0,1,2\}\), we can consider its dual vector space \(C^{p}(\mathbb {G}):=C_{p}^{*}(\mathbb {G})\) consisting of the linear functions on \(C_{p}(\mathbb {G})\). An element of \(C^{p}(\mathbb {G})\) is called p-cochain. Let \(\langle \cdot , \cdot \rangle : C_{p}(\mathbb {G}) \times C^{p}(\mathbb {G})\rightarrow \mathbb {R}\) be the standard bilinear pairing of the p-chain and p-cochain defined with the standard basis. The transposes of \(\mathbb {U}\), \(\mathbb {B}\), and \(\mathbb {V}\) induce the differentials between cochains as \(\delta ^{-1}:=\mathbb {U}^{T}: C^{-1}(\mathbb {G})\rightarrow C^{0}(\mathbb {G})\), \(\delta ^{0}:=\mathbb {B}^{T}: C^{0}(\mathbb {G})\rightarrow C^{1}(\mathbb {G})\), and \(\delta ^{1}:=\mathbb {V}^{T}: C^{1}(\mathbb {G})\rightarrow C^{2}(\mathbb {G})\). The differentials \(\delta ^{p}\) on cochains are the adjoints of the differentials \(\delta _{p}\) on chains, which induce the exact cochain sequence:

$$\begin{aligned} 0 \xrightarrow {} C^{-1}(\mathbb {G})\xrightarrow {\delta ^{-1} =\mathbb {U}^{T}}C^{0}(\mathbb {G})\xrightarrow {\delta ^{0} =\mathbb {B}^{T}}C^{1}(\mathbb {G})\xrightarrow {\delta ^{1} =\mathbb {V}^{T}}C^{2}(\mathbb {G})\xrightarrow {}0. \end{aligned}$$
(20)

Note that the definition of chains, cochains, and differential operators are topological in the sense that we do not include any metric information.

3.2 Chain and cochain complexes on hypergraphs

The definitions of chain and cochain complexes introduced above are algebraically extended to hypergraphs \(\mathbb {H}\) simply by replacing the graph incidence matrix \(\mathbb {B}\) with the hypergraph incidence matrix \(\mathbb {S}\).

Definition 9

(Exact chain and cochain sequences on a hypergraph) The chain and cochain complexes on a hypergraph are defined by the following diagram:

$$\begin{aligned} 0&\xrightarrow {}&C^{-1}(\mathbb {H})&\xrightarrow {\delta ^{-1}=\mathbb {U}^{T}}&C^{0}(\mathbb {H})&\xrightarrow {\delta ^{0}=\mathbb {S}^{T}}&C^{1}(\mathbb {H})&\xrightarrow {\delta ^{1} =\mathbb {V}^{T}}&C^{2}(\mathbb {H})&\xrightarrow {}&0&\\ 0&\xleftarrow {}&C_{-1}(\mathbb {H})&\xleftarrow {\delta _{0}=\mathbb {U}}&C_{0} (\mathbb {H})&\xleftarrow {\delta _{1}=\mathbb {S}}&C_{1}(\mathbb {H})&\xleftarrow {\delta _{2}=\mathbb {V}}&C_{2}(\mathbb {H})&\xleftarrow {}&0&. \end{aligned}$$

where \(C^{-1}(\mathbb {H})\simeq C_{-1}(\mathbb {H})\simeq \mathbb {R}^{N_{\mathbb {l}}}\), \(C^{0}(\mathbb {H})\simeq C_{0}(\mathbb {H})\simeq \mathbb {R}^{N_{\mathbb {X}}}\), \(C^{1}(\mathbb {H})\simeq C_{1}(\mathbb {H})\simeq \mathbb {R}^{N_{\mathbb {e}}}\), and \(C^{2}(\mathbb {H})\simeq C_{2}(\mathbb {H})\simeq \mathbb {R}^{N_{\mathbb {z}}}\).

The bases, \(\mathbb {V}\) and \(\mathbb {U}\), are obtained as integral bases, i.e., the components of \(\mathbb {V}\) and \(\mathbb {U}\) can be chosen from \(\mathbb {Z}\) because \(\mathbb {S}\) is an integer-valued matrix.Footnote 28 As we will explain in Sect. 6 and Sect. 9, the meaning of \(C_{2}(\mathbb {H})\) can be retained as the space on generalized cycles. The meaning of \(C_{-1}(\mathbb {H})\) becomes the space of conserved quantities under the dynamics (Eq. 10).

3.3 Discrete calculus on graphs and hypergraphs

The p-cochain and p-chain introduced above are an algebraic abstraction of the p-differential form and its Hodge dual on a differential manifold [73]. Accordingly, the discrete versions of gradient, divergence, and curl are associated with the differentials (exterior derivative).

Definition 10

(Discrete gradients, divergences, and curls) The discrete gradient is defined as \(\textrm{grad}_{\mathbb {B}}:=\delta ^{0}=\mathbb {B}^{T}\) for \(\mathbb {G}\) and also as \(\textrm{grad}_{\mathbb {S}}:=\delta ^{0}=\mathbb {S}^{T}\) for \(\mathbb {H}\). The adjoints of the gradients are defined with the corresponding adjoint differentials: \(\textrm{grad}^{*}_{\mathbb {B}}:=\delta _{1}=\mathbb {B}\) and \(\textrm{grad}^{*}_{\mathbb {S}}:=\delta _{1}=\mathbb {S}\). They are called discrete divergences and denoted also as \(\textrm{div}_{\mathbb {B}}=\textrm{grad}^{*}_{\mathbb {B}}\) and \(\textrm{div}_{\mathbb {S}}=\textrm{grad}^{*}_{\mathbb {S}}\).Footnote 29 The discrete curl and its adjoint are defined as \(\textrm{curl}_{\mathbb {V}}:=\delta ^{1}=\mathbb {V}^{T}\) and \(\textrm{curl}^{*}_{\mathbb {V}}:=\delta _{2}=\mathbb {V}\), respectively.

3.4 Linear graph Laplacian dynamics and metric structure in discrete calculus

In the theory of graph Laplacian, a metric matrix \(M_{p}\) and its associated inner product are typically endowed for each p. To contrast it with the Legendre duality introduced later, we briefly outline it here. For an edge-weighted graph \(\mathbb {G}_{\varvec{k}^{\pm }}\) and for the case that \(\varvec{k}^{+}=\varvec{k}^{-}=\varvec{k}\in \mathbb {R}_{>0}^{N_{\mathbb {e}}}\), \(M_{0}=I\) and \(M_{1}=\textrm{diag}[1/\varvec{k}]\) are conventionally employed. With these metric matrices, the graph Laplacian introduced in Eq. 5 can be described as

$$\begin{aligned} \mathcal {L}_{\varvec{k}}=\textrm{div}_{\mathbb {B}} M^{1} \textrm{grad}_{\mathbb {B}} M_{0} \end{aligned}$$
(21)

where \(M^{p}:=M_{p}^{-1}\). By including such metric information, the following pair of metric gradient and divergence is often used in graph theory and network theory: \(\textrm{grad}_{M}:=\sqrt{M^{1}}\mathbb {B}^{T}\) and \(\textrm{div}_{M} :=\mathbb {B}\sqrt{M^{1}}\) where \(\sqrt{M^{1}}:=\textrm{diag}[\sqrt{\varvec{k}}]\). This symmetric graph Laplacian \(\mathcal {L}_{\varvec{k}}\) induces a linear dynamics of \(\varvec{x}\in \mathbb {R}^{N_{\mathbb {v}}}\) on graph via Eq. 6Footnote 30:

$$\begin{aligned} \dot{\varvec{x}}=-\mathcal {L}_{\varvec{k}}\varvec{x}. \end{aligned}$$
(22)

The eigenvalues and eigenvectors of \(\mathcal {L}_{\varvec{k}}\) enable us to obtain spectral information of the underlying graph [92]. Even for nonlinear dynamics on a hypergraph as in Eq. 14, the same symmetric Laplacian can provide some information when \(\varvec{k}^{+}=\varvec{k}^{-}=\varvec{k}\). We can also include other information in the metric matrices such as the degree of vertices [110]. Various normalizations of the graph Laplacian can be attributed to the choice of metrics.

However, such a choice of metric matrices ends up only with linear dynamics on \(\mathbb {R}^{N_{\mathbb {v}}}\) and is relevant only when the weighting is symmetric: \(\varvec{k}^{+}=\varvec{k}^{-}=\varvec{k}\). In addition, it may not always capture important aspects of the density dynamics such as gradient flow properties and information–theoretic properties, because nonlinear terms such as \(\ln \varvec{p}\) appear in information–theoretic quantities. To extend the class of dynamics being covered and to enable the information-geometric characterization of dynamics, we have to generalize the conventional inner product structure by replacing it with the Legendre dual structure induced by convex functions.

4 Dually flat spaces on vertices and edges and generalized flow

In this section, we introduce two pairs of dually flat spaces (Fig. 1): one is associated with the vertex spaces, i.e., the dual spaces of 0-chains and 0-cochains. The other corresponds to the edge spaces, i.e., the dual spaces of 1-chains and 1-cochains. By combining them, the dynamics on graphs and hypergraphs are characterized as a generalized flow.

4.1 Dually flat spaces on vertices and thermodynamic functions

We work on the density \(\varvec{x}\) and the vertex space for CRN because its reduction to rLDG is straightforward. For a probability vector \(\varvec{p}\), the introduction of dually flat spaces of \(\varvec{p}\) and \(\ln \varvec{p}\) is natural from the information-geometric viewpoint. In CRN, \(\varvec{x}\) is the vector of concentrations of molecular species. As we recently clarified [48], the dually flat spaces, in this case, result from the Legendre duality between extensive and intensive variables in thermodynamics, which is also natural from the physical viewpoint.

Definition 11

(Density space (primal vertex affine space)) The density space (also called primal affine vertex space) is the positive orthant \(\mathcal {X}:=\mathbb {R}_{>0}^{N_{\mathbb {X}}}\) of a vector space \(\mathbb {R}^{N_{\mathbb {X}}}\), which is isomorphic to \(C_{0}(\mathbb {H})\); \(\mathbb {R}^{N_{\mathbb {X}}} \simeq C_{0}(\mathbb {H})\) (Fig. 1, lower left).

Remark 4

The density space \(\mathcal {X}\) is defined as the positive orthant rather than as \(\mathbb {R}_{\ge 0}^{N_{\mathbb {X}}}\). This excludes the cases where some elements of \(\varvec{x}\) become 0. From the viewpoint of information geometry, this restriction is necessary to consider densities with the same support (all \(\varvec{x}\) in \(\mathcal {X}\) should be equivalent in terms of absolute continuity of measures). From the viewpoint of dynamical systems, depending on the specific functional form of the flux \(\varvec{j}(\varvec{x})\), the trajectory \(\varvec{x}(t)\) may not be restricted within \(\mathcal {X}\). The property \(\varvec{x}(t)\) in \(\mathcal {X}\) for \(t\in [0,\infty ]\) is known as persistence.Footnote 31 Without going into this intricate problem, we simply assume that \(\varvec{x}(t) \in \mathcal {X}\) for \(t\in [0,\infty ]\). We call \(\partial \mathcal {X}:=\mathbb {R}^{N_{\mathbb {X}}}_{\ge 0}{\setminus }\mathcal {X}\) the boundary of \(\mathcal {X}\).

We define the dual of the density space by the Legendre transformation via the thermodynamic function:

Definition 12

(Primal thermodynamic function) A strictly convex differentiable function \(\Phi : \mathcal {X}\rightarrow \mathbb {R}\) is called the primal thermodynamic functionFootnote 32Footnote 33 if the following two conditions are satisfied: (1) the associated Legendre transformation

$$\begin{aligned} \partial \Phi : \mathcal {X}&\rightarrow \mathbb {R}^{N_{\mathbb {X}}} \end{aligned}$$
(23)
$$\begin{aligned} \varvec{x}&\longmapsto \varvec{y} :=\partial _{\varvec{x}}\Phi (\varvec{x}) = \left( \frac{\partial \Phi (\varvec{x})}{\partial x_{1}}, \cdots , \frac{\partial \Phi (\varvec{x})}{\partial x_{N_{\mathbb {X}}}}\right) ^{T} \end{aligned}$$
(24)

has the image \(\mathcal {Y}:=\left\{ \varvec{y}|\varvec{y}=\partial \Phi (\varvec{x}),\varvec{x}\in \mathcal {X}\right\} \) being equal to \(\mathbb {R}^{N_{\mathbb {X}}}\), i.e., \(\mathcal {Y}=\mathbb {R}^{N_{\mathbb {X}}}\); (2) for any \(\varvec{x}_{in}\in \mathcal {X}\) and any point on the boundary \(\varvec{x}_{bd}\in \partial \mathcal {X}\),

$$\begin{aligned} \lim _{\lambda \rightarrow +0}\frac{\textrm{d}\Phi (\varvec{x}_{\lambda })}{\textrm{d}\lambda } =- \infty \end{aligned}$$
(25)

holds where \(\varvec{x}_{\lambda }:=\lambda \varvec{x}_{in} + (1-\lambda ) \varvec{x}_{bd}\) for \(\lambda \in [0,1]\).

Definition 13

(Potential space (dual affine vertex space) and dual thermodynamic function) The potential (field) space \(\mathcal {Y}= \mathbb {R}^{N_{\mathbb {X}}}\) (also called the dual affine vertex space) is an affine space dual to \(\mathcal {X}\) with the associated vector space \(C^{0}(\mathbb {H})\)((Fig. 1, upper left)).Footnote 34 The dual thermodynamic function \(\Phi ^{*}: \mathcal {Y}\rightarrow \mathbb {R}\) is the Legendre-Fenchel conjugate of the primal thermodynamic function:

$$\begin{aligned} \Phi ^{*}: \mathcal {Y}\rightarrow \mathbb {R}, \quad \varvec{y} \mapsto \Phi ^{*}(\varvec{y}) :=\max _{\varvec{x}'\in \mathcal {X}}\left[ \langle \varvec{x}',\varvec{y} \rangle - \Phi (\varvec{x}') \right] , \end{aligned}$$
(26)

where \(\langle \cdot ,\cdot \rangle : \mathcal {X}\times \mathcal {Y}\rightarrow \mathbb {R}\) is the bilinear pairing under the standard basis. From the properties of the primal function, \(\Phi ^{*}(\varvec{y})\) is also a strictly convex differentiable function. From \(\Phi ^{*}(\varvec{y})\), we have the inverse Legendre transformation \(\partial \Phi ^{*}: \mathcal {Y}\rightarrow \mathcal {X},\,\varvec{y}\mapsto \varvec{x} =\partial _{\varvec{y}}\Phi ^{*}(\varvec{y})\).

The Legendre transformations, \(\partial \Phi \) and \(\partial \Phi ^{*}\), are continuous and establish a bijection between \(\mathcal {X}\) and \(\mathcal {Y}\), where \(\partial \Phi ^{*}=\partial \Phi ^{-1}\). In the following, we regard a pair \((\varvec{x},\varvec{y})\) with the same decoration as a Legendre dual pair satisfying \(\varvec{y}=\partial \Phi (\varvec{x})\). For a pair, the Legendre-Fenchel-Young identity holds:

$$\begin{aligned} \Phi (\varvec{x})+\Phi ^{*}(\varvec{y})=\langle \varvec{x},\varvec{y}\rangle . \end{aligned}$$
(27)

Different pairs are discriminated with the difference of decorations as \((\varvec{x}', \varvec{y}')\) or \((\varvec{x}_{p}, \varvec{y}_{p})\).

For a thermodynamic function, the Bregman divergence can be defined:

Definition 14

(Bregman divergence [1, 113]) The Bregman divergence on \(\mathcal {X}\) with the generating thermodynamic function \(\Phi (\varvec{x})\) is defined as

$$\begin{aligned} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']:=\Phi (\varvec{x})-\Phi (\varvec{x}') - \langle \varvec{x}-\varvec{x}', \partial \Phi (\varvec{x}') \rangle \in \mathbb {R}_{\ge 0}. \end{aligned}$$
(28)

The non-negativity of the Bregman divergence follows from the Fenchel-Young inequality for products [114, 115]. Furthermore, from the strict convexity of the thermodynamic function, \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']\) is also strictly convex with respect to \(\varvec{x}\) and \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']=0\) if and only if \(\varvec{x}=\varvec{x}'\). Bregman divergences are defined for \((\varvec{y},\varvec{y}')\) and also for \((\varvec{x},\varvec{y}')\) as

$$\begin{aligned} \mathcal {D}^{\mathcal {Y}}_{\Phi ^{*}}[\varvec{y}'\Vert \varvec{y}]&:=\Phi ^{*}(\varvec{y}') -\Phi ^{*}(\varvec{y}) - \langle \partial \Phi ^{*}(\varvec{y}), \varvec{y}'-\varvec{y} \rangle , \end{aligned}$$
(29)
$$\begin{aligned} \mathcal {D}^{\mathcal {X},\mathcal {Y}}_{\Phi ,\Phi ^{*}}[\varvec{x}; \varvec{y}']&:=\Phi (\varvec{x})+\Phi ^{*}(\varvec{y}') - \langle \varvec{x}, \varvec{y}' \rangle . \end{aligned}$$
(30)

Because \((\varvec{x},\varvec{y})\) and \((\varvec{x}', \varvec{y}')\) are Legendre pairs, all the three representations are equivalentFootnote 35: \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']=\mathcal {D}^{\mathcal {Y}}_{\Phi ^{*}}[\varvec{y}'\Vert \varvec{y}]=\mathcal {D}^{\mathcal {X},\mathcal {Y}}_{\Phi ,\Phi ^{*}}[\varvec{x}; \varvec{y}']\).Footnote 36\(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \varvec{x}']\), \( \mathcal {D}^{\mathcal {Y}}_{\Phi ^{*}}[\varvec{y}'\Vert \varvec{y}]\), and \(\mathcal {D}^{\mathcal {X},\mathcal {Y}}_{\Phi ,\Phi ^{*}}[\varvec{x}; \varvec{y}']\) are abbreviated as \(\mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \varvec{x}']\), \(\mathcal {D}^{\mathcal {Y}}[\varvec{y}'\Vert \varvec{y}]\), and \(\mathcal {D}^{\mathcal {X},\mathcal {Y}}[\varvec{x}; \varvec{y}']\), respectively.

Finally, the Hessian matrices of the primal and dual thermodynamic functions are defined when they are twice differentiableFootnote 37:

Definition 15

(Hessian matrices) The primal and dual Hessian matrices, \(G_{\varvec{x}}\in \mathbb {R}^{N_{\mathbb {v}}\times N_{\mathbb {v}}}\) and \(G_{\varvec{y}}^{*}\in \mathbb {R}^{N_{\mathbb {v}}\times N_{\mathbb {v}}}\), of thermodynamic functions, \(\Phi (\varvec{x})\) and \(\Phi ^{*}(\varvec{y})\), are defined as

$$\begin{aligned} (G_{\varvec{x}})_{i,j}&:=\frac{\partial ^{2}\Phi (\varvec{x})}{\partial x_{i} \partial x_{j}},&(G_{\varvec{y}}^{*})_{i,j}&:=\frac{\partial ^{2}\Phi ^{*}(\varvec{y})}{\partial y_{i} \partial y_{j}}. \end{aligned}$$
(31)

In addition, they are positive definite and \(G_{\varvec{x}}^{-1}=G_{\varvec{y}}^{*}\) holds for a Legendre dual pair \(\varvec{x}\) and \(\varvec{y}\).

The Hessian matrices induce a Riemannian metric over \(\mathcal {X}\). The tangent and cotangent spaces \(\mathcal {T}_{\varvec{x}}\mathcal {X}\) and \(\mathcal {T}^{*}_{\varvec{x}}\mathcal {X}\) are isomorphic to the corresponding tangent and cotangent spaces \(\mathcal {T}^{*}_{\varvec{y}}\mathcal {Y}\) and \(\mathcal {T}_{\varvec{y}}\mathcal {Y}\) over \(\mathcal {Y}\) and also to \(C_{0}(\mathbb {H})\) and \(C^{0}(\mathbb {H})\): \(\mathcal {T}_{\varvec{x}}\mathcal {X}\cong \mathcal {T}^{*}_{\varvec{y}}\mathcal {Y}\cong C_{0}(\mathbb {H})\) and \(\mathcal {T}^{*}_{\varvec{x}}\mathcal {X}\cong \mathcal {T}_{\varvec{y}}\mathcal {Y}\cong C^{0}(\mathbb {H})\).

The typical example of the duality between \(\varvec{x}\) and \(\varvec{y}\) in statistics is that between probability p and its logarithm ln p. Other than this typical one, depending on the purpose, we adopt different forms of thermodynamic functions \((\Phi (\varvec{x}), \Phi ^{*}(\varvec{y}))\), associated dual variables, and Bregman divergence to endow different properties to inference or estimation methods that we are designing [1]. In the case of CRN, the thermodynamic functions and Legendre duality are associated with the equilibrium thermodynamics [117]. Specifically, as we recently demonstrated [48], \(\mathcal {X}\) and \(\mathcal {Y}\) are the conjugate spaces of the extensive and intensive thermodynamic variables (density of molecules and their chemical potential), \(\Phi (\varvec{x})\) is the thermodynamic potential function of the system, and the Bregman divergence becomes the difference of the total entropy. These correspondences are derived directly from the axiomatic formulation of thermodynamics [48, 117]. The explicit functional form of \(\Phi (\varvec{x})\) is then determined by the physical details of the thermodynamic system that we work on.

Before closing this subsection, we introduce the notion of separability, which will be linked to the locality of the flux.

Definition 16

(Separability of a thermodynamic function) A thermodynamic function \(\Phi (\varvec{x})\) is separable if it can be represented as

$$\begin{aligned} \Phi (\varvec{x})=\sum _{i=1}^{N_{\mathbb {v}}}c_{i}\phi (x_{i}/x_{i}^{o}), \end{aligned}$$
(32)

where \(c_{i}>0\), \(x_{i}^{o}>0\), and \(\phi (x):\mathbb {R}_{>0} \rightarrow \mathbb {R}\) is a scalar primal thermodynamic function.

If \(\Phi (\varvec{x})\) is separable, then its conjugate \(\Phi ^{*}(\varvec{y})\) is also separable as \(\Phi ^{*}(\varvec{y})=\sum _{i=1}^{N_{\mathbb {v}}}c_{i}\phi ^{*} (\frac{x_{i}^{o}}{c_{i}}y_{i})\) where \(\phi ^{*}(y): \mathbb {R}\rightarrow \mathbb {R}\) is the Legendre conjugate of \(\phi (x)\).Footnote 38 If a thermodynamic function is separable, then the corresponding Bregman divergence is separable. The Hessian matrices become diagonal for a separable thermodynamic function. Most of our results can hold without the separability, but common thermodynamic functions and related quantities are typically separable. For example, the Kullback-Leibler divergence is an example of separable Bregman divergences.

4.2 Dually flat spaces on edges and dissipation functions

Next, we introduce another dually flat structure onto the edge space of graphs and hypergraphs based on the flux-force relation.

Definition 17

(Flux and force spaces (primal and dual edge spaces)) The flux and force spaces on the edges, \(\mathcal {J}_{\varvec{x}}=\mathbb {R}^{N_{\mathbb {e}}}\) and \(\mathcal {F}_{\varvec{x}}=\mathbb {R}^{N_{\mathbb {e}}}\), are a pair of the primal and dual vector spaces defined for each \(\varvec{x}\in \mathcal {X}\), which are isomorphic to \(C_{1}(\mathbb {H})\) and \(C^{1}(\mathbb {H})\), respectively (Fig. 1, right). The bilinear pairing under the standard basis \(\langle \cdot , \cdot \rangle : C_{1}(\mathbb {H}) \times C^{1}(\mathbb {H}) \rightarrow \mathbb {R}\) is inherited to \((\mathcal {J}_{\varvec{x}}, \mathcal {F}_{\varvec{x}})\).

To introduce Legendre duality on \((\mathcal {J}_{\varvec{x}}, \mathcal {F}_{\varvec{x}})\), we use the dissipation functions:

Definition 18

(Dissipation functionFootnote 39) A dissipation function on \(\mathcal {F}_{\varvec{x}}\), \(\Psi ^{*}_{\varvec{x}}:\mathcal {F}_{\varvec{x}} \rightarrow \mathbb {R}, \varvec{f} \mapsto \Psi ^{*}_{\varvec{x}}(\varvec{f})\), is a strictly convex and continuously differentiable function with respect to \(\varvec{f}\) for all \(\varvec{x} \in \mathcal {X}\) that also satisfies the following additional conditions:

$$\begin{aligned} \text{1-coercive: }{} & {} \frac{\Psi ^{*}_{\varvec{x}}(\varvec{f})}{\Vert \varvec{f}\Vert }&\rightarrow \infty \quad \text{ as } \Vert \varvec{f}\Vert \rightarrow \infty , \end{aligned}$$
(33)
$$\begin{aligned} \text{ Symmetric: }{} & {} \Psi ^{*}_{\varvec{x}}(\varvec{f})&=\Psi ^{*}_{\varvec{x}}(-\varvec{f}) \end{aligned}$$
(34)
$$\begin{aligned} \text{ Bounded } \text{ below } \text{ by } 0:{} & {} \Psi ^{*}_{\varvec{x}}(\varvec{0})&=0, \end{aligned}$$
(35)

Proposition 1

(Duality of dissipation functions) The Legendre-Fenchel conjugate of \(\Psi _{\varvec{x}}^{*}(\varvec{f})\), i.e., \(\Psi _{\varvec{x}}(\varvec{j}) :=\max _{\varvec{f}}\left[ \langle \varvec{j},\varvec{f} \rangle - \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] \), is also the dissipation function on \(\mathcal {J}_{\varvec{x}}\). \(\Psi _{\varvec{x}}(\varvec{j})\) and \(\Psi ^{*}_{\varvec{x}}(\varvec{f})\) are called primal and dual dissipation functions.

Proof

For each \(\varvec{x}\in \mathcal {X}\), the function \(\Psi _{\varvec{x}}(\varvec{j})\) is strictly convex, continuously differentiable, 1-coercive, and \(\Psi _{\varvec{x}}(\varvec{j})<+\infty \) for all \(\varvec{j}\in \mathcal {J}_{\varvec{x}}\) because \(\Psi _{\varvec{x}}^{*}(\varvec{f})\) is ( see Corollary 4.1.4 in [118]). For \(\varvec{j}\in \mathcal {J}_{\varvec{x}}\), the symmetry holds as \(\Psi _{\varvec{x}}(-\varvec{j}) = \max _{\varvec{f}}\left[ \langle \varvec{-j},\varvec{f} \rangle - \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] = \max _{\varvec{f}}\left[ \langle \varvec{j},-\varvec{f} \rangle \! -\! \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] \!=\! \max _{\varvec{f}}\!\left[ \langle \varvec{j},\varvec{f} \rangle \!-\! \Psi ^{*}_{\varvec{x}}(-\varvec{f}) \right] = \max _{\varvec{f}}\left[ \langle \varvec{j},\varvec{f} \rangle - \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] =\Psi _{\varvec{x}}(\varvec{j})\). From the convexity and symmetry, the minimum of \(\Psi _{\varvec{x}}(\varvec{j})\) is attained at \(\varvec{j}=\varvec{0}\) and \(\min _{\varvec{j}}\Psi _{\varvec{x}}(\varvec{j}) = \Psi _{\varvec{x}}(\varvec{0}) =\max _{\varvec{f}}\left[ \langle \varvec{0},\varvec{f} \rangle - \Psi ^{*}_{\varvec{x}}(\varvec{f}) \right] =-\min _{\varvec{f}}\Psi ^{*}_{\varvec{x}}(\varvec{f})=0\). \(\square \)

From these properties, for each \(\varvec{x}\in X\), the one-to-one Legendre duality via Legendre transformations is established for all over \((\mathcal {J}_{\varvec{x}}, \mathcal {F}_{\varvec{x}})\):

$$\begin{aligned} \varvec{j}&=\partial _{\varvec{f}} \Psi ^{*}_{\varvec{x}}(\varvec{f}),&\varvec{f}&=\partial _{\varvec{j}} \Psi _{\varvec{x}}(\varvec{j}). \end{aligned}$$
(36)

In the following, we abbreviate the Legendre transformations as \(\partial _{\varvec{f}} \Psi ^{*}_{\varvec{x}}(\varvec{f})=\partial \Psi ^{*}_{\varvec{x}}(\varvec{f})\) and \(\partial _{\varvec{j}} \Psi _{\varvec{x}}(\varvec{j})=\partial \Psi _{\varvec{x}}(\varvec{j})\)Footnote 40. Similarly to the Legendre dual pair \((\varvec{x},\varvec{y})\) in \(\mathcal {X}\) and \(\mathcal {Y}\), a pair of flux and force with the same decoration, e.g., \((\varvec{j},\varvec{f})_{\varvec{x}}\) or \((\varvec{j}_{0},\varvec{f}_{0})_{\varvec{x}}\), represents a Legendre dual pair linked by Eq. 36 at \(\varvec{x}\). We omit the \(\varvec{x}\)-dependency for simplicity. The Legendre dual pair \((\varvec{j},\varvec{f})\) satisfies the Legendre-Fenchel-Young identity for each \(\varvec{x}\in \mathcal {X}\):

$$\begin{aligned} \Psi ^{*}_{\varvec{x}}(\varvec{f})+ \Psi _{\varvec{x}}(\varvec{j})-\langle \varvec{j},\varvec{f}\rangle =0. \end{aligned}$$
(37)

Furthermore, the additional conditions of dissipation functions enable the Legendre duality to work as an extension of a Riemannian metric structure:

Proposition 2

([75]) The Legendre transformations satisfy the following properties:

$$\begin{aligned}&\text{ Pairing } \text{ of } \varvec{0}\in \mathcal {J} \text{ and } \varvec{0}\in \mathcal {F}: \varvec{0}=\partial \Psi ^{*}_{\varvec{x}}(\varvec{0}),\quad \varvec{0}=\partial \Psi _{\varvec{x}}(\varvec{0}) \end{aligned}$$
(38)
$$\begin{aligned}&\text{ Symmetry: } -\varvec{f}=\partial \Psi _{\varvec{x}}(-\varvec{j}),\, \quad -\varvec{j}=\partial \Psi ^{*}_{\varvec{x}}(-\varvec{f}). \end{aligned}$$
(39)
$$\begin{aligned}&\text{ Nonnegativity } \text{ of } \text{ bilinear } \text{ pairing: } \langle \varvec{j},\varvec{f}\rangle = \Psi ^{*}_{\varvec{x}}(\varvec{f})+ \Psi _{\varvec{x}}(\varvec{j})\ge 0. \end{aligned}$$
(40)

The first property means that zero force \(\varvec{f}=\varvec{0}\) and zero flux \(\varvec{j}=\varvec{0}\) are always Legendre dual regardless of \(\varvec{x}\), and the second one indicates that if \((\varvec{j},\varvec{f})\) is a Legendre dual pair, then \((-\varvec{j},-\varvec{f})\) is as well.Footnote 41 The third property, as well as the nonnegativity of the dissipation functions, enables them to play the similar roles to the metric-induced norm in Riemannian geometry.Footnote 42

With the dissipation functions, \(\Psi _{\varvec{x}}(\varvec{j})\) and \(\Psi ^{*}_{\varvec{x}}(\varvec{f})\), we now have the second dually flat structure on the edge spaces \((\mathcal {J}_{\varvec{x}},\mathcal {F}_{\varvec{x}})\). On these dually flat spaces, we define the Bregman divergence and Hessian matrices:

Definition 19

(Bregman divergence and Hessian matrices on the edge spaces) For each \(\varvec{x}\in \mathcal {X}\), the Bregman divergence between \(\varvec{j}\in \mathcal {J}_{\varvec{x}}\) and \(\varvec{f}'\in \mathcal {F}_{\varvec{x}}\) is defined as

$$\begin{aligned} \mathcal {D}_{\varvec{x}}^{\mathcal {J},\mathcal {F}}[\varvec{j};\varvec{f}']:=\Psi _{\varvec{x}}(\varvec{j})+\Psi ^{*}_{\varvec{x}}(\varvec{f}')-\langle \varvec{j}, \varvec{f}'\rangle . \end{aligned}$$
(41)

\(\mathcal {D}_{\varvec{x}}^{\mathcal {J}}[\varvec{j}\Vert \varvec{j}']\) and \(\mathcal {D}_{\varvec{x}}^{\mathcal {F}}[\varvec{f}\Vert \varvec{f}']\) are also defined analogously to the Bregman divergence on the vertex space \((\mathcal {X}, \mathcal {Y})\). For a Legendre conjugate pair of twice differentiable dissipation functions, the Hessian matrices, \(G_{\varvec{x},\varvec{j}}\) and \(G_{\varvec{x},\varvec{f}}^{*}\), are defined as

$$\begin{aligned} (G_{\varvec{x},\varvec{j}})_{e,e'}&:=\frac{\partial ^{2}\Psi _{\varvec{x}}(\varvec{j})}{\partial j_{e} \partial j_{e'}},&(G_{\varvec{x},\varvec{f}}^{*})_{e,e'}&:=\frac{\partial ^{2}\Psi _{\varvec{x}}^{*}(\varvec{f})}{\partial f_{e} \partial f_{e'}}. \end{aligned}$$
(42)

These matrices are positive-definite.

The Legendre dual structure via the dissipation functions provides an extension of a Riemannian metric structure in the following sense. If the dissipation function is a quadratic function, i.e., a positive definite quadratic form as

$$\begin{aligned} \Psi ^{q,*}_{\varvec{x}}(\varvec{f}):=\frac{1}{2}\langle \varvec{f},M^{*}_{\varvec{x}} \varvec{f} \rangle , \end{aligned}$$
(43)

where \(M^{*}_{\varvec{x}}\) is a positive definite \(N^{\mathbb {e}}\times N^{\mathbb {e}}\) matrix, the Legendre transformation is reduced to the linear mapping \(\varvec{j}=\partial \Psi ^{q,*}_{\varvec{x}}(\varvec{f})=M^{*}_{\varvec{x}}\varvec{f}\)Footnote 43. Then, the bilinear pairing, \(\langle \varvec{j},\varvec{f}'\rangle = \langle \varvec{j},M_{\varvec{x}}\varvec{j}'\rangle =\langle M^{*}_{\varvec{x}}\varvec{f},\varvec{f}'\rangle \), becomes the inner product under the metric matrix \(M_{\varvec{x}}\) where \(M_{\varvec{x}}=(M_{\varvec{x}}^{*})^{-1}\). The dissipation functions are associated with the induced norms: \(\Psi ^{*}_{\varvec{x}}(\varvec{f})=\frac{1}{2}\Vert \varvec{f}\Vert _{M^{*}_{\varvec{x}}}^{2}\), \(\Psi _{\varvec{x}}(\varvec{j})=\frac{1}{2}\Vert \varvec{j}\Vert _{M_{\varvec{x}}}^{2}\). The Bregman divergence is reduced to the norm-induced squared distance: \(\mathcal {D}_{\varvec{x}}^{\mathcal {J},\mathcal {F}}[\varvec{j};\varvec{f}']=\frac{1}{2}\Vert \varvec{j}-\varvec{j}'\Vert _{M_{\varvec{x}}}^{2}=\frac{1}{2}\Vert \varvec{f}-\varvec{f}'\Vert _{M^{*}_{\varvec{x}}}^{2}\).

Finally, we also introduce the notion of separability to the dissipation functions:

Definition 20

(Separability and locality of dissipation functions) A dissipation function \(\Psi ^{*}_{\varvec{x}}(\varvec{f})\) is separable if it can be represented as

$$\begin{aligned} \Psi ^{*}_{\varvec{x}}(\varvec{f})=\sum _{e=1}^{N_{\mathbb {e}}}\omega _{e}(\varvec{x}) \psi ^{*}(f_{e}/f^{o}_{e}(\varvec{x})), \end{aligned}$$
(44)

where \(\omega _{e}(\varvec{x})>0\) and \(f^{o}_{e}(\varvec{x})>0\) for \(\varvec{x}\in \mathcal {X}\) are positive weights and \(\psi ^{*}(f):\mathbb {R}\rightarrow \mathbb {R}\) is a scalar dissipation function, i.e., a strictly convex differentiable scalar function satisfying Eq. 34, Eq. 35, and Eq. 33. If \(\omega _{e}(\varvec{x})\) and \(f^{o}_{e}(\varvec{x})\) are additionally local, then the dissipation function is separable and local. If \(\Psi ^{*}_{\varvec{x}}(\varvec{f})\) is separable, then its dual \(\Psi _{\varvec{x}}(\varvec{j})\) is also separable. The same is true for the locality.

Remark 5

(Young functions and N functions) The scalar dissipation function is a N function, which appears in the theory of Orlicz spaces. A function \(\tilde{\psi }(j): [0,\infty ) \rightarrow [0,\infty ]\) represented as \(\tilde{\psi }(j)=\int _{0}^{j}\varsigma (j')\textrm{d}j'\) is called Young function where \(\varsigma (j):[0,\infty )\rightarrow [0,\infty ]\) is a non-decreasing function satisfying \(\varsigma (0)=0\) and being left-continuous on \((0,\infty )\). If \(\varsigma (j)\) additionally satisfies \(0<\varsigma (j)<+\infty (0<j<\infty )\), \(\lim _{j\rightarrow +0} \varsigma (j)=0\), and \(\lim _{j\rightarrow \infty } \varsigma (j)=+\infty \), then \(\tilde{\psi }(j)\) is called an N-function. If we define a function \(\psi (j)\) with a N-function \(\tilde{\psi }(j)\) as \(\psi (j)=\tilde{\psi }(|j|)\), this becomes a scalar dissipation function [119]. A separable dissipation function (Eq. 44) is often called a weighted N-function [120, 121]. The dissipation function and induced Legendre duality are, therefore, related to Birnbaum-Orlicz spaces, which are an extension of \(L^{p}\) spaces.

4.3 Generalized flow on graphs and hypergraphs and its steady state

Because of the one-to-one Legendre duality between \((\varvec{j},\varvec{f})_{\varvec{x}}\), the continuity equation (Eq. 10) can be represented as a generalized flow driven by the force \(\varvec{f}(\varvec{x})\) dual to \(\varvec{j}(\varvec{x})\) [77]:

Definition 21

(Generalized flow) A curve \(\varvec{x}(t)\) is a generalized flow on \(\mathbb {H}\) driven by force \(\varvec{f}(\varvec{x})\) under the dissipation function \(\Psi ^{*}_{\varvec{x}}\) if it can be represented as

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \varvec{j}(\varvec{x})=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]. \end{aligned}$$
(45)

This representation is independent of the specific functional form of \(\varvec{f}(\varvec{x})\) and \(\Psi ^{*}_{\varvec{x}}(\varvec{f})\) and also on the definition of \(\textrm{div}_{\mathbb {S}}\) as long as the generated \(\varvec{j}(\varvec{x})\) is consistent with \(\mathbb {H}\)Footnote 44\(^{,}\)Footnote 45. Thus, we can potentially apply this framework to various systems by choosing these functions appropriately depending on the system or the problem we work on.

The generalized flow naturally encompasses three types of steady states:

Definition 22

(Steady state, complex-balanced state, and detailed-balanced state) We define the manifolds of steady state \(\mathcal {M}^{\textrm{ST}}\), complex-balanced (CB) state \(\mathcal {M}^{\textrm{CB}}\), and detailed-balanced (DB) state \(\mathcal {M}^{\textrm{DB}}\), respectively, as follows:

$$\begin{aligned} \mathcal {M}^{\textrm{ST}}&:=\{\varvec{x} \in \mathcal {X}| \mathbb {S}\varvec{j}(\varvec{x})=0\}, \end{aligned}$$
(46)
$$\begin{aligned} \mathcal {M}^{\textrm{CB}}&:=\{\varvec{x} \in \mathcal {X}| \mathbb {B}\varvec{j}(\varvec{x})=0\}, \end{aligned}$$
(47)
$$\begin{aligned} \mathcal {M}^{\textrm{DB}}&:=\{\varvec{x} \in \mathcal {X}| \varvec{j}(\varvec{x})=0\}=\{\varvec{x} \in \mathcal {X}| \varvec{f}(\varvec{x})=0\}, \end{aligned}$$
(48)

where we used \(\varvec{j}(\varvec{x})=0\) iff \(\varvec{f}(\varvec{x})=0\) from the properties of the dissipation functions. The relations \(\varvec{j}(\varvec{x})=0\) and \(\mathbb {B}\varvec{j}(\varvec{x})=0\) are called the detail-balanced (DB) condition and the complex-balanced (CB) condition, respectively. From the decomposition , an inclusion relation holds: \(\mathcal {M}^{\textrm{DB}} \subseteq \mathcal {M}^{\textrm{CB}} \subseteq \mathcal {M}^{\textrm{ST}}\). It should be noted that, depending on the details of \(\varvec{j}(\varvec{x})\), these manifolds can be empty.

A steady state is a state at which \(\dot{\varvec{x}}=0\) holds. The DB condition \(\varvec{j}(\varvec{x})=\varvec{0}\) means that all the fluxes are zero at \(\varvec{x}\). In other words, all the forward and reverse fluxes are balanced at \(\varvec{x}\), i.e., \(j_{e}^{+}(\varvec{x})=j_{e}^{-}(\varvec{x})\). The CB condition is equivalent to the balance of all influx and outflux at each hypervertex of \(\mathbb {H}\). As we will see later, DB states are tightly linked to the equilibrium state and equilibrium flow. The CB state is relevant as an extension of the equilibrium state to nonequilibrium flows.

4.4 Generalized gradient flow and De Giorgi’s formulation

When \(\varvec{f}(\varvec{x})\) can be represented as a gradient, i.e., \(\varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})\) of a function \(\mathcal {F}(\varvec{x})\in \mathbb {R}\) on the density space, Eq. 45 is reduced to the generalized gradient flow of \(\mathcal {F}(\varvec{x})\).

Definition 23

(Generalized gradient flow) \(\varvec{x}(t)\) is a generalized gradient flow when it is a generalized flow driven by a gradient force of \(\mathcal {F}(\varvec{x})\), i.e., \(\varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})\) and

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \varvec{j}(\varvec{x})=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})]. \end{aligned}$$
(49)

The following proposition ensures that the generalized gradient flow behaves like the conventional gradient flow:

Proposition 3

(\(\mathcal {F}(\varvec{x})\) is non-increasing along the trajectory of generalized gradient flow) For a trajectory \(\{\varvec{x}_{t}\}_{t \in [0,\tau ]}\) of the generalized gradient flow of \(\mathcal {F}(\varvec{x})\), \(\mathcal {F}(\varvec{x}_{t})\) is always decreasing except at the DB states \(\mathcal {M}^{\textrm{DB}}\). In addition, all the steady states of the generalized gradient flow are the DB states, i.e., \(\mathcal {M}^{\textrm{ST}}=\mathcal {M}^{\textrm{DB}}\)Footnote 46.

Proof

\(\mathcal {F}(\varvec{x}_{t})\) is non-increasing over time as follows:

$$\begin{aligned} \dot{\mathcal {F}}(\varvec{x}_{t})&=\langle \dot{\varvec{x}} , \!\partial _{\varvec{x}} \mathcal {F}(\varvec{x})\rangle \!=\! -\langle \textrm{div}_{\mathbb {S}} \!\partial \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})] , \!\partial _{\varvec{x}} \mathcal {F}(\varvec{x})\rangle \nonumber \\&= -\langle \partial \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})] , \textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}} \mathcal {F}(\varvec{x})\rangle \nonumber \\&=-\langle \varvec{j}(\varvec{x}),\varvec{f}(\varvec{x})\rangle = -\left( \Psi _{\varvec{x}}[\varvec{j}(\varvec{x})] + \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]\right) \le 0, \end{aligned}$$
(50)

where Eq. 40 is used. The equality holds iff \(\varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})=0\) because \(\Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]=\Psi _{\varvec{x}}[\varvec{j}(\varvec{x})]=0\) iff \(\varvec{f}(\varvec{x})=\varvec{j}(\varvec{x})=0\). Thus, \(\dot{\mathcal {F}}(\varvec{x}_{t})=0\) iff \(\varvec{x}_{t}\in \mathcal {M}^{\textrm{DB}}\). Because \(\dot{\varvec{x}}_{t}=0 \Rightarrow \dot{\mathcal {F}}(\varvec{x}_{t})=0\), \(\mathcal {M}^{\textrm{ST}}=\mathcal {M}^{\textrm{DB}}\). \(\square \)

It should be noted that, even if \(\mathcal {F}(\varvec{x})\) has a single minimum, the steady state \(\varvec{x}_{st}:=\lim _{t \rightarrow \infty }\varvec{x}(t)\) may not be the minimum, because \(\dot{\mathcal {F}}(\varvec{x}_{t})=0\) holds for any \(\varvec{x}\in \mathcal {M}^{\textrm{DB}}\).Footnote 47

The generalized gradient flow of this form (Eq. 49) was devised in the process to extend the conventional gradient flow to metric spaces [122, 123].Footnote 48 Furthermore, dissipation functions have been recognized since the seminal work of Onsager [124,125,126]. However, only quadratic dissipation functions have been investigated until very recently [75,76,77,78,79,80,81,82]. This may be partly because we lack an adequate geometric language to handle the non-quadratic cases, i.e., information geometry. Actually, if the dissipation function is quadratic \(\Psi ^{q,*}_{\varvec{x}}[\varvec{f}]\) as in Eq. 43, then the generalized flow (Eq. 45) formally reduces to the flow on a Riemannian manifold with the metric \((\mathbb {S}M^{*}_{\varvec{x}}\mathbb {S}^{T})^{-1}\).

The non-negativity of \(\dot{\mathcal {F}}(\varvec{x}_{t})\) is essentially attributed to the fact that \(\dot{\mathcal {F}}(\varvec{x}_{t})=-\langle \varvec{j}(\varvec{x}),\varvec{f}(\varvec{x})\rangle \) holds in Eq. 50 for the generalized gradient flow. The converse also holds.

Proposition 4

(De Giorgi’s formulation of generalized gradient flow [75, 79]) Let \(\varvec{x}_{t}\) be a generalized flow induced by a force \(\varvec{f}(\varvec{x})\). \(\varvec{x}_{t}\) is the generalized gradient flow of \(\mathcal {F}(\varvec{x})\) iff

$$\begin{aligned} \dot{\mathcal {F}}(\varvec{x}_{t})&=-\langle \varvec{j}(\varvec{x}),\varvec{f}(\varvec{x})\rangle = -\left( \Psi _{\varvec{x}}[\varvec{j}(\varvec{x})] + \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]\right) . \end{aligned}$$
(51)

holds. The integral form of Eq. 51,

$$\begin{aligned} \mathcal {F}(\varvec{x}_{0})-\mathcal {F}(\varvec{x}_{t})=\int _{0}^{t} \left[ \Psi ^{*}_{\varvec{x}_{t'}}(\varvec{f}(\varvec{x}_{t'}))+ \Psi _{\varvec{x}_{t'}}(\varvec{j}(\varvec{x}_{t'}))\right] \textrm{d}t', \end{aligned}$$
(52)

is called De Giorgi’s \((\Psi ,\Psi ^{*})\)-formulation of generalized gradient flow.

Proof

For a generalized flow \(\varvec{x}_{t}\) driven by force \(\varvec{f}(\varvec{x})\) as in Eq. 45 and for any \(\mathcal {F}(\varvec{x})\), the following inequality holds:

$$\begin{aligned} \dot{\mathcal {F}}(\varvec{x}_{t})=\left\langle \dot{\varvec{x}},\frac{\partial \mathcal {F}(\varvec{x}_{t})}{\partial \varvec{x}} \right\rangle&=\left\langle -\varvec{j}(\varvec{x}_{t}),\textrm{grad}_{\mathbb {S}}\frac{\partial \mathcal {F}(\varvec{x}_{t})}{\partial \varvec{x}} \right\rangle \end{aligned}$$
(53)
$$\begin{aligned}&=-\left[ \Psi _{\varvec{x}_{t}}(\varvec{j}(\varvec{x}_{t})) + \Psi ^{*}_{\varvec{x}_{t}}(\varvec{f}'(\varvec{x}_{t}))\right] \nonumber \\&\quad + \mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}_{t}}[\varvec{j}(\varvec{x}_{t});\varvec{f}'(\varvec{x}_{t})] \end{aligned}$$
(54)
$$\begin{aligned}&\ge -\left[ \Psi _{\varvec{x}_{t}}(\varvec{j}(\varvec{x}_{t})) + \Psi ^{*}_{\varvec{x}_{t}}(\varvec{f}'(\varvec{x}_{t}))\right] , \end{aligned}$$
(55)

where we define \(\varvec{f}'(\varvec{x}):=\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}(\varvec{x})\). The last inequality becomes an equality if and only if \(\varvec{f}'(\varvec{x}_{t})\) is the Legendre dual of \(\varvec{j}(\varvec{x}_{t})\),Footnote 49 i.e.,

$$\begin{aligned} \mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}_{t}}[\varvec{j}(\varvec{x}_{t});\varvec{f}'(\varvec{x}_{t})]=0 \Longleftrightarrow \varvec{j}(\varvec{x}_{t})=\partial \Psi ^{*}_{\varvec{x}_{t}}[\varvec{f}'(\varvec{x}_{t})] \end{aligned}$$
(56)

Thus, Eq. 51 holds only when \(\varvec{x}_{t}\) is the generalized gradient flow of \(\mathcal {F}(\varvec{x})\). \(\square \)

De Giorgi’s formulation is a well-established approach for defining gradient flow in metric spaces [122].

4.5 Equilibrium and nonequilibrium flow

In this work, we mainly focus on the case that \(\mathcal {F}(\varvec{x})= \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\) where \(\mathcal {D}^{\mathcal {X}}_{\Phi }\) is the Bregman divergence associated with a thermodynamic function \(\Phi \).

Definition 24

(Equilibrium force, equilibrium flux, and equilibrium flow) The force generated by the gradient of Bregman divergence associated with a thermodynamic function \(\Phi \) is called the (thermodynamic) equilibrium force, and the following equation is denoted as the thermodynamic gradient equation:

$$\begin{aligned} \varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}], \end{aligned}$$
(57)

where \(\tilde{\varvec{x}}\in \mathcal {X}\) is a parameter. The dual of \(\varvec{f}(\varvec{x})\), i.e., \(\varvec{j}(\varvec{x})=\partial \Psi ^{*}_{\varvec{x}}[\varvec{f}(\varvec{x})]\), is called the equilibrium flux: A generalized flow \(\varvec{x}(t)\) is an equilibrium flow if it is driven by the equilibrium force:

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]]. \end{aligned}$$
(58)

Using the relation \(\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]=\partial \Phi (\varvec{x}) - \tilde{\varvec{y}}\) where \(\tilde{\varvec{y}}=\partial \Phi (\tilde{\varvec{x}})\), Eq. 58 can be rewritten as

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}}\left[ \partial \Psi ^{*}_{\varvec{x}}\left[ \textrm{grad}_{\mathbb {S}}\left\{ \partial \Phi (\varvec{x}) - \tilde{\varvec{y}}\right\} \right] \right] , \end{aligned}$$
(59)

which explicitly shows the contribution of both the thermodynamic function and the dissipation function to the dynamics (Fig. 3a).

Various properties of the equilibrium flow (Eq. 58) can be obtained from the doubly dual flat structure as we will see in the following sections. In addition, the equilibrium flow captures the properties that the dynamics of thermodynamic equilibrium systems should have. In this sense, the equilibrium flow is the mathematical representation of the dynamics of equilibrium systems.

Fig. 3
figure 3

Schematic representation of the equilibrium (a) and nonequilibrium flow (b)

Beyond the gradient equilibrium flow, we also consider the non-gradient nonequilibrium flow of the following type:

Definition 25

(Nonequilibrium force and nonequilibrium flow) The force generated by a shift of the equilibrium force

$$\begin{aligned} \varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}] + \varvec{f}_{NE}, \end{aligned}$$
(60)

is called nonequilibrium force if \(\varvec{f}_{NE}\not \in \textrm{Im}[\mathbb {S}^{T}]\).Footnote 50 If the shift \(\varvec{f}_{NE}\) satisfies \(\varvec{f}_{NE}\in \textrm{Im}[\mathbb {S}^{T}]\), then \(\varvec{f}(\varvec{x})\) is reduced to the equilibrium force \(\varvec{f}_{NE}=\varvec{0}\) by appropriately changing \(\tilde{\varvec{x}}\). The nonequilibrium flow is the flow induced by the nonequilibrium force (Fig. 3b):

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}\left[ \left[ \textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\right] +\varvec{f}_{NE}\right] . \end{aligned}$$
(61)

In the next section, we show that this equation can cover a sufficiently wide class of models, e.g., all types of rLDG and CRN with extended LMA kinetics. Equation 61 can also be associated with nonequilibrium dynamics with a constant environmental force. The techniques in information geometry, Hessian geometry, and convex analysis enable us to investigate such non-gradient dynamics.

Remark 6

(Variational modeling [128]) We introduced and characterized dynamics based on the thermodynamic functions and dissipation functions. While we employed a restricted definition in order to link dynamics to information geometry, we may further generalize this approach by appropriately choosing the state space, \(\varvec{f}(\varvec{x})\), \(\Psi ^{*}_{\varvec{x}}(\varvec{f})\), and \(\textrm{div}_{\mathbb {S}}\). For example, we may consider a \(\varvec{x}\)-dependent and noninteger-valued matrix for \(\mathbb {S}(\varvec{x})\). The equilibrium flow may not be restricted to \(\mathcal {F}(\varvec{x})= \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\), and the nonequilibrium flow may be defined for \(\varvec{x}\)-dependent \(\varvec{f}_{NE}(\varvec{x})\). This type of approach for modeling dissipative dynamics has been known as variational modeling.

Before closing this section, we mention that the existence of DB states, i.e., \(\mathcal {M}^{\textrm{DB}}\ne \emptyset \), is necessary and sufficient for a nonequilibrium flow to be an equilibrium flow.

Proposition 5

(Detailed balance condition and equilibrium flow) Consider a flow given by Eq. 61. If \(\mathcal {M}^{\textrm{DB}}\ne \emptyset \), then the flow is equilibrium, i.e., \(\varvec{f}_{NE}\in \textrm{Im}\mathbb {S}^{T}\).

Proof

\(\mathcal {M}^{\textrm{DB}}\ne \emptyset \) means that there exists \(\varvec{x}_{DB}\) satisfying \(\varvec{j}(\varvec{x}_{DB})=\varvec{0}\). Then we have \(\varvec{j}(\varvec{x}_{DB})=\varvec{0}\Leftrightarrow \varvec{f}(\varvec{x}_{DB})=\varvec{0}\). If \(\varvec{f}_{NE}\not \in \textrm{Im}[\mathbb {S}^{T}]\), \(\varvec{f}_{NE}\ne \varvec{0}\) and thus \(\varvec{f}(\varvec{x}) \ne \varvec{0}\) for all \(\varvec{x}\in \mathcal {X}\). Thus, \(\varvec{f}_{NE} \in \textrm{Im}[\mathbb {S}^{T}]\) if \(\mathcal {M}^{\textrm{DB}}\ne \emptyset \).

The necessity follows basically from Prop. 3, but we have to show \(\mathcal {M}^{\textrm{ST}} \ne \emptyset \). This will be shown in the following section (Lemma 1).

5 Explicit form of thermodynamic and dissipation functions

Before investigating the dynamics of the equilibrium (Eq. 58) and nonequilibrium (Eq. 61) flows, we show how the flows can be associated with the dynamics on graphs and hypergraphs via specific forms of the thermodynamic and dissipation functions. The forms of functions depend on the functional form of the flux that we assume: Eq. 3 for rLDG, Eq. 12 for CRN with LMA kinetics, and Eq. 18 for FPE. It should be noted that the choice of the thermodynamic function and the dissipation function is not unique for a given dynamics in general. Depending on the purpose, we should choose or find an appropriate set of functions.

5.1 Explicit form of thermodynamic functions for rLDG and CRN

For rLDG (Eq. 3) and CRN with LMA kinetics (Eq. 12), the following pair of thermodynamic functions is particularly relevantFootnote 51:

$$\begin{aligned} \Phi (\varvec{x})&:=\left[ \ln \varvec{x} - \ln \varvec{x}^{o} - \varvec{1}\right] ^{T}\varvec{x} = \sum _{i=1}^{N_{\mathbb {X}}}\left[ \ln \frac{x_{i}}{x_{i}^{o}}-1\right] x_{i},\nonumber \\ \Phi ^{*}(\varvec{y})&:=(\varvec{x}^{o})^{T}e^{\varvec{y}} =\sum _{i=1}^{N_{\mathbb {X}}}x_{i}^{o}e^{y_{i}}, \end{aligned}$$
(62)

which induce the following Legendre transformation:

$$\begin{aligned} \varvec{y}&=\partial \Phi (\varvec{x}) = \ln \varvec{x}-\ln \varvec{x}^{o},&\varvec{x}&=\partial \Phi ^{*}(\varvec{y}) = \varvec{x}^{o}\circ e^{\varvec{y}}. \end{aligned}$$
(63)

Here, \(\mathcal {Y}=\mathbb {R}^{N_{\mathbb {X}}}\), and \(\varvec{x}^{o}\in \mathcal {X}\) is a parameter determining the point in \(\mathcal {X}\) that is associated with the origin of \(\mathcal {Y}\) via the Legendre transformation. For these thermodynamic functions, the Bregman divergence is reduced to the generalized Kullback-Leibler divergence.

$$\begin{aligned} \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \varvec{x}']=\left( \ln \frac{\varvec{x}}{\varvec{x}'}\right) ^{T}\varvec{x}-\varvec{1}^{T}(\varvec{x}-\varvec{x}'). \end{aligned}$$
(64)

These thermodynamic functions and the generalized KL divergence are separable.

If we choose \(\varvec{x}^{o}=\varvec{1}\), then the conventional dual representation for the probability density \(\varvec{p}\) on a discrete space is recovered:

$$\begin{aligned} \Phi (\varvec{p})&\! = \left[ \ln \varvec{p} \!-\! \varvec{1}\right] ^{T}\varvec{p},&\!\Phi ^{*}(\varvec{y})&= \varvec{1}^{T}e^{\varvec{y}}, \nonumber \\ \!\varvec{y}&\!=\partial _{\varvec{p}}\Phi (\varvec{p}) \!=\! \ln \varvec{p},&\!\varvec{p}&\!=\partial _{\varvec{y}}\Phi ^{*}(\varvec{y}) = e^{\varvec{y}}. \end{aligned}$$
(65)

In this case, \(\mathcal {Y}\) is the space of the logarithm of \(\varvec{p}\). These representations hold even if \(\varvec{p}\) is not a probability density. If \(\varvec{p}\) satisfies \(\varvec{1}^{T}\varvec{p}=1\), the generalized KL divergence becomes the normal KL divergence \(\mathcal {D}^{\mathcal {X}}[\varvec{p}\Vert \varvec{p}']=\left( \ln \frac{\varvec{p}}{\varvec{p}'}\right) ^{T}\varvec{p}\).Footnote 52

5.2 Explicit form of dissipation functions for rLDG and CRN

To determine the dissipation functions, we need the definition of force, which may depend on the phenomena and purpose.Footnote 53 In physics, the flux-force relations, which are also called constitutive equations [129], are central because they determine what kind of change is induced by an incurred force.Footnote 54 For rMJP and CRNs, the flux and force are conventionally defined using the one-way fluxes, \(\varvec{j}^{+}(\varvec{x})\) and \(\varvec{j}^{-}(\varvec{x})\) as

$$\begin{aligned} \varvec{j}&=\varvec{j}^{+}-\varvec{j}^{-},&\varvec{f}&=\ln \varvec{j}^{+} - \ln \varvec{j}^{-}, \end{aligned}$$
(66)

where the dependency of \(\varvec{j}^{\pm }(\varvec{x})\) on \(\varvec{x}\) is abbreviated for notational simplicity. In physics, assuming this form of force-flux relation goes by the name of the local detailed balance (LDB) assumption,Footnote 55 or the generalized detailed balance assumption.Footnote 56 By defining the frenetic activity [132]:

$$\begin{aligned} \varvec{\omega }:=2 \sqrt{\varvec{j}^{+}\circ \varvec{j}^{-}}\in \mathbb {R}_{{\ge } 0}^{N_{\mathbb {e}}}, \end{aligned}$$
(67)

we have a relation \(\varvec{j}=\varvec{\omega }\circ \left[ \frac{\exp (\varvec{f}/{2})-\exp (-\varvec{f}/{2})}{2}\right] \). For a fixed \(\varvec{\omega }\), this relation between the pair \((\varvec{j},\varvec{f})\) is a one-to-one Legendre duality induced by the following specific form of dissipation functions:

$$\begin{aligned} {\begin{matrix} \Psi ^{*}_{\varvec{\omega }}(\varvec{f})&{}:={2} \varvec{\omega }^{T} \left[ \cosh (\varvec{f}/{2})-\varvec{1}\right] , \\ \Psi _{\varvec{\omega }}(\varvec{j})&{}:={2}\varvec{\omega }^{T}\left( \textrm{diag}\left[ \frac{\varvec{j}}{\varvec{\omega }}\right] \sinh ^{-1}\left( \frac{\varvec{j}}{\varvec{\omega }}\right) - \left[ \sqrt{\varvec{1}+\left( \frac{\varvec{j}}{\varvec{\omega }} \right) ^{2}}-\varvec{1}\right] \right) , \end{matrix}} \end{aligned}$$
(68)

which lead to the Legendre transformation:

$$\begin{aligned} \varvec{j}&=\partial \Psi ^{*}_{\varvec{\omega }}(\varvec{f})=\varvec{\omega }\circ \sinh (\varvec{f}/{2}),&\varvec{f}&=\partial \Psi _{\varvec{\omega }}(\varvec{j})={2} \sinh ^{-1}\left( \frac{\varvec{j}}{\varvec{\omega }}\right) . \end{aligned}$$
(69)

We can easily verify that these functions satisfy the conditions for dissipation functions, i.e., Eq. 34, Eq. 35, and Eq. 33.

For the flux \(\varvec{j}_{\textrm{MA}}(\varvec{x})\) of LMA kinetics (Eq. 12), the force and activity becomeFootnote 57

(70)

where we introduced a transformation of the kinetic parameters \((\varvec{k}^{+}, \varvec{k}^{-})\) into the force part \(\varvec{K}\) and activity part \(\varvec{\kappa }\) as \(\varvec{\kappa }:=\sqrt{\varvec{k}^{+}\circ \varvec{k}^{-}}\) and \(\varvec{K}:=\varvec{k}^{+}/\varvec{k}^{-}\).Footnote 58 Because \(\varvec{k}^{\pm }=\varvec{\kappa }\circ \varvec{K}^{\pm 1/2}\) holds, \((\varvec{\kappa },\varvec{K})\) has the same information as \((\varvec{k}^{+}, \varvec{k}^{-})\). Moreover, we can verify that the force and activity are dependent only on \(\varvec{K}\) and \(\varvec{\kappa }\), respectively. The dissipation functions of the forms above and their relations to rLDG and CRN were derived from the large deviation function of the corresponding microscopic stochastic models [75, 133]. Actually, the Bregman divergence \(\mathcal {D}_{\varvec{x}}^{\mathcal {J}}[\varvec{j};\varvec{j}_{\textrm{MA}}(\varvec{x})]\) of the dissipation functions is identical to the rate function of the flux for rMJP and CRN. Thus, these dissipation functions are keystones connecting macroscopic and microscopic dynamics.

If there exists \(\tilde{\varvec{y}}\) satisfying \(-\mathbb {S}^{T}\tilde{\varvec{y}}=\ln \varvec{K}\), i.e., \(\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}\), the force in Eq. 70 is represented as

$$\begin{aligned} \varvec{f}_{\textrm{MA}}(\varvec{x};\varvec{K})&=\textrm{grad}_{\mathbb {S}}\left( \ln \frac{\varvec{x}}{\tilde{\varvec{x}}}\right) =\textrm{grad}_{\mathbb {S}} \partial _{\varvec{x}} \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \tilde{\varvec{x}}] \in \textrm{Im}\mathbb {S}^{T}, \end{aligned}$$
(71)

where \(\tilde{\varvec{x}}\) is the Legendre conjugate of \(\tilde{\varvec{y}}\).Footnote 59 Thus, CRN (and rMJP) is an equilibrium flow of the generalized KL divergence \(\mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \tilde{\varvec{x}}]\) when the parameter \(\varvec{K}\) satisfies \(\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}\). In chemistry, the condition \(\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}\) is called Wegscheider’s equilibrium condition [47, 134], and the CRN satisfying this parametric condition is called equilibrium CRN.Footnote 60 Even if \(\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}\) is not satisfied, we can represent \(\ln \varvec{K}= -\mathbb {S}^{T}\tilde{\varvec{y}} + \varvec{f}_{NE}\) with \(\varvec{f}_{NE}\not \in \textrm{Im}\mathbb {S}^{T}\). The force in Eq. 70 is always represented as

$$\begin{aligned} \varvec{f}_{\textrm{MA}}(\varvec{x};\varvec{K})&=\textrm{grad}_{\mathbb {S}}\left( \ln \frac{\varvec{x}}{\tilde{\varvec{x}}}\right) + \varvec{f}_{NE}=\left[ \textrm{grad}_{\mathbb {S}} \left[ \partial _{\varvec{x}} \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \tilde{\varvec{x}}]\right] + \varvec{f}_{NE}\right] , \end{aligned}$$
(72)

which leads to the nonequilibrium flow (Eq. 61). Thus, CRN with LMA kinetics as well as rLDG are generally within the class of Eq. 61.

Example 2

(Simplified Brusselator CRN [8, 104] (continued)) For the Brusselator CRN introduced in Ex. 1, the force and activity defined in Eq. 70 can be explicitly represented as

(73)

Remark 7

(Wegscheider’s equilibrium condition and Detailed balance condition) While we defined equilibrium flow by the specific functional form of force and obtained Wegscheider’s equilibrium condition as the necessary and sufficient condition to have the equilibrium force under LMA kinetics, the equilibrium dynamics is often defined by the existence of the steady state satisfying the DB condition (Eq. 48) in CRN theory. In addition, the DB condition is also often assumed in statistics when we design or analyze a random walk in parameter spaces, e.g., in the Markov Chain Monte Carlo (MCMC) simulations or in other random-walk-based optimization schemes.Footnote 61 These two are equivalent for (extended) LMA kinetics. Actually, \(\mathcal {M}^{\textrm{DB}} \ne \emptyset \) means that there exists \(\varvec{x}_{DB} \in \mathcal {X}\) such that \(\varvec{j}_{MA}(\varvec{x}_{DB})=\varvec{0} \Leftrightarrow - \mathbb {S}^{T}\ln \varvec{x}_{DB}=\ln \varvec{K}\). From the Fredholm alternative, we obtain the Wegscheider’s equilibrium condition \(\ln \varvec{K}\in \textrm{Im}\mathbb {S}^{T}\) for the existence of \(\varvec{x}_{DB}\).

Remark 8

(Linear graph Laplacian dynamics) The linear graph Laplacian dynamics defined by Eq. 22 can be formally regarded as a generalized flow. From the form of the graph Laplacian (Eq. 21),Footnote 62 it is easy to see that Eq. 22 coincides with Eq. 58 if

$$\begin{aligned} \Phi (\varvec{x})&=\frac{1}{2}\langle \varvec{x},M_{0}\varvec{x} \rangle ,&\Psi ^{*,q}_{\varvec{x}}(\varvec{f})&=\frac{1}{2}\langle \varvec{f},M^{1}\varvec{f}\rangle , \end{aligned}$$
(74)

where \(M_{0}=I\), \(M^{1}=\textrm{diag}[\varvec{k}]\), and \(\tilde{\varvec{x}}=\varvec{0}\). In contrast to rLDG, the natural state space and the corresponding dual is \(\mathcal {X}=\mathcal {Y}=\mathbb {R}^{N_{\mathbb {v}}}\).Footnote 63 In [23], non-quadratic general \(\Phi (\varvec{x})\) is considered as a class of nonlinear diffusion on a network from information geometric viewpoint.

5.3 Some remarks on the dissipation functions for rLDG and CRN

The dissipation functions in Eq. 68 have several notable properties. First, they are separable:

$$\begin{aligned} \Psi ^{*}_{\varvec{\omega }(\varvec{x})}(\varvec{f})&= \sum _{e=1}^{N_{\mathbb {e}}} \omega _{e}(\varvec{x}) \psi ^{*}(f_{e}),&\Psi _{\varvec{\omega }(\varvec{x})}(\varvec{j})&= \sum _{e=1}^{N_{\mathbb {e}}} \omega _{e}(\varvec{x}) \psi (j_{e}/\omega _{e}(\varvec{x})), \end{aligned}$$
(75)

where

$$\begin{aligned} \psi ^{*}(f)&:={2} \left[ \cosh (f/{2})-1\right] \in [0,\infty ), \end{aligned}$$
(76)
$$\begin{aligned} \psi (j)&:={2}\left( j\sinh ^{-1}\left( j\right) - \left[ \sqrt{1+\bar{j}^{2}}-1\right] \right) \in [0,\infty ). \end{aligned}$$
(77)

and \(\varvec{\omega }(\varvec{x})\) is local: \(\omega _{e}(\varvec{x})=2 \kappa _{e}\prod _{i=1}^{N_{\mathbb {X}}}x_{i}^{(\gamma ^{+}_{i,e}+\gamma ^{-}_{i,e})/2}\). The thermodynamic functions in Eq. 62 are also separable.Footnote 64

Second, the scalar function \(\psi ^{*}(f)\) is the N-function. The N-function of the \((\cosh (f)-1)\)-type and the associated Orlicz space have been employed for establishing the infinite-dimensional information geometry by Pistone [72, 135, 136]. In functional analysis, the Orlicz space is a generalization of the \(L^{p}\) spaces, which arise naturally when we work on the \(L \log ^{+} L\) space for the divergences and large deviation functions. Hence, the dissipation functions in Eq. 68 are tightly related to such topics.

Third, various information geometric measures and quantities are related to the dissipation functions in Eq. 68 and also to the associated quantities as follows:

$$\begin{aligned} \frac{1}{4} \Psi ^{*}_{\varvec{\omega }}(\varvec{f})&=\frac{1}{2} \sum _{e=1}^{N_{\mathbb {e}}}\left[ \sqrt{j_{e}^{+}}- \sqrt{j_{e}^{-}}\right] ^{2} =: \mathcal {D}_{Hel}[\varvec{j}^{+};\varvec{j}^{-}]^{2} \end{aligned}$$
(78)
$$\begin{aligned} \frac{1}{2}\varvec{1}^{T}\varvec{\omega }&= \sum _{e=1}^{N_{\mathbb {e}}}\sqrt{j_{e}^{+}j_{e}^{+}} =:\textrm{BC}[\varvec{j}^{+};\varvec{j}^{-}] \end{aligned}$$
(79)
$$\begin{aligned} \langle \varvec{j},\varvec{f}\rangle&= \sum _{e=1}^{N_{\mathbb {e}}}(j^{+}_{e}-j^{-}_{e})\ln \frac{j^{+}_{e}}{j^{-}_{e}}=: \mathcal {D}_{Jef}[\varvec{j}^{+};\varvec{j}^{-}], \end{aligned}$$
(80)

where \(\mathcal {D}_{Hel}[\varvec{j}^{+};\varvec{j}^{-}]\), \(\textrm{BC}[\varvec{j}^{+};\varvec{j}^{-}]\), and \(\mathcal {D}_{Jef}[\varvec{j}^{+};\varvec{j}^{-}]\) are the Hellinger–Kakutani distance, the Bhattacharyya coefficient, and the Jeffreys divergence (symmetrized KL divergence) for \(\varvec{j}^{+}\) and \(\varvec{j}^{-}\), respectively. In addition, in physics, the bilinear pairing \(\langle \varvec{j},\varvec{f}\rangle \) of a Legendre dual pair and its approximation using the Hessian matrix are often referred to as the entropy production rate (EPR) \(\dot{\Sigma }\) and pseudo-entropy production rate (pEPR) \(\dot{\Sigma }^{p}\), respectively [83, 137]:

$$\begin{aligned} \dot{\Sigma }&:=\langle \varvec{j},\varvec{f}\rangle =\sum _{e=1}^{N_{\mathbb {e}}}(j^{+}_{e}-j^{-}_{e})\ln \frac{j^{+}_{e}}{j^{-}_{e}}. \end{aligned}$$
(81)
$$\begin{aligned} \dot{\Sigma }^{p}&:=\langle \varvec{j}, G_{\varvec{\omega },\varvec{j}}\varvec{j}\rangle = 2\sum _{e=1}^{N_{\mathbb {e}}}\frac{(j^{+}_{e}-j^{-}_{e})^{2}}{j^{+}_{e}+j^{-}_{e}}, \end{aligned}$$
(82)

where we treat \(\varvec{j}\in \mathcal {J}_{\varvec{x}}\) as a member of \(\mathcal {T}_{\varvec{j}}\mathcal {J}_{\varvec{x}}\) by the isomorphism: \(\mathcal {J}_{\varvec{x}} \cong \mathcal {T}_{\varvec{j}}\mathcal {J}_{\varvec{x}} \cong C_{1}(\mathbb {H})\). The pEPR \(\dot{\Sigma }^{p}\) is an approximation of EPR by replacing \(\varvec{f}=\partial \Psi _{\varvec{\omega }}(\varvec{j})\) with \(G_{\varvec{\omega },\varvec{j}}\varvec{j}\) and works as a lower bound of \(\dot{\Sigma }\): \(\dot{\Sigma }\ge \dot{\Sigma }^{p}\) [137]Footnote 65.

Finally, the dissipation functions in Eq. 68 are not the unique choice to reproduce the force-flux relation in Eq. 66. The quadratic dissipation functions \(\Psi ^{q,*}_{\varvec{x}}(\varvec{f}):=\frac{1}{2}\langle \varvec{f},M^{*}_{\varvec{x}} \varvec{f} \rangle \) in Eq. 43 with the following diagonal metric tensor can reproduce the relation in Eq. 66:

$$\begin{aligned} M^{*}_{\varvec{x}}=\textrm{diag}\left[ \frac{\varvec{j}^{+}(\varvec{x}) -\varvec{j}^{-}(\varvec{x})}{\ln \varvec{j}^{+}(\varvec{x})- \ln \varvec{j}^{-}(\varvec{x})} \right] =\textrm{diag}\left[ \left( \frac{j^{+}_{e}(\varvec{x})-j^{-}_{e}(\varvec{x})}{\ln j^{+}_{e}(\varvec{x})- \ln j^{-}_{e}(\varvec{x})}\right) _{e} \right] . \end{aligned}$$
(83)

This type of quadratic dissipation function was proposed even earlier than the non-quadratic ones [138,139,140] and has been investigated [104, 141, 142]. Its advantage is that the induced geometry is Riemannian, and thus the information geometric argument is not necessarily required. In addition, this Riemannian geometric structure is analogous to the formal Riemannian geometric structure of FPE and other diffusion processes on continuous manifolds induced via the \(L^{2}\)-Wasserstein geometry [65, 66] (Fig. 4). Thus, this quadratic dissipation function provides a consistent extension of these results for FPE and diffusion processes to graphs and hypergraphs. Nevertheless, the doubly dual flat structure with the non-quadratic dissipation functions that we introduce is also another sound generalization of the formal Riemannian geometry of FPE, as we see in the next subsection.

As long as we focus only on the trajectory of the generalized flow (Eq. 45), the difference between the quadratic and non-quadratic functions does not matter because both induce the same dynamics. However, the Bregman divergence of the quadratic dissipation functions is not directly related to the rate function of the microscopic stochastic models, while that of nonquadratic ones in Eq. 68 is [133]. Thus, if we consider projections of fluxes and forces in the edge spaces, different choices of dissipation functions lead to different projections. In addition, for non-quadratic dissipation functions, the contributions of the kinetic parameters \(\varvec{k}^{\pm }\) can be clearly separated into the force part \(\varvec{K}\) and the activity part \(\varvec{\kappa }\) in the case of CRN with the LMA kinetics (Eq. 70). This separation enables a physical realization of the projected flux as we derive in the following section.

Fig. 4
figure 4

A relationship between Wasserstein geometry and information geometry. The formal Riemannian geometric structure appears at their intersection. It should be noted that, while the regions of \(L^{p}\)-Wasserstein distance for \(p\ne 2\) and nonquadratic dissipation function are not overlapping in this figure, this does not mean that they are unrelated. There may be undiscovered relations between these two regions

5.4 Explicit forms of thermodynamic and dissipation functions for FPE

For FPE, the dualistic representation of the density \(p(\varvec{r})\) and its logarithm \(y(\varvec{r})=\ln p(\varvec{r})\) is also relevant. This duality is induced formally by the following thermodynamic functionsFootnote 66:

$$\begin{aligned} \Phi [p]&= \int [\ln p(\varvec{r}) - \varvec{1}]p(\varvec{r})\textrm{d}\varvec{r},&\Phi ^{*}[y]&= \int e^{y(\varvec{r})}\textrm{d}\varvec{r}, \end{aligned}$$
(84)

the Legendre transformations of which are

$$\begin{aligned} y(\varvec{r})&=\frac{\delta \Phi [p]}{\delta p} = \ln p(\varvec{r}),&p(\varvec{r})&=\frac{\delta \Phi ^{*}[y]}{\delta y} = e^{y(\varvec{r})}. \end{aligned}$$
(85)

The Bregman divergence becomes the KL divergence \(\mathcal {D}_{\mathcal {X}}[p\Vert p']=\int \textrm{d}\varvec{r}p(\varvec{r})\ln \frac{p(\varvec{r})}{p'(\varvec{r})}\). In physics, the flux and force for FPE are defined conventionally as

$$\begin{aligned} \varvec{j}_{\textrm{FP}}[p(\varvec{r})]&=D_{0}p(\varvec{r})\left\{ \varvec{F}(\varvec{r})/D_{0}-\nabla \ln p(\varvec{r}) \right\} , \end{aligned}$$
(86)
$$\begin{aligned} \varvec{f}_{\textrm{FP}}[p(\varvec{r})]&=D^{-1}_{0}\varvec{F}(\varvec{r})- \nabla \ln p(\varvec{r}) . \end{aligned}$$
(87)

The dissipation functions associated with the force-flux relation above are

$$\begin{aligned} \Psi ^{\textrm{FP},*}_{\varvec{\omega }[p]}[\varvec{f}]&= \frac{1}{2}\int \varvec{f}(p(\varvec{r}))^{T}M^{*}_{p(\varvec{r})}\varvec{f}(p(\varvec{r}))\textrm{d}\varvec{r},\nonumber \\ \Psi ^{\textrm{FP}}_{\varvec{\omega }[p]}[\varvec{j}]&= \frac{1}{2}\int \varvec{j}(p(\varvec{r}))^{T}M_{p(\varvec{r})} \varvec{j}(p(\varvec{r}))\textrm{d}\varvec{r}, \end{aligned}$$
(88)

where \(M^{*}_{p(\varvec{r})}:=\textrm{diag}[ \varvec{\omega }[p(\varvec{r})] ] \), \(M_{p(\varvec{r})}:=(M^{*}_{p(\varvec{r})})^{-1}\), and \(\omega _{i}[p(\varvec{r})]=D_{0} p(\varvec{r})\). Thus, the dissipation functions are formally quadratic and positive definite. If \(\varvec{F}(\varvec{r})\) is a gradient of \(U(\varvec{r})\) as \(\varvec{F}(\varvec{r})=D_{0}\nabla U(\varvec{r})\), \(\varvec{f}_{\textrm{FP}}[p(\varvec{r})]=- \nabla \ln \frac{p(\varvec{r})}{\tilde{p}(\varvec{r})}\) holds where \(\tilde{p}(\varvec{r}):=\exp [U(\varvec{r})]\). Then, the dissipation functions, the bilinear pairing \(\langle \varvec{j}_{\textrm{FP}},\varvec{f}_{\textrm{FP}}\rangle \), the EPR \(\dot{\Sigma }_{\textrm{FP}}\) in Eq. 81, and the pEPR \(\dot{\Sigma }^{p}_{\textrm{FP}}\) in Eq. 82 formally consolidate into the same quantity:

$$\begin{aligned}&2\Psi ^{\textrm{FP},*}_{\varvec{\omega }[p]}[\varvec{f}_{\textrm{FP}}] =2\Psi ^{\textrm{FP}}_{\varvec{\omega }[p]}[\varvec{j}_{\textrm{FP}}] =\langle \varvec{j}_{\textrm{FP}},\varvec{f}_{\textrm{FP}}\rangle =\dot{\Sigma }_{\textrm{FP}}=\dot{\Sigma }^{p}_{\textrm{FP}}\nonumber \\&= D_{0}\int p(\varvec{r})\left( \nabla _{\varvec{r}} \ln \frac{p(\varvec{r})}{\tilde{p}(\varvec{r})}\right) ^{2}\textrm{d}\varvec{r}. \end{aligned}$$
(89)

The last quantity without \(D_{0}\) is known as relative Fisher information [66, 143] and Hyvärinen divergence [120, 144] between p and \(\tilde{p}\). For \(U(\varvec{f})=0\), it reduces to the Fisher information number \(\mathbb {I}_{F}[p]\) in Eq. 1. This consolidation is a source of confusion, because the same quantity for FPE or linear diffusion processes has different names in different contexts and in different disciplines. However, they actually have different definitions, roles, and meanings, which become explicit in the information-geometric formulation.

6 Orthogonal subspaces, dual foliations, and Pythagorean relation

To investigate the behaviors and properties of the equilibrium (Eq. 58) and nonequilibrium (Eq. 61) flow, especially its topological and algebraic constraints from the graph or hypergraph structure, information geometry provides the ideal tools. In particular, the four affine subspaces associated with the cycle and cocycle subspaces of the chain and cochain complexes (Fig. 5) form dual foliations via the Legendre transformation, whose geometric properties are captured by information geometry [1, 145]. It should be noted that the results of this section do not assume the specific forms of the thermodynamic and dissipation functions introduced in Sect. 5.

Fig. 5
figure 5

Diagrammatic representation of the four subspaces and their relationship with the chain and cochain complexes of \(\mathbb {H}\)

6.1 Four affine subspaces

Two families of orthogonally complement affine subspaces are naturally introduced on \(\mathcal {X}\) and \(\mathcal {Y}\), respectively, from the topological structure of graph and hypergraph, i.e., \(\mathbb {B}\) and \(\mathbb {S}\).

Definition 26

(Stoichiometric subspaces in \(\mathcal {X}\)) The stoichiometric subspaces are defined asFootnote 67

$$\begin{aligned} \mathcal {P}^{sc}(\varvec{x}_{0})&:=\{\varvec{x}\in \mathcal {X}| \varvec{x}-\varvec{x}_{0} \in \textrm{Im}\mathbb {S}\}, \quad \varvec{x}_{0}\in \mathcal {X} \end{aligned}$$
(90)

where \(\varvec{x}_{0}\) is a parameter to specify the position of the subspace (Fig. 5, lower left)Footnote 68.

Definition 27

(Equilibrium subspaces in \(\mathcal {Y}\)) The equilibrium subspaces (Fig. 5, upper left) are defined as

$$\begin{aligned} \mathcal {P}^{eq}(\tilde{\varvec{y}})&:=\left\{ \varvec{y}\in \mathcal {Y}| \varvec{y}-\tilde{\varvec{y}} \in \textrm{Ker}\mathbb {S}^{T}\right\} , \quad \tilde{\varvec{y}}\in \mathcal {Y}. \end{aligned}$$
(91)

\(\mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\mathcal {P}^{eq}(\tilde{\varvec{y}})\) are of orthogonal complement to each other: \(\langle \varvec{x}-\varvec{x}_{0},\varvec{y}'-\tilde{\varvec{y}}\rangle =0\) for \(\varvec{x}\in \mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\varvec{y}' \in \mathcal {P}^{eq}(\tilde{\varvec{y}})\).Footnote 69 Because \(\mathbb {S}\) and \(\mathbb {S}^{T}\) are the discrete differentials, \(\delta _{1}\) and \(\delta ^{0}\), \(\mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\mathcal {P}^{eq}(\tilde{\varvec{y}})\) are associated with the 0-cycle and 0-cocycle spaces, respectively.

Two other families of orthogonal-complement subspaces are introduced on \(\mathcal {J}_{\varvec{x}}\) and \(\mathcal {F}_{\varvec{x}}\).

Definition 28

(Iso-velocity subspaces in \(\mathcal {J}_{\varvec{x}}\)) The iso-velocity subspaces (Fig. 5, lower right) are defined as

$$\begin{aligned} \mathcal {P}^{vl}(\hat{\varvec{j}})=\left\{ \varvec{j}\in \mathcal {J}_{\varvec{x}}|\varvec{j}-\hat{\varvec{j}} \in \textrm{Ker}\mathbb {S}\right\} , \qquad \hat{\varvec{j}}\in \mathcal {J}_{\varvec{x}}. \end{aligned}$$
(92)

Definition 29

(Iso-force subspaces in \(\mathcal {F}_{\varvec{x}}\)) The iso-external-force subspaces, iso-force subspaces in short, (Fig. 5, upper right) are defined as

$$\begin{aligned} \mathcal {P}^{fr}(\varvec{f}'):=\left\{ \varvec{f}\in \mathcal {F}_{\varvec{x}}|\varvec{f}-\varvec{f}'\in \textrm{Im}\mathbb {S}^{T} \right\} , \qquad \varvec{f}'\in \mathcal {F}_{\varvec{x}}. \end{aligned}$$
(93)

Again, from the correspondence of \(\delta _{1}=\mathbb {S}\) and \(\delta ^{0}=\mathbb {S}^{T}\), \(\mathcal {P}^{vl}(\hat{\varvec{j}})\) and \(\mathcal {P}^{fr}(\varvec{f}')\) are associated with the 1-cycle and 1-cocycle spaces, respectively. We specifically call \(\mathcal {P}^{vl}(\varvec{0})\) and \(\mathcal {P}^{fr}(\varvec{0})\) zero-velocity subspace and equilibrium force subspace, respectively.

6.2 Meaning of the subspaces

All four subspaces are natural constituents in the theory of algebraic graph theory and homological algebra. Here, we provide their meaning in terms of the dynamics on graphs and hypergraphs.

The stoichiometric and iso-velocity subspaces, \(\mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\mathcal {P}^{vl}(\hat{\varvec{j}})\), are related by the continuity equation (Eq. 10). From the continuity equation, \(\mathcal {P}^{vl}(\hat{\varvec{j}})\) is the set of fluxes that induce the same velocity as a reference \(\hat{\varvec{j}}\) does: \(\varvec{j}\in \mathcal {P}^{vl}(\hat{\varvec{j}}) \Longleftrightarrow \dot{\varvec{x}}=-\mathbb {S}\hat{\varvec{j}}=-\mathbb {S}\varvec{j} \). Thereby, \(\mathcal {P}^{vl}(\hat{\varvec{j}})\) is parametrized as follows:

$$\begin{aligned} \mathcal {P}^{vl}(\dot{\varvec{x}})&= \{\varvec{j}\in \mathcal {J}_{\varvec{x}}| - \mathbb {S}\varvec{j} =\dot{\varvec{x}}\},\quad \dot{\varvec{x}}\in \textrm{Im}[\mathbb {S}]=\textrm{Ker}[\mathbb {U}], \end{aligned}$$
(94)

This subspace is crucial to characterize fluxes that can realize the same dynamics as the reference one.

The stoichiometric subspace \(\mathcal {P}^{sc}(\varvec{x}_{0})\) determines the subspace in which the dynamics are algebraically constrained via the topology of the underlying graph or hypergraph. Because \(\dot{\varvec{x}}=-\mathbb {S}\varvec{j}(\varvec{x}(t))\), for an initial state \(\varvec{x}(0)=\varvec{x}_{0}\), \(\varvec{x}(t)-\varvec{x}_{0} \in \textrm{Im}[\mathbb {S}]\) should hold, meaning that \(\varvec{x}(t)\in \mathcal {P}^{sc}(\varvec{x}_{0})\). Thus, \(\mathcal {P}^{sc}(\varvec{x}_{0})\) is the subspace in which the dynamics are restricted by the initial condition \(\varvec{x}_{0}\). \(\mathcal {P}^{sc}(\varvec{x}_{0})\) can also be represented parametrically by the quantities which are conserved by the dynamics. For any vector \(\varvec{u} \in \textrm{Ker}\mathbb {S}^{T}\), \(\eta (t):=\varvec{u}^{T}\varvec{x}(t)\) is constant over time:

$$\begin{aligned} \dot{\eta }(t)=\frac{\textrm{d}\varvec{u}^{T}\varvec{x}(t)}{\textrm{d}t}=\varvec{u}^{T}\frac{\textrm{d}\varvec{x}(t)}{\textrm{d}t}=-\varvec{u}^{T}\mathbb {S}\varvec{j}(\varvec{x})=0. \end{aligned}$$
(95)

In Sect. 3.1, we defined a matrix \(\mathbb {U}\) by a complete basis of \(\textrm{Ker}\mathbb {S}^{T}\) so that \(\textrm{Im}\mathbb {U}^{T}=\textrm{Ker}\mathbb {S}^{T}\). Using \(\mathbb {U}\), the conserved quantities for a given initial condition \(\varvec{x}_{0}\) are obtained as \(\varvec{\eta }=\mathbb {U}\varvec{x}_{0}=\mathbb {U}\varvec{x}(t)\). Because \(\textrm{Im}\mathbb {U}\) is isomorphic to \(C_{-1}(\mathbb {H})\), the stoichiometric subspace is explicitly parametrized by the conserved quantities (an element of \(C_{-1}(\mathbb {H})\)):

$$\begin{aligned} \mathcal {P}^{sc}(\varvec{\eta })&= \{\varvec{x}\in \mathcal {X}| \mathbb {U}\varvec{x} =\varvec{\eta }\},\quad \varvec{\eta }\in C_{-1}(\mathbb {H}). \end{aligned}$$
(96)

For rMJP, the conserved quantity is reduced to the conservation of probability \(\varvec{1}^{T}\varvec{p}(t)=1\) and \(\mathcal {P}^{sc}(\varvec{p}_{0})\) becomes the probability simplex. Because \(\textrm{Ker}\mathbb {B}^{T}\) determines the connected components of the graph \(\mathbb {G}\) and we conventionally assume that the underlying graph is connected in rMJP, we only have the one-dimensional cokernel space and one conserved quantity, which is \(\eta =1\). Thus, the conservation of probability or, equivalently, the restriction of \(\varvec{p}\) in the probability simplex is automatically guaranteed from the topological constraint of the dynamics if we start from an initial state satisfying \(\varvec{1}^{T}\varvec{p}_{0}=1\).

The iso-force subspace \(\mathcal {P}^{fr}(\varvec{f}')\) and the equilibrium subspace \(\mathcal {P}^{eq}(\tilde{\varvec{y}})\) are related to the equilibrium and nonequilibrium force equations, Eq. 57 and Eq. 60. The equilibrium force defined in Eq. 57 satisfies \(\varvec{f}(\varvec{x})\in \textrm{Im}\mathbb {S}^{T}=\mathcal {P}^{fr}(\varvec{0})\). Thus, the equilibrium-force subspace \(\mathcal {P}^{fr}(\varvec{0})\) is literally the set of equilibrium forces. \(\mathcal {P}^{fr}(\varvec{f}')\) is its shift by \(\varvec{f}'\in \mathcal {F}_{\varvec{x}}\). Using \(\mathbb {V}\) defined in Sect. 3.1, we can represent \(\mathcal {P}^{fr}\) parametrically as

$$\begin{aligned} \mathcal {P}^{fr}(\varvec{\zeta })&= \{\varvec{f}\in \mathcal {F}_{\varvec{x}}| \mathbb {V}^{T}\varvec{f} =\varvec{\zeta }\},\quad \varvec{\zeta }\in C^{2}(\mathbb {H}) \end{aligned}$$
(97)

because \(\mathcal {F}_{\varvec{x}}/\textrm{Im}\mathbb {S}^{T}\cong \mathcal {F}_{\varvec{x}}/\textrm{Ker}\mathbb {V}^{T}\cong \textrm{Im}\mathbb {V}^{T}\cong C^{2}(\mathbb {H})\). Thus, \(\varvec{\zeta }\) characterizes the type of nonequilibrium forces quotient by the equilibrium forces.

Finally, the equilibrium subspace \(\mathcal {P}^{eq}(\tilde{\varvec{y}})\) can also be regarded as the set of potentials \(\varvec{y}\) that generate the same equilibrium force because any \(\varvec{y}\in \mathcal {P}^{eq}(\tilde{\varvec{y}})\) satisfies \(\varvec{f}'=\mathbb {S}^{T}\varvec{y}=\mathbb {S}^{T}\tilde{\varvec{y}} \in \mathcal {P}^{fr}(\varvec{0})\). Due to this, the equilibrium subspace \(\mathcal {P}^{eq}\) is parameterized as

$$\begin{aligned} \mathcal {P}^{eq}(\varvec{f}')&= \{\varvec{y}\in \mathcal {Y}| \mathbb {S}^{T}\varvec{y} =\varvec{f}'\},\quad \varvec{f}'\in \textrm{Im}[\mathbb {S}^{T}]. \end{aligned}$$
(98)

The parametric forms of the subspaces are summarized as follows:

$$\begin{aligned} \mathcal {P}^{vl}(\dot{\varvec{x}})&= \{\varvec{j}\in \mathcal {J}_{\varvec{x}}| - \mathbb {S}\varvec{j} =\dot{\varvec{x}}\},&\dot{\varvec{x}}&\in \textrm{Im}[\mathbb {S}]=\textrm{Ker}[\mathbb {U}], \end{aligned}$$
(99)
$$\begin{aligned} \mathcal {P}^{sc}(\varvec{\eta })&= \{\varvec{x}\in \mathcal {X}| \mathbb {U}\varvec{x} =\varvec{\eta }\},&\varvec{\eta }&\in \textrm{Im}[\mathbb {U}] =C_{-1}(\mathbb {H}), \end{aligned}$$
(100)
$$\begin{aligned} \mathcal {P}^{fr}(\varvec{\zeta })&= \{\varvec{f}\in \mathcal {F}_{\varvec{x}}| \mathbb {V}^{T}\varvec{f} =\varvec{\zeta }\},&\varvec{\zeta }&\in \textrm{Im}[\mathbb {V}^{T}]=C^{2}(\mathbb {H}), \end{aligned}$$
(101)
$$\begin{aligned} \mathcal {P}^{eq}(\varvec{f}')&= \{\varvec{y}\in \mathcal {Y}| \mathbb {S}^{T}\varvec{y} =\varvec{f}'\},&\varvec{f}'&\in \textrm{Im}[\mathbb {S}^{T}]=\textrm{Ker}[\mathbb {V}^{T}]. \end{aligned}$$
(102)

From these subspaces, we can obtain dual foliations on the vertex and edge spaces.

6.3 Dual manifolds, dual foliations, and Pythagorean relation in vertex spaces

For the subspaces \(\mathcal {P}^{sc}\) and \(\mathcal {P}^{eq}\) in the density and potential spaces, we introduce their Legendre transformation via the thermodynamic functions, \(\Phi (\varvec{x})\) and \(\Phi ^{*}(\varvec{y})\), which form the dual foliation with the subspaces of orthogonal complement (Fig. 6, left).

Fig. 6
figure 6

Diagrammatic representation of the dual foliations in \(\mathcal {X}\), \(\mathcal {Y}\), \(\mathcal {J}_{\varvec{x}}\), and \(\mathcal {F}_{\varvec{x}}\) spaces

Definition 30

(Stoichiometric manifold in \(\mathcal {Y}\) and equilibrium manifold in \(\mathcal {X}\)) The stoichiometric and equilibrium manifolds (Fig. 6, left) are defined respectively as

$$\begin{aligned} \mathcal {M}^{sc}(\varvec{y}_{0})&:=\partial \Phi [\mathcal {P}^{sc}(\varvec{x}_{0})]\subset \mathcal {Y}, \quad \varvec{y}_{0}=\partial \Phi (\varvec{x}_{0}), \end{aligned}$$
(103)
$$\begin{aligned} \mathcal {M}^{eq}(\tilde{\varvec{x}})&:=\partial \Phi ^{*}[\mathcal {P}^{eq}(\tilde{\varvec{y}})] \subset \mathcal {X}, \quad \tilde{\varvec{x}}=\partial \Phi ^{*}(\tilde{\varvec{y}}). \end{aligned}$$
(104)

Lemma 1

(Dual foliations in density and potential spaces [48]) \(\mathcal {P}^{sc}\) and \(\mathcal {M}^{eq}\) are foliations of \(\mathcal {X}\), and \(\mathcal {M}^{sc}\) and \(\mathcal {P}^{eq}\) are foliations of \(\mathcal {Y}\). For each pair of \((x_{0},\tilde{x})\), the intersection of \(\mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\mathcal {M}^{eq}(\tilde{\varvec{x}})\) is unique and transversal. The same applies to \(\mathcal {M}^{sc}(\varvec{y}_{0})\) and \(\mathcal {P}^{eq}(\tilde{\varvec{y}})\). Then, \((\mathcal {P}^{sc}, \mathcal {M}^{eq})\) and \((\mathcal {M}^{sc}, \mathcal {P}^{eq})\) form dual foliations (nonlinear coordinate systems) in \(\mathcal {X}\) and \(\mathcal {Y}\) spaces, respectively.

Proof

The polyhedron \(\mathcal {P}^{sc}(\varvec{x}_{0})\) and the affine subspace \(\mathcal {P}^{eq}(\tilde{\varvec{y}})\) can cover the whole \(\mathcal {X}\) and \(\mathcal {Y}\) by changing \(\varvec{x}_{0}\) and \(\tilde{\varvec{y}}\), respectively. Similarly, \(\mathcal {M}^{eq}(\tilde{\varvec{x}})\) and \(\mathcal {M}^{sc}(\varvec{y}_{0})\) can cover the whole \(\mathcal {X}\) and \(\mathcal {Y}\) because Legendre transformations by the thermodynamic functions are one-to-one between \(\mathcal {X}\) and \(\mathcal {Y}\). Consider the intersection of \(\mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\mathcal {M}^{eq}(\tilde{\varvec{x}})\) in \(\mathcal {X}\) space. The condition that \(\mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}}) \ne \emptyset \) is related to the existence of \(\varvec{x}^{\dagger }\) defined by the following convex optimization problem:

$$\begin{aligned} \varvec{x}^{\dagger }:=\arg \min _{\varvec{x}\in \overline{\mathcal {P}^{sc}(\varvec{x}_{0})}} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]. \end{aligned}$$
(105)

Because of the properties of \(\Phi (\varvec{x})\), \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\) and its restriction to \(\overline{\mathcal {P}^{sc}(\varvec{x}_{0})}\) are strictly convex with respect to \(\varvec{x}\). Thus, \(\varvec{x}^{\dagger }\) is unique and either satisfies the stationarity condition \(\varvec{x}^{\dagger } \in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})\) if \(\varvec{x}^{\dagger }\in \mathcal {P}^{sc}(\varvec{x}_{0})\) or locates on the boundary \(\partial \mathcal {X}\) if \(\varvec{x}^{\dagger }\not \in \mathcal {P}^{sc}(\varvec{x}_{0})\), where we used \(\mathbb {S}^{T}\frac{\partial \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \tilde{\varvec{x}}]}{\partial \varvec{x}} =\varvec{0} \Leftrightarrow \mathbb {S}^{T}(\varvec{y}-\tilde{\varvec{y}}) =\varvec{0} \Leftrightarrow \varvec{y}\in \mathcal {P}^{eq}(\tilde{\varvec{y}})\Leftrightarrow \varvec{x}\in \mathcal {M}^{eq}(\tilde{\varvec{x}})\). Let \(\varvec{x}_{bd}\in \partial \mathcal {X}\) and \(\varvec{x}_{in}\in \mathcal {X}\) be arbitrary points on the boundary and interior of \(\mathcal {X}\). From the condition Eq. 25 of the thermodynamic function, for \(\varvec{x}_{\lambda }:=\lambda \varvec{x}_{in} + (1-\lambda )\varvec{x}_{bd}\) where \(\lambda \in [0,1]\),

$$\begin{aligned} \lim _{\lambda \rightarrow +0}\frac{\textrm{d}\mathcal {D}^{\mathcal {X}}[\varvec{x}_{\lambda }\Vert \tilde{\varvec{x}}]}{\textrm{d}\lambda }=\lim _{\lambda \rightarrow +0}\left[ \frac{\textrm{d}\Phi (\varvec{x}_{\lambda })}{\textrm{d}\lambda } - \left\langle \tilde{\varvec{y}}, \frac{\textrm{d}\varvec{x}_{\lambda }}{\textrm{d}\lambda } \right\rangle \right] = -\infty . \end{aligned}$$
(106)

Thus, \(\varvec{x}^{\dagger }\not \in \mathcal {X}\) is excluded, and the intersection exists, i.e., \(\varvec{x}^{\dagger }\in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})\). The intersection \(\varvec{x}^{\dagger }\) is unique and transversal because \(\langle \varvec{x}_{sc}-\varvec{x}^{\dagger }, \varvec{y}_{eq}-\varvec{y}^{\dagger }\rangle =0\) holds for any \(\varvec{x}_{sc}\in \mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\varvec{x}_{eq} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})\) and the dimensions of \(\mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\mathcal {M}^{eq}(\tilde{\varvec{x}})\) are complementary because \(\mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\mathcal {P}^{eq}(\tilde{\varvec{y}})\) are of orthogonal complement (see also the proof in [83]). As a result, \(\varvec{x}^{\dagger } \in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})\) always exists, and \((\mathcal {P}^{sc}, \mathcal {M}^{eq})\) forms a dual foliation in \(\mathcal {X}\). Also \((\mathcal {M}^{sc}, \mathcal {P}^{eq})\) does in \(\mathcal {Y}\) because they are bijective Legendre duals of \((\mathcal {P}^{sc}, \mathcal {M}^{eq})\). \(\square \)

This result is reduced to Birch’s theorem [48, 100] and the seminal result by Horn and Jackson [41] when the thermodynamic function is the generalized KL divergence.

With the dual foliation, we can consider the generalized Pythagorean relations and orthogonal decomposition. For any three points satisfying \(\varvec{x} \in \mathcal {P}^{sc}(\varvec{x}_{0})\), \(\varvec{x}_{q} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})\), and \(\varvec{x}^{\dagger } = \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})\)Footnote 70, we have the generalized Pythagorean relation:

$$\begin{aligned} \mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \varvec{x}_{q}]=\mathcal {D}^{\mathcal {X}}[\varvec{x}\Vert \varvec{x}^{\dagger }] + \mathcal {D}^{\mathcal {X}}[\varvec{x}^{\dagger }\Vert \varvec{x}_{q}]. \end{aligned}$$
(107)

In \(\mathcal {Y}\) space, we also have the dual version of the relations as

$$\begin{aligned} \mathcal {D}^{\mathcal {Y}}[\varvec{y}_{q}\Vert \varvec{y}]=\mathcal {D}^{\mathcal {Y}}[\varvec{y}_{q}\Vert \varvec{y}^{\dagger }] + \mathcal {D}^{\mathcal {Y}}[\varvec{y}^{\dagger }\Vert \varvec{y}]. \end{aligned}$$
(108)

These relations are used to characterize the steady state of equilibrium and nonequilibrium flow geometrically and also variationally.

Remark 9

(Interpretation in terms of statistical inference) The meaning of the equilibrium manifold in statistics can be clarified more explicitly by considering the specific form of thermodynamic function (Eq. 65). For this thermodynamic function, the equilibrium manifold \(\mathcal {M}^{eq}(\tilde{\varvec{p}})\) is represented as

$$\begin{aligned} \mathcal {M}^{eq}(\tilde{\varvec{p}})&=\left\{ \varvec{p}\in \mathcal {X}| \ln \varvec{p}-\ln \tilde{\varvec{p}} \in \textrm{Ker}\mathbb {S}^{T}\right\} \nonumber \\ {}&=\left\{ \varvec{p}\in \mathcal {X}| \varvec{p}=\tilde{\varvec{p}}\circ \exp \left[ \mathbb {U}^{T}\varvec{\eta }^{*}\right] , \varvec{\eta }^{*}\in C^{-1}(\mathbb {H}) \right\} \end{aligned}$$
(109)

where we use the fact \(\mathbb {S}^{T}\mathbb {U}^{T}=0\). Thus, \(\mathcal {M}^{eq}(\tilde{\varvec{p}})\) is an exponential family with algebraic constraints via \(\mathbb {U}^{T}\). In contrast, \(\mathcal {P}^{sc}(\varvec{\eta })\) can be regarded as the data manifold, which constrains \(\varvec{p}\) by \(\varvec{\eta }=\mathbb {U}\varvec{p}\), because \(\mathbb {U}\varvec{p}\) can be interpreted as expectation of observables \(\{\varvec{u}_{\ell }\}_{\ell \in [1,N_{\mathbb {l}}]}\). Thus, the intersection \(\varvec{p}^{\dagger }=\mathcal {P}^{sc}(\varvec{\eta })\cap \mathcal {M}^{eq}(\tilde{\varvec{p}})\) is the maximum likelihood estimator. The exponential family with linear algebraic constraints as in Eq. 109 appears in algebraic statistics where \(\mathbb {U}\) is sometimes called the design matrix [48].Footnote 71

6.4 Dual manifolds, dual foliations in edge spaces and information-geometric extension of Helmholtz-Hodge-Kodaira decomposition

For the edge spaces, we similarly introduce the iso-velocity and iso-force manifolds, which are the duals of \(\mathcal {P}^{fr}(\varvec{f}')\) and \(\mathcal {P}^{vl}(\hat{\varvec{j}})\), respectively, via the Legendre transformations by the dissipation functions, \(\Psi _{\varvec{x}}(\varvec{j})\) and \(\Psi ^{*}_{\varvec{x}}(\varvec{f})\) (Fig. 6, right):

Definition 31

(Iso-velocity manifold in \(\mathcal {F}_{\varvec{x}}\) and iso-force manifold in \(\mathcal {J}_{\varvec{x}}\)) The iso-velocity and iso-force manifolds (Fig. 6, right) are defined as follows:

$$\begin{aligned} \mathcal {M}^{vl}_{\varvec{x}}(\hat{\varvec{f}})&:=\partial \Psi _{\varvec{x}}[\mathcal {P}^{vl}(\hat{\varvec{j}})]\subset \mathcal {F}_{\varvec{x}},&\hat{\varvec{f}}&=\partial \Psi _{\varvec{x}}(\hat{\varvec{j}}), \end{aligned}$$
(110)
$$\begin{aligned} \mathcal {M}^{fr}_{\varvec{x}}(\varvec{j}')&:=\partial \Psi ^{*}_{\varvec{x}}[\mathcal {P}^{fr}(\varvec{f}')] \subset \mathcal {J}_{\varvec{x}},&\varvec{j}'&=\partial \Psi ^{*}_{\varvec{x}}(\varvec{f}'). \end{aligned}$$
(111)

It should be noted that \(\mathcal {M}^{vl}_{\varvec{x}}(\hat{\varvec{f}})\) and \(\mathcal {M}^{fr}_{\varvec{x}}(\varvec{j}')\) are dependent on \(\varvec{x}\) via the \(\varvec{x}\) dependence of the dissipation functions. We obtain the intersections in \(\mathcal {J}_{\varvec{x}}\) and \(\mathcal {F}_{\varvec{x}}\):

$$\begin{aligned} \varvec{j}^{\dagger }&:=\mathcal {P}^{vl}(\hat{\varvec{j}}) \cap \mathcal {M}^{fr}_{\varvec{x}}(\varvec{j}'),&\varvec{f}^{\dagger }&:=\mathcal {M}^{vl}_{\varvec{x}}(\hat{\varvec{f}}) \cap \mathcal {P}^{fr}(\varvec{f}'), \end{aligned}$$
(112)

which are also unique and transversal for each \(\varvec{x}\in \mathcal {X}\) because \(\mathcal {J}_{\varvec{x}}\) and \(\mathcal {F}_{\varvec{x}}\) are whole vector spaces and the Legendre transformations are one-to-one. Thus, similarly to the case of vertex space, we have the dual foliation:

Lemma 2

(Dual foliations in edge spaces [83]) For each \(\varvec{x}\in X\), \((\mathcal {P}^{vl}, \mathcal {M}^{fr}_{\varvec{x}})\) and \((\mathcal {M}^{vl}_{\varvec{x}}, \mathcal {P}^{fr})\) form dual foliations in \(\mathcal {J}_{\varvec{x}}\) and \(\mathcal {F}_{\varvec{x}}\) spaces, respectively.

For \(\hat{\varvec{j}}\) and \(\varvec{f}'\), and their intersections \(\varvec{j}^{\dagger }\) and \(\varvec{f}^{\dagger }\) defined in Eq. 112, \(\langle \hat{\varvec{j}}-\varvec{j}^{\dagger },\varvec{f}^{\dagger }-\varvec{f}' \rangle =0\) holds. Thus, we have the generalized Pythagorean relations:

$$\begin{aligned} \mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\hat{\varvec{j}}\Vert \varvec{j}']&=\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\hat{\varvec{j}}\Vert \varvec{j}^{\dagger }] +\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}^{\dagger }\Vert \varvec{j}'], \nonumber \\ \mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}'\Vert \hat{\varvec{f}}]&=\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}'\Vert \varvec{f}^{\dagger }] +\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}^{\dagger }\Vert \hat{\varvec{f}}]. \end{aligned}$$
(113)

In contrast to the thermodynamic functions \((\Phi , \Phi ^{*})\), the dissipation functions have symmetry, which makes the origins \(\varvec{0}\) in \(\mathcal {J}_{\varvec{x}}\) and \(\mathcal {F}_{\varvec{x}}\) special and leads to an extension of Helmholtz-Hodge-Kodaira decomposition.

Theorem 1

(Information-geometric extension of Helmholtz-Hodge-Kodaira (HHK) decomposition [83]) For a given flux-force Legendre pair \((\varvec{j},\varvec{f})\in (\mathcal {J}_{\varvec{x}},\mathcal {F}_{\varvec{x}})\), we have their unique \(\varvec{x}\)-dependent decompositions:

$$\begin{aligned} \varvec{j}&=\varvec{j}_{eq}(\varvec{x})+(\varvec{j}-\varvec{j}_{eq}(\varvec{x})),&\varvec{f}&=\varvec{f}_{st}(\varvec{x})+(\varvec{f}-\varvec{f}_{st}(\varvec{x})), \end{aligned}$$
(114)

such that \(\varvec{f}_{eq}(\varvec{x}) \in \mathcal {P}^{fr}(\varvec{0})\), \(\varvec{j}-\varvec{j}_{eq}(\varvec{x}) \in \mathcal {P}^{vl}(\varvec{0})\), \(\varvec{f}-\varvec{f}_{st}(\varvec{x}), \in \mathcal {P}^{fr}(\varvec{0})\), and \(\varvec{j}_{st}(\varvec{x}) \in \mathcal {P}^{vl}(\varvec{0})\) hold. In addition, \(\varvec{j}_{eq}(\varvec{x})\) and \(\varvec{f}_{st}(\varvec{x})\) are characterized geometrically as

$$\begin{aligned} \varvec{j}_{eq}(\varvec{x})&:=\mathcal {P}^{vl}(\varvec{j})\cap \mathcal {M}^{fr}_{\varvec{x}}(\varvec{0}),&\varvec{f}_{st}(\varvec{x})&:=\mathcal {M}^{vl}_{\varvec{x}}(\varvec{0})\cap \mathcal {P}^{fr}(\varvec{f}). \end{aligned}$$
(115)

Furthermore, \(\varvec{j}_{eq}\) and \(\varvec{f}_{st}\) are also characterized variationally as the minimizers of dissipation functions:

$$\begin{aligned} \varvec{j}_{eq}(\varvec{x})&= \arg \min _{\varvec{j}' \in \mathcal {P}^{vl}(\varvec{j})} \Psi _{\varvec{x}}(\varvec{j}'),&\varvec{f}_{st}(\varvec{x})&= \arg \min _{\varvec{f}'' \in \mathcal {P}^{fr}(\varvec{f})} \Psi _{\varvec{x}}^{*}(\varvec{f}''). \end{aligned}$$
(116)

Proof

The uniqueness of \(\varvec{j}_{eq}(\varvec{x})\) and \(\varvec{f}_{st}(\varvec{x})\) as intersections in Eq. 115 follows immediately from the property of the dual foliations. Because, for any \(\varvec{j}' \in \mathcal {P}^{vl}(\varvec{j})\) and \(\varvec{f}'' \in \mathcal {P}^{fr}(\varvec{f})\), \(\langle \varvec{j}'-\varvec{j}_{eq},\varvec{f}_{eq} \rangle =0\) and \(\langle \varvec{j}_{st}, \varvec{f}''-\varvec{f}_{st} \rangle =0\) hold, the generalized Pythagorean relations lead to

$$\begin{aligned} \begin{aligned} \mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}'\Vert \varvec{0}]&=\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}'\Vert \varvec{j}_{eq}] +\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}_{eq}\Vert \varvec{0}],&\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}''\Vert \varvec{0}]&=\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}''\Vert \varvec{f}_{st}] +\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}_{st}\Vert \varvec{0}]. \end{aligned} \end{aligned}$$

Because \(\mathcal {D}^{\mathcal {J}}_{\varvec{x}}[\varvec{j}'\Vert \varvec{0}]=\Psi _{\varvec{x}}(\varvec{j}')\) and \(\mathcal {D}^{\mathcal {F}}_{\varvec{x}}[\varvec{f}''\Vert \varvec{0}]=\Psi ^{*}_{\varvec{x}}(\varvec{f}'')\) hold, the relations are reduced to

$$\begin{aligned} \Psi _{\varvec{x}}(\varvec{j}')&=\mathcal {D}_{\varvec{x}}^{\mathcal {J}}[\varvec{j}'\Vert \varvec{j}_{eq}]+\Psi _{\varvec{x}}(\varvec{j}_{eq}),&\Psi ^{*}_{\varvec{x}}(\varvec{f}'')&=\mathcal {D}_{\varvec{x}}^{\mathcal {F}}[\varvec{f}''\Vert \varvec{f}_{st}]+\Psi ^{*}_{\varvec{x}}(\varvec{f}_{st}). \end{aligned}$$
(117)

Then Eq. 116 follows. \(\square \)

The decomposed flux \(\varvec{j}_{eq}\) and force \(\varvec{f}_{st}\) play a particularly important role in dynamics. From the definition, \(\varvec{j}_{eq}\) is the equilibrium flux, which induces the same instantaneous velocity \(\dot{\varvec{x}}\) as \(\varvec{j}\) does, i.e., \(\dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}}\varvec{j}=-\textrm{div}_{\mathbb {S}}\varvec{j}_{eq}\). Thus, \(\varvec{j}_{eq}\) is the equilibrium flux mimicking the instantaneous dynamics induced by the nonequilibrium flux \(\varvec{j}\). This equilibrium flux is uniquely determined owing to the information-geometric orthogonality of \(\mathcal {P}^{vl}(\varvec{j})\) and \(\mathcal {M}^{fr}_{\varvec{x}}(\varvec{0})\). Moreover, the decomposition \(\varvec{j}=\varvec{j}_{eq}+(\varvec{j}-\varvec{j}_{eq})\) can be regarded as an information-geometric extension of the Helmholtz-Hodge-Kodaira decomposition in vector calculus and differential form, because \((\varvec{j}-\varvec{j}_{eq})\) is divergence free, i.e., \(\textrm{div}_{\mathbb {S}}(\varvec{j}-\varvec{j}_{eq})=0\), and \(\varvec{f}_{eq}\) is a curl-free equilibrium force, i.e., \(\varvec{f}_{eq}\in \mathcal {P}^{fr}(\varvec{0})=\textrm{Im}[\mathbb {S}^{T}]=\textrm{Ker}[\mathbb {V}^{T}]\).

On the contrary, by definition, \(\varvec{j}_{st}\in \mathcal {P}^{v}(\varvec{0})\) is the flux that makes the state \(\varvec{x}\) a steady state, i.e., \(\dot{\varvec{x}}=0\), and is also induced by the force in the same quotient set of force \(\mathcal {P}^{fr}(\varvec{f})\) as \(\varvec{f}\). The decomposition \(\varvec{f}=\varvec{f}_{st}+(\varvec{f}-\varvec{f}_{st})\) is also a HHK decomposition because \(\varvec{j}_{st}\in \mathcal {P}^{vl}(\varvec{0})\) is divergence free, i.e., \(\textrm{div}_{\mathbb {S}}\varvec{j}_{st}=0\), and \(\varvec{f}-\varvec{f}_{st}\) is a curl-free equilibrium force, i.e., \(\varvec{f}-\varvec{f}_{st}\in \mathcal {P}^{fr}(\varvec{0})=\textrm{Im}[\mathbb {S}^{T}]=\textrm{Ker}[\mathbb {V}^{T}]\).Footnote 72 These decompositions are used in the subsequent sections (Sect. 8 and Sect. 9).

7 Central affine manifold and Hilbert orthogonality

The dual foliation is an essential geometric object in information geometry. While less common than the dual foliation, the central affine manifold defined by a convex function also plays an integral role in information geometry [145].

Definition 32

(Central affine manifolds in \(\mathcal {J}_{\varvec{x}}\) and \(\mathcal {F}_{\varvec{x}}\)) The central affine manifolds in \(\mathcal {J}_{\varvec{x}}\) and \(\mathcal {F}_{\varvec{x}}\) are defined as the level sets of \(\Psi _{\varvec{x}}(\varvec{j})\) and \(\Psi ^{*}_{\varvec{x}}(\varvec{f})\), respectivelyFootnote 73:

$$\begin{aligned} \mathcal {C}_{\varvec{x}}^{\Psi }(c)&:=\{\varvec{j}|\Psi _{\varvec{x}}(\varvec{j})=c\} \subset \mathcal {J}_{\varvec{x}},&\mathcal {C}_{\varvec{x}}^{\Psi ^{*}}(c)&:=\{\varvec{f}|\Psi ^{*}_{\varvec{x}}(\varvec{f})=c\}\subset \mathcal {F}_{\varvec{x}}, \end{aligned}$$
(118)

where \(c\in \mathbb {R}_{\ge 0}\). For a given \(\varvec{j}'\in \mathcal {J}_{\varvec{x}}\) or \(\varvec{f}'\in \mathcal {F}_{\varvec{x}}\), the manifolds are also denoted as

$$\begin{aligned} \mathcal {C}_{\varvec{x}}^{\Psi }(\varvec{j}')&:=\{\varvec{j}|\Psi _{\varvec{x}}(\varvec{j})=\Psi _{\varvec{x}}(\varvec{j}')\},&\mathcal {C}_{\varvec{x}}^{\Psi ^{*}}(\varvec{f}')&:=\{\varvec{f}|\Psi ^{*}_{\varvec{x}}(\varvec{f})=\Psi ^{*}_{\varvec{x}}(\varvec{f}')\}. \end{aligned}$$
(119)

Their Legendre transformations are also called (dual) central affine manifolds:

$$\begin{aligned} \mathcal {M}_{\varvec{x}}^{\Psi }(c)&:=\partial \Psi _{\varvec{x}}[\mathcal {C}_{\varvec{x}}^{\Psi }(c)]\subset \mathcal {F}_{\varvec{x}},&\mathcal {M}_{\varvec{x}}^{\Psi ^{*}}(c)&:=\partial \Psi ^{*}_{\varvec{x}}[\mathcal {C}_{\varvec{x}}^{\Psi ^{*}}(c)]\subset \mathcal {J}_{\varvec{x}}. \end{aligned}$$
(120)

7.1 Pseudo-Hilbert-isosceles orthogonality and decomposition

By employing the central affine manifold, we can introduce another type of generalized orthogonality:

Definition 33

(Pseudo-Hilbert-isosceles orthogonality [78, 79, 81]) Pseudo-Hilbert-isosceles orthogonality between \(\varvec{j}_{S}, \varvec{j}_{A} \in \mathcal {J}_{\varvec{x}}\) and between \(\varvec{f}_{S}, \varvec{f}_{A} \in \mathcal {F}_{\varvec{x}}\) are defined as follows:

$$\begin{aligned} \varvec{j}_{S}\perp _{H} \varvec{j}_{A}&\Longleftrightarrow \Psi _{\varvec{x}}(\varvec{j}_{S}+\varvec{j}_{A})=\Psi _{\varvec{x}}(\varvec{j}_{S} -\varvec{j}_{A}) \end{aligned}$$
(121)
$$\begin{aligned} \varvec{f}_{S}\perp _{H} \varvec{f}_{A}&\Longleftrightarrow \Psi _{\varvec{x}}^{*}(\varvec{f}_{S}+\varvec{f}_{A})=\Psi _{\varvec{x}}^{*}(\varvec{f}_{S} -\varvec{f}_{A}). \end{aligned}$$
(122)

This orthogonality is motivated by the relation \(\Vert \varvec{j}_{S}+\varvec{j}_{A}\Vert ^{2}=\Vert \varvec{j}_{S}-\varvec{j}_{A}\Vert ^{2}\) satisfied by an orthogonal pair \(\varvec{j}_{S}\perp \varvec{j}_{A}\) under a usual inner product structure and its induced norm \(\Vert \cdot \Vert ^{2}\).Footnote 74 By employing this orthogonality, we obtain pseudo-Hilbert isosceles decompositions of \(\varvec{j}\) and \(\varvec{f}\) as follows:

Lemma 3

(Positive decompositions of the bilinear pairing via pseudo-Hilbert-isosceles orthogonality [78, 79, 81, 83]) For a given \(\varvec{j}\in \mathcal {J}_{\varvec{x}}\) and any \(\varvec{j}'\) on the same central affine manifold as \(\varvec{j}\), i.e., \(\varvec{j}' \in \mathcal {C}_{\varvec{x}}^{\Psi }(\varvec{j})\), we obtain the pseudo-Hilbert-isosceles orthogonal decomposition \(\varvec{j}=\varvec{j}_{S}+\varvec{j}_{A}\):

$$\begin{aligned} \varvec{j}_{S}&:=\frac{1}{2}(\varvec{j}+\varvec{j}'),&\varvec{j}_{A}&:=\frac{1}{2}(\varvec{j}-\varvec{j}'), \end{aligned}$$
(123)

where \(\varvec{j}_{S} \perp _{H}\varvec{j}_{A}\) and \(\varvec{j}'=\varvec{j}_{S}-\varvec{j}_{A}\) hold. In addition, this decomposition induces a positive decomposition of the bilinear product \(\langle \varvec{j},\varvec{f}\rangle =\langle \varvec{j}_{S},\varvec{f}\rangle +\langle \varvec{j}_{A},\varvec{f}\rangle \) where

$$\begin{aligned} \langle \varvec{j}_{S}, \varvec{f}\rangle&=\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j}'; -\varvec{f}]\ge 0,&\langle \varvec{j}_{A}, \varvec{f}\rangle&=\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j}'; \varvec{f}]\ge 0, \end{aligned}$$
(124)

hold. Similarly, for \(\varvec{f} \in \mathcal {F}_{\varvec{x}}\) and \(\varvec{f}''\in \mathcal {C}_{\varvec{x}}^{\Psi ^{*}}(\varvec{f})\), a positive orthogonal decomposition \(\varvec{f}=\varvec{f}_{S}+\varvec{f}_{A}\) is obtained by \(\varvec{f}_{S}:=\frac{1}{2}(\varvec{f}+\varvec{f}'')\) and \(\varvec{f}_{A}:=\frac{1}{2}(\varvec{f}-\varvec{f}'')\), which satisfy the associated relations:

$$\begin{aligned} \langle \varvec{j}, \varvec{f}_{S}\rangle&=\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j}; -\varvec{f}'']\ge 0,&\langle \varvec{j}, \varvec{f}_{A}\rangle&=\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j}; \varvec{f}'']\ge 0. \end{aligned}$$
(125)

These decompositions were introduced in [78] for rMJP and extended to CRN in [79, 81], whereas we pointed out its information-geometric aspect in [83]. The decomposition plays a role of characterizing the gradient-flow-like property of non-gradient flows.

8 Information-geometric properties of Equilibrium flow

In this section, we describe several properties of the equilibrium flow (Eq. 58) from the viewpoint of information geometry by employing the objects introduced in the previous sections. Such properties include the existence and uniqueness of the steady state (static property), convergence to the state (kinetic property), and the balance between information-geometric quantities associated with the steady state and convergence along the trajectory (the connection between static and kinetic properties). These properties are consistent with those that thermodynamic equilibrium systems should have. In addition, several results are extensions of the results obtained for FPE in the context of functional analysis, partial differential equations, and optimal transport.

8.1 Properties of equilibrium flow

The following property of the equilibrium state characterizes the static aspect of the equilibrium flow and is fundamentally ascribed to the dually flat structure of density and potential spacesFootnote 75:

Proposition 6

(Equilibrium state and its geometric and variational characterizations) The steady state of the equilibrium flow \(\varvec{x}_{t}\) (Eq. 58) starting from \(\varvec{x}(0)=\varvec{x}_{0}\) is called the equilibrium state \(\varvec{x}_{eq}\). For each \(\varvec{x}_{0}\), the equilibrium state is identical to the intersection \(\varvec{x}^{\dagger }=\mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})\), i.e., \(\varvec{x}_{eq}=\varvec{x}^{\dagger }\), and thus uniquely exists for a given pair of the initial state \(\varvec{x}_{0}\) and the parameter of equilibrium flow \(\tilde{\varvec{x}}\). The equilibrium state \(\varvec{x}_{eq}\) is also characterized variationally as

$$\begin{aligned} \varvec{x}_{eq}&= \arg \min _{\varvec{x}\in \mathcal {P}^{sc}(\varvec{x}_{0})}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}] =\arg \min _{\varvec{x}_{q}\in \mathcal {M}^{eq}(\tilde{\varvec{x}})}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{0}\Vert \varvec{x}_{q}]. \end{aligned}$$
(126)

Moreover, \(\mathcal {M}^{eq}(\tilde{\varvec{x}})=\mathcal {M}^{\textrm{DB}}\) holds.

Proof

From Prop. 3, \(\varvec{x}_{eq} \in \mathcal {M}^{\textrm{DB}}\), from which \(\varvec{f}(\varvec{x}_{eq})=0\) follows. For the equilibrium force (Eq. 57), \( \mathcal {M}^{\textrm{DB}}=\{\varvec{x}|\varvec{f}(\varvec{x})=0\}=\{\varvec{x}|\mathbb {S}^{T}(\partial \Phi [\varvec{x}]-\partial \Phi [\tilde{\varvec{x}}])=0\}=\mathcal {M}^{eq}(\tilde{\varvec{x}})\) holds. Thus, \(\varvec{x}_{eq} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})\). Because the initial state is \(\varvec{x}_{0}\), \(\varvec{x}_{eq}\in \mathcal {P}^{sc}(\varvec{x}_{0})\). Thus, \(\varvec{x}_{eq} = \varvec{x}^{\dagger } \in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})\). The first equality of Eq. 126 is obvious from the proof of the dual foliation (Lemma 1). The second equality is from the generalized Pythagorean relation (Eq. 107). \(\square \)

The second property of the equilibrium flow is kinetic in nature and characterizes the Bregman divergence as the generalized driving potential, which ensures the convergence of \(\varvec{x}_{t}\) to the equilibrium state. This property is attributed to the dually flat structure on the edge spaces.

Proposition 7

(Bregman divergence and Gibbs’ H-Theorem) For the trajectory of the equilibrium flow \(\varvec{x}_{t}\) (Eq. 58) starting from \(\varvec{x}(0)=\varvec{x}_{0}\), the thermodynamic function \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]\) decreases, that is, \(\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]/\textrm{d}t< 0\) except at \(\varvec{x}_{t} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})\) where \(\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]/\textrm{d}t=0\) holds. Thus, the equilibrium state \(\varvec{x}_{eq} \in \mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}})\) is locally and asymptotically stable.

Proof

By replacing \(\mathcal {F}(\varvec{x})\) in Prop. 3 with \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\), we obtain \(\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]/\textrm{d}t=-\left[ \Psi ^{*}_{\varvec{x}_{t}}(\varvec{f}(\varvec{x}_{t}))+ \Psi _{\varvec{x}_{t}}(\varvec{j}(\varvec{x}_{t}))\right] \le 0\) and the equality holds if and only if \(\varvec{f}(\varvec{x}_{t})=0 (\Leftrightarrow \varvec{x}_{t} \in \mathcal {M}^{eq}(\tilde{\varvec{x}})\)). \(\square \)

Because \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\) can be identified with the difference of total entropy between \(\varvec{x}\) and \(\tilde{\varvec{x}}\) for thermodynamic systems such as CRN [85], \(\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{eq}]/\textrm{d}t\le 0\) corresponds to the nondecreasing property of thermodynamic entropy, which is also referred as Gibbs’ H-theorem.Footnote 76

The third property provides a connection between the thermodynamic function and the dissipation function, which is immediately obtained from the De Giorgi’s formulation of the generalized gradient flow (Eq. 52):

Proposition 8

(Balancing of thermodynamic function and dissipation function) For the trajectory of the equilibrium flow \(\varvec{x}_{t}\) (Eq. 58) starting from \(\varvec{x}(0)=\varvec{x}_{0}\), the following relation holds for the thermodynamic function and the dissipation function:

$$\begin{aligned} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{0}\Vert \tilde{\varvec{x}}]-\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]= \int _{t'=0}^{t}\left[ \Psi ^{*}_{\varvec{x}_{t'}}(\varvec{f}(\varvec{x}_{t'}))+ \Psi _{\varvec{x}_{t'}}(\varvec{j}(\varvec{x}_{t'}))\right] \textrm{d}t' = \int _{t'=0}^{t}\dot{\Sigma }_{t'}\textrm{d}t', \end{aligned}$$
(127)

In physics and chemistry, this relation means that the difference in the thermodynamic (potential) function between \(\varvec{x}_{t}\) and \(\varvec{x}_{0}\) (the left-hand side), i.e., the change in total entropy, is equal to the integral of dissipation along \(\varvec{x}_{t}\)(the right-hand side), i.e., the entropy production, for equilibrium systems.

All these results indicate that the equilibrium flow and its properties mathematically abstract the properties of physical equilibrium systems. The equilibrium state \(\varvec{x}_{eq}\) is characterized algebraically by the unique intersection of \(\mathcal {P}^{sc}(\varvec{x}_{0})\) and \(\mathcal {M}^{eq}(\tilde{\varvec{x}})\) and also variationally by Eq. 126. The convergence to \(\varvec{x}_{eq}\) is guaranteed by \(\textrm{d}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}]/\textrm{d}t\le 0\). Furthermore, the entropy-dissipation balance relation (Eq. 127) itself defines the equilibrium system abstractly as the De Giorgi’s formulation (Eq. 52) does.

8.2 Induced dually flat structure on tangent–cotangent spaces

The equilibrium state is characterized geometrically and variationally via the information-geometric structure on the vertex spaces (\(\mathcal {X}\), \(\mathcal {Y}\)) as in Prop. 6. Similarly, the flux (kinetic law) of equilibrium systems (gradient systems) can be obtained variationally as the flux minimizing the dissipation function under the restriction of the continuity equation.

Fig. 7
figure 7

The induced dually flat structure on the restricted tangent and cotangent spaces from the dissipation functions on the edge spaces (gray region). The relationship is compared with the Riemannian metric on tangent and cotangent spaces via Fisher information matrices induced by the thermodynamic functions (dotted box)

Lemma 4

(Equilibrium force as the minimizer of primal dissipation function) For a given trajectory \(\{\varvec{x}_{t}\}\), we define the trajectory of the flux \(\{\varvec{j}^{\dagger }_{t}\}\) minimizing the primal dissipation:

$$\begin{aligned} \{\varvec{j}^{\dagger }_{t}\}:=\arg \min _{\{\varvec{j}_{t}\}}\int _{0}^{t}\Psi _{\varvec{x}_{t'}}[\varvec{j}_{t'}]\textrm{d}t',\quad \text{ s.t. } \dot{\varvec{x}}_{t'}+\textrm{div}_{\mathbb {S}} \varvec{j}_{t'}=0\hbox { for all }t'\in [0,t]. \end{aligned}$$
(128)

Then, \(\varvec{j}^{\dagger }_{t}\) is generated by the equilibrium force, \(\varvec{f}_{t}^{\dagger }=\partial \Psi _{\varvec{x}}[\varvec{j}^{\dagger }_{t}] \in \mathcal {P}^{fr}(\varvec{0})\). Thus, the minimum primal dissipation flux that generates the given \(\{\varvec{x}_{t}\}\) is the equilibrium flux.

Proof

Because the minimization of Eq. 128 can be conducted pointwise-manner for each \(t'\in [0,t]\) and \(\dot{\varvec{x}}_{t'}+\textrm{div}_{\mathbb {S}} \varvec{j}_{t'}=0\Longleftrightarrow \varvec{j}_{t'}\in \mathcal {P}^{vl}(\dot{\varvec{x}}_{t'})\), we have

$$\begin{aligned} \varvec{j}_{t'}^{\dagger }=\varvec{j}^{\dagger }(\varvec{x}_{t'},\dot{\varvec{x}}_{t'})= \arg \min _{\varvec{j}\in \mathcal {P}^{vl}(\dot{\varvec{x}}_{t'}) }\Psi _{\varvec{x}_{t'}}[\varvec{j}]=\mathcal {P}^{vl}(\dot{\varvec{x}}_{t'}) \cap \mathcal {M}^{fr}_{\varvec{x}_{t'}}(\varvec{0}), \end{aligned}$$
(129)

where we used Eq. 115 and Eq. 116. Thus, from \(\varvec{j}_{t'}^{\dagger } \in \mathcal {M}^{fr}_{\varvec{x}_{t'}}(\varvec{0}) \Longleftrightarrow \varvec{f}_{t'}^{\dagger } \in \mathcal {P}^{fr}(\varvec{0})\), the minimum dissipation flux \(\{\varvec{j}^{\dagger }_{t}\}\) is generated by the equilibrium force, \(\{\varvec{f}_{t}^{\dagger }\}=\{\varvec{f}^{\dagger }(\varvec{x}_{t'},\dot{\varvec{x}}_{t'})\}\in \mathcal {P}^{fr}(\varvec{0})\) where \(\varvec{f}^{\dagger }(\varvec{x},\dot{\varvec{x}}):=\partial \Psi _{\varvec{x}}[\varvec{j}^{\dagger }(\varvec{x},\dot{\varvec{x}})]\). \(\square \)

By exploiting this unique pairing between \(\dot{\varvec{x}}_{t'}\) and \(\varvec{j}_{t'}^{\dagger }\) or \(\dot{\varvec{x}}_{t'}\) and \(\varvec{f}_{t'}^{\dagger }\), we can obtain an induced dually flat structure on the restricted tangent and cotangent spaces of \(\mathcal {X}\) and \(\mathcal {Y}\) (Fig. 7), which can be regarded as an information-geometric extension of the Otto structure.

Theorem 2

(Induced dually flat structure on tangent and cotangent spaces) Let \(\tilde{\mathcal {T}}_{\varvec{x}} \mathcal {X}:=\textrm{Im}\mathbb {S}\cong \mathcal {P}^{sc}(\varvec{0})\subset \mathcal {T}_{\varvec{x}}\mathcal {X}\) and \(\tilde{\mathcal {T}}_{\varvec{x}}^{*} \mathcal {X}:=\mathcal {T}_{\varvec{x}}^{*}\mathcal {X}/\textrm{Ker}\mathbb {S}^{T}\) be tangent and cotangent spaces on \(\mathcal {X}\) restricted by \(\mathbb {S}\). On \(\tilde{\mathcal {T}}_{\varvec{x}} \mathcal {X}\) and \(\tilde{\mathcal {T}}_{\varvec{x}}^{*} \mathcal {X}\), we have the Legendre conjugate dissipation functions \(\tilde{\Psi }_{\varvec{x}}: \tilde{\mathcal {T}}_{\varvec{x}} \mathcal {X}\rightarrow \mathbb {R}\) and \(\tilde{\Psi }_{\varvec{x}}^{*}: \tilde{\mathcal {T}}_{\varvec{x}}^{*} \mathcal {X}\rightarrow \mathbb {R}\) induced by the dissipation functions on the edge spaces (Fig. 7).

Proof

By employing Eq. 129, for each \(\varvec{v}\in \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}\), we can uniquely determine \(\varvec{j}^{\dagger }(\varvec{x},\varvec{v})\), \(\varvec{f}^{\dagger }(\varvec{x},\varvec{v})\in \mathcal {P}^{fr}(\varvec{0})\), and \(\varvec{u}^{\dagger }(\varvec{x},\varvec{v}) \in \tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X}\)Footnote 77. They satisfy

$$\begin{aligned} \varvec{v}&= - \mathbb {S}\varvec{j}^{\dagger }(\varvec{x},\varvec{v}), \quad \varvec{j}^{\dagger }(\varvec{x},\varvec{v}) = \partial \Psi ^{*}_{\varvec{x}}[\varvec{f}^{\dagger }(\varvec{x},\varvec{v})],\nonumber \\ \varvec{f}^{\dagger }(\varvec{x},\varvec{v})&= - \mathbb {S}^{T} \varvec{u}^{\dagger }(\varvec{x},\varvec{v}). \quad \end{aligned}$$
(130)

Conversely, for a given \(\varvec{u}\in \tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X}\), we have \(\varvec{f}^{\ddagger }(\varvec{u})\), \(\varvec{j}^{\ddagger }(\varvec{x},\varvec{u})\), and \(\varvec{v}^{\ddagger }(\varvec{x},\varvec{u})\) as follows:

$$\begin{aligned} \varvec{v}^{\ddagger }(\varvec{x},\varvec{u}) = - \mathbb {S}\varvec{j}^{\ddagger }(\varvec{x},\varvec{u}),\quad \varvec{j}^{\ddagger }(\varvec{x},\varvec{u}) = \partial \Psi ^{*}_{\varvec{x}}[\varvec{f}^{\ddagger }(\varvec{u})], \quad \varvec{f}^{\ddagger }(\varvec{u}) = - \mathbb {S}^{T} \varvec{u}. \end{aligned}$$
(131)

Thus, for a pair of \((\varvec{v},\varvec{u})_{\varvec{x}}\) satisfying \(\varvec{u}=\varvec{u}^{\dagger }(\varvec{x},\varvec{v})\), we have \(\varvec{v}=\varvec{v}^{\ddagger }(\varvec{x},\varvec{u})\), \(\varvec{j}^{\dagger }(\varvec{x},\varvec{v})=\varvec{j}^{\ddagger }(\varvec{x},\varvec{u})\), and \(\varvec{f}^{\dagger }(\varvec{x},\varvec{v})=\varvec{f}^{\ddagger }(\varvec{x},\varvec{u})\). This pairing establishes a bijection between \(\tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}\) and \(\tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X}\). Moreover, this bijection is realized by the Legendre transformations of the following induced dissipation functions on \( \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}\) and \(\tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X}\):

$$\begin{aligned} \tilde{\Psi }_{\varvec{x}}(\varvec{v})&:=\Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},\varvec{v})),&\tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u})&:=\Psi _{\varvec{x}}^{*}(\varvec{f}^{\ddagger }(\varvec{u})). \end{aligned}$$
(132)

These functions are Legendre conjugate as follows:

$$\begin{aligned} \max _{\varvec{u}'\in \tilde{\mathcal {T}}^{*}_{\varvec{x}}\mathcal {X}}&\left[ \langle \varvec{v},\varvec{u}'\rangle -\tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u}') \right] \\&=\max _{\begin{array}{c} \varvec{u}'\in \tilde{\mathcal {T}}^{*}_{\varvec{x}}\mathcal {X}\\ \varvec{j}\in \mathcal {P}^{vl}(\varvec{v}) \end{array}}\left[ \langle -\mathbb {S}\varvec{j},\varvec{u}'\rangle -\tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u}') \right] =\max _{\begin{array}{c} \varvec{u}'\in \tilde{\mathcal {T}}^{*}_{\varvec{x}}\mathcal {X}\\ \varvec{j}\in \mathcal {P}^{vl}(\varvec{v}) \end{array}}\left[ \langle \varvec{j},-\mathbb {S}^{T}\varvec{u}'\rangle -\Psi _{\varvec{x}}^{*}(-\mathbb {S}^{T}\varvec{u}') \right] \\&=\max _{\begin{array}{c} \varvec{f}'\in \mathcal {P}^{fr}(\varvec{0})\\ \varvec{j}\in \mathcal {P}^{vl}(\varvec{v}) \end{array}}\left[ \langle \varvec{j},\varvec{f}'\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}') \right] \\&=\max _{\begin{array}{c} \varvec{f}'\in \mathcal {P}^{fr}(\varvec{0})\\ \varvec{j}\in \mathcal {P}^{vl}(\varvec{v}) \end{array}}\left[ \langle \varvec{j}^{\dagger }(\varvec{x},\varvec{v}),\varvec{f}'\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}') + \langle (\varvec{j}-\varvec{j}^{\dagger }(\varvec{x},\varvec{v})),\varvec{f}'\rangle \right] \\&=\max _{\begin{array}{c} \varvec{f}'\in \mathcal {P}^{fr}(\varvec{0}) \end{array}}\left[ \langle \varvec{j}^{\dagger }(\varvec{x},\varvec{v}),\varvec{f}'\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}') \right] =\Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},\varvec{v}))=\tilde{\Psi }_{\varvec{x}}(\varvec{v}), \end{aligned}$$

where we used \(\langle \varvec{j}-\varvec{j}^{\dagger }(\varvec{x},\varvec{v}),\varvec{f}'\rangle =0\) because \(\varvec{f}' \in \mathcal {P}^{fr}(\varvec{0})=\textrm{Im}\mathbb {S}^{T}\) and \((\varvec{j}-\varvec{j}^{\dagger }(\varvec{x},\varvec{v})) \in \textrm{Ker}\mathbb {S}\). The inverse is also shown:

$$\begin{aligned} \max _{\varvec{v}'\in \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}}&\left[ \langle \varvec{v}',\varvec{u}\rangle -\tilde{\Psi }_{\varvec{x}}(\varvec{v}') \right] =\max _{\begin{array}{c} \varvec{v}'\in \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X} \end{array}}\left[ \langle -\mathbb {S}\varvec{j}^{\dagger }(\varvec{x},\varvec{v}'),\varvec{u}\rangle -\Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},\varvec{v}')) \right] \\&=\max _{\begin{array}{c} \varvec{j}^{\dagger }\in \mathcal {M}^{fr}_{\varvec{x}}(\varvec{0}) \end{array}}\left[ \langle \varvec{j}^{\dagger },-\mathbb {S}^{T}\varvec{u}\rangle -\Psi _{\varvec{x}}(\varvec{j}^{\dagger }) \right] =\max _{ \varvec{j}^{\dagger }\in \mathcal {M}^{fr}_{\varvec{x}}(\varvec{0})}\left[ \langle \varvec{j}^{\dagger },\varvec{f}^{\ddagger }(\varvec{u})\rangle -\Psi _{\varvec{x}}(\varvec{j}^{\dagger }) \right] \\&=\Psi _{\varvec{x}}^{*}(\varvec{f}^{\ddagger }(\varvec{u}))=\tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u}), \end{aligned}$$

where we used the fact that \(\{\varvec{j}^{\dagger }(\varvec{x},\varvec{v}')\}_{\varvec{v}'\in \tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}}=\mathcal {M}^{fr}_{\varvec{x}}(\varvec{0})\) in the second line. The pair \((\varvec{v}\), \(\varvec{u})_{\varvec{x}}\) is Legendre dual of these functions:

$$\begin{aligned} \partial _{\varvec{v}} \tilde{\Psi }_{\varvec{x}}(\varvec{v})&= \left[ \frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}\right] ^{T}\left. \frac{\partial \Psi _{\varvec{x}}(\varvec{j})}{\partial \varvec{j}}\right| _{\varvec{j}=\varvec{j}^{\dagger }(\varvec{x},\varvec{v})} =\left[ \frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}\right] ^{T}\varvec{f}^{\dagger }(\varvec{x},\varvec{v}) \end{aligned}$$
(133)
$$\begin{aligned}&=-\left[ \frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}\right] ^{T}\mathbb {S}^{T}\varvec{u}^{\dagger } (\varvec{x},\varvec{v})=\varvec{u}, \end{aligned}$$
(134)
$$\begin{aligned} \partial _{\varvec{u}} \tilde{\Psi }^{*}_{\varvec{x}}(\varvec{u})&= \left[ \frac{\partial \varvec{f}^{\ddagger }(\varvec{u})}{\partial \varvec{u}}\right] ^{T}\left. \frac{\partial \Psi ^{*}_{\varvec{x}}(\varvec{f})}{\partial \varvec{f}}\right| _{\varvec{f}=\varvec{f}^{\ddagger }(\varvec{u})}=\left[ \frac{\partial \varvec{f}^{\ddagger }(\varvec{u})}{\partial \varvec{u}}\right] ^{T}\varvec{j}^{\ddagger }(\varvec{x},\varvec{u}) \end{aligned}$$
(135)
$$\begin{aligned}&=-\mathbb {S}\varvec{j}^{\ddagger }(\varvec{x},\varvec{u})=\varvec{v}, \end{aligned}$$
(136)

where we used \(\left[ \frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}\right] ^{T}\mathbb {S}^{T}=-I\) from \(\frac{\partial }{\partial \varvec{v}}[\varvec{v}+\mathbb {S}\varvec{j}^{\dagger }(\varvec{x},\varvec{v})]=I+ \mathbb {S}\frac{\partial \varvec{j}^{\dagger }(\varvec{x},\varvec{v})}{\partial \varvec{v}}=0\) and \(\frac{\partial \varvec{f}^{\ddagger }(\varvec{u})}{\partial \varvec{u}} =-\mathbb {S}^{T}\). They are dissipation functions; strict convexity and 1-coercivity follow from those of the original dissipation functions. Also, we have

$$\begin{aligned}&\text{ Symmetry }:\quad \tilde{\Psi }_{\varvec{x}}(-\varvec{v}) = \Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},-\varvec{v}))=\Psi _{\varvec{x}} (-\varvec{j}^{\dagger }(\varvec{x},\varvec{v}))=\tilde{\Psi }_{\varvec{x}}(\varvec{v}) \end{aligned}$$
(137)
$$\begin{aligned}&\text{ Bounded } \text{ by } 0 \text{ at } \varvec{0}: \quad \tilde{\Psi }_{\varvec{x}}(\varvec{v}=\varvec{0}) =\Psi _{\varvec{x}}(\varvec{j}^{\dagger }(\varvec{x},\varvec{0}))=\Psi _{\varvec{x}}(\varvec{0})=0. \end{aligned}$$
(138)

\(\square \)

Using the induced dissipation functions, we define the Bregman divergence on \((\tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}, \tilde{\mathcal {T}}^{*}_{\varvec{x}}\mathcal {X})\), which is associated with the Bregman divergence on \((\mathcal {J}_{\varvec{x}},\mathcal {F}_{\varvec{x}})\):

$$\begin{aligned} \mathcal {D}^{\mathcal {X},\mathcal {Y}}_{\tilde{\Psi }_{\varvec{x}}}[\varvec{v}\Vert \varvec{u}']&:=\tilde{\Psi }_{\varvec{x}}(\varvec{v}) + \tilde{\Psi }_{\varvec{x}}^{*}(\varvec{u}') -\langle \varvec{v}, \varvec{u}'\rangle \end{aligned}$$
(139)
$$\begin{aligned}&=\Psi _{\varvec{x}}(\varvec{j}^{\dagger }) + \Psi _{\varvec{x}}^{*}(\varvec{f}'^{\ddagger })-\langle \varvec{j}^{\dagger }, \varvec{f}'^{\ddagger }\rangle =\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}} [\varvec{j}^{\dagger }\Vert \varvec{f}'^{\ddagger }], \end{aligned}$$
(140)

where \(\varvec{j}^{\dagger }=\varvec{j}^{\dagger }(\varvec{x},\varvec{v})\) and \(\varvec{f}'^{\ddagger }=\varvec{f}^{\ddagger }(\varvec{x},\varvec{u}')\). Therefore, we have the induced dually flat structure on \((\tilde{\mathcal {T}}_{\varvec{x}}\mathcal {X}, \tilde{\mathcal {T}}_{\varvec{x}}^{*}\mathcal {X})\). This induced structure can be regarded as an extension to discrete manifolds of the Otto structure [65, 66]: the formal Riemannian structure induced by the \(L^{2}\) -Wasserstein distance. This is also related to Pistone’s infinite-dimensional information geometry [72, 121].

8.3 Fisher information, natural gradient, mirror descent, evolutionary computation, and optimal transport

In information geometry, it is conventional to use the Fisher information matrices, i.e., the Hessian matrices \(G_{\varvec{x}}\) and \(G^{*}_{\varvec{y}}\) (Eq. 31) as the metric tensor (Fisher–Rao metric) on \((\mathcal {T}_{\varvec{x}}\mathcal {X}, \mathcal {T}_{\varvec{x}}^{*}\mathcal {X})\) or equivalently on \((\mathcal {T}_{\varvec{y}}\mathcal {Y}, \mathcal {T}_{\varvec{y}}^{*}\mathcal {Y})\) (Fig. 7). Gradient systems have been defined information-geometrically [25] as a Riemannian gradient flow using the Bregman divergence and the Fisher information matrix of \(\Phi (\varvec{x})\) as the gradient function and the metric tensor, respectively: \(\dot{\varvec{x}}=-G_{\varvec{x}}^{-1} \partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\). Because both \(G_{\varvec{x}}\) and \(\mathcal {D}^{\mathcal {X}}_{\Phi }\) are derived from \(\Phi (\varvec{x})\), this gradient flow becomes a geodesic in \(\mathcal {Y}\) space: \(\dot{\varvec{y}}=- (\varvec{y}-\tilde{\varvec{y}})\). In natural gradient descent [67, 69, 148], the Fisher information matrix is used to find the steepest descent gradient of a function \(\mathcal {F}(\varvec{\theta })\) on a parameter space \(\Theta \) as \(\dot{\varvec{\theta }}=-G_{\varvec{\theta }}^{-1} \partial \mathcal {F}(\varvec{\theta })\), where \(G_{\varvec{\theta }}\) is determined independently of \(\mathcal {F}(\varvec{\theta })\) by considering the underlying model parameter space. In optimization, the natural gradient is fundamental in information-geometric optimization algorithms, which contain various evolutionary optimization schemes [70]. In relation to machine learning, the mirror descent is identified with the natural gradient descent by a naive continuous limit [69, 149]. Furthermore, optimal transport has recently been employed to replace or integrate the Fisher-Rao metric with the Wasserstein metric [150, 151]. Because the Wasserstein metric can take the information of the base manifold into account, their integration may provide more amenable ways to accommodate various prior and structural information.

The doubly dual flat structure introduced in this work actually provides a solution to generalize those results and the associated problems. The base space \(\mathcal {X}\) with the dually flat structure and the associated Fisher information matrix accommodates the conventional natural gradient. The graph or hypergraph structure endows the additional topological relation to the base space of \(\mathcal {X}\). The dissipation functions on the edge spaces or their induced versions bestow a more flexible way than the Fisher-Rao metric to represent the loss of the potential function, i.e., the dissipation, at each point in the state space. Upon necessity, we may combine both of them (Fig. 7), for example, as \(\dot{\varvec{x}}=-G_{\varvec{x}}^{-1} \partial \mathcal {F}^{(1)}(\varvec{x})-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}^{(2)}(\varvec{x})]\) where \(\mathcal {F}^{(1)}(\varvec{x})\) and \(\mathcal {F}^{(2)}(\varvec{x})\) could be different. This flexibility may contribute to the design of new algorithms for machine learning. Actually, this integrated representation is quite relevant to the filtering equations [152] in sequential inference where the first term, i.e., \(-G_{\varvec{x}}^{-1} \partial \mathcal {F}^{(1)}(\varvec{x})\), can usually be associated with the update of posterior probability by observation and the second term, \(-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}[\textrm{grad}_{\mathbb {S}}\partial \mathcal {F}^{(2)}(\varvec{x})]\) can represent the prediction by the prior information on the dynamics. Our framework may provide a unified information-geometric perspective to various information-geometric analyses and extensions of filtering, e.g., projection-filters [153], information-geometric nonlinear filtering [154], and information geometric optimization [70]. Furthermore, the generalized gradient flow can be regarded as a continuous time limit of the mirror descent where the nonlinear Legendre duality between primal and dual spaces is preserved at the limit. This fact may be employed to design new gradient-based algorithms via the doubly dual flat structure.

9 Information-geometric properties of generalized nonequilibrium flow

In this section, we consider the nonequilibrium flow defined by Eq. 61, i.e.,

$$\begin{aligned} \dot{\varvec{x}}=-\textrm{div}_{\mathbb {S}} \partial \Psi ^{*}_{\varvec{x}}\left[ \left[ \textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\right] +\varvec{f}_{NE}\right] , \end{aligned}$$
(141)

with \(\varvec{f}_{NE}\not \in \textrm{Im}\mathbb {S}^{T}\), and show how information geometry can be employed to analyze such dynamics. While we can obtain several properties of equilibrium flow independently of the detail of the thermodynamic function and the dissipation function, these functions should be related so as to obtain nice properties for the nonequilibrium flow. We will observe that the thermodynamic function and the dissipation function of LMA kinetics actually have such a relation.

9.1 Gradient-flow-like property and Lyapunov function of nonequilibrium flow

For the equilibrium flow (Eq. 58), the Bregman divergence \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]\) is a Lyapunov function. The Bregman divergence can still be a Lyapunov function even for the nonequilibrium flow (Eq. 141) under the following conditions:

Lemma 5

Suppose that, for all \(\varvec{x}\in \mathcal {X}\), the force \(\varvec{f}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}]+\varvec{f}_{NE}\) is orthogonally decomposed as \(\varvec{f}(\varvec{x})=\varvec{f}_{S}(\varvec{x}) + \varvec{f}_{A}(\varvec{x})\) where \(\varvec{f}_{S}(\varvec{x}):=\textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}_{CB}]\) and \(\varvec{f}_{A}(\varvec{x})\in \mathcal {F}_{\varvec{x}}\) satisfy the pseudo-Hilbert-isosceles orthogonality \(\varvec{f}_{S}(\varvec{x}) \perp _{H} \varvec{f}_{A}(\varvec{x})\). Then \(\frac{\textrm{d}}{\textrm{d}t}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}]\le 0\) holds. In addition, \(\varvec{x}_{CB}=\mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})\) is the unique steady state of Eq. 141 with the initial state \(\varvec{x}_{0}\) that attains \(\frac{\textrm{d}}{\textrm{d}t}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}]=0\). Thus, \(\varvec{x}_{CB}\) is locally and asymptotically stable.Footnote 78

Proof

We can directly verify \(\frac{\textrm{d}}{\textrm{d}t}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}]\le 0\) as follows:

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}]&=\langle \dot{\varvec{x}}, \partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}] \rangle =-\langle \textrm{div}_{\mathbb {S}}\varvec{j}(\varvec{x}), \partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}] \rangle \end{aligned}$$
(142)
$$\begin{aligned}&=-\langle \varvec{j}(\varvec{x}), \textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}} \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}_{t}\Vert \tilde{\varvec{x}}_{CB}] \rangle \nonumber \\&=-\langle \varvec{j}(\varvec{x}), \varvec{f}_{S}(\varvec{x}) \rangle =-\frac{1}{2}\mathcal {D}^{\mathcal {J},\mathcal {F}}_{\varvec{x}}[\varvec{j} (\varvec{x})\Vert -\varvec{f}''(\varvec{x})] \le 0 \end{aligned}$$
(143)

where we used Eq. 125 and \(\varvec{f}''(\varvec{x}):=\varvec{f}_{S}(\varvec{x})-\varvec{f}_{A}(\varvec{x})\). The equality holds if and only if \(\varvec{f}(\varvec{x})=-\varvec{f}''(\varvec{x})\), which means that

$$\begin{aligned} \varvec{f}(\varvec{x})&=-\varvec{f}''(\varvec{x}) \Longleftrightarrow \varvec{f}_{S}(\varvec{x})=0 \Longleftrightarrow \textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}_{CB}]=0\nonumber \\&\qquad \Longleftrightarrow \varvec{x} \in \mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB}). \end{aligned}$$
(144)

Because \(\varvec{x}_{t}\in \mathcal {P}^{sc}(\varvec{x}_{0})\), \(\varvec{x}_{CB}=\mathcal {P}^{sc}(\varvec{x}_{0}) \cap \mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})\) holds. \(\square \)

Thus, if the pseudo-Hilbert-isosceles orthogonal decomposition exists, then the nonequilibrium flow behaves like the equilibrium flow.

9.2 Complex-balanced state and pseudo-Hilbert-isosceles orthogonality

General conditions or situations under which the orthogonal decomposition in Lemma 5 exists are still an open problem. However, for CRN with LMA kinetics, the decomposition holds if a complex-balanced steady state exists.

Proposition 9

(Complex-balanced steady state and orthogonal decomposition for CRN with LMA kinetics [78, 79, 81, 83]) Suppose that a complex balanced steady state exists, i.e., \(\mathcal {M}^{\textrm{CB}}\ne \emptyset \) for CRN with LMA kinetics (Eq. 12). Using any \(\tilde{\varvec{x}}_{CB} \in \mathcal {M}^{\textrm{CB}}\), consider a decomposition of the force \(\varvec{f}_{MA}(\varvec{x})\) as \(\varvec{f}_{MA}(\varvec{x})=\varvec{f}_{S}(\varvec{x})+\varvec{f}_{A}\) where \(\varvec{f}_{S}(\varvec{x})=\textrm{grad}_{\mathbb {S}}\partial _{\varvec{x}}\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}_{CB}]\), \(\varvec{f}_{A}=\ln \varvec{K}+\mathbb {S}^{T}\ln \tilde{\varvec{x}}_{CB}\), and \(\Phi (\varvec{x})\) is as in Eq. 62. Then, for the dissipation functions in Eq. 68, the pseudo-Hilbert isosceles orthogonality \(\varvec{f}_{S}(\varvec{x})\perp _{H} \varvec{f}_{A}\) holds for all \(\varvec{x} \in \mathcal {X}\). In addition, \(\mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})=\mathcal {M}^{\textrm{CB}}\) holds.

Proof

We can prove the orthogonality by direct computation. The orthogonality condition is

$$\begin{aligned} \Psi ^{*}_{\varvec{x}}[\varvec{f}_{S}(\varvec{x})+\varvec{f}_{A}]&=\Psi ^{*}_{\varvec{x}}[\varvec{f}_{S}(\varvec{x})-\varvec{f}_{A}] \end{aligned}$$
(145)
$$\begin{aligned}&\Leftrightarrow \left\langle \varvec{j}^{+}_{MA}(\varvec{x}),\left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{-\mathbb {S}^{T}} -\varvec{1}\right\rangle \nonumber \\&\quad +\left\langle \varvec{j}^{-}_{MA}(\varvec{x}),\! \left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\mathbb {S}^{T}} \!-\!\varvec{1}\right\rangle \!=\!0. \end{aligned}$$
(146)

Consider the following equality:

$$\begin{aligned} \left\langle \varvec{j}^{\pm }_{MA}(\varvec{x}),\left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\mp \mathbb {S}^{T}} -\varvec{1}\right\rangle&=\sum _{e=1}^{N_{\mathbb {e}}}k_{e}^{\pm }\tilde{\varvec{x}}_{CB}^{\varvec{\gamma }_{e}^{\pm }} \left[ \left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\varvec{\gamma }_{e}^{\mp }} -\left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\varvec{\gamma }_{e}^{\pm }}\right] , \end{aligned}$$
(147)

where . By using this, we have the following:

$$\begin{aligned} \text{ Eq. }\,146&=\sum _{e=1}^{N_{\mathbb {e}}}\left( k_{e}^{+}\tilde{\varvec{x}}_{CB}^{\gamma _{e}^{+}} -k_{e}^{-}\tilde{\varvec{x}}_{CB}^{\gamma _{e}^{-}}\right) \left[ \left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\gamma _{e}^{-}} -\left( \frac{\varvec{x}}{\tilde{\varvec{x}}_{CB}}\right) ^{\gamma _{e}^{+}}\right] \end{aligned}$$
(148)
(149)
(150)

Thus, Eq. 145 holds for any \(\varvec{x}\in \mathcal {X}\) if \(\mathbb {B}\varvec{j}_{MA}(\tilde{\varvec{x}}_{CB})=\varvec{0}\) holds.Footnote 79\(\mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})=\mathcal {M}^{\textrm{CB}}\) can be proved by obtaining the parametric representation of \(\mathcal {M}^{\textrm{CB}}\) as \(\mathcal {M}^{\textrm{CB}}=\{\varvec{x}\in \mathcal {X}| \ln \varvec{x}-\ln \tilde{\varvec{x}}_{CB} \in \textrm{Ker}\mathbb {S}^{T} \}\) via solving \(\varvec{j}_{MA}(\varvec{x})=\varvec{0}\).Footnote 80 This representation is identical to that of \(\mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})\) (Eq. 104). \(\square \)

Remark 10

(Algebraic structure of detailed balanced and complex balanced manifolds) We here mention about the underlying algebraic source of why \(\mathcal {M}^{eq}(\tilde{\varvec{x}}_{CB})=\mathcal {M}^{\textrm{CB}}\) holds. First, we already showed that \(\mathcal {M}^{DB}=\mathcal {M}^{eq}(\tilde{\varvec{x}})\) holds generally if \(\mathcal {M}^{\textrm{DB}} \ne \emptyset \). Under LMA kinetics (Eq. 12), the DB condition \(\varvec{j}_{MA}(\varvec{x})=\varvec{0}\) is nothing but the binomial equations because \(\varvec{j}^{\pm }_{MA}(\varvec{x})\) are vectors of monominals of \(\varvec{x}\). Owing to this, \(\mathcal {M}^{DB}\) becomes a toric variety.Footnote 81 In contrast, the CB condition \(\mathbb {B}\varvec{j}_{MA}(\varvec{x}_{CB})=\varvec{0}\) is a set of polynomial equations for LMA kinetics. Nonetheless, it was shown that \(\mathcal {M}^{\textrm{CB}}\) is binomially generated and has the same structural matrix \(\mathbb {S}^{T}\) as the equilibrium manifold [100]. Because of that, they become equivalent as manifolds.

Because rLDG (Eq. 3) is a subclass of CRN where and thus \(\mathbb {S}=\mathbb {B}\) holds, the complex-balanced condition is always satisfied for rLDG.

Corollary 1

(rLDG is unconditionally complex-balanced [8]) All the steady states of rLDG are complex-balanced states, i.e., \(\mathcal {M}^{\textrm{ST}} = \mathcal {M}^{\textrm{CB}}\) independently of the parameter values \(\varvec{k}^{\pm }\) of the flux (Eq. 3).Footnote 82 Thus, KL divergence (Eq. 64) always works as a Lyapunov function of rLDG.Footnote 83

The properties described in this corollary are well-known for rMJP and are usually obtained by using the Perron-Frobenius theorem for linear operators. The framework of the generalized flow enables us to extend them to the nonlinear regime.

9.3 Effective flux of the nonequilibrium flow by the primal information-geometric projection

In general, the nonequilibrium force or flux has redundant degrees of freedom in terms of generating a specific vector field or trajectory \(\{\varvec{x}_{t}\}\) on the density space. By using the extended HHK projective decomposition (Theorem 1), we can carve out the effective part of the flux for the trajectory \(\{\varvec{x}_{t}\}\). In addition, we can obtain an effective time-dependent equilibrium flux that mimics the trajectory \(\{\varvec{x}_{t}\}\):

Lemma 6

(Effective time-dependent equilibrium flux) Suppose that \(\varvec{j}(\varvec{x})\) is the flux of a generalized flow (Eq. 45), and define the corresponding effective equilibrium flux by \(\varvec{j}_{eq}(\varvec{x})=\partial \Psi ^{*}_{\varvec{x}}[\mathbb {S}^{T}\varvec{u}_{eq}(\varvec{x})]\) where \(\varvec{u}_{eq}(\varvec{x}):=\partial \tilde{\Psi }_{\varvec{x}}[-\mathbb {S}\varvec{j}(\varvec{x})]\). Then, \(\varvec{j}_{eq}(\varvec{x})\) induces the same velocity as \(\varvec{j}(\varvec{x})\) does.Footnote 84 Furthermore, for a given trajectory \(\{\varvec{x}_{t}\}\) of \(\varvec{j}(\varvec{x})\), we can construct a time-dependent equilibrium flux \(\varvec{j}_{eq}(t,\varvec{x})\) that can generate the same \(\{\varvec{x}_{t}\}\) as follows

$$\begin{aligned} \varvec{j}_{eq}(t,\varvec{x})&:=\partial \Psi ^{*}_{\varvec{x}}\left[ \textrm{grad}_{\mathbb {S}}\partial \mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}}_{t}]\right] ,&\tilde{\varvec{x}}_{t} :=\partial \Phi ^{*}[\partial \Phi (\varvec{x}_{t})+\varvec{u}_{eq}(\varvec{x}_{t})]. \end{aligned}$$
(151)

Proof

From Theorem 1, \(\varvec{j}(\varvec{x})\) can be decomposed as \(\varvec{j}(\varvec{x})=\varvec{j}_{eq}(\varvec{x})+(\varvec{j}(\varvec{x})-\varvec{j}_{eq}(\varvec{x}))\). Because \(\varvec{f}_{eq}(\varvec{x})=\partial \Psi _{\varvec{x}}^{*}[\varvec{j}_{eq}(\varvec{x})] \in \mathcal {P}^{fr}(\varvec{0})\), there exists \(\varvec{u}_{eq}(\varvec{x})\) satisfying \(-\mathbb {S}^{T}\varvec{u}_{eq}(\varvec{x})=\varvec{f}_{eq}(\varvec{x})\). By employing the duality introduced in Theorem 2, \(\varvec{u}_{eq}(\varvec{x})\) can be represented as \(\varvec{u}_{eq}(\varvec{x})=\partial \tilde{\Psi }_{\varvec{x}}[\varvec{v}(\varvec{x})]\) where \(\varvec{v}(\varvec{x})=-\mathbb {S}\varvec{j}(\varvec{x})\). Because \(-\mathbb {S}\varvec{j}_{eq}(\varvec{x})=\varvec{v}(\varvec{x})\), \(\varvec{j}_{eq}(\varvec{x})\) generates the same dynamics or vector field as \(\varvec{j}(\varvec{x})\) does. By solving \(-\varvec{u}_{eq}(\varvec{x}_{t})=\partial \Phi (\varvec{x}_{t})-\partial \Phi (\tilde{\varvec{x}}_{t})\), we have Eq. 151. \(\square \)

The effective time-dependent equilibrium flux is obtained more explicitly for CRN with LMA kinetics:

Corollary 2

(Effective equilibrium force and flux of LMA kinetics) Consider the following quantities of CRN with LMA kinetics:

$$\begin{aligned}&\mathrm {Flux\,defined\,in\,Eq.}\,\!:12 \quad \varvec{j}_{MA}(\varvec{x};\varvec{k}^{\pm }) \end{aligned}$$
(152)
$$\begin{aligned}&\mathrm {Force\,defined\,in\,Eq.}\,\!:70 \quad \varvec{f}_{\textrm{MA}}(\varvec{x};\varvec{K}) \end{aligned}$$
(153)
$$\begin{aligned}&\mathrm {Thermodynamic\,function\,defined\,in\, Eq.}\,\!:62 \quad \Phi (\varvec{x}) \end{aligned}$$
(154)
$$\begin{aligned}&\mathrm {Dissipation\, functions\,defined\,in\,Eq.}\,68\, \mathrm {with\,Eq.}\,\!:70 \quad \Psi ^{*}_{\varvec{x},\varvec{\kappa }}[\varvec{f}] = \Psi ^{*}_{\varvec{\omega }_{\textrm{MA}}(\varvec{x};\varvec{\kappa })}[\varvec{f}], \end{aligned}$$
(155)

where \(\varvec{k}^{\pm }=\varvec{\kappa }\circ \varvec{K}^{\pm 1/2}\) holds. For a trajectory \(\{\varvec{x}_{t}\}\) generated by \(\varvec{j}_{MA}(\varvec{x};\varvec{k}^{\pm })\), the effective time-dependent equilibrium force \(\varvec{f}_{eq}(t,\varvec{x})\) can be described as \(\varvec{f}_{MA}(\varvec{x}; \varvec{K}_{eq}(t))\) where \(\varvec{K}_{eq}(t)\) is determined by

$$\begin{aligned} \varvec{K}_{eq}(t)&:=\exp \left[ -\mathbb {S}^{T}\left( \varvec{u}_{eq}(\varvec{x}_{t};\varvec{\kappa })+\partial \Phi (\varvec{x}_{t}) \right) \right] , \nonumber \\&\varvec{u}_{eq}(\varvec{x}_{t};\varvec{\kappa }) :=\partial \tilde{\Psi }_{\varvec{x}_{t},\varvec{\kappa }}[-\mathbb {S}\varvec{j}_{MA}(\varvec{x}_{t})] \end{aligned}$$
(156)

Thus, the effective time-dependent equilibrium flux \(\varvec{j}_{eq}(t,\varvec{x})\) is represented as \(\varvec{j}_{eq}(t,\varvec{x})=\varvec{j}_{MA}(\varvec{x}; \varvec{k}^{\pm }_{eq}(t))\) where \(\varvec{k}^{\pm }_{eq}(t)=\varvec{\kappa }\circ \varvec{K}^{\pm 1/2}_{eq}(t)\).

This corollary means that the effective time-dependent flux of LMA kinetics is always obtained by a time-dependent modulation of the kinetic parameters \(\varvec{k}^{\pm }\). More specifically, the modulation of force part \(\varvec{K}_{eq}(t)\) is sufficient while the activity part \(\varvec{\kappa }\) is kept constant.Footnote 85

Fig. 8
figure 8

A nonequilibrium trajectory of the Brusselator CRN (Ex. 1) and the associated time-dependent effective equilibrium flux \(\varvec{j}_{eq}(t,\varvec{x})=\varvec{j}_{MA}(\varvec{x}; \varvec{k}^{\pm }_{eq}(t))\). (a) A nonequilibrium trajectory \(\{\varvec{x}_{t}\}\) of the Brusselator CRN for a set of the kinetic parameters, \(k_{1}^{+}=1.0\), \(k_{1}^{-}=1.0\), \(k_{2}^{+}=3.0\), \(k_{2}^{-}=0.1\), \(k_{3}^{+}=1.0\), \(k_{3}^{-}=0.1\), and the initial state \((\varvec{x}_{1}(0),\varvec{x}_{2}(0))=(1.0, 4.0)\). (b,c) The time-dependent kinetic parameter set \(\varvec{k}^{\pm }_{eq}(t)\), which generates the time-dependent equilibrium flux \(\varvec{j}_{eq}(t,\varvec{x})\). (d) The top left panel shows the nonequilibrium trajectory \(\{\varvec{x}_{t}\}\) (black curve) and the vector field \(\varvec{v}(\varvec{x})= - \mathbb {S}\varvec{j}(\varvec{x})\) (arrows). The other panels show the time-dependent vector field \(\varvec{v}_{eq}(t, \varvec{x})\) induced by \(\varvec{j}_{eq}(t,\varvec{x})\): \(\varvec{v}_{eq}(t, \varvec{x})= - \mathbb {S}\varvec{j}_{eq}(t,\varvec{x})\). The nonequilibrium trajectory \(\{\varvec{x}_{t}\}\) is also depicted as a reference (the black curve). The color indicates the value of \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}_{t}}]\). In each panel, the white circle on the trajectory is \(\varvec{x}_{t}\) at which \(\varvec{j}_{eq}(t,\varvec{x})\) is computed, and the black circle with a white border corresponds to \(\tilde{\varvec{x}}_{t}\) at which \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}_{t}}]\) is 0

Example 3

(Simplified Brusselator CRN [8, 104] (continued)) By using the Brusselator CRN (Ex. 1), we numerically obtained a nonequilibrium trajectory \(\{\varvec{x}_{t}\}\) (Fig. 8a, d top left panel). By using Cor. 2, we also computed the corresponding time-dependent kinetic parameter set \(\varvec{k}^{\pm }_{eq}(t)\) that generates the time-dependent equilibrium flux \(\varvec{j}_{eq}(t,\varvec{x})=\varvec{j}_{MA}(t,\varvec{x}; \varvec{k}^{\pm }_{eq}(t))\) (Fig. 8b, c). Figure 8d shows the vector field \(\varvec{v}_{eq}(t, \varvec{x})=-\mathbb {S}\varvec{j}_{eq}(t,\varvec{x})\) induced by the time-dependent equilibrium flux \(\varvec{j}_{eq}(t,\varvec{x})\) and the contours of \(\mathcal {D}^{\mathcal {X}}_{\Phi }[\varvec{x}\Vert \tilde{\varvec{x}_{t}}]\) where \(\tilde{\varvec{x}}_{t}\) follows from Eq. 151. From Fig. 8, we can see that the trajectory \(\{\varvec{x}_{t}\}\) originally generated by the nonequilibrium flux \(\varvec{j}(\varvec{x})\) can be traced by the time-dependent equilibrium flux \(\varvec{j}_{eq}(t,\varvec{x})\) and also that \(\varvec{j}_{eq}(t,\varvec{x})\) can be physically realized by the modulation of the kinetic parameters \(\varvec{k}^{\pm }_{eq}(t)\).

9.4 Characterization of the nonequilibrium flow by the dual information geometric projection

The nonequilibrium flow is redundant in terms of generating a specific vector field or trajectory \(\{\varvec{x}_{t}\}\). Such redundancy is crucial to characterize the extent of nonequilibrium. One approach for the characterization is to investigate the cycle force or flux, which has been employed in the linear theory of dynamics on graphs and also in graph-theoretic approaches to nonequilibrium phenomena [89, 155,156,157,158]. To extract such cyclic components, we can use \(\mathbb {V}^{T}=\textrm{curl}_{\mathbb {V}}\), its adjoint \(\mathbb {V}=\textrm{curl}^{*}_{\mathbb {V}}\), the associated cycle subspaces \(C^{2}(\mathbb {H})\) and \(C_{2}(\mathbb {H})\), and also the generalized HKK decomposition (Theorem 1).

Definition 34

(Cycle spaces) The cycle spaces at \(\varvec{x}\in \mathcal {X}\) are defined as \(\mathcal {Z}_{\varvec{x}} = C_{2}(\mathbb {H})=\textrm{Ker}[\mathbb {V}]\) and \(\mathfrak {Z}_{\varvec{x}} = C^{2}(\mathbb {H})=\textrm{Im}[\mathbb {V}^{T}]\).

For a given nonequilibrium force \(\varvec{f}(\varvec{x})\), we can obtain its cycle component \(\varvec{\zeta } = \textrm{curl}_{\mathbb {V}}\varvec{f}(\varvec{x})=\mathbb {V}^{T}\varvec{f}(\varvec{x}) \in \mathfrak {Z}_{\varvec{x}}\). \(\varvec{\zeta }\) contains the information to categorize the force because \(\mathcal {P}^{fr}(\varvec{f})=\mathcal {P}^{fr}(\varvec{\zeta })\) is the quotient space of force by the equilibrium force. For each \(\zeta \in \mathfrak {Z}_{\varvec{x}}\), we obtain the representative force \(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\) via the following variational problem:

Lemma 7

(Steady (zero-velocity) force as the minimizer of the dual dissipation function) For a given \(\varvec{\zeta }\in \mathfrak {Z}_{\varvec{x}}\), we define the force \(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\) minimizing the dual dissipation function:

$$\begin{aligned} \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }):=\arg \min _{\varvec{f}}\Psi _{\varvec{x}}^{*}[\varvec{f}],\quad \text{ s.t. } \textrm{curl}_{\mathbb {V}} \varvec{f}=\varvec{\zeta }. \end{aligned}$$
(157)

Then, \(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\) is the steady (zero-velocity) force, i.e., \(\varvec{j}^{\lozenge }=\partial \Psi _{\varvec{x}}^{*}[\varvec{f}^{\lozenge }] \in \mathcal {P}^{vl}(\varvec{0})\).

Proof

From Eq. 115 in Theorem 1, \(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {P}^{fr}(\varvec{\zeta })\cap \mathcal {M}^{vl}(\varvec{0})\). Thus, \(\varvec{j}^{\lozenge }\in \mathcal {P}^{vl}(\varvec{0})\)\(\square \)

Among various forces \(\varvec{f}\) that has the same cyclic component \(\varvec{\zeta }\), the force \(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\) is the one that induces no dynamics of \(\varvec{x}\). Because any dynamics of \(\varvec{x}\) can be represented by the effective equilibrium flux as in Lemma 6, \(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\) can be regarded as the force being purely relevant to the cycle.

Fig. 9
figure 9

The induced dually flat structure on the cycle spaces (right gray region) in comparison with the induced structure on the restricted tangent and cotangent spaces (left gray region)

Using \(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\), we can establish the induced duality between \(\mathcal {Z}_{\varvec{x}}\) and \(\mathfrak {Z}_{\varvec{x}}\) spaces:

Theorem 3

(Induced dually flat structure on cycle spaces) On the cycle spaces, \(\mathcal {Z}_{\varvec{x}}\) and \(\mathfrak {Z}_{\varvec{x}}\), we have the Legendre conjugate dissipation functions \(\hat{\Psi }_{\varvec{x}}: \mathcal {Z}_{\varvec{x}} \rightarrow \mathbb {R}\) and \(\hat{\Psi }_{\varvec{x}}^{*}: \mathfrak {Z}_{\varvec{x}} \rightarrow \mathbb {R}\) induced by the dissipation functions on the edge spaces (Fig. 9).

Proof

For each \(\varvec{\zeta }\in \mathfrak {Z}_{\varvec{x}}\), we can uniquely determine \(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {F}_{\varvec{x}}\) and \(\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })\in \mathcal {J}_{\varvec{x}}\) as

$$\begin{aligned} \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })&:=\mathcal {P}^{fr}(\varvec{\zeta })\cap \mathcal {M}^{vl}_{\varvec{x}}(\varvec{0})=\arg \min _{\varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta })} \Psi _{\varvec{x}}^{*}[\varvec{f}], \end{aligned}$$
(158)
$$\begin{aligned} \varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })&:=\partial \Psi ^{*}_{\varvec{x}}[\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })] \in \mathcal {P}^{vl}(\varvec{0}). \end{aligned}$$
(159)

In addition, \(\varvec{z}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {Z}_{\varvec{x}}\) satisfying \(\mathbb {V}\varvec{z}^{\lozenge }(\varvec{x},\varvec{\zeta }) = \varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })\) is also uniquely determined because \(\mathbb {V}: \mathcal {Z}_{\varvec{x}} \rightarrow \mathcal {J}_{\varvec{x}}\), \(\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {P}^{vl}(\varvec{0})=\textrm{Im}[\mathbb {V}]\) and \(\textrm{Ker}\mathbb {V}=\{\varvec{0}\}\). For these quantities,

$$\begin{aligned} \varvec{\zeta }&= \mathbb {V}^{T} \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }),&\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })&= \partial \Psi _{\varvec{x}}[\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })],&\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })&= \mathbb {V}\varvec{z}^{\lozenge }(\varvec{x},\varvec{\zeta }), \end{aligned}$$
(160)

hold. Conversely, for a given \(\varvec{z}\in \mathcal {Z}_{\varvec{x}}\), we have \(\varvec{j}^{\blacklozenge }(\varvec{z})\), \(\varvec{f}^{\blacklozenge }(\varvec{x}, \varvec{z})\), and \(\varvec{\zeta }^{\blacklozenge }(\varvec{x},\varvec{z})\) as follows:

$$\begin{aligned} \varvec{\zeta }^{\blacklozenge }(\varvec{x},\varvec{z})&= \mathbb {V}^{T} \varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z}),&\varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z})&= \partial \Psi _{\varvec{x}}[\varvec{j}^{\blacklozenge }(\varvec{z})],&\varvec{j}^{\blacklozenge }(\varvec{z})&= \mathbb {V}\varvec{z}. \end{aligned}$$
(161)

For a pair of \((\varvec{z},\varvec{\zeta })_{\varvec{x}}\) that satisfies \(\varvec{z}=\varvec{z}^{\lozenge }(\varvec{x},\varvec{\zeta })\), then \(\varvec{\zeta }=\varvec{\zeta }^{\blacklozenge }(\varvec{x},\varvec{z})\), \(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })=\varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z})\), and \(\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta })=\varvec{j}^{\blacklozenge }(\varvec{z})\) hold. This pairing establishes a bijection between \(\mathcal {Z}_{\varvec{x}}\) and \(\mathfrak {Z}_{\varvec{x}}\). This bijection is realized by the Legendre transformations of the following induced dissipation functions on \(\mathcal {Z}_{\varvec{x}}\) and \(\mathfrak {Z}_{\varvec{x}}\):

$$\begin{aligned} \hat{\Psi }_{\varvec{x}}(\varvec{z})&:=\Psi _{\varvec{x}}(\varvec{j}^{\blacklozenge }(\varvec{z})),&\hat{\Psi }_{\varvec{x}}^{*}(\varvec{\zeta })&:=\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })). \end{aligned}$$
(162)

These functions are Legendre conjugate as follows:

$$\begin{aligned} \max _{\varvec{z}'\in \mathcal {Z}_{\varvec{x}}}&\left[ \langle \varvec{z}', \varvec{\zeta }\rangle -\hat{\Psi }_{\varvec{x}}(\varvec{z}') \right] =\max _{\begin{array}{c} \varvec{z}'\in \mathcal {Z}_{\varvec{x}}\\ \varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta }) \end{array}}\left[ \langle \varvec{z}', \mathbb {V}^{T}\varvec{f}\rangle -\hat{\Psi }_{\varvec{x}}(\varvec{z}') \right] \\&=\max _{\begin{array}{c} \varvec{z}'\in \mathcal {Z}_{\varvec{x}}\\ \varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta }) \end{array}}\left[ \langle \mathbb {V}\varvec{z}', \varvec{f}\rangle -\Psi _{\varvec{x}}(\mathbb {V}\varvec{z}') \right] \\&=\max _{\begin{array}{c} \varvec{j}'\in \mathcal {P}^{vl}(\varvec{0})\\ \varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta }) \end{array}}\left[ \langle \varvec{j}', \varvec{f}\rangle -\Psi _{\varvec{x}}(\varvec{j}') \right] \\&=\max _{\begin{array}{c} \varvec{j}'\in \mathcal {P}^{vl}(\varvec{0})\\ \varvec{f}\in \mathcal {P}^{fr}(\varvec{\zeta }) \end{array}}\left[ \langle \varvec{j}', \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\rangle -\Psi _{\varvec{x}}(\varvec{j}') + \langle \varvec{j}', (\varvec{f}-\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }))\rangle \right] \\&=\max _{\begin{array}{c} \varvec{j}'\in \mathcal {P}^{vl}(\varvec{0}) \end{array}}\left[ \langle \varvec{j}', \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })\rangle -\Psi _{\varvec{x}}(\varvec{j}') \right] =\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }))\nonumber \\ {}&=\hat{\Psi }_{\varvec{x}}^{*}(\varvec{\zeta }), \end{aligned}$$

where we used \(\langle \varvec{j}', (\varvec{f}-\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }))\rangle =0\) because \(\varvec{j}' \in \mathcal {P}^{vl}(\varvec{0})=\textrm{Im}\mathbb {V}\) and \(\varvec{f}-\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }) \in \mathcal {P}^{fr}(\varvec{0})=\textrm{Ker}\mathbb {V}^{T}\). The inverse is also shown:

$$\begin{aligned} \max _{\varvec{\zeta }'\in \mathfrak {Z}_{\varvec{x}}}&\left[ \langle \varvec{z}, \varvec{\zeta }'\rangle -\hat{\Psi }_{\varvec{x}}^{*}(\varvec{\zeta }') \right] =\max _{\begin{array}{c} \varvec{\zeta }'\in \mathfrak {Z}_{\varvec{x}} \end{array}}\left[ \langle \varvec{z}, \mathbb {V}^{T}\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }')\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta }')) \right] \\&=\max _{\begin{array}{c} \varvec{f}^{\lozenge }\in \mathcal {M}^{vl}_{\varvec{x}}(\varvec{0}) \end{array}}\left[ \langle \mathbb {V}\varvec{z}, \varvec{f}^{\lozenge }\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }) \right] =\max _{ \varvec{f}^{\lozenge }\in \mathcal {M}^{vl}_{\varvec{x}}(\varvec{0})} \left[ \langle \varvec{j}^{\blacklozenge }(\varvec{z}), \varvec{f}^{\lozenge }\rangle -\Psi _{\varvec{x}}^{*}(\varvec{f}^{\lozenge }) \right] \\&=\Psi _{\varvec{x}}(\varvec{j}^{\blacklozenge }(\varvec{z})) =\hat{\Psi }_{\varvec{x}}(\varvec{z}) \end{aligned}$$

The pair \((\varvec{z}\), \(\varvec{\zeta })_{\varvec{x}}\) is Legendre dual with respect to these functions:

$$\begin{aligned} \partial _{\varvec{\zeta }} \hat{\Psi }_{\varvec{x}}^{*}(\varvec{\zeta })&= \left[ \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }}\right] ^{T}\left. \frac{\partial \Psi _{\varvec{x}}^{*}(\varvec{f})}{\partial \varvec{f}}\right| _{\varvec{f} =\varvec{f}^{\lozenge }(\varvec{x}, \varvec{\zeta })}\nonumber \\ {}&=\left[ \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }} \right] ^{T}\varvec{j}^{\lozenge }(\varvec{x},\varvec{\zeta }) \end{aligned}$$
(163)
$$\begin{aligned}&=\left[ \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }}\right] ^{T}\mathbb {V}\varvec{z}^{\lozenge } (\varvec{x},\varvec{\zeta })=\varvec{z}, \end{aligned}$$
(164)
$$\begin{aligned} \partial _{\varvec{z}} \hat{\Psi }_{\varvec{x}}(\varvec{z})&= \left[ \frac{\partial \varvec{j}^{\blacklozenge }(\varvec{z})}{\partial \varvec{z}}\right] ^{T}\left. \frac{\partial \Psi _{\varvec{x}}(\varvec{j})}{\partial \varvec{j}}\right| _{\varvec{j} =\varvec{j}^{\blacklozenge }(\varvec{z})} =\left[ \frac{\partial \varvec{j}^{\blacklozenge }(\varvec{z})}{\partial \varvec{z}}\right] ^{T}\varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z})\nonumber \\ {}&=\mathbb {V}^{T}\varvec{f}^{\blacklozenge }(\varvec{x},\varvec{z})=\varvec{\zeta }, \end{aligned}$$
(165)

where we used \(\left[ \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }}\right] ^{T}\mathbb {V}=I\) from \(\frac{\partial }{\partial \varvec{\zeta }}[\varvec{\zeta }-\mathbb {V}^{T} \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })]=I- \mathbb {V}^{T} \frac{\partial \varvec{f}^{\lozenge }(\varvec{x},\varvec{\zeta })}{\partial \varvec{\zeta }}=0\) and \(\frac{\partial \varvec{j}^{\blacklozenge }(\varvec{x},\varvec{z})}{\partial \varvec{z}} =\mathbb {V}\). They are dissipation functions: strict convexity and 1-coercivity follow from those of the original dissipation functions. Also, we have

$$\begin{aligned}&\text{ Symmetry: } \hat{\Psi }_{\varvec{x}}(-\varvec{z}) = \Psi _{\varvec{x}}(\varvec{j}^{\blacklozenge }(-\varvec{z})) =\Psi _{\varvec{x}}(-\varvec{j}^{\blacklozenge }(\varvec{z})) =\hat{\Psi }_{\varvec{x}}(\varvec{z}) \end{aligned}$$
(166)
$$\begin{aligned}&\text{ Bounded } \text{ by } 0\hbox { at }\varvec{0}: \hat{\Psi }_{\varvec{x}}(\varvec{z}=\varvec{0})=\Psi _{\varvec{x}} (\varvec{j}^{\blacklozenge }(\varvec{0}))=\Psi _{\varvec{x}}(\varvec{0})=0. \end{aligned}$$
(167)

\(\square \)

For a given force \(\varvec{f}(\varvec{x})\) of the generalized flow (Eq. 45), \(\varvec{f}_{st}(\varvec{x})=\varvec{f}^{\lozenge }(\varvec{x}, \mathbb {V}^{T}\varvec{f}(\varvec{x}))\) works as the effective cycle force for each \(\varvec{x} \in \mathcal {X}\). Similarly to Cor. 2, we obtain the effective cycle force and the flux for LMA kinetics by parametric modulation:

Corollary 3

(Effective cycle force and flux for LMA kinetics) Consider CRN with LMA kinetics as in Cor. 2. For each \(\varvec{x}\in \mathcal {X}\), the effective cycle force \(\varvec{f}_{st}(\varvec{x})\) associated with \(\varvec{j}_{MA}(\varvec{x};\varvec{k}^{\pm })\) can be described as \(\varvec{f}_{st}(\varvec{x})=\varvec{f}_{MA}(\varvec{x}; \varvec{K}_{st}(\varvec{x}))\) where \(\varvec{K}_{st}(\varvec{x})\) is determined by

$$\begin{aligned} \varvec{K}_{st}(\varvec{x})&:=\exp \left[ \varvec{f}^{\lozenge }(\varvec{x};\mathbb {V}^{T} \varvec{f}_{MA}(\varvec{x};\varvec{K}))-\mathbb {S}^{T}\partial \Phi (\varvec{x}) \right] . \end{aligned}$$
(168)

Thus, the effective cycle flux \(\varvec{j}_{st}(\varvec{x})\) is represented as \(\varvec{j}_{st}(\varvec{x})=\varvec{j}_{MA}(\varvec{x}; \varvec{k}^{\pm }_{st}(\varvec{x}))\) where \(\varvec{k}^{\pm }_{st}(\varvec{x})=\varvec{\kappa }\circ \varvec{K}^{\pm 1/2}_{st}(\varvec{x})\). For a given trajectory \(\{\varvec{x}_{t}\}\), which is generated by a generalized flow, we have the effective time-dependent cycle flux \(\varvec{j}_{st}(t, \varvec{x})\) as \(\varvec{j}_{st}(t, \varvec{x})=\varvec{j}_{MA}(\varvec{x}; \varvec{k}^{\pm }_{st}(\varvec{x}_{t}))\). From the construction, this time-dependent cycle flux makes \(\varvec{x}_{t}\) a steady state for each t, i.e., \(\mathbb {S}\varvec{j}_{st}(t, \varvec{x}_{t})=\varvec{0}\) holds for any t.

Fig. 10
figure 10

A nonequilibrium trajectory of the Brusselator CRN (Ex. 1) and the associated time-dependent cycle flux \(\varvec{j}_{st}(t,\varvec{x})=\varvec{j}_{MA}(\varvec{x}; \varvec{k}^{\pm }_{st}(t))\). a A nonequilibrium trajectory \(\{\varvec{x}_{t}\}\) of the Brusselator CRN obtained with the same parameter values in Fig. 8a. b, c The time-dependent kinetic parameter set \(\varvec{k}^{\pm }_{st}(t)\), which generates the time-dependent cycle flux \(\varvec{j}_{st}(t,\varvec{x})\). d The top left panel shows the nonequilibrium trajectory \(\{\varvec{x}_{t}\}\) (the black curve) and vector field \(\varvec{v}(\varvec{x})= - \mathbb {S}\varvec{j}(\varvec{x})\) (arrows). The other panels show the time-dependent vector field \(\varvec{v}_{st}(t, \varvec{x})\) induced by \(\varvec{j}_{st}(t,\varvec{x})\): \(\varvec{v}_{st}(t, \varvec{x})= - \mathbb {S}\varvec{j}_{st}(t,\varvec{x})\). The nonequilibrium trajectory \(\{\varvec{x}_{t}\}\) is also depicted for a reference (the black curve). In each panel, the white circle on the trajectory is \(\varvec{x}_{t}\) at which \(\varvec{j}_{st}(t,\varvec{x})\) is computed

Example 4

(Simplified Brusselator CRN [8, 104] (continued)) For the nonequilibrium trajectory of the Brusselator CRN in Fig. 10a, we numerically obtained the effective cycle flux \(\varvec{j}_{st}(t,\varvec{x})\) and the corresponding time-dependent kinetic parameter set \(\varvec{k}_{st}^{\pm }(t)\) (Fig. 10b, c). Figure 10d shows the vector field \(\varvec{v}_{st}(t,\varvec{x})=-\mathbb {S}\varvec{j}_{st}(t,\varvec{x})\) induced by the effective cycle flux \(\varvec{j}_{st}(t,\varvec{x})\). From Fig. 10, we can see that any point on the trajectory \(\{\varvec{x}_{t}\}\) originally generated by the nonequilibrium flux \(\varvec{j}(\varvec{x}_{t})\) can be kept steady with the time-dependent cycle flux \(\varvec{j}_{st}(t,\varvec{x})\) realized by the modulation of the kinetic parameter \(\varvec{k}^{\pm }_{st}(t)\).

In modern nonequilibrium thermodynamics, it has been a great challenge to establish thermodynamic characterizations for nonequilibrium phenomena. To this end, the dissection of dynamics and the corresponding flux and force has been attempted [104, 159,160,161,162]. For a given trajectory \(\{\varvec{x}_{t}\}\), the effective time-dependent equilibrium flux \(\varvec{j}_{eq}(t, \varvec{x})\) generates exactly the same trajectory, which dissects and mimics the dynamic aspect of the trajectory. On the other hand, the effective time-dependent cycle flux \(\varvec{j}_{st}(t, \varvec{x})\) makes each point on the trajectory steady, which can be recognized as the nonequilibrium aspect of the trajectory. Moreover, these two types of fluxes can be realized by appropriately modulating the kinetic parameter set \(\varvec{k}^{\pm }\), which makes the dissected fluxes physically meaningful and accessible. More specifically, the modulation of the force part \(\varvec{K}\) of the kinetic parameter is sufficient for realization (Eq. 156 and Eq. 168), while the activity part \(\varvec{\kappa }\) is kept constant. In the case of CRN, the former is linked to the free energy difference between reactants and products of each reaction, and the latter is associated with the height of the energy barrier between them. This clear separation of different physical parameters in our framework is advantageous to further investigate physical aspects of dynamics on graphs and hypergraphs. Thus, the dually flat structure on the edge space and the HHK decomposition provides a new and promising way to characterize the nonequilibrium flow.Footnote 86

10 Summary and discussion

In this work, we have shown that the doubly dual flat structure of the vertex and edge spaces on graphs and hypergraphs provides the information-geometric basis for the dynamics on graphs and hypergraphs. Two notions of orthogonality, pseudo-Hilbert isosceles orthogonality and information-geometric orthogonality, have been introduced and shown to dissect the equilibrium and nonequilibrium aspects of the dynamics into the induced structures on the tangent and cotangent spaces and the cycle spaces. The doubly dual flat structure naturally connects the topological information of underlying discrete manifolds, i.e., graphs and hypergraphs, with the dynamics on them and thus endows more flexibility and representation power to the information-geometric modeling of dynamics. Furthermore, the generalized equilibrium and nonequilibrium flows, as well as the generalized flow, accommodate a sufficiently wide range of models, which include the reversible Markov jump processes on finite graphs and CRN with LMA kinetics (a class of PDS). These results could substantially extend the applicability of information geometry to dynamical problems.

10.1 Extension of other relations involving information measures

While we demonstrated that the generalized flow and the doubly dual flat structure can extend several results known for FPE and diffusion processes, we still have potentially relevant results and problems that could be explained and extended in our framework. For example, for FPE and diffusion processes, the Fisher information number \(\mathbb {I}_{F}\) was extended to the relative Fisher information (also known as Hyvärinen divergence [120, 144]). The relative Fisher information of two trajectories \(p^{(1)}_{t}(\varvec{r})\) and \(p^{(2)}_{t}(\varvec{r})\) is known to satisfy information–theoretic relations such as the De Brujin identity [54] and its extensions [56, 62]. In addition, the logarithmic Sobolev inequality also constitutes a relationship between the Fisher information number and the KL divergence (or Shannon information) [63, 64]. It would be an important future problem to associate these results with the doubly dual flat structure.

Moreover, several relations potentially being related to De Giorgi’s formulation (Eq. 52) have been known for mutual information in filtering and control theories. For example, Guo, Shamai, and Verdu found a relation between mutual information and the minimum mean square error (MMSE) in Gaussian channels [163]. Relations similar to these have also been reported by Mayer-Wolf and Zakai [164, 165]. Our framework may offer a unified perspective behind these different types of relations involving information measures.

10.2 Extensions of the doubly dual flat structure

There is also room for extensions of the doubly dual flat structure. While we consider only strictly convex thermodynamic functions and dissipation functions, the strict convexity is not necessarily required, at least for defining the generalized flow and the equilibrium and nonequilibrium flowsFootnote 87. Actually, in terms of thermodynamics, the thermodynamic function can be non-strictly convex when a phase transition of the system occurs [117]. The loss of bijectivity via the loss of the strict convexity can happen in complicated and degenerate statistical models [166]. Techniques from algebraic geometry could be employed to address such situations [167].

Moreover, the structure introduced for rLDG and CRN may be extended to irreversible cases, where some edges have only either forward or reverse jumps or reactions. For this purpose, we may take advantage of several results about the CB states obtained in CRN theory [8] where the reversibility is not necessarily assumed and those in stochastic thermodynamics for absolute irreversible processes [168].

While the nonequilibrium flow is general enough to cover at least all reversible CRN with LMA kinetics, the classes of nonlinear dynamics other than CRN are much wider in general. To further extend the range of models that can be covered, GENERIC (General Equation for Non-Equilibrium Reversible–Irreversible Coupling) would be a good candidate [169]. GENERIC is a theoretical framework to integrate dissipative dynamics (gradient flow dynamics) and conservative dynamics (Hamiltonian dynamics). The extension of the generalized flow to GENERIC has already been attempted but is still ongoing [77, 79]. One might also consider Hamiltonian-type dynamics, which differs from the GENERIC structure mentioned above. In the doubly dual flat structure, dual spaces are statically coupled by the Legendre duality. However, we could consider the coupling of two dynamics, each of which is defined on the primal and the dual spaces. Such coupling has been investigated in relation to accelerations of gradient flows [170], optimal control problems [171], and also mean field game problems [172]. It would be an interesting problem to formulate this dynamic coupling in relation to our results and also the results of GENERIC. The information geometry could offer new insights and techniques to achieve these missions.

10.3 Homological algebra and differential geometric formulations

From the viewpoint of the standard homological algebra, the doubly dual flat structure that we introduced is an extension of chain and cochain complexes with inner product structure. Because the homological algebra used here is an abstraction of the differential form, the doubly dual flat structure can also be viewed as an extension of the differential form and might be called dually flat form. It would be an interesting mission to characterize this stricture under a more rigorous mathematical formulation and to investigate if the Legendre duality can be consistently introduced for chains and cochains higher than those of the edge space. From the viewpoint of differential geometry, the dual flat structure can be defined independently of the specific coordinate by the Hessian geometry [145]. While we stick to the standard basis of the graph or hypergraph on which the convex thermodynamic function is defined, we can formulate it more generally. It would be an important future work to clarify how the doubly dual flat structure can be formulated from a differential geometric perspective.

10.4 Consistency and Persistence

Finally, we would like to mention the problem of consistency and persistence. In this work, we presume that the flux \(\varvec{j}(\varvec{x})\) is consistent with \(\mathbb {H}\) and that the trajectory is persistent. The explicit conditions when these are satisfied are still elusive. Actually, the condition for consistency is intricate, even for the separable cases. For an illustrative example, suppose that the ith molecule is involved as a reactant in the \(e\hbox {th}\) reaction of a CRN with LMA kinetics. For \(x_{i} \rightarrow 0\), \(j_{e}^{+}(\varvec{x}) \rightarrow 0\) holds. However, their Legendre dual diverges as \(y_{i} \rightarrow -\infty \) and \(f_{e}(\varvec{x}) \rightarrow -\infty \). The flux \(j_{e}(\varvec{x})\) stays finite because \(\omega _{e}(\varvec{x})\rightarrow 0\) holds. This example suggests that we have to consider a certain limit of relevant quantities to appropriately address the consistency condition.

The persistence of the nonequilibrium flow would be a much harder problem. While persistence has been approached for CB CRN with LMA kinetics using techniques from algebraic geometry [112], its connection to information geometry has yet to be clarified. Moreover, from an information-geometric viewpoint, the loss of persistence means a change in the support of probability or positive density, which effectively results in a change in the topology of the underlying graph or hypergraph. To resolve the problem, we may need a deeper understanding of the interrelationship among dynamics, information-geometric structure, and the underlying topology.