Keywords

We here construct the general theory of the relationship between information and stochastic thermodynamics. Characterizing complex nonequilibrium dynamics by causal networks, we derive the generalized second law of thermodynamics with information flow. This chapter is the refinement of our result [Ito S., & Sagawa T., Phys. Rev. Lett. 111, 180503 (2013)] [1].

6.1 Entropy on Causal Networks

6.1.1 Entropy Production on Causal Networks

First of all, we clarify how to introduce the entropy production on causal networks. Let \(\mathcal {V} = \{ a_1, \dots , a_{N_{\mathcal {V}}} \}\) be a set of nodes of causal network, where \(a_k\) represents a random variable. We here introduce a set of the random variables, which represents the time evolution of the target system \(X= \{ x_1, \dots , x_N \}\). \(x_k\) denotes the state of the target system X at time k. X is a subset of \(\mathcal {V}\), i.e., \(X \subseteq \mathcal {V}\). We assume the following properties of \(x_k\) such that

$$\begin{aligned} x_{k'-1}&\in \mathrm{pa} (x_k) \, \, \, (k' = k), \end{aligned}$$
(6.1)
$$\begin{aligned} x_{k' -1}&\notin \mathrm{pa} (x_k) \, \, \, (k' \ne k), \end{aligned}$$
(6.2)

with \(k>2\). The former assumption indicates that the time evolution of the target system X is characterized by the sequence of edges \(x_1 \rightarrow x_2 \rightarrow \cdots \rightarrow x_N\). The latter assumption corresponds to the Markov property of the physical dynamics. We stress that the latter assumption does not prohibit the non-Markovian dynamics of \(\mathcal {V}\) at all. Next, we define the other system as

$$\begin{aligned} \mathcal {C}= \{ c_1, \dots , c_{N'}\} := \mathcal {V} \setminus X. \end{aligned}$$
(6.3)

Because \(c_l\) is an element of \(\mathcal {V}\), we can rewrite \(c_l\) as \(c_l = a_j\). We can introduce the time ordering of \(c_l\) from the topological ordering of \(\mathcal {V}\). We assume that \(l' < l\) with \(j' < j\) if \(c_l = a_j\) and \(c_{l'} = a_{j'}\). This assumption indicates that \(c_1, \dots , c_{N'}\) is ordered as the time ordering.

The probability of \(p(\mathcal {V})\) is given by the chain rule of the Bayesian networks Eq. (5.4) such that

$$\begin{aligned} p(\mathcal {V})&= p(X, \mathcal {C}) \nonumber \\&= \prod _{k=1}^N p(x_k| \mathrm{pa} (x_k)) \prod _{l=1}^{N'} p(c_l | \mathrm{pa}(c_l)). \end{aligned}$$
(6.4)

The conditional probabilities \(\prod _{k=1}^N p(x_k| \mathrm{pa} (x_k))\) represent the path-probability of the target system X, and the conditional probabilities \(\prod _{l=1}^{N'} p(c_l| \mathrm{pa} (c_l))\) represent the path probability of the other systems \(\mathcal {C}\).

Fig. 6.1
figure 1

A schematic of X and \(\mathcal {B}_{k+1}\) on causal networks. \(\mathcal {B}_{k+1}\) represents a set of random variables which can affects the time evolution in X from time k to \(k+1\)

We introduce a set of random variables \(\mathcal {B}_{k+1} := \mathrm{pa} (x_{k+1} ) \setminus \{ x_k\}\), which affect the time evolution of the target system X from state \(x_k\) to \(x_{k+1}\) at time k (see also Fig. 6.1). \(\mathcal {B}_{k+1}\) is a subset of the variables in the other system, i.e., \(\mathcal {B}_{k+1} \subseteq \mathcal {C}\). By definition of \(\mathcal {B}_{k+1}\), the transition probability in X at time k is rewritten as

$$\begin{aligned} p(x_{k+1} | \mathrm{pa} (x_{k+1}))= p(x_{k+1} | x_k , \mathcal {B}_{k+1}), \end{aligned}$$
(6.5)

which indicates that, in the time evolution from state \(x_k\) to \(x_{k+1}\), \(\mathcal {B}_{k+1}\) plays a role of a set of external parameters (e.g., a memory in a feedback system). Thus, the entropy change in heat baths at time k is given by

$$\begin{aligned} \Delta s^k_\mathrm{bath} = \ln \frac{p(x_{k+1} |x_k, \mathcal {B}_{k+1})}{p_B(x_{k} |x_{k+1}, \mathcal {B}_{k+1})}, \end{aligned}$$
(6.6)

which is a modification of the detailed fluctuation theorem [e.g., Eq. (4.3)]. \(p_B\) describes the probability of the backward process. The definition of the backward probability is given by \(p_B(x_{k} |x_{k+1}, \mathcal {B}_{k+1})= p(x_{k}^+, -x_{k}^-|x_{k+1}^+, -x_{k+1}^-, \mathcal {B}_{k+1}^+, -\mathcal {B}_{k+1}^-)\), where \(x_{k}^+\) (\(\mathcal {B}_{k+1}^+\)) denotes an even function of the momentum, and \(x_{k}^-\) (\(\mathcal {B}_{k+1}^-\)) denotes an odd function of the momentum. The entropy production \(\sigma \) in the target system X from time \(k=1\) to \(k=N\) is defined as

$$\begin{aligned} \sigma&:= \ln p(x_1) - \ln p(x_N) + \sum _{k=1}^{N-1} \Delta s_\mathrm{bath}^k \nonumber \\&= \ln \left[ \frac{p(x_1)}{p(x_N)} \prod _{k=1}^{N-1} \frac{p(x_{k+1} |x_k, \mathcal {B}_{k+1})}{p_B(x_{k} |x_{k+1}, \mathcal {B}_{k+1})} \right] . \end{aligned}$$
(6.7)

6.1.2 Examples of Entropy Production on Causal Networks

We here show that the definition of the entropy production \(\sigma \) is well-defined in three examples (i.e., the Markov chain, the feedback control with a single measurement, and the coupled Langevin equations).

Fig. 6.2
figure 2

Examples of X and \(\mathcal {C}\) on causal networks. Example 1 Markov chain. Example 2 Feedback control with a single measurement. Example 3 Coupled Langevin equations.

Example 1: Markov Chain

The causal network corresponding to the Markov chain is given by \(\mathcal {V} = \{x_1, \dots , x_N \}\), \(\mathrm{pa} (x_k) = x_{k-1}\) with \(k \ge 2\), and \(\mathrm{pa} (x_1) = \emptyset \) (see also Fig. 6.2). We set \(X = \{x_1, \dots , x_N \} \) and \(\mathcal {C} = \emptyset \), so that we have \(\mathcal {B}_{k+1} = \mathrm{pa} (x_{k+1}) \setminus \{ x_k\} = \emptyset \). Thus, the entropy production on causal networks Eq. (6.7) gives the entropy production for the Markov chain Eq. (3.15):

$$\begin{aligned} \sigma = \ln \left[ \frac{p(x_1)}{p(x_N)} \prod _{k=1}^{N-1} \frac{p(x_{k+1} |x_k)}{p_B(x_{k} |x_{k+1})} \right] . \end{aligned}$$
(6.8)

Example 2: Feedback control with a single measurement

The causal network corresponding to the system under feedback control with the single measurement is given by \(\mathcal {V} = \{x_1, m_1, x_2, \dots , x_N \}\), \(\mathrm{pa} (x_k) = \{m_1, x_{k-1} \}\) with \(k \ge 2\), \(\mathrm{pa} (x_1) = \emptyset \), and \(\mathrm{pa} (m_1) = x_1\) (see also Fig. 6.2). We set \(X = \{x_1, \dots , x_N \} \) and \(\mathcal {C} = \{c_1 :=m_1 \}\), so that we have \(\mathcal {B}_{k+1} = \mathrm{pa} (x_{k+1}) \setminus \{ x_k\} = \{ m_1\}\) with \(k \ge 2\). Thus, the entropy production on causal networks Eq. (6.7) gives the entropy production for a feedback control Eq. (4.4):

$$\begin{aligned} \sigma = \ln \left[ \frac{p(x_1)}{p(x_N)} \prod _{k=1}^{N-1} \frac{p(x_{k+1} |x_k, m_1)}{p_B(x_{k} |x_{k+1}, m_1)} \right] . \end{aligned}$$
(6.9)

Example 3: Coupled Langevin equations

Here we discuss the following coupled Langevin equations

$$\begin{aligned} \dot{x}(t)&= f_x(x (t), y(t)) + \xi ^x (t), \nonumber \\ \dot{y} (t)&= f_y (x (t) , y(t)) + \xi ^y (t), \nonumber \\ \langle \xi ^x (t) \rangle&=0, \nonumber \\ \langle \xi ^y (t) \rangle&=0, \nonumber \\ \langle \xi ^x (t) \xi ^x (t') \rangle&= 2 T^x \delta (t-t'), \nonumber \\ \langle \xi ^y (t) \xi ^y (t') \rangle&= 2 T^y \delta (t-t'), \nonumber \\ \langle \xi ^x (t) \xi ^x (t') \rangle&= 0, \end{aligned}$$
(6.10)

where \(x_t\) (\(y_t\)) is a dynamical variable of the system X (Y). The corresponding Bayesian Network is given by \(\mathcal {V} = \{x_t, y_t, x_{t+dt}, y_{t+dt} \}\), \(\mathrm{pa} (x_t) = \emptyset \), \(\mathrm{pa} (y_t) = x_t\), \(\mathrm{pa} (x_{t+dt}) = \{x_t, y_t \}\) and \(\mathrm{pa} (y_{t+dt}) = \{ x_t, y_t \}\) (see also Fig. 6.2). The entropy production on causal networks Eq. (6.7) gives

$$\begin{aligned} \sigma = \ln \left[ \frac{p(x_t)}{p(x_{t+dt})} \frac{p(x_{t+dt} |x_t, y_t)}{p_B(x_{t} |x_{t+dt}, y_{t} )} \right] , \end{aligned}$$
(6.11)

where we set \(X = \{x_1 := x_t, x_2 := x_{t+dt}\}\), \(\mathcal {C} = \{c_1 := y_t, c_2 := y_{t+dt} \}\), and \(\mathcal {B}_2 = y_t\). For the coupled Langevin dynamics, we can explicitly calculate the entropy change in heat baths \(\Delta s_\mathrm{bath}^{k=1}\). The conditional probability \(p(x_{t+dt}|x_t, y_t)\) is given by

$$\begin{aligned} p(x_{t+dt}|x_t, y_t) = \mathcal {N}_x \exp \left[ - \frac{(x_{t+dt} - x_t -f_x (x_t, y_t) dt )^2}{4 T^x dt}\right] , \end{aligned}$$
(6.12)

and the backward probability \(p_B(x_{t+dt}|x_t, y_t)\) is defined as

$$\begin{aligned} p_B(x_{t} |x_{t+dt}, y_{t} ) := \mathcal {N}_x \exp \left[ - \frac{(x_{t} - x_{t+dt} -f_x (x_{t+dt}, y_t) dt )^2}{4 T^x dt}\right] , \end{aligned}$$
(6.13)

where we assume that \(x_t\) and \(y_t\) are even functions of the momentum. Up to the order o(dt), the entropy change in heat baths \(\Delta s_\mathrm{bath}^{k=1}\) is calculated as

$$\begin{aligned} \Delta s_\mathrm{bath}^{k=1}&:= \ln \frac{p(x_{t+dt} |x_t, y_t)}{p_B(x_{t} |x_{t+dt}, y_{t} )} \nonumber \\&=\frac{f_x (x_t, y_t) + f_x (x_{t+dt}, y_t)}{T^x} (x_{t+dt} -x_t) \nonumber \\&= \frac{f_x (x_t, y_t) + f_x (x_{t+dt}, y_{t+dt})}{T^x} (x_{t+dt} -x_t) \nonumber \\&= \frac{ (\xi ^x (t) -\dot{x}(t)) \circ \dot{x}(t)}{T^x} dt. \end{aligned}$$
(6.14)

Here, \((\xi ^x (t) -\dot{x}(t)) \circ \dot{x}(t)\) is Sekimoto’s definition of the heat flow in the system X for the Langevin equations [2]. We add that, up to the order o(dt), \(\Delta s_\mathrm{bath}^{k=1}\) can be rewritten as

$$\begin{aligned} \Delta s_\mathrm{bath}^{k=1 }&= \frac{f_x (x_t, y_t) + f_x (x_{t+dt}, y_{t+dt})}{T^x} (x_{t+dt} -x_t) \nonumber \\&= \ln \frac{p(x_{t+dt} |x_t, y_t)}{p_B(x_{t} |x_{t+dt}, y_{t+dt} )}, \end{aligned}$$
(6.15)

where the backward probability is defined as

$$\begin{aligned} p_B(x_{t} |x_{t+dt}, y_{t+dt} ) :=\mathcal {N}_x \exp \left[ - \frac{(x_{t} - x_{t+dt} -f_x (x_{t+dt}, y_{t+dt}) dt )^2}{4 T^x dt}\right] . \end{aligned}$$
(6.16)

This fact indicates that it does not matter whether we select the condition of the backward probability \(y_t\) or \(y_{t+dt}\) if we discretize the dynamical variables with infinitesimal time interval dt.

6.1.3 Transfer Entropy on Causal Networks

We here discuss the transfer entropy on causal networks. On causal networks, we have two time series \(X= \{ x_1, \dots , x_N \}\) and \(\mathcal {C} =\{ c_1, \dots , c_{N'} \}\). The transfer entropy is a measure of the causal dependence in the dynamics. Thus the most natural choice of the transfer entropy from X to \(\mathcal {C}\) depends on the set of parents \(\mathrm{pa} (c_l)\) in the transition probability \(p(c_l|\mathrm{pa} (c_l))\).

The set of parents \(\mathrm{pa} (c_l)\) generally includes both elements of X and \(\mathcal {C}\). We define the intersection of two sets X (\(\mathcal {C}\)) and \(\mathrm{pa} (c_l)\) as \(\mathrm{pa}_X (c_l) := X \cap \mathrm{pa} (c_l)\) (\(\mathrm{pa}_{\mathcal {C}} (c_l) := \mathcal {C} \cap \mathrm{pa} (c_l)\)), where \(\cap \) denotes the symbol of intersection. The set of parents \(\mathrm{pa} (c_l) \) is rewritten as \(\mathrm{pa} (c_l) = \{\mathrm{pa}_X (c_l) , \mathrm{pa}_{\mathcal {C}} (c_l) \}\), so that the transition probability \(p(c_l|\mathrm{pa} (c_l))\) is calculated as

$$\begin{aligned} p(c_l|\mathrm{pa} (c_l))&= p(c_l|\mathrm{pa}_X (c_l) , \mathrm{pa}_{\mathcal {C}} (c_l)) \nonumber \\&=p(c_l|\mathrm{pa}_X (c_l) , c_{l-1}, \dots , c_1), \end{aligned}$$
(6.17)

where we used the property of the Bayesian network Eq. (5.5). In the transition probability \(p(c_l|\mathrm{pa} (c_l))\), the set \(\mathrm{pa}_X (c_l)\) indicates the causal dependence of the target system X in the dynamics from \(\{ c_{l-1}, \dots , c_1 \}\) to \(c_l\). By comparing the transition probability in \(\mathcal {C}\) and that under the condition \(\mathrm{pa}_X (c_l)\), we introduce the transfer entropy from X to \(\mathcal {C}\) at l such as

$$\begin{aligned} I^l_\mathrm{tr}&:= \langle \ln p(c_l|\mathrm{pa}_X (c_l) , c_{l-1}, \dots , c_1) - \ln p (c_l|c_{l-1}, \dots , c_1)\rangle \nonumber \\&= \langle \ln p(c_l|\mathrm{pa} (c_l) ) - \ln p (c_l|c_{l-1}, \dots , c_1)\rangle . \end{aligned}$$
(6.18)

This transfer entropy can be rewritten as the conditional mutual information

$$\begin{aligned} I^l_\mathrm{tr} = I(c_l: \mathrm{pa}_X (c_l) | c_{l-1}, \dots , c_1). \end{aligned}$$
(6.19)

From the nonnegativity of the mutual information, we have \(I^l_\mathrm{tr} \ge 0\) with equality if and only if \( p(c_l|\mathrm{pa}_X (c_l) , c_{l-1}, \dots , c_1) = p (c_l|c_{l-1}, \dots , c_1)\) [or equivalently \(\mathrm{pa}_X (c_l) = \emptyset \)]. We also define the stochastic transfer entropy \(i^l_\mathrm{tr}\) as

$$\begin{aligned} i^l_\mathrm{tr} = \ln p(c_l|\mathrm{pa} (c_l)) - \ln p (c_l|c_{l-1}, \dots , c_1). \end{aligned}$$
(6.20)

The sum of the transfer entropy \(\sum _l I^l_\mathrm{tr} \) is a quantity similar to the directed information \(I^{DI}\), Eq. (2.25).

6.1.4 Initial and Final Correlations on Causal Networks

We here define two types of mutual information which represent the initial and final correlations between the target system X and the outside world \(\mathcal {C}\).

First, we define the initial correlation on causal networks. The initial state \(x_1\) is initially correlated to its parents \(\mathrm{pa} (x_1)\), because the state of \(x_1\) is given by the transition probability \(p(x_1| \mathrm{pa} (x_1) )\). \(\mathrm{pa} (x_1)\) is the set of variables in outside world, i.e., \(\mathrm{pa} (x_1) \subseteq \mathcal {C}\). A natural quantification of the initial correlation between X and \(\mathcal {C}\) is the mutual information between \(x_1\) and its parents:

$$\begin{aligned} I_\mathrm{ini} := I(x_1 : \mathrm{pa} (x_1)). \end{aligned}$$
(6.21)

From the nonnegativity of the mutual information, we have \(I_\mathrm{ini} \ge 0\) with the equality satisfied if and only if \(p(x_1| \mathrm{pa} (x_1) )= p(x_1)\) [or equivalently \(\mathrm{pa} (x_1) = \emptyset \)].

Next, we define the final correlation on causal networks. The dynamics in the target system X generally depends on the ancestors of the final state \(x_N\), \(\mathrm{an} (x_N)\). We introduce the set \(\mathcal {C}' :=\mathrm{an} (x_N) \cap \mathcal {C}\), which is the history of the outside world \(\mathcal {C}\) that can affect the finial state \(x_N\). Thus, a natural quantification of the finial correlation between X and \(\mathcal {C}\) is given by the mutual information between \(x_N\) and \(\mathcal {C}'\):

$$\begin{aligned} I_\mathrm{fin} := I(x_N : \mathcal {C}'). \end{aligned}$$
(6.22)

We also define the stochastic initial correlation and the stochastic final correlation as

$$\begin{aligned} i_\mathrm{ini}&:= i(x_1:\mathrm{pa} (x_1) ) \nonumber \\&= \ln \frac{p(x_1|\mathrm{pa} (x_1))}{p(x_1)},\end{aligned}$$
(6.23)
$$\begin{aligned} i_\mathrm{fin}&:= i(x_N : \mathcal {C}') \nonumber \\&= \ln \frac{p(x_N , \mathcal {C}')}{p(x_N)p( \mathcal {C}')}, \end{aligned}$$
(6.24)

respectively.

6.2 Generalized Second Law on Causal Networks

We now state the main result of this thesis. In the foregoing setup, we have the generalized second law for subsystem X in the presence of the other system \(\mathcal {C}\).

6.2.1 Relative Entropy and Generalized Second Law

Here, we define the key informational quantity \(\Theta \) characterized by the topology of the causal network:

$$\begin{aligned} \Theta := i_\mathrm{fin} -i_\mathrm{ini} - \sum _{l |c_l \in \mathcal {C}'} i^l_\mathrm{tr}. \end{aligned}$$
(6.25)

This quantity \(\Theta \) indicates the total stochastic information flow from the target system X to the outside world \(\mathcal {C}'\) in the dynamics from \(x_1\) to \(x_N\), where \(i_\mathrm{fin}\) and \(i_\mathrm{ini}\) mean the boundary terms. Its ensemble average \(\langle \Theta \rangle \) gives the total information flow given by the mutual information and the transfer entropy.

We show that the difference between the entropy production and the informational quantity \(\sigma - \Theta \) can be rewritten as the stochastic relative entropy

$$\begin{aligned} \sigma -\Theta&= \ln \left[ \frac{p(x_1)}{p(x_N)} \prod _{k=1}^{N-1} \frac{p(x_{k+1} |x_k, \mathcal {B}_{k+1})}{p_B(x_{k} |x_{k+1}, \mathcal {B}_{k+1})} \right] -\ln \frac{p(x_N , \mathcal {C}')}{p(x_N)p( \mathcal {C}')} + \ln \frac{p(x_1|\mathrm{pa} (x_1))}{p(x_1)} \nonumber \\&\quad + \sum _{l |c_l \in \mathcal {C}'} \ln \frac{p(c_l|\mathrm{pa} (c_l)) }{p(c_l|c_{l-1}, \dots , c_1)} \nonumber \\&= \ln \left[ \frac{\prod _{k=1}^{N} p(x_{k} |\mathrm{pa} (x_k)) \prod _{{l |c_l \in \mathcal {C}'}} p(c_l|\mathrm{pa} (c_l)) }{\prod _{k=1}^{N-1} p_B(x_{k} |x_{k+1}, \mathcal {B}_{k+1}) p(x_N , \mathcal {C}') } \right] \nonumber \\&= \ln \left[ \frac{p (\mathcal {V})}{\prod _{k=1}^{N-1} p_B(x_{k} |x_{k+1}, \mathcal {B}_{k+1}) p(x_N , \mathcal {C}') \prod _{{l |c_l \notin \mathcal {C}'}} p(c_l|\mathrm{pa} (c_l)) } \right] \nonumber \\&= d_\mathrm{KL} (p (\mathcal {V})|| p_B (\mathcal {V})), \end{aligned}$$
(6.26)

where we define the backward path probability \(p_B (\mathcal {V})\) as

$$\begin{aligned} p_B (\mathcal {V})= \prod _{k=1}^{N-1} p_B(x_{k} |x_{k+1}, \mathcal {B}_{k+1}) p(x_N , \mathcal {C}') \prod _{{l |c_l \notin \mathcal {C}'}} p(c_l|\mathrm{pa} (c_l)). \end{aligned}$$
(6.27)

The backward path probability satisfies the normalization of the probability such as

$$\begin{aligned} \sum _{\mathcal {V}} p_B (\mathcal {V})&= \sum _{X, \mathcal {C}'} \prod _{k=1}^{N-1} p_B(x_{k} |x_{k+1}, \mathcal {B}_{k+1}) p(x_N , \mathcal {C}') \nonumber \\&=\sum _{x_N, \mathcal {C}'}p(x_N , \mathcal {C}') \nonumber \\&=1. \end{aligned}$$
(6.28)

The definition of this backward path probability \(p_B (\mathcal {V})\) indicates that the conditional probability in the target system X is given by the backward path probability (i.e., \(\prod _{k=1}^{N-1} p_B(x_{k} |x_{k+1}, \mathcal {B}_{k+1})\)) and the conditional probability in the other systems \(\mathcal {C}\) is given by the probability distribution of the forward process (i.e., \( p(x_N , \mathcal {C}') \prod _{{l |c_l \notin \mathcal {C}'}} p(c_l|\mathrm{pa} (c_l))\)). It implies that we consider the backward path only for the target system X under the condition of stochastic variables \(\mathcal {C}\), where the distribution of \(\mathcal {C}\) is given by the distribution of a forward process \(p(\mathcal {V})\).

From the identity Eq. (3.28) and the nonnegativity of the stochastic relative entropy \(D_\mathrm{KL} (p (\mathcal {V})|| p_B (\mathcal {V})) \ge 0\), we have the generalizations of the integral fluctuation theorem and the second law of thermodynamics,

$$\begin{aligned} \langle \exp (- \sigma + \Theta ) \rangle =1, \end{aligned}$$
(6.29)
$$\begin{aligned} \langle \sigma \rangle \ge I_\mathrm{fin} - I_\mathrm{ini} - \sum _{{l |c_l \notin \mathcal {C}'}} I_\mathrm{tr}^l. \end{aligned}$$
(6.30)
Fig. 6.3
figure 3

Schematic of the generalized second law on causal networks. We consider two fluctuating subsystems X and \(\mathcal {C}\). The entropy production of X is generally bounded by the informational quantity \(\langle \Theta \rangle \) which includes the initial correlation \(I_\mathrm{ini}\) between X and \(\mathcal {C}\), the final correlation \(I_\mathrm{fin}\) between them, and the transfer entropy \(I_\mathrm{tr}\) from X to \(\mathcal {C}'\) during the dynamics. We can automatically calculate the informational quantity \(\langle \Theta \rangle \) using the graphical representation by causal networks.

The equality in Eq. (6.30) holds if and only if a kind of reversibility \(p (\mathcal {V}) =p_B (\mathcal {V})\) holds. Application of the generalized second law to specific problems is straightforward by using the expression of the causal networks (see also Fig. 6.3). We next show several applications to stochastic models.

6.2.2 Examples of Generalized Second Law on Causal Networks

We here illustrate that the generalized integral fluctuation theorem Eq. (6.29) and the generalized second law Eq. (6.30) can reproduce known nonequilibrium relations in a unified way, and moreover can lead to novel results.

Example 1: Markov Chain

We consider the causal network corresponding to the Markov chain: \(\mathcal {V} := \{ x_1, \dots , x_N\}\), \(\mathrm{pa} (x_k) = \{ x_{k-1}\}\) with \(k\ge 2\), and \(\mathrm{pa} (x_1) = \emptyset \) (see also Fig. 6.4). We here set \(X= \{ x_1, \dots , x_N\}\) and \(\mathcal {C} = \emptyset \). We have \(i_\mathrm{fin} =0\), \(i_\mathrm{ini} = 0\) and \(i_\mathrm{tr}^l =0\). From the generalized integral fluctuation theorem Eq. (6.29) and the generalized second law Eq. (6.30), we reproduce the conventional integral fluctuation theorem Eqs. (3.29) and (3.31):

$$\begin{aligned} \langle \exp (-\sigma ) \rangle&=1, \end{aligned}$$
(6.31)
$$\begin{aligned} \langle \sigma \rangle&\ge 0. \end{aligned}$$
(6.32)
Fig. 6.4
figure 4

Examples of the generalized second law on causal networks. Example 1 Markov chain. Example 2 Feedback control with a single measurement

Example 2: Feedback Control with a Single Measurement

We consider the causal network corresponding to the system under feedback control with a single measurement : \(\mathcal {V} := \{ x_1, m_1, x_2, \dots , x_N\}\), \(\mathrm{pa} (x_k) = \{ x_{k-1}, m_1\}\) with \(k\ge 2\), \(\mathrm{pa} (m_1) = \{ x_{1} \}\), and \(\mathrm{pa} (x_1) = \emptyset \) (see also Fig. 6.4). We here set \(X= \{ x_1, \dots , x_N\}\) and \(\mathcal {C} =\{ m_1 \}\). We have \(i_\mathrm{fin} =i (x_N: m_1)\), \(i_\mathrm{ini} = 0\) and \(i_\mathrm{tr}^1 =i (x_1: m_1)\). From the generalized integral fluctuation theorem Eq. (6.29) and the generalized second law Eq. (6.30), we reproduce Sagawa–Ueda relations Eqs. (4.8) and (4.10):

$$\begin{aligned} \langle \exp [-\sigma + i(x_N: m_1) - i(x_1: m_1)] \rangle =1,\end{aligned}$$
(6.33)
$$\begin{aligned} \langle \sigma \rangle \ge I(x_N: m_1) - I(x_1:m_1). \end{aligned}$$
(6.34)
Fig. 6.5
figure 5

Examples of the generalized second law on causal networks. Example 3 Repeated feedback control with multiple measurements. Example 4 Coupled Langevin equations

Example 3: Repeated Feedback Control with Multiple Measurement

We consider the causal network corresponding to the system under feedback control with multiple measurements : \(\mathcal {V} := \{ x_1, m_1, x_2, m_2, \dots , x_N, m_N\}\), \(\mathrm{pa} (x_k) = \{ x_{k-1}, m_{k-1}, \dots , m_1\}\) with \(k\ge 2\), \(\mathrm{pa} (m_l) = \{ x_{l} \}\), and \(\mathrm{pa} (x_1) = \emptyset \) (see also Fig. 6.5). We here set \(X= \{ x_1, \dots , x_N\}\), \(\mathcal {C} =\{ m_1, \dots , m_N \}\), and \(\mathcal {C}' =\{ m_1, \dots , m_{N-1} \}\). We have \(i_\mathrm{fin} =i (x_N: \{m_1, \dots , m_{N-1} \} )\), \(i_\mathrm{ini} = 0\) and \(i_\mathrm{tr}^l =i (x_l: m_l | m_{l-1}, \dots , m_1)\). From the generalized integral fluctuation theorem Eq. (6.29) and the generalized second law Eq. (6.30), we have the following relations:

$$\begin{aligned} \left\langle \exp \left[ -\sigma + i(x_N: \{m_1, \dots , m_{N-1} \}) - \sum _{l=1}^{N-1} i(x_l: m_l | m_{l-1}, \dots , m_1)\right] \right\rangle =1,\end{aligned}$$
(6.35)
$$\begin{aligned} \langle \sigma \rangle \ge I(x_N: \{m_1, \dots , m_{N-1} \}) - \sum _{l=1}^{N-1} I(x_l: m_l | m_{l-1}, \dots , m_1). \end{aligned}$$
(6.36)

On the other hand, Horowitz and Vaikuntanathan [3] have derived the information thermodynamic equality in the case of the repeated feedback control such as

$$\begin{aligned} \left\langle \exp \left[ - \beta W_{d}- \sum _{l=1}^{N-1} i(x_l: m_l | m_{l-1}, \dots , m_1)\right] \right\rangle =1, \end{aligned}$$
(6.37)

where \(\beta \) is the inverse temperature of the heat bath, and \(W_d\) is the dissipated work defined as \(\beta W_d := \sum _{k=1}^{N-1} \Delta s_\mathrm{bath}^k + \ln p_\mathrm{eq} (x_1) - \ln p_\mathrm{eq}(x_N| m_1, \dots , m_{N-1})\) [\(p_\mathrm{eq}\) indicates the equilibrium distribution]. If the initial and final states of the system X are in thermal equilibrium, \(\beta W_d\) is equivalent to \(\sigma - i_\mathrm{fin}\) such that

$$\begin{aligned} \beta W_d&:= \sum _{k=1}^{N-1} \Delta s_\mathrm{bath}^k + \ln p_\mathrm{eq} (x_1) - \ln p_\mathrm{eq}(x_N| m_1, \dots , m_{N-1}) \nonumber \\&= \sum _{k=1}^{N-1} \Delta s_\mathrm{bath}^k + \ln p_\mathrm{eq} (x_1) - \ln p_\mathrm{eq} (x_N) - i (x_N : \{ m_1, \dots , m_{N-1} \}) \nonumber \\&=\sigma - i_\mathrm{fin}, \end{aligned}$$
(6.38)

where we use a thermal equilibrium condition, i.e., \(i (x_N : \{ m_1, \dots , m_{N-1} \} ) = \ln p_\mathrm{eq}(x_N| m_1, \dots , m_{N-1}) -\ln p_\mathrm{eq} (x_N)\). Thus our general results Eqs. (6.29) and (6.30) can reproduce the known result for the system under feedback control with multiple measurements.

Example 4: Coupled Langevin Equations

We consider the causal network corresponding to the coupled Langevin equations Eq. (6.10): \(\mathcal {V} := \{ x_t, y_t, x_{t+dt}, y_{t+dt} \}\), \(\mathrm{pa} (x_{k+dt}) = \{ x_{t}, y_{t}\}\), \(\mathrm{pa} (y_{t+dt}) = \{ x_{t}, y_{t}\}\), \(\mathrm{pa} (x_{t}) = \emptyset \), and \(\mathrm{pa} (y_{t}) = \{ x_{t}\}\) (see also Fig. 6.5). We here set \(X= \{ x_1 := x_t, x_2 := x_{t+dt}\}\), and \(\mathcal {C}' = \mathcal {C} =\{ c_1 := y_t, c_2 := y_{t+dt} \}\). We have \(i_\mathrm{fin} =i (x_{t+dt}: \{y_t, y_{t+dt} \} )\), \(i_\mathrm{ini} = 0\), \(i_\mathrm{tr}^1 =i (x_t: y_t)\), \(i_\mathrm{tr}^2 =i (x_t: y_{t+dt}| y_{t})\). The informational quantity \(\Theta \) is calculated as

$$\begin{aligned} \Theta&= i (x_{t+dt}: \{y_t, y_{t+dt} \} ) - i (x_t: y_{t+dt}| y_{t}) -i (x_t: y_t) \nonumber \\&= i (x_{t+dt}: \{y_t, y_{t+dt} \} ) - i (x_t: \{ y_t, y_{t+dt} \} ) \nonumber \\&= i (x_{t+dt}: y_{t+dt} ) - i(x_t:y_t) + i (x_{t+dt}: y_{t}| y_{t+dt}) - i (x_t: y_{t+dt}| y_{t}). \end{aligned}$$
(6.39)

From the generalized integral fluctuation theorem Eq. (6.29) and the generalized second law Eq. (6.30), we have the following relations:

$$\begin{aligned} \left\langle \exp \left[ -\sigma + i (x_{t+dt}: \{y_t, y_{t+dt} \} ) - i (x_t: \{ y_t, y_{t+dt} \} )\right] \right\rangle =1,\end{aligned}$$
(6.40)
$$\begin{aligned} \langle \sigma \rangle \ge I(x_{t+dt}: \{y_t, y_{t+dt} \} ) - I (x_t: \{ y_t, y_{t+dt} \} ), \end{aligned}$$
(6.41)

or equivalently,

$$\begin{aligned} \left\langle \exp \left[ -\sigma + i (x_{t+dt}: y_{t+dt} ) - i(x_t:y_t) + i (x_{t+dt}: y_{t}| y_{t+dt}) - i (x_t: y_{t+dt}| y_{t})\right] \right\rangle =1, \end{aligned}$$
(6.42)
$$\begin{aligned} \langle \sigma \rangle \ge I (x_{t+dt}: y_{t+dt} ) - I(x_t:y_t) + I (x_{t+dt}: y_{t}| y_{t+dt}) - I (x_t: y_{t+dt}| y_{t}). \end{aligned}$$
(6.43)

Equation (6.14) gives the entropy production \(\sigma \) as

$$\begin{aligned} \sigma&=- \frac{j^x(t) dt}{T^x} + \ln p(x_{t}) - \ln p(x_{t+dt}) \end{aligned}$$
(6.44)
$$\begin{aligned} j^x(t)&:= (\dot{x} (t) - \xi ^x(t)) \circ \dot{x} (t). \end{aligned}$$
(6.45)

The generalized second law Eq. (6.30) can be rewritten as

$$\begin{aligned} - \frac{\langle j^x(t) \rangle dt}{T^x} +d S_{x|y} (t)&\ge I (x_{t+dt}: y_{t}| y_{t+dt}) - I (x_t: y_{t+dt}| y_{t}), \end{aligned}$$
(6.46)

where \(d S_{x|y} (t) := \langle \ln p(x_{t}|y_t) -\ln p(x_{t+dt}|y_{t+dt}) \rangle \) is the Shannon entropy difference of the system X under the condition of the system Y. The equality holds if and only if the local reversibility

$$\begin{aligned}&p(x_{t+dt} |x_t , y_t) p(y_{t+dt}|x_{t}, y_{t})p(x_{t}, y_{t})\nonumber \\&\quad = p_B( x_{t} |x_{t+dt}, y_{t+dt}) p(y_{t}|x_{t+dt}, y_{t+dt})p(x_{t+dt}, y_{t+dt}) \end{aligned}$$
(6.47)

holds.

In a stationary state, we have \(p((x_{t+dt}: y_{t+dt} ) =p(x_{t}: y_{t}) \), and the Shannon entropy vanishes, i.e., \(d S_{x|y} (t) =0\). Even in a stationary state, the transfer entropy from X to Y, \(I (x_t: y_{t+dt}| y_{t})\), and the term \(I (x_{t+dt}: y_{t}| y_{t+dt})\) still remain. We here call \(I (x_{t+dt}: y_{t}| y_{t+dt})\) the “backward transfer entropy”, which indicates the conditional mutual information under the condition of the future variables. From the nonnegativity of the conditional mutual information, the transfer entropy \(I (x_t: y_{t+dt}| y_{t})\) gives an upper bound of the stationary entropy reduction in the target system X and the backward transfer entropy \(I (x_{t+dt}: y_{t}| y_{t+dt})\) gives a lower bound of the stationary dissipation in the target system X. Thus, for the coupled dynamics, the information flow defined as the transfer entropy and backward transfer entropy from the target system to the outside world, gives a bound of the stationary heat flow \(\langle j^x(t) \rangle \) in the target system.

Fig. 6.6
figure 6

Examples of the generalized second law on causal networks. Example 5 Coupled dynamics with a time delay. Example 6 Complex dynamics

Example 5: Coupled Dynamics with a Time Delay

We here consider the causal network given in Fig. 6.6: \(\mathcal {V} := \{ y_{t - \Delta \tau }, x_t, y_t, x_{t+dt}, y_{t+dt} \}\), \(\mathrm{pa} (x_{k+dt}) = \{ x_t, y_{t - \Delta \tau }\}\), \(\mathrm{pa} (y_{t+dt}) = \{ x_{t}, y_{t}\}\), \(\mathrm{pa} (x_{t}) = \{y_{t- \Delta \tau } \}\), \(\mathrm{pa} (y_{t}) = \{y_{t- \Delta \tau } , x_{t}\}\) and \(\mathrm{pa} (y_{t- \Delta \tau }) = \emptyset \). We set \(X = \{ x_1:= x_t, x_2:= x_{t+dt}\} \), and \(\mathcal {C} =\mathcal {C}' = \{ c_1:= y_{t-\Delta \tau }, c_2:= y_{t}, c_3= y_{t+dt} \}\). We have \(i_\mathrm{ini} = i(x_t: y_{t- \Delta \tau })\), \(i_\mathrm{fin} = i(x_{t+dt}: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \}) \), \(i_\mathrm{tr}^1 =0\), \(i_\mathrm{tr}^2 = i(x_t: y_t| y_{t-\Delta \tau })\), and \(i_\mathrm{tr}^2 = i(x_t: y_{t+dt}| y_t, y_{t-\Delta \tau } )\). In this case, the informational quantity \(\Theta \) is calculated as

$$\begin{aligned} \Theta&= i(x_{t+dt}: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \})- i(x_t: y_{t- \Delta \tau }) -i(x_t: y_t| y_{t-\Delta \tau }) - i(x_t: y_{t+dt}| y_t, y_{t-\Delta \tau } ) \nonumber \\&= i(x_{t+dt}: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \}) - i (x_t: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \} ). \end{aligned}$$
(6.48)

From the generalized integral fluctuation theorem Eq. (6.29) and the generalized second law Eq. (6.30), we have the following relations:

$$\begin{aligned} \left\langle \exp \left[ -\sigma + i(x_{t+dt}: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \}) - i (x_t: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \} ) \right] \right\rangle =1, \end{aligned}$$
(6.49)
$$\begin{aligned} \langle \sigma \rangle \ge&I(x_{t+dt}: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \}) - I (x_t: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \} ) \nonumber \\ =&I(x_{t+dt}: \{ y_t, y_{t+dt} \}) - I (x_t: \{y_t, y_{t+dt} \} ) \nonumber \\&+I(x_{t+dt} : y_{t- \Delta \tau } | y_t, y_{t+dt} ) - I(x_{t} : y_{t- \Delta \tau } | y_t, y_{t+dt} ). \end{aligned}$$
(6.50)

The crucial difference between this model and the coupled Langevin equations [Example 4], is the dependence of the time delayed variable \(y_{t-\Delta \tau }\). In the case of the time delayed dynamics, the mutual information difference \(I(x_{t+dt}: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \}) - I (x_t: \{ y_{t- \Delta \tau }, y_t, y_{t+dt} \} )\), which gives a bound of the entropy production, includes the variable \(y_{t-\Delta \tau }\). Equation (6.50) gives the effect of the time delay for the entropy production in X as the difference \(I(x_{t+dt} : y_{t- \Delta \tau } | y_t, y_{t+dt} ) - I(x_{t} : y_{t- \Delta \tau } | y_t, y_{t+dt} )\).

Example 6: Complex Dynamics

We here consider the causal network corresponding to complex dynamics given in Fig. 6.6: \(\mathcal {V} := \{ y_1, x_1, z_1, x_2, z_2, y_2, x_3, z_3 \}\), \(\mathrm{pa} (y_1) = \emptyset \), \(\mathrm{pa} (x_1) = y_1\), \(\mathrm{pa} (z_1) =y_1\), \(\mathrm{pa} (x_2) = \{x_1, z_1 \}\), \(\mathrm{pa} (z_2) = \{ x_1, z_1 \}\), \(\mathrm{pa} (y_2) = \{y_1, x_2, z_2\}\), \(\mathrm{pa} (x_3) = \{ x_2, y_2\}\) and \(\mathrm{pa} (z_3) = \{ z_2, x_2\}\). We set \(X = \{ x_1, x_2, x_3\} \), \(\mathcal {C} = \{ c_1:= y_1, c_2:= z_1, c_3 := z_2, c_4 :=y_2, c_5 :=z_3 \}\), and \(\mathcal {C}' = \{ y_1, z_1, z_2, y_2 \}\). We have \(i_\mathrm{ini} = i(x_1: y_1)\), \(i_\mathrm{fin} = i(x_3: \{ y_1, z_1, z_2, y_2 \}) \), \(i_\mathrm{tr}^1 =0\), \(i_\mathrm{tr}^2 = 0\), \(i_\mathrm{tr}^3 = i(x_1: z_2| y_1, z_1)\), and \(i_\mathrm{tr}^4 = i(x_2: y_2| y_1, z_1,z_2)\). From the generalized integral fluctuation theorem Eq. (6.29) and the generalized second law Eq. (6.30), we have the following relations:

$$\begin{aligned}&\qquad \qquad \qquad \qquad \quad \qquad \langle \exp (-\sigma +\Theta ) \rangle =1, \end{aligned}$$
(6.51)
$$\begin{aligned}&\Theta =i(x_3: \{ y_1, z_1, z_2, y_2 \}) - i(x_1: y_1)- i(x_1: z_2| y_1, z_1) -i(x_2: y_2| y_1, z_1,z_2), \end{aligned}$$
(6.52)
$$\begin{aligned}&\quad \langle \sigma \rangle \ge I(x_3: \{ y_1, z_1, z_2, y_2 \}) - I(x_1: y_1)- I(x_1: z_2| y_1, z_1) -I(x_2: y_2| y_1, z_1,z_2). \end{aligned}$$
(6.53)

6.2.3 Coupled Chemical Reaction Model with Time-Delayed Feedback Loop

We here discuss an application of our general result to coupled chemical reaction systems with a time-delayed feedback loop. The model is characterized by a feedback loop between two systems: output system O and memory system M. We assume that each of O and M has a binary state described by 0 or 1. The model is driven by the following master equation:

$$\begin{aligned} \frac{d}{dt} p_0^X (t)&= - \omega _{0,1}^X (t) p_0^X (t) + \omega ^X_{1, 0} (t) p_1^X (t),\end{aligned}$$
(6.54)
$$\begin{aligned} \frac{d}{dt} p_1^X (t)&= - \omega ^X_{1, 0} (t) p_1^X (t) +\omega _{0,1}^X (t) p_0^X (t). \end{aligned}$$
(6.55)

where \(p_0^X (t)\) (\(p_1^X\)) is the probability of the state 0 (1) with \(X=O, M\) at time t. The normalization of the probability is satisfied, i.e., \(p_0^X (t) + p_1^X (t) =1\). The transition rate of a chemical reaction \(\omega _{i', j'}^X\) (\(i', j' =0,1\)) is given by

$$\begin{aligned} \omega _{i', j'}^X = \frac{1}{\tau ^X} \exp [-\beta ^X (D_{i'j'}^X - F_{i'}^X (t) )], \end{aligned}$$
(6.56)

where \(\tau ^X\) is a time constant of the system X, \(\beta ^X\) is the inverse temperature of a heat bath coupled to the system X, \(F_{i'}^X(t)\) is the effective free energy of the state \(i'\) at time t, and \(D^X_{i' j'}\) is the barrier of X between states \(i'\) and \(j'\) that satisfies \(D^X_{i' j'} = D^X_{j' i'} \). This transition rate is well-established in chemical reaction models [2].

Fig. 6.7
figure 7

Schematic of the coupled chemical reaction model with a time-delayed feedback loop. The previous states of O and M determine the effective free energy landscapes \(F^O\) or \(F^M\). A blue directed arrow indicates the effect of time-delayed feedback loop. This time-delayed effect is introduced by \(m_1\)-dependence of \(F^O\)

Fig. 6.8
figure 8

The causal network corresponding to the coupled chemical reaction model with a time-delayed feedback loop

Here we consider the feedback loop between O and X (see also Fig. 6.7). We introduce the random variables \((o_1,o_2, m_1, m_2)\), where \(o_1\) is the state of O at time t, \(o_2\) is the state of O at time \(t+\Delta t\), \(m_1\) is the state of M at time \(t- \Delta t'\), and \(m_2\) is the state of M at time \(t + \Delta t- \Delta t'\) with \( \Delta t > \Delta t'\). The feedback loop between O and X is described by the dependence of \(o_k\) and \(m_k\) in the effective free energy \(F_{\mu }^X (t)\). From time t to \(t+\Delta t\), the effective free energy \(F^O_{\mu } (t)\) depends on \(m_1\) and \(m_2\), where \(m_1\)-dependence indicates the effect of a time-delayed feedback control. \(F_{i'}^O (m_1, m_2)\) denotes the effective free energy of the state \(i'\) in O under the condition of \((m_1, m_2)\). From time \(t- \Delta t'\) to \(t + \Delta t- \Delta t'\), the effective free energy \(F^M_{\mu } (t)\) depends on \(o_1\). \(F_{i'}^M (o_1)\) denotes the effective free energy of the state \(i'\) in M under the condition of \(o_1\). The joint probability distribution of this model is given by

$$\begin{aligned} p(m_1, o_1, m_2, o_2) = p(m_1, o_1) p(m_2| o_1, m_1) p(o_2|o_1, m_1, m_2). \end{aligned}$$
(6.57)

The chain rule \(p(m_1, o_1) = p(m_1) p(o_1|m_1)\)gives the causal network corresponding to this model as \(\mathcal {V} = \{ m_1, o_1, m_2, o_2\}\), \(\mathrm{pa} (o_2) = \{ o_1, m_1, m_2\}\), \(\mathrm{pa} (m_2) = \{ o_1, m_1\}\), \(\mathrm{pa} (o_1) = \{ m_1\}\), and \(\mathrm{pa} (m_1) = \emptyset \) (see also Fig. 6.8).

Information Thermodynamics in the Memory System \({\varvec{M}}\)

We next treat the output system O as the target system X. If we set \(M = \{x_1 := m_1, x_2 := m_2 \}\), \(\mathcal {C} = \{c_1:= o_1, c_2 := o_2 \} \), and \(\mathcal {C}' =\{o_1\}\), the entropy change \(\Delta s_\mathrm{bath}^{k=1}\) in a heat bath attached to the system M is given by

$$\begin{aligned} \Delta s_\mathrm{bath}^{k=1} = \ln \frac{p(m_2|m_1, o_1)}{p_B (m_1|m_2,o_1)}, \end{aligned}$$
(6.58)

where we used \(\mathcal {B}_2= \mathrm{pa} (m_2) \setminus \{m_1\} =\{ o_1\}\), and the backward probability is defined as \(p_B (m_1 = i'| m_2=j', o_1) := p(m_2=i'| m_1=j', o_1 )\). From time \(t -\Delta t'\) to \(t+\Delta t -\Delta t'\), the master equation of the system M can be rewritten as

$$\begin{aligned} \frac{d}{dt} p_0^M (t)&= - [\omega _{0,1}^M(o_1) +\omega ^M_{1, 0}(o_1) ] p_0^M (t) +\omega ^M_{1, 0}(o_1) , \end{aligned}$$
(6.59)
$$\begin{aligned} \omega ^M_{i', j'}(o_1)&= \frac{1}{\tau ^M} \exp [-\beta ^M (D_{i'j'}^M - F_{i'}^M (o_1) )], \end{aligned}$$
(6.60)

and we get the solution of Eq. (6.59) as

$$\begin{aligned} p^M_0 (t+\Delta t -\Delta t')&= p_{0, \mathrm{eq}}^M (o_1) + (p_{0}^M ( t-\Delta t') - p_{0, \mathrm{eq}}^M(o_1)) \exp [- \omega ^M (o_1)\Delta t], \end{aligned}$$
(6.61)
$$\begin{aligned} p^M_1 (t+\Delta t -\Delta t')&= 1- p^M_0 (t+\Delta t -\Delta t'), \end{aligned}$$
(6.62)

where \( \omega ^M (o_1):= \omega ^M_{0,1}(o_1) +\omega ^M_{1,0}(o_1) \), and \( p^M_{0, \mathrm{eq}}(o_1) \) is an equilibrium distribution of the state 0 in M under the condition of \(o_1\) defined as

$$\begin{aligned} p^M_{0, \mathrm{eq}}(o_1) := \frac{\exp [- \beta ^M F_0^M (o_1) ]}{\exp [- \beta ^M F_0^M (o_1)] + \exp [- \beta ^M F_1^M (o_1)] }. \end{aligned}$$
(6.63)

Substituting \(p_{0}^M (t)=0,1\) into the solutions of Eqs. (6.61) and (6.62), we have the conditional probability \(p(m_2|m_1, o_1)\):

$$\begin{aligned} p(m_{2}=0|m_{1}=0, o_{1})&= p_{0, \mathrm{eq}}^M (o_1) + (1 - p_{0, \mathrm{eq}}^M(o_1)) \exp [- \omega ^M (o_1)\Delta t], \end{aligned}$$
(6.64)
$$\begin{aligned} p(m_{2}=0|m_{1}=1,o_{1})&= p_{0, \mathrm{eq}}^M (o_1) - p_{0, \mathrm{eq}}^M(o_1) \exp [- \omega ^M (o_1)\Delta t],\end{aligned}$$
(6.65)
$$\begin{aligned} p(m_{2}=1|m_{1}=i',o_{1})&=1- p(m_{2}=0|m_{1}=i' ,o_1). \end{aligned}$$
(6.66)

From Eqs. (6.64), (6.65) and (6.66), we have

$$\begin{aligned} \Delta s_\mathrm{bath}^{k=1}&=\ln \frac{p(m_{2}|m_{1},o_{1}) }{p_B(m_{1}|m_{2},o_{1}) } \nonumber \\&= \left\{ \begin{array}{ll} 0 &{}\quad (m_{1} = m_{2} )\\ \ln [1- p_{0, \mathrm{eq}}^M (o_1)] -\ln p_{0, \mathrm{eq}}^M (o_1) &{}\quad (m_{1}=0, m_{2}=1 )\\ \ln p_{0, \mathrm{eq}}^M (o_1) -\ln [1-p_{0, \mathrm{eq}}^M (o_1)] &{}\quad (m_{1}=1, m_{2}=0 )\\ \end{array} \right. \nonumber \\&= -\beta ^M \Delta F^M, \end{aligned}$$
(6.67)

where \(\Delta F^M\) is the effective free energy difference defined as \(\Delta F^M := F_{m_2}^M (o_1)- F_{m_1}^O (o_1)\). The entropy change in a heat bath gives the effective free energy difference in the memory system M.

Fig. 6.9
figure 9

The generalized second law in M on the causal network corresponding to the coupled chemical reaction model with a time-delayed feedback loop

On the causal network corresponding to this model, we have \(i_\mathrm{fin} = i(m_2: o_1)\), \(i_\mathrm{ini} =0\), and \(i_\mathrm{tr}^1 = i ( m_1:o_1 )\) (see also Fig. 6.9). Informational quantity \(\Theta \) is calculated as

$$\begin{aligned} \Theta&= i(m_2: o_1) - i(m_1:o_1) \nonumber \\&=\ln p(m_1) - \ln p(m_2) + \ln p( m_2, o_1) - \ln p(m_1, o_1). \end{aligned}$$
(6.68)

From the generalized second law Eq. (6.30), we have the following relation

$$\begin{aligned} \langle \sigma \rangle \ge I(m_2: o_1) -I(m_1:o_1), \end{aligned}$$
(6.69)

or equivalently

$$\begin{aligned} - \langle \beta ^M \Delta F^M \rangle \ge \langle \ln p( m_2, o_1) \rangle - \langle \ln p( m_1, o_1) \rangle . \end{aligned}$$
(6.70)

This result is equivalent to the Sagawa-Ueda relation, which is valid for a system under the feedback control. A bound of the effective free energy difference \(\langle \Delta F^M \rangle \) is given by the two-body Shannon entropy difference.

Information Thermodynamics in the Output System \({\varvec{O}}\)

We next treat the output system O as the target system X. If we set \(X = \{x_1 := o_1, x_2 := o_2 \}\) and \(\mathcal {C} =\mathcal {C}' = \{c_1:= m_1, c_2 := m_2 \}\), the entropy change \(\Delta s_\mathrm{bath}^{k=1}\) in a heat bath attached to the system O is given by

$$\begin{aligned} \Delta s_\mathrm{bath}^{k=1} = \ln \frac{p(o_2|o_1, m_1, m_2)}{p_B (o_1|o_2, m_1, m_2)}, \end{aligned}$$
(6.71)

where we used \(\mathcal {B}_2= \mathrm{pa} (x_2) \setminus \{ x_1\} =\{ m_1, m_2\}\), and the backward probability is defined as \(p_B (o_1 = i'| o_2=j', m_1, m_2) := p(o_2=i'| o_1=j', m_1, m_2 )\). To obtain the analytical expression of \(\Delta s_\mathrm{bath}^1\), we here calculate the conditional probability \(p(o_2|o_1, m_1, m_2)\). From time t to \(t+\Delta t\), the master equation of the system O can be rewritten as

$$\begin{aligned} \frac{d}{dt} p_0^O (t)&= - [\omega _{0,1}^O(m_1, m_2) +\omega ^O_{1, 0}(m_1, m_2) ] p_0^O (t) + \omega ^O_{1, 0} (m_1, m_2) , \end{aligned}$$
(6.72)
$$\begin{aligned} \omega ^O_{i', j'}(m_1, m_2)&= \frac{1}{\tau ^O} \exp [-\beta ^O (D_{i'j'}^O - F_{i'}^O (m_1, m_2) )], \end{aligned}$$
(6.73)

and we get the solution of Eq. (6.72) as

$$\begin{aligned} p^O_0 (t+\Delta t)&= p_{0, \mathrm{eq}}^O (m_1, m_2) + (p_{0}^O (t) - p_{0, \mathrm{eq}}^O(m_1, m_2)) \exp [- \omega ^O (m_1, m_2)\Delta t], \end{aligned}$$
(6.74)
$$\begin{aligned} p^O_1 (t+\Delta t)&= 1- p^O_0 (t+\Delta t), \end{aligned}$$
(6.75)

where \( \omega ^O (m_1, m_2):= \omega ^O_{0,1}(m_1, m_2) +\omega ^O_{1,0}(m_1, m_2) \), and \( p^O_{0, \mathrm{eq}}(m_1, m_2) \) is an equilibrium distribution of the state 0 in O under the condition of \((m_1, m_2)\) defined as

$$\begin{aligned} p^O_{0, \mathrm{eq}}(m_1, m_2) := \frac{\exp [- \beta ^O F_0^O (m_1, m_2) ]}{\exp [- \beta ^O F_0^O (m_1, m_2)] + \exp [- \beta ^O F_1^O (m_1, m_2)] }. \end{aligned}$$
(6.76)
Fig. 6.10
figure 10

The generalized second law in O on the causal network corresponding to the coupled chemical reaction model with a time-delayed feedback loop

Substituting \(p_{0}^O (t)=0,1\) into the solutions of Eqs. (6.74) and (6.75), we have the conditional probability \(p(o_2|o_1, m_1, m_2)\):

$$\begin{aligned} p(o_{2}=0|o_{1}=0,m_{1} ,m_{2})&= p_{0, \mathrm{eq}}^O (m_1, m_2)+ \left( 1 - p_{0, \mathrm{eq}}^O (m_1, m_2)\right) \exp \left[ -\omega ^O (m_1, m_2)\Delta t \right] \end{aligned}$$
(6.77)
$$\begin{aligned} p(o_{2}=0|o_{1}=1,m_{1} ,m_{2})&= p_{0, \mathrm{eq}}^O (m_1, m_2)- p_{0, \mathrm{eq}}^O (m_1, m_2) \exp \left[ -\omega ^O (m_1, m_2)\Delta t \right] \end{aligned}$$
(6.78)
$$\begin{aligned} p(o_{2}=1|o_{1}=i',m_{1} ,m_{2})&=1- p(o_{2}=0|o_{1}=i' ,m_{1},m_{2}). \end{aligned}$$
(6.79)

From Eqs. (6.77)–(6.79), we have

$$\begin{aligned} \Delta s_\mathrm{bath}^{k=1}&=\ln \frac{p(o_{2}|o_{1},m_{1}, m_{2}) }{p_B(o_{1}|o_{2},m_{1}, m_{2}) } \nonumber \\&= \left\{ \begin{array}{ll} 0 &{} (o_{1} = o_{2} )\\ \ln [1- p_{0, \mathrm{eq}}^O (m_1, m_2)] -\ln p_{0, \mathrm{eq}}^O (m_1, m_2) &{}(o_{1}=0, o_{2}=1 )\\ \ln p_{0, \mathrm{eq}}^O (m_1, m_2) -\ln [1-p_{0, \mathrm{eq}}^O (m_1, m_2)] &{} (o_{1}=1, o_{2}=0 )\\ \end{array} \right. \nonumber \\&= -\beta ^O \Delta F^O, \end{aligned}$$
(6.80)

where \(\Delta F^O\) is the effective free energy difference defined as \(\Delta F^O := F_{o_2}^O (m_1, m_2)- F_{o_1}^O (m_1, m_2)\). The entropy change in a heat bath gives the effective free energy difference in the output system O.

Fig. 6.11
figure 11

Numerical illustration of the nonnegativity of \(\langle \sigma \rangle - \langle \Theta \rangle =- \langle \beta ^O \Delta F^O \rangle + \langle \ln p( o_1, m_1, m_2) \rangle - \langle \ln p( o_2, m_1, m_2) \rangle \). We here assume that \(o_1\) and \(m_1\) are independent, i.e., \(p(o_1, m_1) = p(o_1) p( m_1)\). We set the parameters as follows: \(\Delta t =0.5\), \(\beta ^O = \beta ^O = 0.01\), \(\tau ^O = \tau ^M =0.001\), \(D^O_{01} = D^M_{01} = 100\), \(F^M_{0} (x_1=0) = F^M_{0} (x_1=1) =100\), \(F^M_{1} (x_1=0) = 10\), \( F^M_{1} (x_1=1) =30,\) \(F^O_{0} (m_1=0, m_2=0) = F^O_{0} (m_1=1, m_2=0) =F^O_{0} (m_1=0, m_2=1) =F^O_{0} (m_1=1, m_2=1) =100\), \(F^O_{1} (m_1=0, m_2=0) = 30\), \(F^O_{1} (m_1=1, m_2=0) = 10\), \(F^O_{1} (m_1=0, m_2=1) = 20\), and \(F^O_{1} (m_1=1, m_2=1) = 5\)

On the causal network corresponding to this model, we have \(i_\mathrm{fin} = i(o_2{:} \{ m_1, m_2\})\), \(i_\mathrm{ini} = i (o_1: m_1)\), \(i_\mathrm{tr}^1 =0\) and \(i_\mathrm{tr}^2 = i (o_1: m_2| m_1 )\) (see also Fig. 6.10). Informational quantity \(\Theta \) is calculated as

$$\begin{aligned} \Theta&= i(o_2: \{ m_1, m_2\})-i (o_1: m_1) - i (o_1: m_2| m_1 )\nonumber \\&=i(o_2: \{ m_1, m_2\}) - i(o_1: \{ m_1, m_2\})\nonumber \\&=\ln p(o_1) - \ln p(o_2) + \ln p( o_2, m_1, m_2) - \ln p( o_1, m_1, m_2) \end{aligned}$$
(6.81)

From the generalized second law Eq. (6.30), we have the following relation

$$\begin{aligned} - \langle \beta ^O \Delta F^O \rangle \ge \langle \ln p( o_2, m_1, m_2) \rangle - \langle \ln p( o_1, m_1, m_2) \rangle . \end{aligned}$$
(6.82)

The right hand side of Eq. (6.82) is the change in the three-body Shannon entropy, not in the two-body Shannon entropy. This three-body Shannon entropy includes the states of different times \(m_1\) and \(m_2\). This is a crucial difference between the conventional thermodynamics and our general result. Our general result is applicable to non-Markovian dynamics such as the time-delayed feedback loop, where the conventional second law is not valid. In our general result, the Shannon entropy includes the state of different times plays a important role of the generalized second law for non-Markovian dynamics.

Here we numerically illustrate the validity of Eq. (6.82) in Fig. 6.11. In this model, the equilibrium distribution is numerically calculated as \(p_{0, \mathrm{eq}}^O (m_1=0, m_2=0)= 0.332\), \(p_{0, \mathrm{eq}}^O (m_1=0, m_2=1)= 0.310\), \(p_{0, \mathrm{eq}}^O (m_1=1, m_2=0)= 0.289\), and \(p_{0, \mathrm{eq}}^O (m_1=1, m_2=0)= 0.278\). We note that the value of \(\langle \sigma \rangle - \langle \Theta \rangle \) in Fig. 6.11 is close to 0 when the initial states are close to the equilibrium distribution of the output system, which is similar to the probability \(p_{0, \mathrm{eq}}^O (m_1, m_2)\).