1 Introduction

Let us first provide an illustrative simple example of some of our main results. Perform a random walk on the network

where every time you cross one specific edge in one direction you get 1€, and when you cross it in the opposite direction you pay 1€. All other transitions also give and take credit, but in liras. Initially your pockets are empty of euros, and while you have a virtually infinite reservoir of liras your bank would not convert them into euros.

The question we pose and answer in this paper is: assuming that you live forever, what is the probability \(\mathfrak {f}_-\) that you eventually go broke (reach -1€, that you cannot actually afford to pay)?

In the special case where you are initially in \(\star \) (so that at least you get a chance to not get immediately broke!), we find, on the assumption \(F > 0\),

$$\begin{aligned} \mathfrak {f}_- = \exp - F, \end{aligned}$$
(1)

with

(2)

where each diagram implies multiplication by the corresponding rates of the Markov chain. If instead \(F < 0\), the probability to go bankrupt is 1.

In the case where the diagonal transition has vanishing rates in both directions, in the last expression the diagrams containing diagonal terms disappear and we are left with the log-ratio of two cyclic contributions

(3)

This object is known as cycle affinity, a measure of the probability of performing the cycle in one direction relative to the opposite. For unicyclic networks (“norias”) Eq. (1) boils down to a remarkable formula first obtained by Bauer and Cornu [1] (which in fact is slightly more general in that the initial state can be chosen arbitrarily). As reasonable, the cycle is completed more often in the favourable direction \((A > 0\)) rather than the unfavourable one; then the Bauer-Cornu formula yields the probability of the rare event of the cycle to ever be completed more often in the unfavourable direction. Generalizations have been proven for the total entropy production (a weighted balance of euros and liras) using the tools of martingale theory [2,3,4],Footnote 1 and discussed in the light of first-exit time problems [5].

Beyond the probabilistic interpretation, cycle affinities also afford two different thermodynamic interpretations, one global and one local, of which we give here an intuitive sketch (but see Sec. 3.2 for details), on the assumption that forward and backward rates of a transition \(x \leftrightarrow y\) have ratio \(\exp \delta q_{xy} / T_{xy}\) with \(\delta q\) the heat (in our example: currency) exchange along that transition and T the temperature (in our example this could be the interest rate) of that transition, i.e. a measure of how inconvenient it is to perform that transition (to borrow money). Then the global one is as Carnot’s entropy production along a cyclic process [6]

$$\begin{aligned} A = \oint \frac{\delta q}{T} \end{aligned}$$
(4)

where \(\oint \) is a shorthand for the sum over cyclic transitions. This takes into account all contributions, e.g. from liras and from euros, even if for the problem at hand liras to not play any role. Thus this latter interpretation has little operational value. The local one is

(5)

in terms the specific transaction in euros, its temperature , and the value of such temperature \(T^\varnothing \) at which forward and backward euro transactions happen with the same frequency (that is: the rest of the world does not care much of whatever you do with your euros).

Going back to our main result Eq. (1) (which, we remind, holds for arbitrary non-unicyclic networks), here the effective affinity does not have such a simple global interpretation, but still retains the local interpretation. What is lost with respect to norias is the independence from the initial state: while it does not matter where you are initially along a cycle to complete that cycle, it does matter where you are in a network to perform an arbitrary composition of cycles that touch the initial state. We generalize Eq. (1) appropriately.

We then show by computational examples that first-passage and extreme value problems such as the one above might give a better estimate of the effective affinity than do fluctuation and fluctuation-dissipation relations at fixed stopping time. Finally, Eq. (1) is reminiscent of Boltzmann’s formula, but turned upside down. We linger on this analogy towards the conclusions.

2 Framework

2.1 State-Space Processes

Consider an irreducible continuous-time Markov chain on finite state space \(\mathcal {X}\). We characterize it in terms of the probability \(p^\mathcal {X}_t = \{p^{\mathcal {X}}_t(x), x \in \mathcal {X}\}\) of being at x at time t, which satisfies the continuous-time master equation

$$\begin{aligned} \frac{d}{dt} p^{\mathcal {X}}_t(x) = \sum _{x' (\ne x) \in \mathcal {X}} \left[ {r}(x\vert x') \,p^{\mathcal {X}}_t(x') - {r}(x'\vert x) \,p^{\mathcal {X}}_t(x) \right] \end{aligned}$$
(6)

starting from a given initial distribution \(p^{\mathcal {X}}_0\), with non-negative rates \({r}(x\vert x')\) of jumping from \(x'\) to x. We also define the continuous-time (adjoint) generator R with matrix entries

$$\begin{aligned} R_{x,x'} = {r}(x\vert x') - \delta _{x,x'} r(x') \end{aligned}$$
(7)

where \(\delta \) is Kroenecker’s delta and

$$\begin{aligned} r(x) = \sum _{x' \in \mathcal {X}} {r}(x'\vert x) \end{aligned}$$
(8)

is the exit rate out of a state. The master equation reads in vector form \(\frac{d}{dt} p^{\mathcal {X}}_t = R p^{\mathcal {X}}_t\) and its stationary distribution solves \(R p^{\mathcal {X}}_\infty = 0\). From now on we do not specify the range of summation unless necessary.

We now focus on a pair of connected states, namely \(x,x' = 1,2\) without loss of generality. We assume that edge \(1 \leftrightarrow 2\) is not a bridge, that is, that its removal does not disconnect the system, and denote \(\mathcal {X}_\varnothing \) (or simply \(\varnothing \)) a system where edge \(1 \leftrightarrow 2\) is removed. Transitions between these states are deemed to be visible to an external observer. Let \(\ell \in \mathcal {L} = \{ {\rightarrow }= 2 \rightarrow 1, {\leftarrow }= 1 \rightarrow 2\}\) denote transitions between such states, to and from, and \(\texttt{t}(\cdot ), \texttt{s}(\cdot ) \in \{1,2\}\) the source and target states of a transition, i.e. \(\texttt{t}({\rightarrow }) = \texttt{s}({\leftarrow }) = 1\) and \(\texttt{t}({\leftarrow }) = \texttt{s}({\rightarrow }) = 2\).

Letting \(n(\ell )\) be the number of times transition \(\ell \) occurs along a realization of the process, we define the visible activity and the cumulated current respectively as

$$\begin{aligned} n&= n({\rightarrow }) + n({\leftarrow }), \end{aligned}$$
(9)
$$\begin{aligned} c&= n({\rightarrow }) - n({\leftarrow }). \end{aligned}$$
(10)

Notice that they typically grow linearly in time; thus we denote the mean stationary current (i.e. cumulated current per unit time) as

$$\begin{aligned} \langle \dot{c} \rangle = {r}(1\vert 2) p^\mathcal {X}_\infty (2) - {r}(2\vert 1) p^\mathcal {X}_\infty (1). \end{aligned}$$
(11)

Our final goal is to compute the probability \(\mathfrak {f}_\pm \) that the cumulated current c takes value \(\pm 1\) at least once as the process unfolds from time \(t = 0\) to time \(t \rightarrow +\infty \).

2.2 Transition-State Processes

Our strategy is to lift the description of the process from state space \(\mathcal {X}\) to transition space \(\mathcal {L}\), following the treatment of Ref.  [7]. The central objects to be calculated are the trans-transition probabilities \(p(\ell \vert \ell ')\) that the next observable transition is \(\ell \) given that the previous was \(\ell '\). An intuitive way to go about this would by brute-force coarse-graining of a stochastic trajectory \(\varvec{x},\varvec{\tau } = x_0,\tau _0 \rightarrow x_1,\tau _1 \rightarrow \ldots \rightarrow x_k,\tau _k\) in state space, where \(x_j\) are the visited states and \(\tau _j\) are the permanence times. Then \(p(\ell \vert \ell ')\) can be computed by marginalization of the probability density function at fixed number of jumps k

$$\begin{aligned} f_k(\varvec{x},\varvec{\tau }) = \prod _{i = 1}^{k-1} r(x_{i+1}\vert x_i) \, e^{- r(x_i) \tau _i} \end{aligned}$$
(12)

by integrating away the intermediate times and summing over all trajectories between the target state of \(\ell '\) and the source state of \(\ell \) but that otherwise do not contain observable transitions, and multiplying by the rate of this latter transition. This direct procedure is illustrated in the Supplemental Material of Ref.  [7].

A more elegant line is the following. Notice that with probability one the visible activity takes any positive integer value and at any given time t does not depend on future information. Thus the time when the activity reaches a certain value n for the first time is a valid stopping time. Then by the strong Markov property [8] the event of being at x after n visible transitions is also a Markov process in state space. Let \(p^\mathcal {L}_n(\ell )\) be the probability that the n-th visible transition is \(\ell \). Notice that the probability that the next transition is \(\ell \) given that the previous was \(\ell '\) only depends on the target state of \(\ell '\). Thus we conclude that \(p^\mathcal {L}_n\) satisfies a discrete-time Markov chain in transition space

$$\begin{aligned} p^\mathcal {L}_{n+1}(\ell ) = \sum _{\ell ' \in \mathcal {L}} {p}(\ell \vert \ell ') \, p^\mathcal {L}_{n}(\ell ') \end{aligned}$$
(13)

evolving from some initial probability \(p^\mathcal {L}_1(\ell )\) that the first transition is \(\ell \). The \({p}(\ell \vert \ell ')\) are the the so-called trans-transition probabilities; we arrange them in a trans-transition matrix P with entries \(P_{\ell ,\ell '} = {p}(\ell \vert \ell ')\).

Both the initial transition probability and the trans-transition probabilities can be obtained from the initial state probability \(p^\mathcal {X}_0\) and the transition rates \({r}(x\vert x')\) by solving first-transition time problems. In particular the probability that, starting from x, \(\ell \) is the first visible transition and that it occurs in the time interval \([t,t+dt)\) is given by

$$\begin{aligned} {r}(\texttt{t}(\ell )\vert \texttt{s}(\ell )) \left[ \exp tS\right] _{\texttt{s}(\ell ),x} dt \end{aligned}$$
(14)

where S is the matrix obtained from R by setting to zero the off-diagonal entries corresponding to the visible transition, namely

$$\begin{aligned} \begin{aligned} S_{x,x'}&= R_{x,x'}, \qquad \textrm{for}\,(x,x') \ne (1,2), (2, 1), \\ S_{1,2} = S_{2,1}&= 0. \end{aligned} \end{aligned}$$
(15)

By integrating Eq. (14) from \(t = 0\) to infinity and evaluating at \(x = \texttt{t}(\ell ')\) we find, for all \(\ell ,\ell ' \in \mathcal {L}\), the trans-transition probabilities

$$\begin{aligned} {p}(\ell \vert \ell ') = - {r}(\texttt{t}(\ell )\vert \texttt{s}(\ell )) [S^{-1}]_{\texttt{s}(\ell ) \texttt{t}(\ell ')}, \end{aligned}$$
(16)

and, for all \(\ell \in \mathcal {L}\), the probability of the first transition

$$\begin{aligned} p^\mathcal {L}_1(\ell ) = - {r}(\texttt{t}(\ell )\vert \texttt{s}(\ell )) \sum _{x} [S^{-1}]_{\texttt{s}(\ell ),x} \, p^\mathcal {X}_0(x). \end{aligned}$$
(17)

The invertibility of matrix S is granted by the fact that, as a corollary of the Perron-Froebenius theorem [9], its Perrron root is strictly smaller than that of R, which is zero (see [7, Appendix] for more details). There it was also proven that trans-transition probabilities and the initial transition probability are positive and normalized, as they should be:

$$\begin{aligned} 1 = p^\mathcal {L}_1({\rightarrow }) + p^\mathcal {L}_1({\leftarrow }) = {p}({\rightarrow }\vert \ell ) + {p}({\leftarrow }\vert \ell ), \quad \textrm{for}\;\ell \in \mathcal {L}. \end{aligned}$$
(18)

Explicitly, the trans-transition matrix is given by

$$\begin{aligned} P = \frac{1}{\nu _{{\rightarrow }} + \nu _{{\leftarrow }} - \nu _0} \left( \begin{array}{cc} \nu _{{\rightarrow }} - \nu _\circ &{} \nu _{{\rightarrow }} \\ \nu _{{\leftarrow }} &{} \nu _{{\leftarrow }} - \nu _\circ \end{array}\right) \end{aligned}$$
(19)

where, letting \(A_{\setminus (x_1, \ldots , x_n \vert x'_1, \ldots , x'_n)}\) be a matrix from which rows \(x_1, \ldots , x_n\) and columns \(x'_1, \ldots , x'_n\) are removed, we have

$$\begin{aligned} \nu _\circ&= {r}(1\vert 2) {r}(2\vert 1) \det R_{\setminus (1,2 \vert 2, 1)} \nonumber \\ \nu _{{\rightarrow }}&= {r}(1\vert 2) \det R_{\setminus (2\vert 1)} \\ \nu _{{\leftarrow }}&= {r}(2\vert 1) \det R_{\setminus (1\vert 2)}. \nonumber \end{aligned}$$
(20)

A proof of these expressions is given in Appendix A.1.

3 Results

3.1 Statement and Derivation of the Main Result

We can now formulate our problem of calculating the probability that the cumulated current ever hits value \(-1\) (case \(+1\) for later) as

$$\begin{aligned} \mathfrak {f}_- = \sum _{n = 1}^\infty \mathfrak {f}^{(n)}_- \end{aligned}$$
(21)

where \(\mathfrak {f}^{(n)}_-\) is the probability that the cumulated current c takes value \(-1\) for the first time at the \(n-\)th visible transition. The first is just the probability that the transition occurs right-away:

$$\begin{aligned} \mathfrak {f}^{(1)}_-&= p^\mathcal {L}_1({\leftarrow }). \end{aligned}$$
(22)

Notice instead that the cumulated current cannot be \(-1\) after two visible transitions:

$$\begin{aligned} \mathfrak {f}^{(2)}_-&= 0. \end{aligned}$$
(23)

For the cumulated current to be \(-1\) for the first time at the third visible transition, we need that the first visible transition is \({\rightarrow }\) and the second and third are \({\leftarrow }\), therefore:

$$\begin{aligned} \mathfrak {f}^{(3)}_-&= {p}({\leftarrow }\vert {\leftarrow }) {p}({\leftarrow }\vert {\rightarrow }) p^\mathcal {L}_1({\rightarrow }). \end{aligned}$$
(24)

To go beyond, first notice that all probabilities at even n vanish. At odd n, we need to count all different paths of \(2n+1\) steps that perform a \({\leftarrow }\) transition leading from \(c = 0\) to \(c = -1\) for the first time as the last step, and multiply each path by the corresponding probability. Namely, we need to count all different sequences \((\ell _1,\ell _2, \ldots , \ell _{2n+1})\) such that:

  1. 1)

    \(\ell _{1} = {\rightarrow }\) and \(\ell _{2n+1} = {\leftarrow }\);

  2. 2)

    at any intermediate step the number of \({\leftarrow }\) is never greater than the number of \({\rightarrow }\) and at 2n the number of \({\leftarrow }\) is exactly equal to the number of \({\rightarrow }\);

  3. 3)

    they have a given amount \(k \le n\) of \({\leftarrow }\vert {\rightarrow }\) trans-transitions (which also fixes the number of \({\leftarrow }\vert {\leftarrow }\), \({\rightarrow }\vert {\rightarrow }\), and \({\rightarrow }\vert {\leftarrow }\) trans-transitions).

Fig. 1
figure 1

A mountain with \(n = 5\) up and down slopes, 3 peaks and 2 valleys

In fact, this question maps to a well-known enumeration problem: if we replace \({\rightarrow }\) with a \(45^\circ \) unit segment and \({\leftarrow }\) with a \(-45^\circ \) unit segment, we need to count all “mountains” of length 2n that can be drawn without lifting the pencil and that have exactly k peaks (see Fig. 1). This problem is well-known to be solved by the Narayana numbers [10]

$$\begin{aligned} N(n,k) = \frac{1}{n} { n \atopwithdelims ()k} { n \atopwithdelims ()k-1}. \end{aligned}$$
(25)

Therefore for \(n \ge 1\) the result to our problem is

$$\begin{aligned} \mathfrak {f}^{(2n+1)}_- = p^\mathcal {L}_1({\rightarrow }) \frac{{p}({\leftarrow }\vert {\leftarrow })}{{p}({\rightarrow }\vert {\leftarrow })} \sum _{k = 1}^n N(n,k) [{p}({\leftarrow }\vert {\leftarrow }) {p}({\rightarrow }\vert {\rightarrow })]^{n-k} [{p}({\leftarrow }\vert {\rightarrow }) {p}({\rightarrow }\vert {\leftarrow })]^{k}. \end{aligned}$$
(26)

Notice that the prefactor in Eq. (26) accounts for the initial transition (which must be \({\leftarrow }\)), for the last transition (which must be \({\leftarrow }\) given that the previous was also \({\leftarrow }\)), and for the fact that in such mountains valleys are one less than peaks.

We now use the fact that Narayana numbers admit the generating function [11]

$$\begin{aligned} G(x,y)&= \sum _{n \ge 1} \sum _{k = 1}^n N(n,k) \,x^n y^k \nonumber \\&= \frac{1 + x(1-y) - \sqrt{1-2 x(1+y) + x^2 (1-y)^2} }{2x} - 1. \end{aligned}$$
(27)

Then, letting

$$\begin{aligned} x_*&= {p}({\rightarrow }\vert {\rightarrow }) {p}({\leftarrow }\vert {\leftarrow }) \end{aligned}$$
(28)
$$\begin{aligned} y_*&= \frac{{p}({\rightarrow }\vert {\leftarrow }) {p}({\leftarrow }\vert {\rightarrow })}{x_*}, \end{aligned}$$
(29)

we find that

$$\begin{aligned} \mathfrak {f}_- = p^\mathcal {L}_1({\leftarrow }) + p^\mathcal {L}_1({\rightarrow }) \frac{{p}({\leftarrow }\vert {\leftarrow })}{{p}({\rightarrow }\vert {\leftarrow })} G(x_*,y_*). \end{aligned}$$
(30)

Now notice that, using normalization of the trans-transition probabilities Eq. (18), we have

$$\begin{aligned} x_*(1 -y_*)&= {p}({\rightarrow }\vert {\rightarrow }) + {p}({\leftarrow }\vert {\leftarrow }) - 1, \nonumber \\ x_*( 1+y_*)&= 2{p}({\rightarrow }\vert {\rightarrow }) {p}({\leftarrow }\vert {\leftarrow }) - {p}({\rightarrow }\vert {\rightarrow }) - {p}({\leftarrow }\vert {\leftarrow }) + 1. \end{aligned}$$
(31)

After some tedious but revealing calculation (see appendix A.2 for details) one obtains that the square root in Eq. (27) has real-valued solution

$$\begin{aligned} \sqrt{1-2 x_*(1+y_*) + {x_*}^2 (1-y_*)^2} = \big \vert {p}({\rightarrow }\vert {\rightarrow }) - {p}({\leftarrow }\vert {\leftarrow }) \big \vert \end{aligned}$$
(32)

in terms of the absolute value \(\vert \,\cdot \,\vert \). We then find the remarkably simple expression

$$\begin{aligned} G(x_*,y_*) = \left\{ \begin{array}{ll} {p}({\rightarrow }\vert {\leftarrow })/{p}({\leftarrow }\vert {\leftarrow }), &{} \textrm{if}\,{p}({\rightarrow }\vert {\rightarrow }) < {p}({\leftarrow }\vert {\leftarrow }) \\ {p}({\leftarrow }\vert {\rightarrow })/{p}({\rightarrow }\vert {\rightarrow }), &{} \textrm{if}\,{p}({\rightarrow }\vert {\rightarrow }) \ge {p}({\leftarrow }\vert {\leftarrow }). \end{array} \right. \end{aligned}$$
(33)

Plugging this latter into Eq. (30) we find our central result

$$\begin{aligned} \mathfrak {f}_- = \min \,\left\{ 1, p^\mathcal {L}_1({\leftarrow }) + p^\mathcal {L}_1({\rightarrow }) \frac{{p}({\leftarrow }\vert {\leftarrow })}{{p}({\rightarrow }\vert {\rightarrow })} \frac{{p}({\leftarrow }\vert {\rightarrow })}{{p}({\rightarrow }\vert {\leftarrow })} \right\} , \end{aligned}$$
(34)

where the two values are obtained respectively for \({p}({\rightarrow }\vert {\rightarrow }) < {p}({\leftarrow }\vert {\leftarrow })\) and for \({p}({\rightarrow }\vert {\rightarrow }) \ge {p}( {\leftarrow }\vert {\leftarrow })\). To express \(\mathfrak {f}_-\) as a minimum between two values we used the fact that, because \({p}({\leftarrow }\vert {\rightarrow })/ {p}({\rightarrow }\vert {\leftarrow }) = [1 - {p}({\rightarrow }\vert {\rightarrow })]/[1- {p}({\leftarrow }\vert {\leftarrow }) ]\), the second value is monotonically increasing in \({p}({\leftarrow }\vert {\leftarrow })\) and decreasing in \({p}({\rightarrow }\vert {\rightarrow })\), and is only 1 for \({p}({\rightarrow }\vert {\rightarrow }) = {p}({\leftarrow }\vert {\leftarrow })\).

Now notice that the stationary distribution in transition space (eigenvector of the trans-transition matrix relative to eigenvalue 1, \(Pp^\mathcal {L}_\infty = p^\mathcal {L}_\infty \)) is easily found to be \(p^\mathcal {L}_\infty (\ell ) \propto {p}(\ell \vert \overline{\ell })\), where \(\overline{\ell }\) denotes the reverse transition of \(\ell \) (i.e. \(\overline{{\rightarrow }} = {\leftarrow }\), \(\overline{{\leftarrow }} = {\rightarrow }\)). Therefore we can rewrite the above expression as

$$\begin{aligned} \mathfrak {f}_- = \min \,\left\{ 1, p^\mathcal {L}_1({\leftarrow }) + p^\mathcal {L}_1({\rightarrow }) \frac{{p}({\leftarrow }\vert {\leftarrow }) p^\mathcal {L}_\infty ({\leftarrow })}{{p}({\rightarrow }\vert {\rightarrow }) p^\mathcal {L}_\infty ({\rightarrow })} \right\} . \end{aligned}$$
(35)

Now consider the probability \(\mathfrak {f}_+\) that the cumulated current ever reaches value \(+1\). A quick review of the above derivation promptly leads to

$$\begin{aligned} \mathfrak {f}_+ = \min \,\left\{ p^\mathcal {L}_1({\rightarrow }) + p_1^\mathcal {L}({\leftarrow }) \frac{{p}({\rightarrow }\vert {\rightarrow }) p^\mathcal {L}_\infty ({\rightarrow })}{{p}({\leftarrow }\vert {\leftarrow }) p^\mathcal {L}_\infty ({\leftarrow })}, 1 \right\} , \end{aligned}$$
(36)

where the two values are taken respectively for \({p}({\rightarrow }\vert {\rightarrow }) < {p}({\leftarrow }\vert {\leftarrow })\) and for \({p}({\rightarrow }\vert {\rightarrow }) \ge {p}({\leftarrow }\vert {\leftarrow })\).

3.2 The Effective Affinity

Let us define

$$\begin{aligned} F&= \log \frac{{p}({\rightarrow }\vert {\rightarrow })}{{p}({\leftarrow }\vert {\leftarrow })}. \end{aligned}$$
(37)

This quantity has been given an operational thermodynamic interpretation in Refs.  [12, 13] as follows. By Eqs. (19) and (20) we have

$$\begin{aligned} F&= \log \frac{{r}(1\vert 2) [ \det R_{\setminus (2\vert 1)} - {r}(2\vert 1) \det R_{\setminus (1,2 \vert 2, 1)}]}{ {r}(2\vert 1) [\det R_{\setminus (1\vert 2)} - {r}(1\vert 2) \det R_{\setminus (1,2 \vert 2, 1)}]} \nonumber \\&= \log \frac{{r}(1\vert 2) p_\infty ^\varnothing (2) }{ {r}(2\vert 1) p_\infty ^\varnothing (1)} \end{aligned}$$
(38)

where in the second expression \(p^\varnothing _\infty = \lim _{t \rightarrow \infty } p^\mathcal {X_\varnothing }_{t}\) is the stationary probability of the system where transition \(1 \leftrightarrow 2\) is removed, i.e. \(R^\varnothing p^\varnothing _\infty = 0\) (see Appendix A.2 for a direct proof; the distribution is unique by the assumption that edge \(1 \leftrightarrow 2\) is not a bridge). Notice that this is a stalling system, that is, one where (by non-existence of the transition!) the mean stationary current \(\langle \dot{c} \rangle ^\varnothing = {r}(1\vert 2) p^\varnothing _\infty (2) - {r}(2\vert 1) p^\varnothing _\infty (1)\) vanishes.

Let us now parametrize rates according to the principle of local detailed balance [14, 15]

$$\begin{aligned} \frac{{r}(x\vert x')}{{r}(x'\vert x)} = \exp \frac{\delta q_{xx'}}{T_{xx'}} \end{aligned}$$
(39)

in terms of a energy increment \(\delta q_{xx'} = - \delta q_{x'x}\) and a temperature profile \(T_{xx'} = T_{x'x}\) describing the influence of a local bath’s degrees of freedom. We assume that temperature \(T_{12}\) is specific of transition \(1 \leftrightarrow 2\) (that is, its variation does not affect other rates). Then it was proven [12] that there exists a value \(T^\varnothing _{12}\) for which the mean current stalls (but here the transition is possible!). Nevertheless, a simple argument shows that the stationary values \(p^\varnothing _\infty (2)\) and \(p^\varnothing _\infty (1)\) are the same as in the system where the transition is removed altogether (see Appendix A.3). We therefore have

$$\begin{aligned} 0 = \langle \dot{c} \rangle ^\varnothing = {r}^\varnothing (1\vert 2) p^\varnothing _\infty (2) - {r}^\varnothing (2\vert 1) p^\varnothing _\infty (1) \end{aligned}$$
(40)

leading to \(p_\infty ^\varnothing (2)/p_\infty ^\varnothing (1) = -\delta q_{12} / T^\varnothing _{12}\) and

$$\begin{aligned} F = \delta q_{12} \left( \frac{1}{T_{12}} - \frac{1}{T_{12}^\varnothing } \right) . \end{aligned}$$
(41)

This latter local expression grants an operational procedure to measure F, on the assumption that \(\delta q_{12}\) is measured or theoretically determined by a microphysical theory of the system describing energy levels, that \(T_{12}\) is tunable, and that the mean current \(\langle \dot{c} \rangle \) is observable. The procedure consists in tuning \(T_{12}\) to the value \(T^\varnothing _{12}\) for which the observable mean current vanishes. Then, if \(\delta q_{12}\) is known, F is determined in terms of the inverse temperature difference.

As regards the global acceptation of affinity mentioned in the introduction, for systems containing a single oriented cycle \(\mathcal {C}\) it is easily shown [16, 17] that \(F = A\) is the cycle affinity, namely the ratio of the products of rates along the cycle, in opposite directions

$$\begin{aligned} A = \log \prod _{(xx') \in \mathcal {C}} \frac{{r}(x\vert x')}{{r}(x'\vert x)} = \sum _{(xx') \in \mathcal {C}} \frac{\delta q_{xx'}}{T_{xx'}} = \oint \frac{\delta q}{T}. \end{aligned}$$
(42)

For vanishing A (Kolmogorov condition) one finds an equilibrium state with vanishing mean current. From the above relation one immediately finds for the equilibrium temperature the relation

$$\begin{aligned} \frac{\delta q_{12}}{T^\varnothing _{21}} = - \sum _{(12) \ne (xx') \in \mathcal {C}} \frac{\delta q_{xx'}}{T_{xx'}}. \end{aligned}$$
(43)

For generic multicyclic systems, this latter identification with a specific thermodynamic cycle is not possible. However, the cumulated current \(c = \sum _{\mathcal {C}} c({\mathcal {C})}\) can in fact be envisioned as the sum of the winding numbers over all cycles that include the visible transition (see Refs.  [18, 19] for some insights on such winding numbers). Notice that a stalling mean current does not imply global equilibrium, as these cycles may have circulation even if overall the visible mean current stalls. An explicit expression of F in terms of such cycles is

$$\begin{aligned} F = \log \frac{\sum _{\mathcal {C} \ni {\rightarrow }} w(\mathcal {C}) \prod _{(xx') \in \mathcal {C}} {r}(x\vert x')}{\sum _{\mathcal {C} \ni {\rightarrow }} w(\mathcal {C}) \prod _{(xx') \in \mathcal {C}} {r}(x'\vert x)} \end{aligned}$$
(44)

where \(w(\mathcal {C})\) is some cycle weight, independent of the cycle’s orientation [12]. Nevertheless, defining entropy production as the Kullback–Leibler distance of random processes from their time-reversed, it has been shown that \(F \langle \dot{c} \rangle \) is indeed the entropy production estimated by an external observer who only has access to the sequence of visible transitions [16, 17].

3.3 Special Cases and the Noria, and a Generalization

We consider two special cases where our main results write in terms of the effective affinity. Here we resolve the explicit dependency of the stopping probability in terms of the probability \(p_1^\mathcal {L}\) of the first transition, \(\mathfrak {f}_\pm = \mathfrak {f}_\pm \left[ p_1^\mathcal {L}\right] \). Remember that such probability can eventually be computed from the initial probability in state space \(p_0^\mathcal {X}\) via Eq. (17). Finally we generalize the above results to the probability of hitting arbitrary low values.

3.3.1 Stationary Case

In the first case we sample the initial transition from the stationary distribution. We easily find from Eqs. (35) and (36)

$$\begin{aligned} \mathfrak {f}_-\left[ p^\mathcal {L}_\infty \right]&= \min \, \left\{ 1, p^\mathcal {L}_\infty ({\leftarrow }) (1 + \exp -F) \right\} , \end{aligned}$$
(45)
$$\begin{aligned} \mathfrak {f}_+\left[ p^\mathcal {L}_\infty \right]&= \min \, \left\{ p^\mathcal {L}_\infty ({\rightarrow }) (1 + \exp +F), 1 \right\} . \end{aligned}$$
(46)

From an operational point of view this is particularly simple because it only requires to wait long enough for the system to stationarize. Then \(p^\mathcal {L}_\infty \) can be computed explicitly from the time series of the transitions, by just counting the relative frequency of \({\rightarrow }\)’s and \({\leftarrow }\)’s.

3.3.2 Cyclic Case

In the second case, we prepare the system just after a visible transition is performed and then wait for the same transition to occur again, thus completing a cycle. Therefore for \(c = +1\) we prepare the system at the tipping point of \({\rightarrow }\), which gives \(p_0^\mathcal {X}(x) = \delta _{x,1}\) so that, after Eq. (17) is applied, \(p^\mathcal {L}_1(\ell ) = {p}(\ell \vert {\rightarrow })\). For \(c = -1\) we prepare the system at the tipping point of \({\leftarrow }\), which gives \(p_0^\mathcal {X}(x) = \delta _{x,2}\) and \(p^\mathcal {L}_1(\ell ) = {p}(\ell \vert {\leftarrow })\). After some calculation trick such as

$$\begin{aligned} {p}({\rightarrow }\vert {\rightarrow }) \left[ 1+ \frac{{p}({\rightarrow }\vert {\leftarrow })}{{p}({\leftarrow }\vert {\leftarrow })} \right] = {p}({\rightarrow }\vert {\rightarrow }) \left[ 1+ \frac{1- {p}({\leftarrow }\vert {\leftarrow })}{{p}({\leftarrow }\vert {\leftarrow })} \right] = \exp F \end{aligned}$$
(47)

we find

$$\begin{aligned} \mathfrak {f}_-\left[ {p}(\cdot \vert {\leftarrow })\right]&= \min \,\left\{ 1, \exp - F \right\} , \end{aligned}$$
(48)
$$\begin{aligned} \mathfrak {f}_+\left[ {p}(\cdot \vert {\rightarrow })\right]&= \min \,\left\{ \exp + F, 1\right\} . \end{aligned}$$
(49)

This result is analogous to the one derived in Ref.  [1] for unicyclic systems, with the exception that in the unicyclic case the choice of initial state (or, equivalently, the final transition) is not relevant, given that all states share the same cycle and therefore the explicit dependency on the initial state drops and the above result simplifies to

$$\begin{aligned} \mathfrak {f}_\pm [\,\cdot \,]&= \min \,\left\{ 1, \exp \pm A \right\} \end{aligned}$$
(50)

where \(\mathfrak {f}_\pm [\,\cdot \,]\) is just the probability that the cycle is ever completed in either direction, independently of the initial state.

3.3.3 Hitting \(-n\)

The above hitting result for the cumulated current to ever become \(-1\) (for \(F > 0\)) lends itself to a simple generalization to the case of the cumulated current hitting value \(-n\), for \(n \in \mathbb {N}\). Intuitively (given that denumerable + denumerable = denumerable) this is just given by reiterating the hitting problem (renewal property), with the initial condition stabilizing to the previous occurrence of \({\leftarrow }\) just after the first occurrence. One immediately obtains

$$\begin{aligned} \mathfrak {f}_{-n}[p^{\mathcal {L}}_1]&= \mathfrak {f}_{-1}[p^{\mathcal {L}}_1] \; \mathfrak {f}_{-1}[{p}(\cdot \vert {\leftarrow })]^{n-1} = \left[ p^{\mathcal {L}}_1({\rightarrow }) e^{F} + p^{\mathcal {L}}_1({\leftarrow }) e^{ F^\leftrightarrow } \right] e^{-nF}, \end{aligned}$$
(51)

where we rewrote \({p}({\rightarrow }\vert {\leftarrow })/{p}({\leftarrow }\vert {\rightarrow }) = \exp F^{\leftrightarrow }\) as the effective affinity of a system whose trans-transition matrix \(P^\leftrightarrow \) has the columns swapped with respect to P; interestingly this auxiliary dynamics also plays a role in formulating the transient fluctuation relation in Ref.  [7], but its physical interpretation has still to be clarified.

3.4 Fluctuation Relations

In the unicyclic case, one easily finds the fluctuation relation

$$\begin{aligned} \frac{\mathfrak {f}_+[\,\cdot \,] }{\mathfrak {f}_-[\,\cdot \,]} = \exp A. \end{aligned}$$
(52)

In the multicylic case, from Eqs. (49), (48) we have

$$\begin{aligned} \frac{\mathfrak {f}_+\left[ {p}(\cdot \vert {\rightarrow })\right] }{\mathfrak {f}_-\left[ {p}(\cdot \vert {\leftarrow })\right] } = \exp F. \end{aligned}$$
(53)

This looks formally like a fluctuation relation, with a caveat: in fluctuation relations the probabilities being compared should be the same, while in this case they are different probabilities, as they are conditioned on two different initial distributions, viz. \({p}(\cdot \vert {\rightarrow })\) and \({p}(\cdot \vert {\leftarrow })\). This, as we will see, has consequences on the computational or experimental interpretation of data, given that one should prepare different experiments for forward and backward processes and post-select their outcome, which is not desirable. In the next section we comment further on this aspect, arguing that Eq. (53) may in fact be the best chance of an estimator of nonequilibrium despite approximations.

Furthermore, in Ref.  [7] it was proven (Eq. (21)) that, by sampling the initial transition from distribution \(p_1^\mathcal {L}(\ell ) \propto {p}(\ell \vert \ell )\), the following fluctuation relation holds

$$\begin{aligned} \frac{p_n(c)}{p_n(-c)} = \exp c F, \end{aligned}$$
(54)

where we remind that c is given by Eq. (10) and \(p_n(c)\) is the probability that the cumulated current is a certain value \(c \in \mathbb {Z}\) after n visible transitions. One can then further derive the relation

$$\begin{aligned} \frac{\sum _{n \in \mathcal {N}} p_n(+1)}{\sum _{n \in \mathcal {N}} p_n(-1)} = \exp F \end{aligned}$$
(55)

where \(\mathcal {N}\) is any subset of \(\mathbb {N}\). This is reminiscent of Eq. (53), but notice that these latter are not independent probabilities.

Finally, fluctuation relations for single edge currents at stopping times different than the total number of visible transitions (in particular at “clock time” t) do not generally hold – but in the unicyclic case – because the statistics of a specific current depends on all other currents flowing through the network. This is what makes relations such as Eqs. (53) and (55) particularly appealing, as they are local and phenomenological, and do not depend on knowledge of the whole system.

3.5 Estimation of the Effective Affinity

Many of the above expressions can be used to build estimators of the effective affinity. We will focus on the ones coming from cyclic processes.

Consider M independent realizations of a trajectory performing N visible transitions:

$$\begin{aligned} \ell _1^{(m)},\ell _2^{(m)},\ldots ,\ell _{N}^{(m)}, \quad m \in [1,M] . \end{aligned}$$
(56)

Define the cumulated current after the n-th visible transition

$$\begin{aligned} \hat{c}^{(m)}_n = \sum _{k = 1}^n \left( \delta _{\ell _k^{(m)}, {\rightarrow }} - \delta _{\ell _k^{(m)}, {\leftarrow }}\right) . \end{aligned}$$
(57)

It has empirical distribution

$$\begin{aligned} \hat{p}_n(c) = \sum _{m = 1}^M \delta _{\hat{c}^{(m)}_n, c}, \quad \textrm{for} \, c \in [-n,n] \end{aligned}$$
(58)

and empirical mean and variance

$$\begin{aligned} \langle \hat{c}_n \rangle&= \frac{1}{M} \sum _{m = 1}^M c^{(m)}_n = \sum _{c \in [-n,n]} c \, \hat{p}_n(c), \end{aligned}$$
(59)
$$\begin{aligned} \langle \!\langle \hat{c}_n^2 \rangle \!\rangle&= \frac{1}{M} \sum _{m = 1}^M (\hat{c}^{(m)}_n - \langle \hat{c}_n \rangle )^2 = \sum _{c \in [-n,n]} c^2 \, \hat{p}_n(c) - \langle \hat{c}_n \rangle ^2. \end{aligned}$$
(60)

Define the empirical stopping times

$$\begin{aligned} \hat{N}_{\pm }^{(m)} = \inf \{n \in [0,N] \;\mathrm {s.t.}\; \hat{c}^{(m)}_n = \pm 1\} \vee \{N+1\} \end{aligned}$$
(61)

and the estimators of the stopping probabilities

$$\begin{aligned} \hat{\mathfrak {f}}_\pm&= \min \left\{ 1 - \frac{1}{M} \sum _{m = 1}^M \delta _{\hat{N}_{\pm }^{(m)}, N+1}, \frac{1}{M} \right\} \end{aligned}$$
(62)

where the minimum is introduced to avoid possible divergences in the case \(\hat{\mathfrak {f}}_\pm = 0\) (see also Eq. (25) in Ref.  [20]).

Notice that due to the finite cutoff on the number of transitions, given Eq. (21) these latter are biased. In particular they systematically underestimate (on average) the true stopping probability due to the fact that all occurrences of \(c = \pm 1\) after N visible transitions are discarded.

Assuming that we can ignore the initial conditions, we can invert Eqs. (48) and (49) to obtain an estimator of the effective affinity

$$\begin{aligned} \hat{F}_{\textrm{cy}}&= \left\{ \begin{array}{ll} \log \hat{\mathfrak {f}}_+, &{} \mathrm {if\;} \hat{\mathfrak {f}}_+ \le \hat{\mathfrak {f}}_-, \\ -\log \hat{\mathfrak {f}}_-, &{} \mathrm {if\;} \hat{\mathfrak {f}}_+ > \hat{\mathfrak {f}}_-. \end{array} \right. \end{aligned}$$
(63)

We can compare this to the estimator coming from the stopping fluctuation relation

$$\begin{aligned} \hat{F}_{\textrm{fr}}&= \log \hat{\mathfrak {f}}_+ - \log \hat{\mathfrak {f}}_-, \end{aligned}$$
(64)

which is generally biased due to the different initial conditions in Eq. (53).

We complement these stopping-problem estimators with an estimator coming from the theory of linear response out of stalling states [21]

$$\begin{aligned} \hat{F}_{\textrm{lr}}&= \frac{2 \langle \hat{c}_N \rangle }{\langle \!\langle \hat{c}_N^2 \rangle \!\rangle } \end{aligned}$$
(65)

and with an estimator obtained from the standard entropy production expression as a Kullback–Leibler divergence (properly regularized to avoid taking \(\log 0\))

$$\begin{aligned} \hat{F}_{\textrm{kl}}&= \frac{1}{ \langle \hat{c}_N \rangle }\sum _{\begin{array}{c} c \in [-N,N] \\ \hat{p}_N(c)\hat{p}_N(-c) \ne 0 \end{array}} \hat{p}_N(c) \log \frac{\hat{p_N}(c)}{\hat{p}_N(-c)}. \end{aligned}$$
(66)

This latter is well-known to be a biased estimator, and better practices in evaluating relative entropies correct these biases but also greatly increase the running time (see Supplementary Material in Ref.  [16]). We do not concern ourselves with this issue here.

Fig. 2
figure 2

For a fully-connected four-state model with all unit rates except for \(R_{1,2} = \exp F\) and initial state \(x=1\) (that is \(p^{\mathcal {X}}_0(x) = \delta _{1,x}\)): (dashed) the effective affinity F; (continuous) estimator \(\hat{F}_{\textrm{fr}}\) of \(\log \mathfrak {f}_+/ \mathfrak {f}_-\); (crossed) estimator \(\hat{F}_{\textrm{cy}}\) of \(\textrm{sign}\, (\mathfrak {f}_+ - \mathfrak {f}_-) \log \mathfrak {f}_\sigma \); (bullets) the linear regime estimator \(\hat{F}_{\textrm{lr}}\); (triangles) the entropy production estimator \(\hat{F}_{\textrm{kl}}\). The ultimate stopping time was set to \(N = 20\) and the number of samples to \(M = 10{,}000\)

In Fig. 2 we compare the behaviour of these estimators in a simple model. The linear regime estimator \(\hat{F}_{\textrm{lr}}\) performs better near the stalling condition \(F = 0\), while it diverges significantly out of stalling. On the contrary, the cyclic estimator \(\hat{F}_{\textrm{cy}}\) converges far from stalling, but it systematically suffers from the finite N cutoff. The entropy production estimator \(\hat{F}_{\textrm{kl}}\) is also biased and noisy due to the tails of the cumulated current’s distribution. The stopping fluctuation-relation estimator \(\hat{F}_{\textrm{fr}}\) instead appears to not be affected by all these issues, despite the approximation due the bias in the different initial state in Eq. (53).

The reason behind the left-right asymmetry in the above plot is due to the fact that for simplicity we decided to only perturb one rate \(R_{1,2} = \exp F\) and keep all others fixed. This choice is useful to show that entropy production estimators can lead to noise in the tails depending on time-scale separation between rates. Had we distributed the perturbation among \(R_{1,2}\) and \(R_{2,1}\), we would have obtained a more symmetric plot.

4 Discussion

4.1 A Nonequilibrium Boltzmann Formula?

Our Eq. (1) can be seen as a “nonequilibrium Boltzmann formula” given its similarities with \(S = \log W\) connecting entropy S and probability 1/W (W being the volume of state space), elaborated by Boltzmann and refined by Planck (Boltzmann’s constant set to unity). But with some precautions.

Einstein wrote about the Boltzmann formula: «To be able to calculate W, one needs a complete theory of the system under consideration. If considered from a phenomenological point of view [this] equation appears devoid of content». Einstein then inverted the equation to make it a rule for inferring probabilities from measured entropy differences between equilibrium states – which better capture the dynamical nature of processes – and used it to perfection Smoluchowski’s theory of critical opalescence [22]. However, at least since Kant, philosophers warn us that observations are not independent of conceptions, and therefore deduction from measurements needs theory (the fluctuations of what?), and theory needs the human touch. Still now we don’t know which came first, whether the chickens of gases and thermal machines or the eggs of thermodynamics and statistical mechanics [23]. At equilibrium the situation is aggravated by the fact that the construction of thermodynamic potentials requires many arbitrary choices by the observer [24], while the pursue of objectiveness requires a description of processes in terms of invariant quantities.

Far from equilibrium, flows of heat to and from the environment are not quantified by differences of a state function, but by “inexact differences”. By the so-called principle of local detailed balance ratios of probabilities of forward-to-backward processes have been connected to so-called affinities that quantify the entropy production along cyclic processes, and which are invariant upon the redefinition of the fundamental degrees of freedom [24]. However, until recently it has proven difficult to directly connect probabilities and meaningful physical quantities. In fact, despite some claims, there are no predictive variational principles far from equilibrium [25, 26].

However, a concern still plagues our result. What comes first: the egg of F, or the chicken of \(\mathfrak {f}_-\)? Only circumstances can tell.

4.2 Relation to a Companion Publication

The present manuscript is strictly related to a companion work [27] by the same Authors that addresses similar questions. Let us clarify in which ways.

Equation (51) is strictly related to Eq. (14) in Ref.  [27]. There the normalized probability \(\mathfrak {p}_{-n}\) of the cumulated current taking minimum value \(-n\) is addressed, while in our case \(\mathfrak {f}_{-n}\) allows that, after hitting value \(-n\), the cumulate current may take even more negative values. Therefore we have, intuitively, that this latter is the cumulative distribution of the former

$$\begin{aligned} \mathfrak {f}_{-n} = \sum _{k \ge n} \mathfrak {p}_{-k}. \end{aligned}$$
(67)

Given that \(\mathfrak {p}_{-k}\) is normalized, this identification allows to estimate the escape probability that the cumulated current never actually attains a negative value as \(\mathfrak {p}_0 = 1 - \mathfrak {f}_{-1}\), which in view of Eq. (34), the explicit expression for the trans-transition probabilities Eqs. (19) and (20), and the explicit expression for the probability of the first transition Eq. (17) allows to express \(\mathfrak {p}_0\) in terms of the (distribution of) the initial state (see below the explicit expression).

The other main difference between the two works is methodological. Here we follow a constructive but specific approach based on first-transition time techniques and combinatorics, while Ref.  [27] is rooted in the more general theory of martingales. In particular in Ref.  [27] it is shown that, upon a proper choice of initial state, \(\exp -Fc\) is a martingale, and in particular its expected value \(\langle \exp -Fc \rangle \) is constant in time. Doob’s optional stopping theorem then states that this time can be any proper stopping time. By choosing the moment when the cumulated current hits the boundary values \(n_+ > 0\) or \(n_- < 0\) for the first time, and given that c starts from value 0, one obtains

$$\begin{aligned} 1 = \left\langle \exp -Fc \right\rangle = \mathfrak {f}^{(n_-)}_{n_+} e^{-Fn_+} + \mathfrak {f}^{(n_+)}_{n_-} e^{-Fn_-} \end{aligned}$$
(68)

where \(\mathfrak {f}^{(n_+)}_{n_-}\) is the probability of hitting \(n_-\) whilst not hitting \(n_+\). The nonequilibrium Boltzmann formula follows by taking \(n_- = -1\) and \(n_+ \rightarrow \infty \), in which limit \(\mathfrak {f}^{(n_+)}_{-1} \rightarrow \mathfrak {f}_{-1}\). Interestingly, similar formulas were derived in an optimization context in Ref.  [28].

Finally, here is a short dictionary of equivalent terms and concepts in the two papers: transition rates \({r}(x\vert x')\) here are \(k_{u \rightarrow v}\) there; the observed edge \(1 \leftrightarrow 2\) is \(y \leftrightarrow x\); “cumulated” currents c are “integrated” currents J; for stationary probabilities we have \(\infty \) instead of “\(\textrm{ss}\)”; the effective affinity F is \(a^*\); the extremum probability \(\mathfrak {p}_{-n}[p^\mathcal {L}_1]\), given Eq. (17), is \(p_{J^{\textrm{inf}}_{x \rightarrow y}} (-\ell \vert X(0) = x_0)\); the escape probability \(\mathfrak {p}_0\) is

$$\begin{aligned} p_{\textrm{esc}}(x_0) = 1 + \left( k_{x \rightarrow y} \frac{ [S^{-1}]_{x,y} [S^{-1}]_{y,x_0}}{ [S^{-1}]_{y,x} } + k_{y \rightarrow x} \frac{ [S^{-1}]_{y,y} [S^{-1}]_{x,x_0}}{ [S^{-1}]_{x,x} } \right) e^{-a^*} \end{aligned}$$
(69)

where S is the matrix with entries \(S_{u,v} = k_{v \rightarrow u} - \delta _{u,v} \sum _{w \in \mathcal {X}; w \ne u} k_{u \rightarrow w}\) if \((u,v) \ne (x,y), (y,x)\), else \(S_{x,y} = S_{y,x} = 0\), and we used the explicit expression of the effective affinity Eq. (37), that now translates into

$$\begin{aligned} a^*= \log \frac{k_{x \rightarrow y} [S^{-1}]_{x,y}}{k_{y \rightarrow x} [S^{-1}]_{y,x}}. \end{aligned}$$
(70)

When \(x_0 = x\) we find

$$\begin{aligned} p_{\textrm{esc}}(x)&= 1 + \left( k_{x \rightarrow y}[S^{-1}]_{x,y} + k_{y \rightarrow x} [S^{-1}]_{y,y} \right) e^{-a^*} \nonumber \\&= 1 - \left( \frac{ k_{x \rightarrow y} \det S_{\setminus (y\vert x)} - k_{y \rightarrow x} \det S_{\setminus (y\vert y)}}{\det S} \right) e^{-a^*} \nonumber \\&= 1 - e^{-a^*}, \end{aligned}$$
(71)

where this latter passage follows from the algebraic manipulations in Appendix A.1. We thus recover Eq. (16) in the companion paper. We checked computationally the more general equivalence (implied by the theory) of Eqs. (69) with (80) in the companion paper, but a direct proof has remained elusive.

4.3 Conclusions

Both martingale and first-transition methods are having a revival in connection to thermodynamic considerations [2,3,4, 7, 16, 17, 29], and they may lead to independent generalizations and applications of our results. In both approaches, the main open question is the generalization to an arbitrary subset of currents – neither the full entropy production nor a single edge current.

As regards the first-transition approach followed here, as soon as one steps out of the single-edge case the Markov property of the process in transition space is lost. Here the combinatorial approach may allow some exploration.

Since any Radon-Nicodym derivative of two probability distributions over realizations of the process is a martingale, martingales can be used to generalise the results in this paper. From this approach it would seem that one can generate an arbitrary number of first-hitting results by building ad hoc auxiliary dynamics. However, the physical interpretation of this class of results may not be clear: it is crucial in our approach that the effective affinity has a clear operational interpretation. In particular, if one could tune its value by just “turning a knob”, then the effective affinity is just the difference of that knob’s value (in proper physical units) where one wants to perform the experiment and the value of that knob at which the observable current vanishes on average. This local operational interpretation dispenses one to compute the effective affinity from knowledge of all the inner details of the fundamental thermodynamic cycles that influence that particular current.

On a more speculative side, notice that in our derivation we made an arbitrary restriction of the solution of the Narayana generating function, based on the assumption that we expect probabilities to be real-valued. It may be interesting to explore the meaning of the complex-valued solution.

Notoriously, Boltzmann’s epitaph is his formula. But it took a whole community (including Einstein, Planck etc.) to digest it. So who’s formula is it?