The results below focus on one or more states, within a potentially larger state transition model, for which we would like to assume a particular dwell time distribution and derive a corresponding system of mean field ODEs using the LCT or a generalization of the LCT. In particular, the results below describe how to construct those mean field ODEs directly from stochastic model assumptions without needing to derive them from equivalent mean field integral equations (which themselves might need to be derived from an explicit continuous-time stochastic model).
Preliminaries
Before presenting extensions of the LCT, we first illustrate in Sect. 3.1.1 how mean field ODEs include terms that reflect underlying Poisson process rates. In Sect. 3.1.2, we highlight a key property of these Poisson process 1st event time distributions that we refer to as a weak memorylessness property since it is a generalization of the well known memorylessness property of the exponential and geometric distributions.
Transition rates in ODEs reflect underlying Poisson process rates
To build upon the intuition spelled out in Sect. 1.1, where particles exit X following an exponentially distributed dwell time, we now assume that particles exit X following the 1st event time under nonhomogeneous Poisson processes with rate r(t) (recall the 1st event time distribution is exponential if \(r(t)=r\) is constant). As illustrated by the corresponding mean field equations given below, the rate function r(t) can be viewed as either the intensity functionFootnote 5 for the Poisson process governing when individuals leave state X, or as the (mean field) per-capita rate of loss from state X as shown in Eq. (11). This dual perspective provides valuable intuition for critically evaluating mean field ODEs.
Example 1
(Equivalence between Poisson process rates and per capita rates in mean field ODEs) Consider the scenario described above. The survival function for the dwell time distribution for a particle entering X at time \(\tau \) is \(S(t,\tau )=\exp (-\int _{\tau }^{t} r(u)\,du)\), and it follows that the expected proportion of such particles remaining in X at time \(t>\tau \) is given by \(S(t,\tau )\). Let x(t) be the total amount in state X at time t, \(x(0)=x_0\), and assume that \({\mathcal {I}}(t)\) and r(t) are integrable, non-negative functions of t. Then the corresponding mean field integral equation for this scenario is
$$\begin{aligned} x(t) =\; x_0\,S(t,0) + \int ^t_{0} {\mathcal {I}}(\tau )\, S(t,\tau ) d\tau \end{aligned}$$
(10)
and the integral equation (10) above is equivalent to
$$\begin{aligned} \frac{d}{dt}{x}(t) =\; {\mathcal {I}}(t) - r(t)\, x(t), \; \text { with } x(0)=x_0. \end{aligned}$$
(11)
Proof
The Leibniz rule for differentiating integrals, Eq. (10), and Lemma 1 yield
$$\begin{aligned} \begin{aligned} \frac{d}{dt}{x}(t)&=\; x_0\frac{d}{dt}S(t,0) + \frac{d}{dt}\int ^t_{0} {\mathcal {I}}(\tau )\,S(t,\tau )d\tau \\&=\; -r(t)\,x_0\,e^{-\int _{0}^{t}r(u)\,du} +{\mathcal {I}}(t) -r(t) \int ^t_{0}{\mathcal {I}}(\tau ) e^{-\int _{\tau }^{t}r(u)\,du} d\tau \\&=\; {\mathcal {I}}(t) - r(t)\bigg [x_0\,e^{-\int _{0}^{t}r(u)\,du} + \int ^t_{0} {\mathcal {I}}(\tau )e^{-\int _{\tau }^{t}r(u)\,du} d\tau \bigg ] \\&=\; {\mathcal {I}}(t) - r(t) x(t). \end{aligned} \end{aligned}$$
(12)
\(\square \)
Weak memoryless property of Poisson process 1st event time distributions
The intuition behind the LCT, and the history-independent nature of ODEs, are related to the memorylessness property of exponential distributions. For example, when particles accumulate in a state with an exponentially distributed dwell time distribution, then at any given time all particles currently in that state have iid exponentially distributed amounts of time left before they leave that state regardless of the duration of time already spent in that state. Each of these can be viewed as a reflection of the history-independent nature of Poisson processes.
Accordingly, the familiar memorylessness property of exponential and geometric distributions can, in a sense, be generalized to (nonhomogeneous) Poisson process 1st event time distributions. Recall that if an exponentially distributed (rate r) random variable T represents the time until some event, then if the event has not occurred by time s the remaining duration of time until the event occurs is also exponential with rate r. The analogous weak memorylessness property of nonhomogeneous Poisson process 1st event time distributions is detailed in the following definition.
Definition 1
(Weak memorylessness property of Poisson process 1st event times) Assume T is a Poisson process 1st event time starting at time \(\tau \), which has rate r(t) and CDF \(H_r^1(t,\tau )=1-\exp (-m(t,\tau ))\) [see Eqs. (9) and (9c)]. If the event has not occurred by time \(s>\tau \) the distribution of the remaining time \(T_s \equiv T-s\;|\;T>s\) follows a shifted but otherwise identical Poisson process 1st event time distribution with CDF \(P(T_s \le t)=H_r^1(t+s,s)\). If \(r(t)=r\) is a positive constant we recover the memorylessness property of the exponential distribution.
Proof
The CDF of \(T_s\) (for \(t>\tau \)) is given by
$$\begin{aligned} \begin{aligned} P(T_s \le t)&=\; P(T - s \le t \;|\; T>s) =\; \frac{P(s< T \le s+ t)}{P(s<T)} \\&=\; \frac{H_r^1(t+s,\tau ) - H_r^1(s,\tau )}{1-H_r^1(s,\tau )} =\; 1-\frac{1-H_r^1(t+s,\tau )}{1-H_r^1(s,\tau )}\\&=\; 1-\frac{e^{-m(t+s,\tau )}}{e^{-m(s,\tau )}} =\; 1-e^{-m(t+s,s)} =\; H_r^1(t+s,s). \\ \end{aligned} \end{aligned}$$
(13)
\(\square \)
In other words, Poisson process 1st event time distributions are memoryless up to a time shift in their rate functions. In the context of multiple particles entering a given state X at different times and leaving according to independent Poisson process 1st event times with identical rates r(t) (i.e., t is absolute time, not time since entry into X), then for all particles in state X at a given time the distribution of time remaining in state X is (1) independent of how much time each particle has already spent in X and (2) follows iid Poisson process 1st event time distributions with rate r(t).
Simple case of the LCT
To illustrate how the LCT follows from Lemma 1, consider the simple case of the LCT illustrated in Fig. 1, where a higher dimensional model includes a state transition into, then out of, a focal state X. Assume the time spent in that state (\(T_X\)) is Erlang(r, k) distributed [i.e., \(T_X \sim \) Erlang(r, k)]. Then the LCT provides a system of ODEs equivalent to the mean field integral equations for this process as detailed in the following theorem:
Theorem 1
(Simple LCT) Consider a continuous time state transition model with inflow rate \({\mathcal {I}}(t)\ge 0\) (an integrable function of t) into state X which has an Erlang(r, k) distributed dwell time [with survival function \(S_r^k\) from Eq. (7c)]. Let x(t) be the (mean field) amount in state X at time t and assume \(x(0)=x_0\).
The mean field integral equation for this scenario (see Fig. 1a) is
$$\begin{aligned} x(t) =\; x_0S_r^k(t) + \int ^t_{0} {\mathcal {I}}(s)\,S_r^k(t-s) ds. \end{aligned}$$
(14)
State X can be partitioned (see Fig. 1b) into k sub-states X\(_i\), \(i=1,\ldots ,k\), where particles in X\(_i\) are those awaiting the ith event as the next event under a homogeneous Poisson process with rate r. Let \(x_i(t)\) be the amount in X\(_i\) at time t, and \(x(t) = \sum _{j=1}^{k} x_j(t)\). Equation (14) is equivalent to the mean field ODEs
$$\begin{aligned} \frac{d}{dt}{x_1}(t)&=\; {\mathcal {I}}(t) -r\, x_1(t) \end{aligned}$$
(15a)
$$\begin{aligned} \frac{d}{dt}{x_j}(t)&=\; r\, x_{j-1}(t) - r\, x_j(t), \quad j=2,\ldots ,k \end{aligned}$$
(15b)
with initial conditions \(x_1(0)=x_0\), \(x_j(0)=0\) for \(j\ge 2\), and
$$\begin{aligned} x_j(t)= x_0\,\frac{1}{r}\,g_r^j(t) + \int ^t_{0} {\mathcal {I}}(s) \frac{1}{r}\,g_r^j(t-s)ds. \end{aligned}$$
(16)
Proof
Substituting Eq. (7c) into Eq. (14) and then substituting Eq. (16) yields
$$\begin{aligned} \begin{aligned} x(t)&=\; x_0\,S_r^k(t) + \int ^t_{0} {\mathcal {I}}(s)\,S_r^k(t-s) \,ds \\&=\; x_0\,\sum _{j=1}^{k} \frac{1}{r}\,g_r^j(t) + \int ^t_{0}{\mathcal {I}}(s)\,\sum _{j=1}^{k} \frac{1}{r}\,g_r^j(t-s) \,ds \\&= \sum _{j=1}^{k} \left( x_0\, \frac{1}{r}\,g_r^j(t) + \int ^t_{0} {\mathcal {I}}(s)\;\frac{1}{r}\,g_r^j(t-s) \,ds \right) = \sum _{j=1}^{k} x_j(t). \end{aligned} \end{aligned}$$
(17)
Differentiating Eq. (16) (for \(j=1,\ldots ,k\)) yields Eq. (15) as follows.
For \(j=1\), Eq. (16) reduces to
$$\begin{aligned} x_1(t) = x_0 e^{-r\,t} + \int ^t_{0} {\mathcal {I}}(s) e^{-r(t-s)}ds. \end{aligned}$$
(18)
Differentiating \(x_1(t)\) using the Leibniz integral rule and substituting (18) yields
$$\begin{aligned} \frac{d}{dt}{x_1}(t) =\; - r x_0 e^{-r\,t} - r\int ^t_{0} {\mathcal {I}}(s) e^{-r(t-s)}ds + {\mathcal {I}}(t) \;=\; {\mathcal {I}}(t) - r x_1(t). \end{aligned}$$
(19)
Similarly, for \(j\ge 2\), Lemma 1 yields
$$\begin{aligned} \begin{aligned} \frac{d}{dt}{x_j}(t)&=\; x_0\,\frac{1}{r}\,\frac{d}{dt}g_r^j(t) + \int ^t_{0} {\mathcal {I}}(s) \frac{d}{dt} \left( \frac{1}{r}\,g_r^j(t-s)\right) \,ds \\&=\; x_0\,\left( g_r^{j-1}(t)-g_r^j(t)\right) + \int ^t_{0} {\mathcal {I}}(s) \left( g_r^{j-1}(t-s)-g_r^j(t-s)\right) \,ds \\&=\; r\,\bigg ( \frac{x_0}{r}\,g_r^{j-1}(t) + \int ^t_{0} {\mathcal {I}}(s) \frac{1}{r} g_r^{j-1}(t-s)\,ds \bigg ) -\; r\,\bigg ( \frac{x_0}{r}\,g_r^{j}(t)\\&\quad + \int ^t_{0} {\mathcal {I}}(s) \frac{1}{r} g_r^j(t-s)\,ds\bigg ) =\; r\,x_{j-1}(t) - r\,x_j(t). \end{aligned} \end{aligned}$$
(20)
\(\square \)
The X\(_j\) dwell time distributions are exponential with rate r. To see why, let \(\chi _i\)(t) (where \(1\le i \le k\)) be the (mean field) number of particles in state X at time t that have not reached the ith event. Then
$$\begin{aligned} \chi _i(t) = x_0\,S_r^i(t) + \int _0^t {\mathcal {I}}(s)\,S_r^i(t-s)\,ds \end{aligned}$$
(21)
and by Eq. (7) we see from Eqs. (18) and (21) that \(x_j(t) = \chi _j(t) - \chi _{j-1}(t)\). That is, particles in state X\(_j\) are those for which the \((j-1)\)th event has occurred, but not the jth event. Thus, by properties of Poisson processes the dwell time in state X\(_j\) must exponential with rate r.
Standard LCT
The following theorem and corollary together provide a more general, formal statement of the standard Linear Chain Trick (LCT) as used in practice. These extend Theorem 1 (compare Figs. 1, 2) to explicitly include that particles leaving X enter state Y then remain in Y according to an arbitrary distribution with survival function S, and include transitions into Y from other sources.
Theorem 2
(Standard LCT) Consider a continuous time dynamical system model of mass transitioning among various states, with inflow rate \({\mathcal {I}}_X(t)\) to a state X and an Erlang(r, k) distributed delay before entering state Y. Let x(t) and y(t) be the amount in each state, respectively, at time t. Further assume an inflow rate \({\mathcal {I}}_Y(t)\) into state Y from other non-X states, and that the underlying stochastic model assumes that the duration of time spent in state Y is determined by survival function \(S_Y(t,\tau )\). Assume \({\mathcal {I}}_i(t)\) are integrable non-negative functions of t, and assume non-negative initial conditions \(x(0)=x_0\) and \(y(0)=y_0\).
The mean field integral equations for this scenario are
$$\begin{aligned} x(t)&=\; x_0\,S_r^k(t) + \int ^t_{0} {\mathcal {I}}_X(s)\,S_r^k(t-s) ds \end{aligned}$$
(22a)
$$\begin{aligned} y(t)&=\; y_0S_Y(t,0) + \int ^t_{0} \bigg ({\mathcal {I}}_Y(\tau ) + x_0\,g_r^k(\tau ) \nonumber \\&\quad +\int ^\tau _0 {\mathcal {I}}_X(s)\,g^k_r(\tau -s)ds\bigg )S_Y(t,\tau )d\tau . \end{aligned}$$
(22b)
Equations (22) are equivalent to
$$\begin{aligned} \frac{d}{dt}{x_1}(t)&=\; {\mathcal {I}}_X(t) -r x_1(t) \end{aligned}$$
(23a)
$$\begin{aligned} \frac{d}{dt}{x_j}(t)&=\; r x_{j-1}(t) - r x_j(t), \quad j=2,\ldots ,k \end{aligned}$$
(23b)
$$\begin{aligned} y(t)&=\; y_0S_Y(t,0) + \int ^t_{0} \underbrace{\left( {\mathcal {I}}_Y(\tau ) + r\,x_k(\tau )\right) }_{\text {Net input rate at time } \tau } S_Y(t,\tau )d\tau \end{aligned}$$
(23c)
where \(x(t) = \sum _{j=1}^{k} x_j(t)\), \(x_1(0)=x_0\), \(x_j(0)=0\) for \(j\ge 2\), and
$$\begin{aligned} x_j(t)= x_0\,\frac{1}{r}\,g_r^j(t) + \int ^t_{0} {\mathcal {I}}_X(s) \frac{1}{r}\,g_r^j(t-s)ds. \end{aligned}$$
(24)
Proof
Eqs. (23a), (23b) and (24) follow from Theorem 1. Equation (23c) follows from substituting (24) into (22b). The definition of \(x_j\) and initial condition \(x(0)=x_0\) together imply \(x_1(0)=x_0\) and \(x_j(0)=0\) for the remaining \(j\ge 2\). \(\square \)
Corollary 1
Integral equations like Eq. (23c) may have ODE representations depending on the Y dwell time distribution, i.e., \(S_Y(t,\tau )\), for example:
-
1.
If particles leave Y after a Poisson process 1st event time distributed dwell time [i.e., the per-capita loss rate from Y is \(\mu (t)\)], then \(S_Y(t,\tau )=\exp (-\int _{\tau }^{t}\mu (u)\,du)\), and letting \({\mathcal {I}}(t)={\mathcal {I}}_Y(t)+rx_k(t)\), Theorem 2 yields
$$\begin{aligned} \frac{d}{dt}{y}(t) =\; {\mathcal {I}}_Y(t) + r x_{k}(t) - \mu (t) y(t).\end{aligned}$$
(25)
-
2.
If particles leave Y after an Erlang(\(\mu ,\kappa \)) delay, then \(S_Y(t,\tau )=S_\mu ^\kappa (t-\tau )\) and letting \({\mathcal {I}}(t)={\mathcal {I}}_Y(t)+rx_k(t)\) Theorem 2 gives that \(y=\sum _{i=1}^\kappa y_i\) and
$$\begin{aligned} \frac{d}{dt}{y_1}(t)&=\; {\mathcal {I}}_Y(t) + r x_{k}(t) -\mu \, y_1(t) \end{aligned}$$
(26a)
$$\begin{aligned} \frac{d}{dt}{y_i}(t)&=\; \mu \, y_{i-1}(t) - \mu \, y_i(t), \quad i=2,\ldots ,\kappa . \end{aligned}$$
(26b)
-
3.
As implied by 1 and 2 above, if the per-capita loss rate \(\mu (t)=\mu \) is constant or time spent in Y is otherwise exponentially distributed, then
$$\begin{aligned} \frac{d}{dt}{y}(t) =\; {\mathcal {I}}_Y(t) + r x_{k}(t) - \mu \, y(t).\end{aligned}$$
(27)
-
4.
Any of the more general cases considered in the sections below.
Example 2
To illustrate how the Standard LCT can be used to substitute an implicit exponential dwell time distribution with an Erlang distribution, consider the SIR example discussed in the Introduction [Eqs. (2) and (3), see also Anderson and Watson 1980; Lloyd 2001a, b], but assume the dwell time distribution for the infected state I is Erlang (still with mean \(1/\gamma \)) with varianceFootnote 6\(\sigma ^2\), i.e., by Eq. (6), Erlang with a rate \(r=\gamma k\) and shape \(k=\sigma ^2/\gamma ^2\).
By Theorem 2 and Corollary 1, with \({\mathcal {I}}_I(t)=\lambda (t)\,S(t)\) and \(I(t)=\sum _{j=1}^k I_j(t)\), the corresponding mean field ODEs are
$$\begin{aligned} \frac{d}{dt}S(t)&=\; -\lambda (t)\,S(t) \end{aligned}$$
(28a)
$$\begin{aligned} \frac{d}{dt}I_1(t)&=\; \lambda (t)\,S(t) - \gamma k\,I_1(t) \end{aligned}$$
(28b)
$$\begin{aligned} \frac{d}{dt}I_j(t)&=\; \gamma k\,I_{j-1}(t) - \gamma k\,I_j(t), \quad \text { for } j=2,\ldots ,k \end{aligned}$$
(28c)
$$\begin{aligned} \frac{d}{dt}R(t)&=\; \gamma k\,I_k(t). \end{aligned}$$
(28d)
Notice that if \(\sigma ^2=\gamma ^2\) (i.e., \(k=1\)), the dwell time in I is exponentially distributed with rate \(\gamma \), \(I(t)=I_1(t)\), and Eq. (28) reduce to Eq. (2).
This example nicely illustrates how using Theorem 2 to relax an exponential dwell time assumption implicit in a system of mean field ODEs is much more straightforward than constructing them after first deriving the integral equations, like Eq. (3), and then differentiating them using Lemma 1. In the sections below, we present similar theorems intended to be used for constructing mean field ODEs directly from stochastic model assumptions.
Extended LCT for Poisson process kth event time distributed dwell times
The Standard LCT assumes an Erlang(r, k) distributed dwell time, i.e., a kth event time distribution under a homogeneous Poisson process with rate r. Here we generalize the Standard LCT by assuming the X dwell time follows a more general kth event time distribution under a Poisson process with rate r(t).
First, observe that Eq. (8) in Lemma 1 are more practical when written in terms of \(\frac{1}{r} g^j_r(t)\) (see the proof of Theorem 1),
$$\begin{aligned} \frac{d}{dt}\bigg [\frac{1}{r} g^1_{r}(t)\bigg ]&=\; -\,r \bigg [\frac{1}{r} g^1_{r}(t)\bigg ], \end{aligned}$$
(29a)
$$\begin{aligned} \frac{d}{dt}\bigg [\frac{1}{r} g^j_{r}(t)\bigg ]&=\; r\bigg [\frac{1}{r} g^{j-1}_{r}(t)-\frac{1}{r} g^{j}_{r}(t)\bigg ], \qquad j\ge 2 \end{aligned}$$
(29b)
where \(\frac{1}{r} g^1_{r}(0)=1\) and \(\frac{1}{r} g^j_{r}(0)=0\).
Lemma 2
A similar relationship to Eq. (29) above (i.e., to Lemma 1) holds true for the Poisson process jth event time distribution density functions \(h_{r}^j\) given by Eq. (9a). Specifically,
$$\begin{aligned} \frac{d}{dt}\bigg [\frac{1}{r(t)} h^1_{r}(t,\tau )\bigg ]&=\; -\,r(t) \bigg [\frac{1}{r(t)} h^1_{r}(t,\tau )\bigg ], \end{aligned}$$
(30a)
$$\begin{aligned} \frac{d}{dt}\bigg [\frac{1}{r(t)} h^j_{r}(t,\tau )\bigg ]&=\; r(t)\bigg [\frac{1}{r(t)} h^{j-1}_{r}(t,\tau )-\frac{1}{r(t)} h^{j}_{r}(t,\tau )\bigg ], \end{aligned}$$
(30b)
where \(\frac{1}{r(\tau )}h^1_{r}(\tau ,\tau )=1\) and \(\frac{1}{r(\tau )} h^j_{r}(\tau ,\tau )=0\) for \(j\ge 2\). Note that, if for some t\(r(t)=0\), this relationship can be written in terms of
$$\begin{aligned} u_r^k(t,\tau )\ \equiv \frac{m(t,\tau )^{k-1}}{(k-1)!}\,e^{-m(t,\tau )}, \end{aligned}$$
(31)
as shown in the proof below, where \(h_r^k(t,\tau )=r(t)\,u_r^k(t,\tau )\), \(u^1_{r}(\tau ,\tau )=1\), and \(u^j_{r}(\tau ,\tau )=0\) for \(j\ge 2\).
Proof
For \(j=1\),
$$\begin{aligned} \frac{d}{dt}\bigg [u^1_{r}(t,\tau )\bigg ] =\; \frac{d}{dt}e^{-m(t,\tau )} =\; -r(t)\,e^{-m(t,\tau )} =\; -r(t)\,u^1_{r}(t,\tau ). \end{aligned}$$
(32)
Likewise, for \(j\ge 2\), we have
$$\begin{aligned} \begin{aligned} \frac{d}{dt}\bigg [u^j_{r}(t,\tau )\bigg ]&=\; \frac{d}{dt} \frac{m(t,\tau )^{k-1}}{(k-1)!}\,e^{-m(t,\tau )} \\&=\; r(t)\, \frac{m(t,\tau )^{k-2}}{(k-2)!}\,e^{-m(t,\tau )} - r(t)\, \frac{m(t,\tau )^{k-1}}{(k-1)!}\,e^{-m(t,\tau )} \\&=\; r(t)\bigg [u^{j-1}_{r}(t,\tau )-u^{j}_{r}(t,\tau )\bigg ]. \end{aligned} \end{aligned}$$
(33)
\(\square \)
Lemma 2 allows us to generalize Erlang-based results like Theorem 2 to their time-varying counterparts with a time-dependent (or state-dependent) rate r(t), as in the following generalization of the Standard LCT (Theorem 2).
Theorem 3
(Extended LCT for dwell times distributed as Poisson process \(k^\mathbf{th }\)event times) Consider the Standard LCT in Theorem 2 but assume the X dwell time is a Poisson process kth event time, rate r(t). The corresponding mean field integral equations, where \(h_r^j\) and \({\mathcal {S}}_r^j\) are given in Eq. (9), are
$$\begin{aligned} x(t)&=\; x_0\,{\mathcal {S}}_r^k(t,0) + \int ^t_{0} {\mathcal {I}}_X(s)\,{\mathcal {S}}_r^k(t,s) ds \end{aligned}$$
(34a)
$$\begin{aligned} y(t)&=\; y_0S_Y(t,0) + \int ^t_{0} \bigg ({\mathcal {I}}_Y(\tau ) + x_0\,h_r^k(\tau ,0) \nonumber \\&\quad +\int ^\tau _0 {\mathcal {I}}_X(s)\,h^k_r(\tau ,s)ds\bigg )S_Y(t,\tau )d\tau . \end{aligned}$$
(34b)
The above Eq. (34), with \(x(t) = \sum _{j=1}^{k} x_j(t)\), are equivalent to
$$\begin{aligned} \frac{d}{dt}{x_1}(t)&=\; {\mathcal {I}}_X(t) -r(t)\, x_1(t) \end{aligned}$$
(35a)
$$\begin{aligned} \frac{d}{dt}{x_j}(t)&=\; r(t)\, x_{j-1}(t) - r(t)\, x_j(t), \quad j=2,\ldots ,k \end{aligned}$$
(35b)
$$\begin{aligned} y(t)&=\; y_0\,S_Y(t,0) + \int ^t_{0} \left( {\mathcal {I}}_Y(\tau ) + r(\tau )\,x_k(\tau )\right) S_Y(t,\tau )d\tau \end{aligned}$$
(35c)
with initial conditions \(x_1(0)=x_0\), \(x_j(0)=0\) for \(j\ge 2\) and
$$\begin{aligned} x_j(t)= x_0\,\frac{1}{r(t)}\,h_r^j(t,0) + \int ^t_{0} {\mathcal {I}}_X(s) \frac{1}{r(t)}\,h_r^j(t,s)ds. \end{aligned}$$
(36)
Equation (35c) may be further reduced to ODEs, e.g., via Corollary 1.
Proof
Substituting Eq. (9b) into Eq. (34a) and substituting Eq. (36) yields \(x(t) = \sum _{j=1}^k x_j(t)\). Differentiating Eq. (36) with \(j=1\) using the Liebniz integration rule as well as Eq. (30a) from Lemma 2 yields Eq. (35a). Likewise, for \(j\ge 2\), differentiation of Eq. (36) and Lemma 2 yields
$$\begin{aligned} \begin{aligned} \frac{d}{dt}x_j(t)&=\; x(0)\,r(t)\,\bigg [ \frac{1}{r(t)} h^{j-1}_{r}(t,0) - \frac{1}{r(t)} h^{j}_{r}(t,0) \bigg ] \\&\quad + \int _0^t {\mathcal {I}}_X(s)\,r(t)\bigg [ \frac{1}{r(t)} h^{j-1}_{r}(t,\tau ) - \frac{1}{r(t)} h^{j}_{r}(t,\tau ) \bigg ] ds \\&=\; r(t) \big (x_{j-1}(t) - x_j(t)\big ). \end{aligned} \end{aligned}$$
(37)
Equation (35c) follows from substituting (36) into (34b). The definition of \(x_j\) and initial condition \(x(0)=x_0\) together imply \(x_1(0)=x_0\) and \(x_j(0)=0\) for the remaining \(j\ge 2\). If \(r(t)=0\) for some t, Eq. (35) still hold, since Eqs. (36) and (37) can be rewritten using u as in the proof of Lemma 2. \(\square \)
Having generalized the Standard LCT (Lemma 1 and Theorem 2) to include Poisson process kth event time distributed dwell times, we may now address more complex stochastic model assumptions and how they are reflected in the structure of corresponding mean field ODEs.
Transitions to multiple states
Modeling the transition from one state to multiple states following a distributed delay (as illustrated in Fig. 3) can be done under different sets of assumptions about the underlying stochastic processes, particularly with respect to the rules governing how individuals are distributed across multiple recipient states upon exiting X, and how those rules depend on the dwell time distribution(s) for individuals in that state. Importantly, those different sets of assumptions can yield very different mean field models (e.g., see Feng et al. 2016) and so care must be taken to make those assumptions appropriately for a given application.
While modelers have some flexibility to choose appropriate assumptions, in practice modelers sometimes make unintended and undesirable implicit assumptions, especially when constructing ODE models using “rules of thumb” instead of deriving them from first principles. In this section we present results aimed at helping guide (a) the process of picking appropriate dwell time distribution assumptions, and (b) directly constructing corresponding systems of ODEs without deriving them from explicit stochastic models or intermediate integral equations.
Each of the three cases detailed below yield different mean field ODE models for the scenario depicted in Fig. 3.
First, in Sect. 3.5.1, we consider the extension of Theorem 3 where upon leaving X particles are distributed across \(m\ge 1\) recipient states according to a generalized Bernoulli distribution with (potentially time varying) probabilities/proportions \(p_{j}(t)\), \(j=1,\ldots ,m\). Here the outcome of which state a particle transitions to is independent of the time spent in the first state.
Second, in Sects. 3.5.2 and 3.5.3, particles entering the first state (X) do not all follow the same dwell time distribution in X. Instead, upon entering X they are distributed across \(n\ge 2\) sub-states of X, X\(_i\), according to a generalized Bernoulli distribution, and each sub-state X\(_i\) has a dwell time given by a Poisson process \(k_i\)th event time distribution with rate \(r_i(t)\). That is, the X dwell time is a finite mixture of Poisson process event time distributions. Particles transition out of X into m subsequent states Y\(_j\) according to the probabilities/proportions \(p_{ij}(t)\), the probability of going to Y\(_j\) from X\(_i\), \(i=1,\ldots ,n\) and \(j=1,\ldots ,m\). Here the determination of which recipient state Y\(_\ell \) a particle transitions to depends on which sub-state of X the particle was assigned to upon entering X (see Fig. 5).
Third, in Sect. 3.5.4, the outcome of which recipient state a particle transitions to upon leaving X is determined by a “race” between multiple competing Poisson process kth event time distributions, and is therefore not independent of the time spent in the first state (as in Sect. 3.5.1), nor is it pre-determined upon entry into X (as in Sects. 3.5.2 and 3.5.3). This result is obtained using yet another novel extension of Lemma 1 in which the dwell time in state X is the minimum of \(n\ge 2\) independent Poisson process event time distributions.
Lastly (Sect. 3.5.5), we describe an equivalence between (1) the more complex case addressed in Sect. 3.5.4 assuming a dwell time that obeys the minimum of Poisson process 1st event times, before being distributed across m recipient states, and (2) the conceptually simpler case in Sect. 3.5.1 where the dwell time follows a single Poisson process 1st event time distribution before being distributed among m recipient states. This is key to understanding the scope of the Generalized Linear Chain Trick in Sect. 3.7.
Transition to multiple states independent of the X dwell time distribution
Here we extend the case in the previous section and assume that, upon leaving state X, particles can transition to one of m states (call them \(Y_i\), \(i=1,\ldots ,m\)), and that a particle leaving X at time t enters state \(Y_i\) with probability \(p_i(t)\), where \(\sum _{i=1}^m p_i(t)=1\) [i.e., particles are distributed across all Y\(_i\) following a generalized Bernoulli distribution with parameter vector \({\mathbf {p}}(t)=(p_1(t),\ldots ,p_m(t))\)]. See Fig. 4 for a simple example with constant \({\mathbf {p}}\) and \(m=2\). An important assumption in this case is that the determination about which state a particle goes to after leaving X is made once it leaves X, and thus the state it transitions to upon exiting X is determined independent of the dwell time in X. Examples from the literature include Model II in Feng et al. (2016), where infected individuals (state X) either recovered (Y\(_0\)) or died (Y\(_1\)) after an Erlang distributed time delay.
Theorem 4
(Extended LCT with proportional output to multiple states) Consider the case addressed by Theorem 3, and further assume particles go to one of m states (call them Y\(_j\)) with \(p_j(t)\) being the probability of going to Y\(_j\). Let \(S_j\) be the survival functions for the dwell times in Y\(_j\).
The mean field integral equations for this case, with \(x(0)=x_0\) and \(y_j(0)=y_{j0}\), are
$$\begin{aligned} x(t)&= x_0\,{\mathcal {S}}_{r}^{k}(t,0) + \int _0^t{\mathcal {I}}_X(s)\,{\mathcal {S}}_{r}^{k}(t,s)\,ds \end{aligned}$$
(38a)
$$\begin{aligned} y_j(t)&= y_j(0)S_j(t,0) + \int _0^t\bigg ({\mathcal {I}}_j(\tau )+p_j(t)\bigg ( x_0\,h_{r}^{k}(\tau ,0) \nonumber \\&\quad + \int _0^\tau {\mathcal {I}}_X(s)h_{r}^{k}(\tau ,s)\,ds \bigg )\bigg )S_j(t,\tau ) d\tau \end{aligned}$$
(38b)
These integral equations are equivalent to the following system of equations:
$$\begin{aligned} \frac{d}{dt}{x_1}(t)&=\; {\mathcal {I}}_X(t) -r(t)\, x_1(t) \end{aligned}$$
(39a)
$$\begin{aligned} \frac{d}{dt}{x_i}(t)&=\; r(t)\, x_{i-1}(t) - r(t)\, x_i(t), \quad i=2,\ldots ,k \end{aligned}$$
(39b)
$$\begin{aligned} y_j(t)&=\; y_j(0)S_j(t,0) + \int _0^t\bigg ({\mathcal {I}}_j(\tau )+p_j(t)\;r(t)\,x_k(\tau )\bigg )S_j(t,\tau ) d\tau \end{aligned}$$
(39c)
where \(x(t)=\sum _{i=1}^k x_i(t)\), \(x_1(0)=x_0\), \(x_i(0)=0\) for \(i\ge 2\), and
$$\begin{aligned} x_i(t)= x_0 \frac{1}{r(t)} h_{r}^{k}(t,0) + \int _0^t{\mathcal {I}}_X(s) \frac{1}{r(t)} h_{r}^{k}(t,s)\,ds. \end{aligned}$$
(40)
Equations (39c) may be further reduced to ODEs, e.g., via Corollary 1.
Proof
Equations (39a), (39b) and (40) follow from Theorem 3. Equation (39c) follows from substitution of Eq. (40) into (38b). The derivation of Eq. (38b) is similar to the derivation in Appendix A.1 but accounts for the expected proportion entering each Y\(_j\) at time t being equal to \(p_j(t)\). \(\square \)
Example 3
Consider the example shown in Fig. 4, where the dwell time distribution for X is Erlang(r, k) and the dwell times in Y and Z follow 1st event times under nonhomogeneous Poisson processes with respective rates \(\mu _Y(t)\) and \(\mu _Z(t)\). By Theorem 4 the corresponding mean field ODEs are
$$\begin{aligned} \frac{d}{dt}{x_1}(t)&=\; {\mathcal {I}}_X(t) -r\, x_1(t) \end{aligned}$$
(41a)
$$\begin{aligned} \frac{d}{dt}{x_i}(t)&=\; r\, x_{i-1}(t) - r\, x_i(t), \quad i=2,\ldots ,k \end{aligned}$$
(41b)
$$\begin{aligned} \frac{d}{dt}{y}(t)&=\; {\mathcal {I}}_Y(t) + p\,r\, x_{k}(t) - \mu _Y(t) y(t) \end{aligned}$$
(41c)
$$\begin{aligned} \frac{d}{dt}{z}(t)&=\; {\mathcal {I}}_Z(t) + (1-p)\,r\, x_{k}(t) - \mu _Z(t) z(t). \end{aligned}$$
(41d)
Transition from sub-states of X with differing dwell time distributions and differing output distributions across states Y\(_j\)
In this second case, particles in state X can be treated as belonging to a heterogeneous population, where each remains in that state according to one of N possible dwell time distributions, the ith of these being the \(k_i\)th event time distribution under a Poisson process with rate \(r_i(t)\). Each particle is assigned one of these N dwell time distributions (i.e., it is assigned to sub-state X\(_i\)) upon entry into X according to a generalized Bernoulli distribution with a (potentially time varying) probability vector \(\varvec{\rho }(t)=(\rho _1(t),\ldots ,\rho _N(t))\), \(\sum _{i=1}^N \rho _i(t) =1\). In contrast to the previous case, here the outcome of which recipient state a particle transitions to is not necessarily independent of the dwell time distribution.
Note that the dwell time distribution in this case is a finite mixture of N independent Poisson processes event time distributions. If a random variable T is a mixture of Erlang distributions, or more generally a mixture of N independent Poisson process event time distributions, then the corresponding density function (f) and survival function (\({\varPhi }\)) are
$$\begin{aligned} f_\theta (t,\tau )&=\; \sum _{i=1}^N \rho _i(\tau )\, h_{r_i}^{k_i}(t,\tau ) \end{aligned}$$
(42a)
$$\begin{aligned} {\varPhi }_\theta (t,\tau )&=\; \sum _{i=1}^N \rho _i(\tau )\, {\mathcal {S}}_{r_i}^{k_i}(t,\tau ) =\; \sum _{i=1}^N \rho _i(\tau ) \sum _{j=1}^{k_i} \frac{1}{r_{i}(t)}\,h_{r_{i}}^j(t,\tau ) \end{aligned}$$
(42b)
where \(\varvec{\theta }(t)=(\rho _1(t), r_1(t), k_1, \ldots , \rho _N(t), r_N(t), k_N)\) is the potentially time varying parameter vector for the N distributions that constitute the mixture distribution. Note that if all \(r_i(t)=r_i\) are constant, this is a mixture of Erlang distributions, or if also all \(k_i=1\), a mixture of exponentials.
Theorem 5
(Extended LCT for dwell times given by mixtures of Poisson process event time distributions and outputs to multiple states) Consider a continuous time state transition model with inflow rate \({\mathcal {I}}_X(t)\) into state X. Assume that the duration of time spent in state X follows a finite mixture of N independent Poisson process event time distributions. That is, X can be partitioned into N sub-states X\(_i\), \(i=1,\ldots ,N\), each with dwell time distributions given by a Poisson process \(k_i\)th event time distributions with rates \(r_i(t)\). Suppose the inflow to state X at time t is distributed among this partition according to a generalized Bernoulli distribution with probabilities \(\rho _i(t)\), where \(\sum _{i=1}^{N} \rho _i(t) = 1\), so that the input rate to X\(_i\) is \(\rho _i(t){\mathcal {I}}_X(t)\). Assume that particles leaving sub-state X\(_i\) then transition to state Y\(_\ell \) with probability \(p_{i\ell }(t)\), \(\ell =1,\ldots ,m\), where the duration of time spent in state Y\(_\ell \) follows a delay distribution give by survival function \(S_j\). Then we can partition each X\(_i\) into X\(_{ij}\), \(j=1,\ldots ,k_i\), according to Theorem 3 and let x(t), \(x_i(t)\), \(x_{ij}(t)\), and \(y_\ell (t)\) be the amounts in states X, X\(_i\), X\(_{ij}\), and Y\(_\ell \) at time t, respectively. Assume non-negative initial conditions \(x(0)=x_0\), \(x_i(0)=\rho _i(0)x_0\), \(x_{i1}(0)=\rho _i(0)\,x_0\), \(x_{ij}(0)=0\) for \(j\ge 2\), and \(y_\ell (0)\ge 0\).
The mean field integral equations for this scenario are
$$\begin{aligned} x(t)&=\; x_0\,{\varPhi }_{\theta }(t,0) + \int ^t_{0} {\mathcal {I}}_X(s)\,{\varPhi }_{\theta }(t,s) ds \end{aligned}$$
(43a)
$$\begin{aligned} y_\ell (t)&=\; y_\ell (0) S_\ell (t,0) + \int ^t_{0} \bigg ({\mathcal {I}}_\ell (\tau ) + \sum _{i=1}^N p_{ij}(\tau ) \bigg ( x_0\, \rho _i(\tau ) \,h_{r_i}^{k_i}(\tau ,0) \nonumber \\&\quad +\int ^\tau _0 \rho _i(s)\, {\mathcal {I}}_X(s)\,h_{r_i}^{k_i}(\tau , s) ds\bigg )\bigg ) S_\ell (t,\tau )d\tau . \end{aligned}$$
(43b)
The above system of equations (43) are equivalent to
$$\begin{aligned} \frac{d}{dt}{x_{i1}}(t)&=\; \rho _i(t)\, {\mathcal {I}}_X(t) - r_i(t)\,x_{i1}(t), \quad i=1,\ldots ,N \end{aligned}$$
(44a)
$$\begin{aligned} \frac{d}{dt}{x_{ij}}(t)&=\; r_i(t) \big (x_{i,j-1}(t) - x_{ij}(t)\big ), \quad i=1,\ldots ,N; \; j=2,\ldots ,k_i \end{aligned}$$
(44b)
$$\begin{aligned} y_\ell (t)&=\; y_\ell (0) S_\ell (t,0) + \int ^t_{0} \bigg ({\mathcal {I}}_\ell (\tau ) \nonumber \\&\quad + \sum _{i=1}^N r_i(t)\,x_{ik_i}(\tau )\,p_{i\ell }(\tau ) \bigg ) S_\ell (t,\tau )d\tau \end{aligned}$$
(44c)
with initial conditions \(x_{i1}(0)=\rho _i(0)\,x_0\), \(x_{ij}(0)=0\) for \(j\ge 2\), where \(x(t) = \sum _{i=1}^{N} x_i(t)\), and \(x_i(t) = \sum _{j=1}^{k_i} x_{ij}(t)\). The amounts in X\(_i\) and X\(_{ij}\) are
$$\begin{aligned} x_i(t)&=\; \rho _i(0)\,x_0\,{\mathcal {S}}_{r_i}^{k_i}(t,0) + \int ^t_{0} \rho _i(s)\,{\mathcal {I}}_X(s)\,{\mathcal {S}}_{r_i}^{k_i}(t,s) ds \end{aligned}$$
(45)
$$\begin{aligned} x_{ij}(t)&=\; \rho _i(0)\,x_0 \frac{h_{r_i}^j(t,0)}{r_i(t)} + \int ^t_{0} \rho _i(s)\,{\mathcal {I}}_X(s)\;\frac{h_{r_i}^j(t,s)}{r_i(t)} \,ds. \end{aligned}$$
(46)
Equations (44c) may be reduced to ODEs, e.g., via Corollary 1.
Proof
Substituting Eq. (42b) into Eq. (43a) and then substituting Eq. (45) yields \(x(t) = \sum _{i=1}^{N} x_{i}(t)\). Applying Theorem 3 to each X\(_i\) [i.e., to each Eq. (45)] then yields Eqs. (46), (44a) and (44b). (Alternatively, one could prove this directly by differentiating Eq. (46) using Eq. (30) from Lemma 2). The \(y_\ell (t)\) equations (44c) are obtained from (43b) by substitution of Eq. (46). \(\square \)
Example 4
Consider the scenario in Fig. 5, where particles entering state X at rate \({\mathcal {I}}_X(t)\) enter sub-state X\(_i\) with probability \(\rho _i\), \(\rho _1+\rho _2+\rho _3=1\), and the X\(_i\) dwell time is Erlang\((r_i,k_i)\) distributed. Particles exiting X\(_1\) and X\(_2\) transition to Y with probability 1, while particles exiting X\(_3\) transition either to state Y or Z with probabilities \(p_Y\) and \(p_Z=1-p_Y\). Assume particle may also enter states Y and Z from sources other than state X (at rates \({\mathcal {I}}_Y(t)\) and \({\mathcal {I}}_Z(t)\), respectively), and the dwell times in those two states follow the 1st event times of independent nonhomogeneous Poisson processes with rates \(\mu _Y(t)\) and \(\mu _Z(t)\). Theorem 5 yields the following mean field ODEs (see Fig. 5).
$$\begin{aligned} \frac{d}{dt}{x_{i,1}}(t)&=\; \rho _i\, {\mathcal {I}}_X(t) - r_i x_{i,1}(t), \quad i=1,\ldots ,3, \end{aligned}$$
(47a)
$$\begin{aligned} \frac{d}{dt}{x_{i,j}}(t)&=\; r_i \big (x_{i,j-1}(t) - x_{ij}(t)\big ), \quad j=2,\ldots ,k_i \end{aligned}$$
(47b)
$$\begin{aligned} \frac{d}{dt}{y}(t)&=\; {\mathcal {I}}_Y(t) + r_1 \, x_{1,k_1}(t) + r_2\, x_{2,k_2}(t) \nonumber \\&\quad + r_3 \,p_Y\, x_{3,k_3}(t) - \mu _Y(t) y(t) \end{aligned}$$
(47c)
$$\begin{aligned} \frac{d}{dt}{z}(t)&=\; {\mathcal {I}}_Z(t) + r_3\,p_Z\,x_{3,k_3}(t) - \mu _Z(t) z(t) . \end{aligned}$$
(47d)
Extended LCT for dwell times given by finite mixtures of Poisson process event time distributions
It’s worth noting that in some applied contexts one may want to approximate a non-Erlang delay distribution with a mixture of Erlang distributions (see Sect. 3.7.1 and Appendix B for more details on making such approximations). Theorem 5 above details how assuming such a mixture distribution would be reflected in the structure of the corresponding mean field ODEs. This case can also be addressed in the more general context provided in Sect. 3.7.1.
Transition to multiple states following “competing” Poisson processes
We now consider the case where T, the time a particle spends in a given state X, follows the distribution given by \(T=\min _i T_i\), the minimum of \(n\ge 2\) independent random variables \(T_i\), where \(T_i\) has either an Erlang(\(r_i,k_i\)) distribution or, more generally, a Poisson process \(k_i\)th event time distribution with rate \(r_i(t)\). Upon leaving state X, particles have the possibility of transitioning to any of m recipient states \(Y_\ell \), \(\ell =1,\ldots ,m\), where the probability of transitioning to state Y\(_\ell \) depends on which of the n random variables \(T_i\) was the minimum. That is, if a particle leaves X at time \(T=T_i=t\), then the probability of entering state Y\(_\ell \) is \(p_{i\ell (t)}\).
The distribution associated with T is not itself an Erlang distribution or a Poisson process event time distribution, however its survival function is the productFootnote 7 of such survival functions, i.e.,
$$\begin{aligned} {\mathscr {S}}(t,\tau )\equiv \prod _{i=1}^{n}{\mathcal {S}}_{r_i}^{k_i}(t,\tau ). \end{aligned}$$
(48)
As detailed below, we can further generalize the recursion relation in Lemma 1 for the distributions just described above, which can then be used to produce a mean field system of ODEs based on appropriately partitioning X into sub-states.
Before considering this case in general, it is helpful to first describe the sub-states of X imposed by assuming the dwell time distribution described above, particularly the case where the distribution for each \(T_i\) is based on 1st event times (i.e., all \(k_i=1\)). Recall that the minimum of n exponential random variables (which we may think of as 1st event times under a homogeneous Poisson process) is exponential with a rate that is the sum of the individual rates \(r=\sum _{i=1}^n r_i\). More generally, it is true that the minimum of n 1st event times under independent Poisson processes with rates \(r_i(t)\) is itself distributed as the 1st event time under a single Poisson processes with rate \(r(t)\equiv \sum _{i=1}^n r_i(t)\), thus in this case \({\mathscr {S}}(t,\tau )= \prod _{i=0}^{n}{\mathcal {S}}_{r_i}^{1}(t,\tau )={\mathcal {S}}_{r}^{1}(t,\tau )\). Additionally, if particles leaving state X are then distributed across the recipient states Y\(_\ell \) as described above, then this scenario is equivalent to the proportional outputs case described in Theorem 4 with a dwell time that follows a Poisson process 1st event time distribution with rate \(r(t)\equiv \sum _{i=1}^n r_i(t)\) and a probability vector \(p_\ell = \sum _{i=1}^n p_{i\ell }(t)r_i(t)/r(t)\), since \(P(T=T_i)=r_i(T)/r(T)\). (This mean field equivalence of these two cases is detailed in Sect. 3.5.5.) Thus, the natural partitioning of X in this case is into sub-states with dwell times that follow iid 1st event time distributions with rate \(r(t)\equiv \sum _{i=1}^{N} r_i(t)\).
We may now describe the mean field ODEs for the more general case using the following notation. To index the sub-states of X, consider the ith Poisson process and its \(k_i\)th event time distribution which defines the distribution of \(T_i\). Let \(a_i\in \{1,\ldots ,k_i\}\) denote the event number a particle is awaiting under the ith Poisson process. Then we can describe the particle’s progress through X according to its progress along each of these n Poisson processes using the index vector \(\alpha \in {\mathcal {K}}\), where
$$\begin{aligned} {\mathcal {K}}=\{(a_1,a_2,\ldots ,a_n)\;|\;a_j\in \{1,\ldots ,k_j\}\}. \end{aligned}$$
(49)
Let \({\mathcal {K}}_{i}\subset {\mathcal {K}}\) denote the subset of indices where \(a_i=k_i\) (where we think of particles in these sub-states as being poised to reach the \(k_i\)th event related to the ith Poisson process, and thus poised to transition out of state X).
To extend Lemma 2 for these distributions, define
$$\begin{aligned} u(t,\tau ,\alpha ) \equiv \prod _{i=1}^{n} e^{-m_i(t,\tau )} \frac{m_i(t,\tau )^{a_i-1}}{(a_i-1)!} \end{aligned}$$
(50)
where \(m_i(t,\tau )=\exp \big (-\int _{\tau }^t r_i(s)ds\big )\), and \(u(\tau ,\tau ,\alpha )=1\) if \(\alpha =(1,\ldots ,1)\) and \(u(\tau ,\tau ,\alpha )=0\) otherwise. Note that \(\prod _{i=1}^{n} h_{r_i}^{a_i}(t,\tau ) = u(t,\tau ,\alpha )\prod _{i=1}^n r_i(t)\) (c.f. Lemma 2). Applying Eq. (9b) to \({\mathscr {S}}(t,\tau )\) in Eq. (48), the survival function given by Eq. (48) [c.f. Eq. (31) and (9b)] can be written
$$\begin{aligned} {\mathscr {S}}(t,\tau ) = \,\sum _{\alpha \in {\mathcal {K}}}u(t,\tau ,\alpha ). \end{aligned}$$
(51)
We will also refer to the quantities u and \({\mathscr {S}}\) with the jth element of each product (in u) removed using the notation
$$\begin{aligned} u_{{\setminus } j}(t,\tau ,\alpha )&\equiv \; \prod _{i=1, i\ne j}^n e^{-m_i(t,\tau )} \frac{m_i(t,\tau )^{a_i-1}}{(a_i-1)!} \end{aligned}$$
(52a)
$$\begin{aligned} {\mathscr {S}}_{{\setminus } j}(t,\tau )&\equiv \; \sum _{\alpha \in {\mathcal {K}}_j} u_{{\setminus } j}(t,\tau ,\alpha ). \end{aligned}$$
(52b)
This brings us to the following lemma, which generalizes Lemma 1 and Lemma 2 to distributions that are the minimum of n different (independent) Poisson process event times. As with the above lemmas, Lemma 3 will allow one to partition X into sub-states corresponding to each of the event indices in \({\mathcal {K}}\) describing the various stages of progress along each Poisson process prior to the first of them reaching the target event number.
Lemma 3
For u as defined in Eq. (50), differentiation with respect to t yields
$$\begin{aligned} \frac{d}{dt}u(t,\tau ,\alpha ) = \; \sum _{j=1}^n r_j(t)\,u(t,\tau ,\alpha _{j,-1})\,\mathbb {1}_{[a_j>1]}(\alpha ) - \sum _{j=1}^n r_j(t) u(t,\tau ,\alpha ) \end{aligned}$$
(53)
where the notation \(\alpha _{j,-1}\) denotes the index vector generated by decrementing the jth element of \(\alpha \), \(a_j\) [assuming \(a_j>1\); for example, \(\alpha _{2,-1}=(a_1,a_2-1,\ldots ,a_n)\)], and the indicator function \(\mathbb {1}_{[a_j>1]}(\alpha )\) is 1 if \(a_j>1\) and 0 otherwise.
Proof
Using the definition of u in Eq. (50) above, it follows that
$$\begin{aligned} \begin{aligned}&\frac{d}{dt}u(t,\tau ,\alpha ) =\; \frac{d}{dt}\prod _{i=1}^{n} e^{-m_i(t,\tau )} \frac{m_i(t,\tau )^{a_i-1}}{(a_i-1)!}\\&\quad =\sum _{j=1}^n\bigg (\prod _{\begin{array}{c} i=1\\ i\ne j \end{array}}^n e^{-m_i(t,\tau )} \frac{m_i(t,\tau )^{a_i-1}}{(a_i-1)!}\bigg )\bigg [-r_j(t)e^{-m_j(t,\tau )} \frac{m_j(t,\tau )^{a_j-1}}{(a_j-1)!} \\&\qquad + \mathbb {1}_{[a_j>1]}(\alpha )\,r_j(t)\,e^{-m_j(t,\tau )}\frac{m_j(t,\tau )^{a_j-2}}{(a_j-2)!}\bigg ] \\&\quad =\sum _{j=1}^n -r_j(t)\,\prod _{i=1}^{n} e^{-m_i(t,\tau )} \frac{m_i(t,\tau )^{a_i-1}}{(a_i-1)!} \; \\&\qquad + \sum _{j=1}^n \mathbb {1}_{[a_j>1]}(\alpha )\,r_j(t)\,e^{-m_j(t,\tau )}\frac{m_j(t,\tau )^{a_j-2}}{(a_j-2)!} \prod _{\begin{array}{c} i=1\\ i\ne j \end{array}}^n e^{-m_i(t,\tau )} \frac{m_i(t,\tau )^{a_i-1}}{(a_i-1)!} \\&\quad = \sum _{j=1}^n r_j(t)\,u(t,\tau ,\alpha _{j,-1})\,\mathbb {1}_{[a_j>1]}(\alpha ) - \sum _{j=1}^n r_j(t) u(t,\tau ,\alpha ). \end{aligned} \end{aligned}$$
(54)
\(\square \)
The next theorem details the LCT extension that follows from Lemma 3.
Theorem 6
(Extended LCT for dwell times given by competing Poisson processes) Consider a continuous time dynamical system model of mass transitioning among multiple states, with inflow rate \({\mathcal {I}}_X(t)\) to a state X. The distribution of time spent in state X (call it T) is the minimum of n random variables, i.e., \(T=\min _{i}(T_i)\), \(i=1,\ldots ,n\), where \(T_i\) are either Erlang(\(r_i,k_i\)) distributed or follow the more general (nonhomogeneous) Poisson process \(k_i\)th event time distribution with rate \(r_i(t)\). Assume particles leaving X can enter one of m states Y\(_\ell \), \(\ell =1,\ldots ,m\). If a particle leaves X at time \(T_i\) (i.e., \(T_i\) occurred first, so \(T=T_i\)), and then the particle transitions into state \(Y_\ell \) with probability \(p_{i\ell }(T)\). Let x(t), and \(y_\ell (t)\) be the amount in each state, respectively, at time t, and assume non-negative initial conditions.
The mean field integral equations for this scenario, for \(\ell =1,\ldots ,m\) and \(i=1,\ldots ,n\), are
$$\begin{aligned} x(t)&=\; x_0\,{\mathscr {S}}(t,0) + \int ^t_{0} {\mathcal {I}}_X(s)\,{\mathscr {S}}(t,s)ds \end{aligned}$$
(55a)
$$\begin{aligned} y_\ell (t)&=\; y_{\ell }(0)S_\ell (t,0) + \int ^t_{0} \bigg ({\mathcal {I}}_\ell (\tau ) + \sum _{i=1}^n p_{i\ell } \bigg ( x_0\,{\mathscr {S}}_{{\setminus } i}(\tau ,0)\,h_{r_i}^{k_i}(\tau ,0) \; \nonumber \\&\quad + \int _{0}^{\tau } {\mathcal {I}}_X(s) {\mathscr {S}}_{{\setminus } i}(\tau ,s)\,h_{r_i}^{k_i}(\tau ,s) ds\bigg )\bigg )S_\ell (t,\tau )d\tau . \end{aligned}$$
(55b)
Equations (55) above are equivalent to
$$\begin{aligned} \frac{d}{dt}x_{(1,\ldots ,1)}(t)&=\; {\mathcal {I}}_X(t) - r(t)\,x_{(1,\ldots ,1)}(t), \end{aligned}$$
(56a)
$$\begin{aligned} \frac{d}{dt}x_\alpha (t)&=\;\sum _{i=1}^{n} r_i(t)\,x_{\alpha _{i,-1}}(t)\,\mathbb {1}_{[a_i>1]}(\alpha ) \;-\; r(t) \, x_{\alpha }(t) \end{aligned}$$
(56b)
$$\begin{aligned} y_\ell (t)&=\; y_{\ell }(0) S_\ell (t,0) \nonumber \\&\quad + \int ^t_{0} \bigg ({\mathcal {I}}_\ell (\tau ) + \sum _{i=1}^n p_{i\ell }(\tau )\sum _{\alpha \in {\mathcal {K}}_i} r_i(t)\,x_\alpha (\tau ) \bigg )S_\ell (t,\tau )d\tau \end{aligned}$$
(56c)
for all \(\alpha \in {\mathcal {K}}{\setminus }(1,\ldots ,1)\), \(r(t)=\sum _{i=1}^{n}r_i(t)\), \(x(t)=\sum _{\alpha \in {\mathcal {K}}} x_\alpha (t)\), and
$$\begin{aligned} x_\alpha (t) = x_0\,u(t,0,\alpha ) + \int _0^t{\mathcal {I}}_X(s)\,u(t,s,\alpha )\,ds. \end{aligned}$$
(57)
The \(y_\ell (t)\) equations (56c) may be further reduced to a system of ODEs, e.g., via Corollary 1.
Proof
Substituting Eq. (51) into Eq. (55a) yields
$$\begin{aligned} \begin{aligned} x(t)&= x_0\,\sum _{\alpha \in {\mathcal {K}}} u(t,0,\alpha ) + \int ^t_{0} {\mathcal {I}}_X(s)\,\sum _{\alpha \in {\mathcal {K}}}u(t,s,\alpha )ds \\&= \sum _{\alpha \in {\mathcal {K}}} \bigg ( x_0\, u(t,0,\alpha ) + \int ^t_{0} {\mathcal {I}}_X(s)\, u(t,s,\alpha )ds \bigg ) = \sum _{\alpha \in {\mathcal {K}}} x_\alpha (t). \end{aligned} \end{aligned}$$
(58)
Differentiating (57) yields equations Eqs. (56a) and (56b) as follows. First, if \(\alpha =(1,\ldots ,1)\) then by Lemma 3
$$\begin{aligned} \begin{aligned} \frac{d}{dt}{x_{(1,\ldots ,1)}}(t)&=\; -x_0\,\sum _{i=1}^n r_i(t)\,u(t,0,\alpha ) \\&\quad - \sum _{i=1}^n r_i(t)\,\int ^t_{0} {\mathcal {I}}_X(s)\, u(t,s,\alpha )ds + {\mathcal {I}}_X(t) \\&=\; {\mathcal {I}}_X(t) - \sum _{i=1}^n r_i(t)\, x_{(1,\ldots ,1)}(t). \end{aligned} \end{aligned}$$
(59)
Next, if \(\alpha \) has any \(a_i>1\), differentiating Eq. (57) and applying Lemma 3 yields
$$\begin{aligned} \begin{aligned}&\frac{d}{dt}{x_\alpha }(t) =\; x_0\, \frac{d}{dt}u(t,0,\alpha ) + \int ^t_{0} {\mathcal {I}}_X(s)\; \frac{d}{dt}u(t,s,\alpha ) \,ds \\&\quad =\; x_0\, \bigg ( \sum _{i=1}^{n} r_i(t)u(t,0,\alpha _{i,-1})\,\mathbb {1}_{[a_i>1]}(\alpha ) - \sum _{i=1}^{n}r_i(t)u(t,\alpha ) \bigg ) \; \\&\qquad + \int ^t_{0} {\mathcal {I}}_X(s)\bigg ( \sum _{i=1}^{n} r_i(t)\,u(t,s,\alpha _{i,-1})\,\mathbb {1}_{[a_i>1]}(\alpha ) - \sum _{i=1}^{n}r_i(t)\,u(t,s,\alpha )\bigg )\,ds \\&\quad =\; \sum _{i=1}^{n} r_i(t)\,x_{\alpha _{i,-1}}(t)\,\mathbb {1}_{[a_i>1]}(\alpha ) - \sum _{i=1}^{n} r_i(t)\,x_{\alpha }(t) \end{aligned} \end{aligned}$$
(60)
Note that, by the definitions of \(x_\alpha \) and u that initial condition \(x(0)=x_0\) becomes \(x_{(1,\ldots ,1)}(0)=x_0\) and \(x_\alpha (0)=0\) for the remaining \(\alpha \in {\mathcal {K}}\).
Eqs. (55b) become (56c), where \({\mathcal {K}}_i=\{\alpha \;|\;\alpha \in {\mathcal {K}},\;a_i=k_i\}\), by substituting Eqs. (52), (57), and \({\mathscr {S}}_{{\setminus } i}(t,\tau )\,h_{r_i}^{k_i}(t,\tau ) = \sum _{\alpha \in {\mathcal {K}}_i} r_i(t)\,u(t,\tau ,\alpha )\), which yields
$$\begin{aligned} \begin{aligned}&x_0\,{\mathscr {S}}_{{\setminus } i}(\tau ,0)h_{r_i}^{k_i}(\tau ,0) + \int _{0}^{\tau } {\mathcal {I}}_X(s) {\mathscr {S}}_{{\setminus } i}(\tau ,s)h_{r_i}^{k_i}(\tau ,s) ds \; \\&\quad = x_0 \sum _{\alpha \in {\mathcal {K}}_i} r_i(t)\,u(\tau ,0,\alpha ) + \int _{0}^{\tau } {\mathcal {I}}_X(s) \sum _{\alpha \in {\mathcal {K}}_i} r_i(t)\,u(\tau ,s,\alpha ) ds \; \\&\quad = r_i(t) \sum _{\alpha \in {\mathcal {K}}_i} \bigg (x_0\, u(\tau ,0,\alpha ) + \int _{0}^{\tau } {\mathcal {I}}_X(s) u(\tau ,s,\alpha ) ds \bigg ) =\; \sum _{\alpha \in {\mathcal {K}}_i}r_i(t) \, x_\alpha (\tau ). \\ \end{aligned} \end{aligned}$$
(61)
\(\square \)
Example 5
See Fig. 6. Suppose the X dwell time is \(T=\min (T_1,T_2)\) where \(T_1\) and \(T_2\) are the \(k_1\)th and \(k_2\)th event time distributions under independent Poisson processes (call these PP1 and PP2) with rates \(r_1(t)\) and \(r_2(t)\), respectively. Assume that, upon leaving X, particles transition to Y\(_1\) if \(T=T_1\) or to Y\(_2\) if \(T=T_2\). By Theorem 6, we can partition X into sub-states defined by which event (under each Poisson process) particles are awaiting next. Upon entry into X, all particles enter sub-state X\(_{1,1}\) where they each await the 1st events under PP1 or PP2. If the next event to occur for a given particle is from PP1, the particle transitions to X\(_{2,1}\) where it awaits a 2nd event from PP1 or 1st event from PP2 (hence the subscript notation). Likewise, if PP2’s 1st event occurs before PP1’s 1st event, the particle would transition to X\(_{1,2}\), and so on. Particles would leave these two states to either X\(_{2,2}\), Y\(_1\), or Y\(_2\) depending on which event occurs next. Under these assumptions, with \(k_1=k_2=2\) and exponential Y\(_i\) dwell times with rates \(\mu \), then the corresponding mean field equations (using \(r(t) = r_1(t)+r_2(t)\)) are
$$\begin{aligned} \frac{dx_{11}}{dt}&=\; {\mathcal {I}}_X(t) - r(t)\,x_{11}(t) \end{aligned}$$
(62a)
$$\begin{aligned} \frac{dx_{21}}{dt}&=\; r_1(t)\,x_{11}(t) - r(t)\,x_{21}(t) \end{aligned}$$
(62b)
$$\begin{aligned} \frac{dx_{12}}{dt}&=\; r_2(t)\,x_{11}(t) - r(t)\,x_{12}(t) \end{aligned}$$
(62c)
$$\begin{aligned} \frac{dx_{22}}{dt}&=\; r_1(t)\,x_{12}(t) + r_2(t)\,x_{21}(t) - r(t)\,x_{12}(t) \end{aligned}$$
(62d)
$$\begin{aligned} \frac{dy_{1}}{dt}&=\; r_1(t)\,x_{22}(t) - \mu (t)\,y_{1}(t) \end{aligned}$$
(62e)
$$\begin{aligned} \frac{dy_{2}}{dt}&=\; r_2(t)\,x_{22}(t) - \mu (t)\,y_{2}(t). \end{aligned}$$
(62f)
It’s worth pointing out that the dwell times for the above sub-states of X are all identically distributed Poisson process 1st event times (note the loss rates in Eqs. (62a)–(62d), and recall the weak memorylessness property from Sect. 3.1.2). All particles in a X sub-state at time \(\tau \) will spend a remaining amount of time in that state that follows a 1st event time distributions under a Poisson process with rate \(r(t)=r_1(t)+r_2(t)\). This is a slight generalization of the familiar fact that the minimum of n independent exponentially distributed random variables (with respective rates \(r_i\)) is itself an exponential random variable (with rate \(r\equiv \sum _{i=1}^n r_i\)). The next section addresses the generality of this observation about the sub-states of X.
Mean field equivalence of proportional outputs and competing Poisson processes for 1st event time distributions
The scenarios described in Sect. 3.5.1 (proportional distribution across multiple states \(Y_\ell \) after an Erlang dwell time in X) and Sect. 3.5.4 (proportional distribution across multiple states based upon competing Poisson processes), can lead to equivalent mean field equations when the X dwell times follow Poisson process 1st event time distributions, as is Example 5. This equivalence is detailed in Theorem 7, and is an important aspect of the GLCT detailed in Sect. 3.7 because it helps to show how sub-states with dwell times distributed as Poisson process 1st event times are the fundamental buildings blocks of the GLCT.
Theorem 7
(Mean field equivalence of proportional outputs and competing Poisson processes for 1st event time distributions) Consider the special case of Theorem 6 (the Extended LCT for competing Poisson processes) where X has a dwell time given by \(T=\min _i T_i\), where each \(T_i\) is a Poisson process 1st event time with rate \(r_i(t)\), \(i=1,\ldots ,n\) and particles transition to Y\(_\ell \) with probability \(p_{i\ell }(T)\) when \(T=T_i\). The corresponding mean field model is equivalent to the special case of Theorem 4 (the Extended LCT for multiple outputs) where the X dwell time is a Poisson process 1st event time distribution with rate \(r(t)=\sum _{i=1}^n r_i(t)\), and the transition probability vector for leaving X and entering state Y\(_\ell \) is given by \(p_\ell (t)=\sum _{i=1}^n p_{i\ell }(t)\,r_i(t)/r(t)\).
Proof
First, in this case \({\mathscr {S}}(t,\tau ) = \prod _{i=0}^{n}{\mathcal {S}}_{r_i}^{1}(t,\tau )={\mathcal {S}}_{r}^{1}(t,\tau )\). Since all \(k_i=1\), the probability that \(T=T_i\) is \(r_i(T)/r(T)\), thus the probability that a particle leaving X at t goes to Y\(_\ell \) is \(p_\ell (t)=\sum _{i=1}^n \frac{r_i(t)}{r(t)}p_{i\ell }(t)\). Substituting the above equalities into the mean field Eq. (56a) (where there’s only one possible index in \({\mathcal {K}}=\{(1,1,\ldots ,1)\}\)) and (56c) gives
$$\begin{aligned} \frac{d}{dt}x(t)&=\; {\mathcal {I}}_X(t) - r(t)\,x(t) \end{aligned}$$
(63a)
$$\begin{aligned} y_j(t)&=\; y_{j}(0)S_j(t,0)\; + \int ^t_{0} \bigg ({\mathcal {I}}_j(\tau ) + r(t)\,p_j(\tau )\,x(\tau )\bigg ) S_j(t,\tau ) d\tau \end{aligned}$$
(63b)
which are the mean field equations for the aforementioned special case of Theorem 4. \(\square \)
Modeling intermediate state transitions: reset the clock, or not?
We next describe how to apply extensions of the LCT in two similar but distinctly different scenarios (see Fig. 7) where the transition to one or more intermediate sub-states either resets an individual’s overall dwell time in state X (by assuming the time spent in an intermediate sub-state X\(_{I_i}\) is independent of time already spent in X\(_0\); see Sect. 3.6.1), or instead leaves the overall dwell time distribution for X unchanged (by conditioning the time spent in intermediate state X\(_{I_i}\) on the time already spent in X\(_0\); see Sect. 3.6.2).
An example of these different assumptions leading to important differences in practice comes from Feng et al. (2016) where individuals infected with Ebola can either leave the infected state (X) directly (either to a recovery or death), or after first transitioning to an intermediate hospitalized state (X\(_I\)) which needs to be explicitly modeled in order to incorporate a quarantine effect into the rates of disease transmission (i.e., the force of infection should depend on the number of non-quarantined individuals, i.e., X\(_0\)). As shown in Feng et al. (2016), the epidemic model output depends strongly upon whether or not it is assumed that moving into the hospitalized sub-state impacts the distribution of time spent in the infected state X.
To most simply illustrate these two scenarios, consider the simple case in Fig. 7 where a single intermediate sub-state \(X_I\) is being modeled, and particles enter X into sub-state X\(_0\) at rate \({\mathcal {I}}_X(t)\). Let X\(=\)X\(_0\cup \)X\(_I\). Both cases assume particles transition out of X\(_0\) either to sub-state X\(_I\) or they leave state X directly and enter state Y. Both cases also assume the distribution of time spent in X\(_0\) is \(T_*=\)min(\(T_0,T_1\)) where particles transition to X\(_I\) if \(T_1<T_0\) (i.e., if \(T=T_1\)) or to Y if \(T_0<T_1\) (where each \(T_i\) is the \(k_i\)th event time under Poisson processes with rates \(r_0(t)\) and \(r_1(t)\) (see Sects. 3.5.4 and 3.5.5). Let \(T_I\) denote the distribution of time spent in intermediate state X\(_I\). The first case assumes \(T_I\) is independent of time spent in X\(_0\) (i.e., the transition to X\(_I\) ‘resets the clock’; see Sect. 3.6.1). The second case assumes \(T_I\) is conditional on time already spent in X\(_0\) (call it \(t_0\)), such that the total amount of time spent in X, \(t_0+T_I\), is equivalent in distribution to \(T_0\) (i.e., the transition to X\(_I\) does not change the overall distribution of time spent in X; see Sect. 3.6.2).
In the next two sections, we provide extensions of the LCT for generalizations of these two scenarios, extended to multiple possible intermediate states with eventual transitions out of X into multiple recipient states.
Intermediate states that reset dwell time distributions
First, we consider the case in which the time spent in the intermediate state X\(_I\) is independent of the time already spent in X (i.e., in the base state X\(_0\)). This is arguably the more commonly encountered (implicit) assumption found in ODE models that aren’t explicitly derived from a stochastic model and/or mean field integro-differential delay equations.
The construction of mean field ODEs for this case is a straightforward application of Theorem 6 from the previous section, combined with the extended LCT with output to multiple states (Theorem 4). Here we have extended this scenario to include \(M_X\) intermediate sub-states X\(_{I_j}\) where the transition to those sub-states from base state X\(_0\) is based on the outcome of N competing Poisson process event time distributions (\(T_i\)), and upon leaving the intermediate states particles transition out of state X into one of \(M_Y\) possible recipient states Y\(_\ell \).
Theorem 8
(Extended LCT with dwell time altering intermediate sub-state transitions) Suppose particles enter X at rate \({\mathcal {I}}_X(t)\) into a base sub-state X\(_0\). Assume particles remain in X\(_0\) according to a dwell time distribution given by T, the minimum of \(N+1\) independent Poisson process \(k_i\)th event time distributions with rates \(r_i(t)\), \(i=0,\ldots ,N\), i.e., \(T=\min _i(T_i)\). Particles leaving X\(_0\) transition to one of \(M_X\ge 1\) intermediate sub-states X\(_{I_i}\) or to one of \(M_Y\ge 1\) recipient states \(Y_\ell \) according to which \(T_i=T\). If \(T_0=T\) then the particle leaves X and the probability of transitioning to Y\(_\ell \) is \(p_{0\ell }(T)\), where \(\sum _{\ell =1}^{M_Y} p_{0\ell }(T)=1\). If \(T_i=T\) for \(i\ge 1\) then the particle transitions to X\(_{I_j}\) with probability \(p_{ij}(T)\), where \(\sum _{j=1}^{M_X} p_{ij}(T)=1\). Particles in intermediate state \(X_{I_j}\) remain there according to the \(\kappa _i\)th event times under a Poisson process with rate \(\varrho _i(t)\), and then transition to state Y\(_\ell \) with probability \(q_{j\ell }(t)\), where (for fixed t) \(\sum _{\ell =1}^{M_Y} q_{j\ell }(t)=1\), and they remain in Y\(_\ell \) according to a dwell time with survival function \(S_\ell (t,\tau )\).
In this case the corresponding mean field equations are
$$\begin{aligned} \frac{d}{dt}x_{0(1,\ldots ,1)}(t)&=\; {\mathcal {I}}_X(t) - \sum _{i=0}^N r_i(t) \,x_{(1,\ldots ,1)}(t) \end{aligned}$$
(64a)
$$\begin{aligned} \frac{d}{dt}x_{0\alpha }(t)&=\; \sum _{i=0}^{N} r_i(t) \bigg ( x_{0\alpha _{i,-1}}(t)\,\mathbb {1}_{[a_i>1]}(\alpha ) - x_{0\alpha }(t)\bigg ) \end{aligned}$$
(64b)
$$\begin{aligned} \frac{d}{dt}x_{I_{j1}}(t)&=\; {\mathcal {I}}_{X_{Ij}}(t) + p_{ij}(t)\bigg (\sum _{\alpha \in {\mathcal {K}}_i} r_i(t)\,x_{0\alpha }(t)\bigg ) - \varrho _j(t)\,x_{I_{j1}}(t) \end{aligned}$$
(64c)
$$\begin{aligned} \frac{d}{dt}x_{I_{jk}}(t)&=\; \varrho _j(t)\big (x_{I_{j,k-1}}(t) - x_{I_{jk}}(t)\big ), \;\qquad k=2,\ldots ,\kappa _j \end{aligned}$$
(64d)
$$\begin{aligned} y_\ell (t)&=\; y_{\ell }(0)\,S_\ell (t,0) + \int ^t_{0} \bigg ({\mathcal {I}}_{Y_\ell }(\tau ) + p_{0\ell }(\tau ) \sum _{\alpha \in {\mathcal {K}}_0} r_0(\tau )\,x_{0\alpha }(\tau ) \nonumber \\&\quad + \sum _{j=1}^{M_X} \varrho _j(\tau )\,x_{I_{j\kappa _j}}(\tau )\, q_{j\ell }(\tau ) \bigg ) S_Y(t,\tau )\,d\tau . \end{aligned}$$
(64e)
where the amount in base sub-state X\(_0\) is \(x_0(t)=\sum _{\alpha \in {\mathcal {K}}}x_{0\alpha }(t)\), the amount in the jth intermediate state X\(_{I_j}\) is \(x_{Ij}(t)=\sum _{k=1}^{\kappa _j} x_{I_{jk}}(t)\) (see Theorem 6 for notation), \({\mathcal {K}}=\{(a_0,a_1,\ldots ,a_N)\;|\;a_j\in \{1,\ldots ,k_j\}\}\), \({\mathcal {K}}_i=\{\alpha \in {\mathcal {K}} | a_i=k_i\}\), \(j=1,\ldots ,N\), \(\ell = 1,\ldots ,M_Y\), and in Eq. (64b) \(\alpha =(a_0,\ldots ,a_N) \in {\mathcal {K}}{\setminus }(1,\ldots ,1)\). Note that the \(y_\ell (t)\) equations (64e) may be further reduced to a system of ODEs, e.g., via Corollary 1, and that more complicated distributions for dwell times in intermediate states X\(_{I_i}\) (e.g., an Erlang mixture) could be similarly modeled according to other cases addressed in this manuscript.
Proof
This follows from applying Theorem 6 to X\(_0\) and treating the intermediate states X\(_{I_j}\) as recipient states, then applying Theorem 4 to each intermediate state to partition each X\(_{I_j}\) into X\(_{I_{jk}}\), \(k=1,\ldots ,\kappa _j\), yielding Eq. (64). \(\square \)
Example 6
Consider the scenario in Fig. 7. Let the X\(_0\) dwell time be the minimum of \(T_0\sim \)Erlang(\(r_0,2\)) and \(T_1\sim \)Erlang(\(r_1,2\)), with intermediate state dwell time \(T_{I_1}\sim \)Erlang(\(\varrho _1,\kappa _1=3\)) and an exponential (rate \(\mu \)) dwell time in Y. Assume the only inputs into X are into X\(_0\) at rate \({\mathcal {I}}_X(t)\). By Theorem 8 the corresponding mean field ODEs (see Fig. 8) are Eq. (65), where \(x_0(t)=x_{0(1,1)}(t)+x_{0(2,1)}(t)+x_{0(1,2)}(t)+x_{0(2,2)}(t)\) and \(x_{I_1}(t)=x_{I_{11}}(t)+x_{I_{12}}(t)+x_{I_{13}}(t)\).
$$\begin{aligned} \frac{d}{dt}x_{0(1,1)}(t)&=\; {\mathcal {I}}_X(t) - (r_0+r_1)\,x_{0(1,1)}(t) \end{aligned}$$
(65a)
$$\begin{aligned} \frac{d}{dt}x_{0(2,1)}(t)&=\; r_0 x_{0(1,1)}(t) - (r_0+r_1) x_{0(2,1)}(t) \end{aligned}$$
(65b)
$$\begin{aligned} \frac{d}{dt}x_{0(1,2)}(t)&=\; r_1 x_{0(1,1)}(t) - (r_0+r_1) x_{0(1,2)}(t) \end{aligned}$$
(65c)
$$\begin{aligned} \frac{d}{dt}x_{0(2,2)}(t)&=\; r_0 x_{0(1,2)}(t) + r_1 x_{0(2,1)}(t) - (r_0+r_1) x_{0(2,2)}(t) \end{aligned}$$
(65d)
$$\begin{aligned} \frac{d}{dt}x_{I_{11}}(t)&=\; r_1\,x_{0(1,2)}(t) + r_1\,x_{0(2,2)}(t) - \varrho _1\,x_{I_{11}}(t) \end{aligned}$$
(65e)
$$\begin{aligned} \frac{d}{dt}x_{I_{12}}(t)&=\;\varrho _1\,x_{I_{11}}(t) - \varrho _1\,x_{I_{12}}(t) \end{aligned}$$
(65f)
$$\begin{aligned} \frac{d}{dt}x_{I_{13}}(t)&=\;\varrho _1\,x_{I_{12}}(t) - \varrho _1\,x_{I_{13}}(t) \end{aligned}$$
(65g)
$$\begin{aligned} \frac{d}{dt}y(t)&=\; {\mathcal {I}}_Y(t) + r_0\,x_{0(2,1)}(t) + r_0\,x_{0(2,2)}(t) + \varrho _1\,x_{I_{13}}(t) - \mu \,y(t). \end{aligned}$$
(65h)
Intermediate states that preserve dwell time distributions
In this section we address how to modify the outcome in the previous section to instead construct mean field ODE models that incorporate ‘dwell time neutral’ sub-state transitions, i.e., where the distribution of time spent in X is the same regardless of whether or not particles transition (within X) from some base sub-state X\(_0\) to one or more intermediate sub-states X\(_{I_j}\). This is done by conditioning the dwell time distributions in X\(_{I_i}\) on time spent in X\(_0\) in a way that leverages the weak memorylessness property discussed in Sect. 3.1.2.
In applications, this case (in contrast to the previous case) is perhaps the more commonly desired assumption, since modelers often seek to partition states into sub-states where key characteristics, like the overall X dwell time distribution, remain unchanged, but where the different sub-states have functional differences elsewhere in the model. For example, transmission rate reductions from quarantine in SIR type infectious disease models.
One approach to deriving such a model is to condition the dwell time distribution for an intermediate state X\(_{I_i}\) on the time already spent in X\(_0\) (as in Feng et al. 2016). We take a slightly different approach and exploit the weak memoryless property of Poisson process 1st event time distributions (see Theorem 1 in Sect. 3.1.2, and the notation used in the previous section) to instead condition the dwell time distribution for intermediate states X\(_{I_j}\) on how many of the \(k_0\) events have already occurred when a particle transitions from X\(_0\) to X\(_{I_j}\) (rather than conditioning on the exact elapsed time spent in X\(_0\)). In this case, since each sub-state of X\(_0\) has iid dwell time distributions that are Poisson process 1st event times with rate \(r(t)=\sum _{i=0}^N r_i(t)\), if i of the \(k_0\) events had occurred prior to the transition out of X\(_0\), then the weak memoryless property of Poisson process 1st event time distributions implies that the remaining time spent in X\(_{I_j}\) should follow a \((k_0-i)\)th event time distribution under an Poisson process with rate \(r_0(t)\), thus ensuring that the total time spent in X follows a \(k_0\)th event time distribution with rate \(r_0(t)\). With this realization in hand, one can then apply Theorem 6 and Theorem 4 as in the previous section to obtain the desired mean field ODEs, as detailed in the following Theorem, and as illustrated in Fig. 8.
Theorem 9
(Extended LCT with dwell time preserving intermediate states) Consider the mean field equations for a system of particles entering state X (into sub-state X\(_0\)) at rate \({\mathcal {I}}_X(t)\). As in the previous case, assume the time spent in X\(_0\) follows the minimum of \(N+1\) independent Poisson process \(k_i\)th event time distributions with respective rates \(r_i(t)\), \(i=0,\ldots ,N\) (i.e., \(T=\min _i(T_i)\)). Particles leaving X\(_0\) at time T transition to recipient state Y\(_\ell \) with probability \(p_{0\ell }(T)\) if \(T=T_0\), or if \(T=T_i\) (\(i=1,\ldots ,N\)) to the jth of \(M_X\) intermediate sub-states, X\(_{I_j}\), with probability \(p_{ij}(T)\). If \(T<T_0\), we may define a random variable \(K\in \{0,\ldots ,k_0-1\}\) indicating how many events had occurred under the Poisson process associated with \(T_0\) at the time of the transition out of X\(_0\) (at time T). In order to ensure the overall time spent in X follows a Poisson process \(k_0\)th event time distribution with rate \(r_0(t)\), it follows that particles entering state, X\(_{I_j}\) will remain there for a duration of time that is conditioned on \(K=k\) such that the conditional dwell time for that particle in X\(_{I_j}\) will be given by a Poisson process \((k_0-k)\)th event time with rate \(r_0(t)\). Finally, assume that particles leaving X via intermediate sub-state X\(_{I_j}\) at time t transition to Y\(_\ell \) with probability \(q_{j\ell }\), where they remain according to a dwell time determined by survival function \(S_\ell (t,\tau )\).
The corresponding mean field equations are
$$\begin{aligned}&\frac{d}{dt}x_{0(1,\ldots ,1)}(t) =\; {\mathcal {I}}_X(t) -\sum _{i=0}^{N}r_i(t)\,x_{0(1,\ldots ,1)}(t) \end{aligned}$$
(66a)
$$\begin{aligned}&\frac{d}{dt}x_{0\alpha }(t) =\; \sum _{i=0}^{N} r_i(t)\, x_{0\alpha _{i,-1}}(t)\,\mathbb {1}_{[a_i>1]}(\alpha ) - \sum _{i=0}^{N} r_i(t)\, x_{0\alpha }(t) \end{aligned}$$
(66b)
$$\begin{aligned}&\frac{d}{dt}x_{I_{jk}}(t) =\; r_0(t)\,\big ( x_{I_{j,k-1}}(t)\,\mathbb {1}_{[k>1]} - x_{I_{jk}}(t) \big ) + \sum _{\alpha \in {\mathcal {K}}_{ij}} r_i(t)\,x_\alpha (t)\,p_{ij}(t) \end{aligned}$$
(66c)
$$\begin{aligned}&y_\ell (t) =\; y_\ell (0)S_\ell (t,0) + \int ^t_{0} \bigg ({\mathcal {I}}_{Y_\ell }(\tau ) + \sum _{\alpha \in {\mathcal {K}}_0} r_0(\tau )\,x_\alpha (\tau ) \nonumber \\&\quad + \sum _{j=1}^{M_X} r_0(\tau )\,x_{I_{jk_0}}(\tau )\,q_{j\ell }(\tau ) \bigg ) S_\ell (t,\tau )d\tau \end{aligned}$$
(66d)
where \({\mathcal {K}}=\{(a_0,a_1,\ldots ,a_N)\;|\;a_j\in \{1,\ldots ,k_j\}\}\), \(\alpha \in {\mathcal {K}}{\setminus }(1,\ldots ,1)\), \(j=1,\ldots ,M_X\), \(k=1,\ldots ,k_0\), \(\ell =1,\ldots ,M_Y\), \({\mathcal {K}}_i\subset {\mathcal {K}}\) are the subset of indices where \(a_i=k_i\), \({\mathcal {K}}_{ij}\subset {\mathcal {K}}_i\) are the subset of indices where \(a_i=k_i\) and \(a_0=j\), \(x_0(t)=\sum _{\alpha \in {\mathcal {K}}} x_{0\alpha }(t)\), \(x_{iI}(t)=\sum _{j=1}^{k_0} x_{iIj}(t)\), and \(x(t)=x_0(t)+\sum _{i=1}^{n} x_{iI}(t)\). The \(y_\ell (t)\) equations (66d) may be reduced to ODEs, e.g., via Corollary 1.
Proof
The proof of Theorem 9 parallels the proof of Theorem 8, but with the following modifications. First, each sub-state of X\(_{I_j}\) (for all j) has the same dwell time distribution, namely, they are all 1st event time distributions under a Poisson process with rate \(r_0(t)\). Second, upon leaving X\(_0\) where \(T=T_i\) and \(K(T)=k < k_0\) (i.e., when only \(k<k_0\) events have occurred under the 0th Poisson process; see the definition of K in the text above) particles will enter (with probability \(p_{ij}(T)\)) the jth intermediate state X\(_{I_j}\) by entering sub-state X\(_{I_{jk}}\) which (due to the weak memorylessness property described in Theorem 1) ensures that, upon leaving X\(_{I_j}\) particles will have spent a duration of time that follows the Poisson process \(k_0\)th event time distribution with rate \(r_0(t)\). \(\square \)
Example 7
Consider Example 6 in the previous section, but instead assume that the transition to the intermediate state does not impact the overall time spent in state X, as detailed in Theorem 9. The corresponding mean field ODEs for this case are given by Eq. (67) below [compare Eqs. (67e)–(67g) to Eqs. (65e)–(65h)].
$$\begin{aligned} \frac{d}{dt}x_{0(1,1)}(t)&=\; {\mathcal {I}}_X(t) - (r_0+r_1)\,x_{0(1,1)}(t) \end{aligned}$$
(67a)
$$\begin{aligned} \frac{d}{dt}x_{0(2,1)}(t)&=\; r_0 x_{0(1,1)}(t) - (r_0+r_1) x_{0(2,1)}(t) \end{aligned}$$
(67b)
$$\begin{aligned} \frac{d}{dt}x_{0(1,2)}(t)&=\; r_1 x_{0(1,1)}(t) - (r_0+r_1) x_{0(1,2)}(t) \end{aligned}$$
(67c)
$$\begin{aligned} \frac{d}{dt}x_{0(2,2)}(t)&=\; r_0 x_{0(1,2)}(t) + r_1 x_{0(2,1)}(t) - (r_0+r_1) x_{0(2,2)}(t) \end{aligned}$$
(67d)
$$\begin{aligned} \frac{d}{dt}x_{I_{11}}(t)&=\; r_1\,x_{0(1,2)}(t) - r_0\,x_{I_{11}}(t) \end{aligned}$$
(67e)
$$\begin{aligned} \frac{d}{dt}x_{I_{12}}(t)&=\; r_1\,x_{0(2,2)}(t) + r_0\,x_{I_{11}}(t) - r_0\,x_{I_{12}}(t) \end{aligned}$$
(67f)
$$\begin{aligned} \frac{d}{dt}y(t)&=\; {\mathcal {I}}_Y(t) + r_0\,x_{0(2,1)}(t) + r_0\,x_{0(2,2)}(t) + r_0\,x_{I_{12}}(t) - \mu \,y(t). \end{aligned}$$
(67g)
Generalized Linear Chain Trick (GLCT)
In the preceding sections we have provided various extensions of the Linear Chain Trick (LCT) that describe how the structure of mean field ODE models reflects the assumptions that define corresponding continuous time stochastic state transition models. Each case above can be viewed as a special case of the following more general framework for constructing mean field ODEs, which we refer to as the Generalized Linear Chain Trick (GLCT).
The cases we have addressed thus far share the following stochastic model assumptions, which constitute the major assumptions of the GLCT.
-
A1.
A focal state (which we call state X) can be partitioned into a finite number of sub-states (e.g., X\(_1,\ldots ,\)X\(_n\)), each with independent (across states and particles) dwell time distributions that are either exponentially distributed with rates \(r_i\) or, more generally, are distributed as independent 1st event times under nonhomogeneous Poisson processes with rates \(r_i(t)\), \(i=1,\ldots ,n\). Recall the equivalence relation in Sect. 3.5.5.
-
A2.
Inflow rates into X can be described by non-negative, integrable inflow rates into each of these sub-states (e.g., \({\mathcal {I}}_{X_1}(t),\ldots ,{\mathcal {I}}_{X_n}(t)\)), some or all of which may be zero. This includes the case where particles enter X at rate \({\varLambda }(t)\) and are distributed across sub-states X\(_i\) according to the probabilities \(\varvec{\rho }(t) = [\rho _1(t),\ldots ,\rho _n(t)]^{{\textit{T}}}\) (i.e., we let \({\mathcal {I}}_{X_i}(t)\equiv \rho _i(t)\,{\varLambda }(t)\)) where \(\sum _i \rho _i =1\).
-
A3.
Particles that transition out of a sub-state X\(_i\) at time t transition either into sub-state X\(_j\) with probability \(p_{ij}(t)\), or into one of recipient states Y\(_\ell \), \(\ell =1,\ldots ,m\), with probability \(p_{i,n+\ell }\). That is, let \(p_{ij}(t)\) denote the probability that a particle leaving state X\(_i\) at time t enters either X\(_j\) if \(j\le n\) or Y\(_{j-n}\) if \(j>n\), where \(i=1,\ldots ,n\), \(j=1,\ldots ,n,n+1,\ldots ,n+m\).
-
A4.
Recipient states Y\(_\ell \), \(\ell =1,\ldots ,m\), also have dwell time distributions defined by survival functions \(S_{Y_\ell }(t,\tau )\) and integrable, non-negative inflow rates \({\mathcal {I}}_{Y_\ell }(t)\) that describe inputs from all other non-X sources.
The GLCT (Theorem 10) describes how to construct mean field ODEs for states X and Y for state transition models satisfying the above assumptions.
Theorem 10
(Generalized Linear Chain Trick) Consider a stochastic, continuous time state transition model of particles entering state X and transitioning to states Y\(_\ell \), \(\ell =1,\ldots ,m\), according to the above assumptions A1-A4. Then the corresponding mean field model is given by
$$\begin{aligned} \frac{d}{dt}x_i(t)&=\; {\mathcal {I}}_{X_i}(t) + \sum _{j=1}^n p_{ji}(t)\,r_j(t)\,x_j(t) - r_i(t)\,x_i(t), \quad i=1,\ldots ,n, \end{aligned}$$
(68a)
$$\begin{aligned} y_\ell (t)&=\; y_\ell (0)S_{Y_\ell }(t,0) + \int _{0}^{t} \bigg ( {\mathcal {I}}_{Y_\ell }(\tau ) \nonumber \\&\quad + \sum _{j=1}^{n} r_j(t)\,x_j(\tau )\,p_{j,n+\ell }(t) \bigg ) S_{Y_\ell }(t,\tau )\,d\tau \end{aligned}$$
(68b)
where \(x(t)=\sum _{i=1}^n x_i(t)\), and we assume non-negative initial conditions \(x_i(0)=x_{i0}\), \(y_\ell (0)=y_{\ell 0}\). Note that the \(y_\ell (t)\) equations might be reducible to ODEs, e.g., via Corollary 1 or other results presented above.
Furthermore, Eq. (68a) may be written in vector form where \({\mathbf {P}}_X(t)=(p_{ij}(t))\) (\(i,j\in \{1,\ldots ,n\}\)) is the \(n\times n\) matrix of (potentially time-varying) probabilities describing which transitions out of X\(_i\) at time t go to X\(_j\) (likewise, one can define \({\mathbf {P}}_Y(t)=(p_{ij}(t))\), \(i\in \{1,\ldots ,n\}\), \(j\in \{n+1,\ldots ,n+m\}\), which is the \(n\times m\) matrix of probabilities describing which transitions from X\(_i\) at time t go to Y\(_{j-n}\)), \(\mathbf {{\mathcal {I}}_X}(t) = [{\mathcal {I}}_{X_1}, \ldots , {\mathcal {I}}_{X_n}]^\text {T}\), \({\mathbf {R}}(t)=[r_1(t),\ldots ,r_n(t)]^\text {T}\), and \({\mathbf {x}}(t)=[x_1(t),\ldots ,x_n(t)]^\text {T}\) which yields
$$\begin{aligned} \frac{d}{dt}{\mathbf {x}}(t) =\; \mathbf {{\mathcal {I}}_X}(t) + {\mathbf {P}}_X(t)^\text {T}\,({\mathbf {R}}(t)\circ {\mathbf {x}}(t)) - {\mathbf {R}}(t)\circ {\mathbf {x}}(t). \end{aligned}$$
(69)
where \(\circ \) indicates the Hadamard (element-wise) product.
Proof
The proof of the theorem above follows directly from applying Theorem 4 to each sub-state. \(\square \)
Example 8
(Dwell time given by the maximum of independent Erlang random variables) We here illustrate how the GLCT can provide a conceptually simpler framework for deriving ODEs relative to derivation from mean field integral equations by assuming the X dwell time obeys the maximum of multiple Erlang distributions. While the survival function for this distribution is not straightforward to write down, it is fairly straightforward to construct a Markov Chain that yields such a dwell time distribution (see Fig. 9).
Recall that, in Sect. 3.5.4, we considered a dwell time given by the minimum of N Erlang distributions. Here we instead consider the case where the dwell time distribution is given by the maximum of multiple Erlang distributions, \(T=\max (T_1,T_2)\) where \(T_i\sim \)Erlang(\(r_i,2\)). For simplicity, assume the dwell time in a single recipient state Y is exponential with rate \(\mu \). We again partition X according to which events (under the two independent homogeneous Poisson processes associated with each of \(T_1\) and \(T_2\)) particles are awaiting, and index those sub-states accordingly (see Fig. 9). These sub-states are X\(_{11}\), X\(_{21}\), X\(_{12}\), X\(_{*1}\), X\(_{22}\), X\(_{1*}\), X\(_{*2}\), and X\(_{2*}\), where a ‘\(*\)’ in the ith index position indicates that particles in that sub-state have already had the ith Poisson process reach the \(k_i\)th event (in this case, the 2nd event). Each such sub-state has exponentially distributed dwell times, but rates for these dwell time distributions differ (unlike the cases in Sect. 3.5.4 where all sub-states had the same rate): the Poisson process rates for sub-states X\(_{11}\), X\(_{21}\), X\(_{12}\), and X\(_{22}\) are \(r=r_1+r_2\) (see Fig. 9 and compare to Theorem 6 and Fig. 6), but the rate for the states X\(_{1*}\) and X\(_{2*}\) (striped circles in Fig. 9) are \(r_1\) , and for X\(_{*1}\) and X\(_{*2}\) (shaded circles in Fig. 9) are \(r_2\).
In the context of the GLCT, let \({\mathbf {x}}(t) = \)[\(x_{11}(t)\), \(x_{21}(t)\), \(x_{12}(t)\), \(x_{*1}(t)\), \(x_{22}(t)\), \(x_{1*}(t)\), \(x_{*2}(t)\), \(x_{2*}(t)]^\text {T}\) then by the assumptions above \({\mathbf {R}}(t)=\)[r, r, r, \(r_2\), r, \(r_1\), \(r_2\), \(r_1]^\text {T}\), \(\varvec{\rho }(t)=[1,0,\ldots ,0]^\text {T}\) and hence \({\mathcal {I}}_{\mathbf {X}}(t)=[{\mathcal {I}}_X(t),0,\ldots ,0]^\text {T}\). Denote \(p_1\equiv r_1/r\) and \(p_2\equiv r_2/r\) (à la Theorem 7 in Sect. 3.5.5). Then the first eight rows of \(9\times 9\) matrix \({\mathbf {P}}\) are given by
Thus, by the GLCT (Theorem 10), the corresponding mean field ODEs are
$$\begin{aligned} \frac{d}{dt}x_{11}(t)&=\; {\mathcal {I}}_X(t) - r\,x_{11}(t) \end{aligned}$$
(71a)
$$\begin{aligned} \frac{d}{dt}x_{21}(t)&=\; r_1\,x_{11}(t) - r\,x_{21}(t) \end{aligned}$$
(71b)
$$\begin{aligned} \frac{d}{dt}x_{12}(t)&=\; r_2\,x_{11}(t) - r\,x_{12}(t) \end{aligned}$$
(71c)
$$\begin{aligned} \frac{d}{dt}x_{*1}(t)&=\; r_1\,x_{21}(t) - r_2\,x_{*1}(t) \end{aligned}$$
(71d)
$$\begin{aligned} \frac{d}{dt}x_{22}(t)&=\; r_2\,x_{21}(t) + r_1\,x_{12}(t) - r\,x_{22}(t) \end{aligned}$$
(71e)
$$\begin{aligned} \frac{d}{dt}x_{1*}(t)&=\; r_2\,x_{12}(t) - r_1\,x_{1*}(t) \end{aligned}$$
(71f)
$$\begin{aligned} \frac{d}{dt}x_{*2}(t)&=\; r_2\,x_{*1}(t) + r_1\,x_{22}(t) - r_2\,x_{*2}(t) \end{aligned}$$
(71g)
$$\begin{aligned} \frac{d}{dt}x_{2*}(t)&=\; r_1\,x_{1*}(t) + r_2\,x_{22}(t) - r_1\,x_{2*}(t) \end{aligned}$$
(71h)
$$\begin{aligned} \frac{d}{dt}y(t)&=\; r_1\,x_{2*}(t) + r_2\,x_{*2}(t) - \mu \,y(t). \end{aligned}$$
(71i)
GLCT for phase-type distributions
The GLCT above extends the LCT to a very flexible family of dwell time distributions known as (continuous) phase-type distributions (Asmussen et al. 1996; Pérez and Riaño 2006; Osogami and Harchol-Balter 2006; Thummler et al. 2006; Reinecke et al. 2012a; Horváth et al. 2012; Komárková 2012; Horváth et al. 2016; Okamura and Dohi 2015; Horváth and Telek 2017), i.e., the hitting time distributions for Continuous Time Markov Chains (CTMC). These CTMC hitting time distributions include the hypoexponential distributions, hyper-exponential and hyper-Erlang distributions, and generalized Coxian distributions (Reinecke et al. 2012a; Horváth et al. 2016). Importantly, these distributions can be fit to data or can be used to approximate other named distributions (e.g., see Horváth and Telek 2017; Osogami and Harchol-Balter 2006; Altiok 1985; Pérez and Riaño 2006; Komárková 2012; Reinecke et al. 2012b, and related publications). As detailed below, this enables modelers to incorporate a much broader set of dwell time distributions into ODEs than is afforded by the standard LCT.
Consider the assumptions of Theorem 10 above. Assume vectors \(\varvec{\rho }(t)=\varvec{\rho }\) and \({\mathbf {R}}(t)={\mathbf {R}}\), and matrices \({\mathbf {P}}_X(t)={\mathbf {P}}_X\), and \({\mathbf {P}}_Y(t)={\mathbf {P}}_Y\) are all constant. As above, assume the probability of entering states in Y is zero, thus our initial distribution vector for this CTMC (with \(n+m\) states) is fully determined by the (length n) vector \(\varvec{\rho }\). Also assume—just to define the CTMC that describes transitions among transient states X up to (but not after) entering states Y—that each state in Y is absorbing (i.e., \(p_{ii}=1\), \(i>n\)). Then the X dwell time distribution follows the hitting time distribution for a CTMC with initial distribution vector \(\varvec{\rho }\) and (\(n+m\))\(\times \)(\(n+m\)) transition probability matrix
$$\begin{aligned} {\mathbf {P}}=\begin{bmatrix} {\mathbf {P}}_X&\quad {\mathbf {P}}_Y\\ 0&\quad {\mathbf {I}} \\ \end{bmatrix}. \end{aligned}$$
(72)
To clearly state the GLCT for phase-type distributions, we must reparameterize the above CTMC. First, there is an equivalent parameterization of this CTMC which corresponds to thinning the underlying Poisson processes so that we only track transitions between distinct states, and ignore when an individual leaves and instantly returns to it’s current state (this thinned process is often called the embedded jump process).
The rate for the thinned process that determines a particle’s dwell time in transient state i goes from \(r_i\) to
$$\begin{aligned} \lambda _i \equiv r_i\,(1-p_{ii}) \end{aligned}$$
(73)
If \(0 \le p_{ii} < 1\), the transition probabilities out of state i then get normalized to
$$\begin{aligned} {\widetilde{p}}_{ij} \equiv \frac{p_{ij}}{(1-p_{ii})}, \quad \text {for} \; j\ne i,\text { and } {\widetilde{p}}_{ii}=0. \end{aligned}$$
(74)
The rows for absorbing states (Y) remain unchanged, i.e., \({\widetilde{p}}_{ij}=p_{ij}\) for \(i>n\). The resulting transition probability matrix \(\widetilde{{\mathbf {P}}}\) and rate vector \(\varvec{\lambda }\) define the embedded jump process description of the CTMC with transition probability matrix \({\mathbf {P}}\) and rate vector \({\mathbf {R}}\) (initial probability vector \(\varvec{\rho }\) is the same for both representations of this CTMC).
Lastly, this CTMC can again be reparameterized by combining the jump process transition probability matrix \(\widetilde{{\mathbf {P}}}\) and rate vector \(\varvec{\lambda }\) to yield this CTMC’s transition rate matrix (also sometimes called the infinitesimal generator matrix or simply the generator matrix) which is defined as follows. Let matrix \({\mathbf {G}}\) be the same dimension as \({\mathbf {P}}\) (and thus, \(\widetilde{{\mathbf {P}}}\)) and let the first n terms in the diagonal of \({\mathbf {G}}\) be the negative of the jump process rates, \(-\varvec{\lambda }\) (i.e., \(G_{ii}=-\lambda _i\), \(i\le n\)). Let the off diagonal entries of the first n rows of \({\mathbf {G}}\) be the jump process transition probabilities \({\widetilde{p}}_{ij}\) multiplied by the ith rate \(\lambda _i\) (i.e., \(G_{ij}=\lambda _i\,{\widetilde{p}}_{ij}\), \(j\ne i\)). Thus, the first row of \({\mathbf {G}}\) is
$$\begin{aligned} {[}\; -\lambda _1 \quad {\widetilde{p}}_{12}\lambda _1 \quad \cdots \quad {\widetilde{p}}_{1n}\lambda _1 \quad \cdots \quad {\widetilde{p}}_{1,n+m}\lambda _1 \; ] \end{aligned}$$
and so on. Since the transition rates out of absorbing states (e.g., the last m rows of \({\mathbf {G}}\)) are 0, \({\mathbf {G}}\) has the form
$$\begin{aligned} {\mathbf {G}}= \begin{bmatrix} {\mathbf {G}}_X&\quad {\mathbf {G}}_Y\\ 0&\quad 0 \\ \end{bmatrix}. \end{aligned}$$
(75)
Note that \(\widetilde{{\mathbf {P}}}\) and \(\varvec{\lambda }\) can be recovered from \({\mathbf {G}}\) using the definitions above.
This third parameterization of the given CTMC, determined solely by initial distribution \(\varvec{\rho }\) and transition rate matrix \({\mathbf {G}}\), can now to used to formally describe the phase-type distribution associated with this CTMC, i.e., the distribution of time spent in the transient states X before hitting absorbing states Y. Specifically, the phase-type distribution density function and CDF are
$$\begin{aligned} f(t)&=\; \varvec{\rho }^\text {T} \, e^{t\,{\mathbf {G}}_X}(-{\mathbf {G}}_X{\mathbf {1}}) \end{aligned}$$
(76a)
$$\begin{aligned} F(t)&=\; 1 - \varvec{\rho }^\text {T} \, e^{t\,{\mathbf {G}}_X}\,{\mathbf {1}} \end{aligned}$$
(76b)
where \({\mathbf {1}}\) is a \(n\times 1\) vector of ones. Importantly, this distribution depends only on the \(n \times n\) matrix \({\mathbf {G}}_X\) and length n initial distribution vector \(\varvec{\rho }\).
The GLCT for phase-type distributions can now be stated as follows.
Corollary 2
(GLCT for phase-type distributions) Assume particles enter state X at rate \({\varLambda }(t)\) and that the dwell time distribution for a state X follows a continuous phase-type distribution given by the \(n \times 1\) initial probability vector \(\varvec{\rho }\) and \(n\times n\) matrix \({\mathbf {G}}_X\). Let \({\mathcal {I}}_{X_i}(t) = \rho _i{\varLambda }(t)\) and \(\mathbf {{\mathcal {I}}_X}(t) = [{\mathcal {I}}_{X_1}, \ldots , {\mathcal {I}}_{X_n}]^\text {T}\). Then Eq. (69) in Theorem 10 becomes
$$\begin{aligned} \frac{d}{dt}{\mathbf {x}}(t) =\; {\mathcal {I}}_{\mathbf {X}}(t) + {\mathbf {G}}_X^\text {T}\,{\mathbf {x}}(t). \end{aligned}$$
(77)
Example 9
(Serial LCT and hypoexponential distributions) Assume the dwell time in state X is given by the sum of independent (not identically distributed) Erlang distributions or, more generally, Poisson process \(k_i\)th event time distributions with rates \(r_i(t)\), i.e., \(T=\sum _i T_i\), \(i=1,\ldots ,N\) (note the special case where all \(k_i=1\) and \(r_i(t)=r_i\) are constant, which yields that T follows a hypoexponential distribution). Let \(n=\sum _i k_i\) and further assume particles go to \(Y_\ell \) with probability \(p_{\ell }\) upon leaving X, \(\ell =1,\ldots ,m\). Using the GLCT framework above, this corresponds to partitioning X into sub-states X\(_j\), where \(j=1,\ldots ,n\), and
$$\begin{aligned} {\mathbf {R}}(t)=[r_1(t),r_1(t),\ldots ,r_2(t),\ldots ,r_n(t)]^\text {T} \end{aligned}$$
(78)
where the first \(k_1\) elements of \({\mathbf {R}}(t)\) are \(r_1(t)\), the next \(k_2\) are \(r_2(t)\), etc., and
$$\begin{aligned} {\mathbf {P}}_X=\begin{bmatrix} 0&\quad 1&\quad 0&\quad \cdots&\quad 0&\quad 0 \\ 0&\quad 0&\quad 1&\quad \cdots&\quad 0&\quad 0 \\ \vdots&\quad \vdots&\quad \ddots&\quad \ddots&\quad \vdots&\quad \vdots \\ 0&\quad 0&\quad 0&\quad \ddots&\quad 1&\quad 0 \\ 0&\quad 0&\quad 0&\quad \cdots&\quad 0&\quad 1 \\ 0&\quad 0&\quad 0&\quad \cdots&\quad 0&\quad 0 \\ \end{bmatrix}_{n\times n}, \qquad {\mathbf {P}}_Y =\begin{bmatrix} 0&\quad 0&\quad \cdots&\quad 0 \\ \vdots&\quad \vdots&\quad \ddots&\quad \vdots \\ 0&\quad 0&\quad \cdots&\quad 0 \\ p_1&\quad p_2&\quad \cdots&\quad p_m \end{bmatrix}_{n\times \ell }. \end{aligned}$$
(79)
By the GLCT (Theorem 10), using \(r_{(j)}(t)\) to denote the jth element of \({\mathbf {R}}(t)\) in Eq. (78), the corresponding mean field equations are
$$\begin{aligned} \frac{d}{dt}x_1(t)&=\; {\mathcal {I}}_X(t) - r_1(t)\,x_1(t) \end{aligned}$$
(80a)
$$\begin{aligned} \frac{d}{dt}x_j(t)&=\; r_{(j-1)}(t)\,x_{j-1}(t) - r_{(j)}(t)\,x_j(t), \text { for } j\ge 1, \end{aligned}$$
(80b)
$$\begin{aligned} y_\ell (t)&=\; y_\ell (0)S_{Y_\ell }(t,0) + \int _{0}^{t} \bigg ( {\mathcal {I}}_{Y_\ell }(\tau ) \nonumber \\&\quad + \sum _{j=1}^{m} r_{(j)}(t)\,x_j(\tau )\,p_{j}(t) \bigg ) S_{Y_\ell }(t,\tau )\,d\tau . \end{aligned}$$
(80c)
Note, the phase-type distribution form of the hypoexponential distribution could be used with Corollary 2 to derive the \(x_i\) equations in Eq. (9).
Example 10
(SIR model with Phase-Type Duration of Infectiousness.) Consider the simple SIR model given by Eq. (2) with mass action transmission \(\lambda (t)=\beta \,I(t)\). Suppose the assumed exponentially distributed dwell times in state I were to be replaced by a phase-type distribution with initial distribution vector \(\varvec{\alpha }\) and matrix \({\mathbf {A}}\) (note that, in some cases, it is possible to match the first three or more moments using only a \(2\times 2\) or \(3\times 3\) matrix \({\mathbf {A}}\) and note that Matlab and Python routines for making such estimates are freely available in Horváth and Telek 2017). Then, letting \({\mathbf {x}}\) be the vector of substates of I and \(I(t)=x_1(t)+\cdots +x_n(t)\), by the GLCT for phase-type distributions (Corollary 2) the corresponding mean field ODE model, with \(R(t)=N_0-S(t)-I(t)\), is
$$\begin{aligned} \frac{d}{dt}S(t)&=\; -\beta \,I(t)\,S(t) \end{aligned}$$
(81a)
$$\begin{aligned} \frac{d}{dt}{\mathbf {x}}(t)&=\; \varvec{\alpha }\,\beta \,I(t)\,S(t) + {\mathbf {A}}^\text {T}\,{\mathbf {x}}(t). \end{aligned}$$
(81b)