1 Introduction

1.1 Background

The majority of epidemic models fall either into the category of stochastic models (Bailey 1975; Bartlett 1956) or into the category of deterministic differential equation-based models (Anderson and May 1991; Kermack and McKendrick 1927). These two strands developed largely independently for much of the twentieth century. Thus, an interesting question arises as to the precise mathematical connection between stochastic and deterministic models. Frequently, deterministic descriptions apply to large populations where the stochastic effects can be treated as negligible. For small populations we shall assume that it is the average or expected behaviour of the epidemic that we are hoping to replicate with “deterministic” descriptions. This average behaviour is a system characteristic that is fully specified by the system and its initial conditions.

The first epidemic models were based on the assumption that populations are evenly mixed, with each individual equally likely to interact with any other individual at any time (Heathcote 2000). A classic example of this type of model is the Susceptible-Infectious-Removed (SIR) compartmental model whereby individuals are classified according to being in one of these three states. It has been shown that for this type of mean-field model, the average of many stochastic simulations (the expected outcome of the stochastic model) converges to the solution of the “equivalent” mean-field deterministic model in the limit of an infinite population size and subject to strict conditions regarding the initialisation of the epidemic (Kurtz 1970, 1971; Simon and Kiss 2011).

More recently, a higher degree of realism has been introduced by considering stochastic models on contact networks where individuals are only able to contact a limited subset of the population. This enables significant heterogeneity to be incorporated, treating individuals as distinct entities with fixed connectivity to pre-allocated neighbours. While stochastic models are readily extended to incorporate such systems, deterministic descriptions have been more problematic. Several methodologies have been developed including pair-approximation models (Keeling 1999; Keeling and Eames 2005; Rand 1999), degree-based models (Pastor-Satorras and Vespignani 2001), and models based on the probability generating function (PGF) formalism which are applicable to configuration networks (Volz 2008) as well as the related edge-based compartmental modelling (Miller et al. 2012; Miller and Volz 2012). It has been observed (House and Keeling 2011) that these models are, at some level, equivalent and are all derived from similar principles of independence. Although comparison with simulation of stochastic models can sometimes be good, the basic link remains obscure.

Typically there are two idealised scenarios in which exact correspondence between stochastic models and solvable deterministic descriptions has been shown. Firstly, correspondence has been shown to sometimes occur in the limit of infinite populations for particular idealised graphs (Ball and Neal 2008; Decreusefond et al. 2012) which cannot be exactly realised in practice. It can also occur with some very simplified systems whose symmetry properties can be exploited to achieve reductions in the stochastic description (Keeling and Ross 2008; Simon et al. 2011).

Here we consider a recently introduced class of model, related to the pair-approximation models, which give an exact correspondence between a deterministic description and the stochastic model for SIR epidemics on finite, time-invariant networks. Pair-approximation models were introduced into network-based epidemic and ecological theory in the 1990s to describe large populations of interacting individuals (Matsuda et al. 1992; Sato et al. 1994; Harada and Iwasa 1994; Rand 1999; Keeling 1999). They are an example of a hierarchy of equations which are truncated at the second order by an approximation (truncation at the first order corresponds to mean-field). This type of hierarchy was first considered in statistical physics and is sometimes known as the Bogoliubov–Born–Green–Kirkwood–Yvon (BBGKY) hierarchy (Kirkwood 1946, 1947; Born and Green 1946). Recently, related models have been considered at the level of individuals, variously called subsystem equations, moment dynamics equations, pair-based equations (Sharkey 2008, 2011; Baker and Simpson 2010; Markham et al. 2013). This method generates a solvable class of models which can encompass a significant amount of heterogeneity and enables a fundamental link with finite stochastic models (Sharkey 2008, 2011).

We consider a pair-based representation of Markovian SIR dynamics. We show that by considering subsystems at the level of pairs, a closure can be found that determines the expected infectious time series exactly for arbitrary network structures where the underlying graph is a tree and, in some special circumstances, for particular networks with cycles. We note that the recent, related message passing formulation of epidemics on contact networks developed by Karrer and Newman (2010) also enables an exact description of epidemic dynamics on finite tree graphs.

1.2 Statement of the Main Result

We consider an SIR compartmental model composed of P individuals whose states are described at any given point in time by vectors I and S with respective components I i and S i , i∈{1,2,…,P}, such that I i =1 if individual i is infectious (I i =0 otherwise) and S i =1 if individual i is susceptible (S i =0 otherwise). Transmission and recovery occur by Poisson processes with rate parameters \(\lambda_{i}=\sum_{j=1}^{P}T_{ij}I_{j}S_{i}\) and μ i =γ i I i , respectively, where T is a “transmission” matrix with (time-independent) elements T ij denoting the rate parameter for an infectious node j infecting a susceptible node i (T ii =0 for all i) and where γ i denotes the rate parameter for an infectious individual i to recover, enabling individual-specific removal rates.

As shown by Sharkey (2011), for any transmission matrix T and any nodes i,j, the following differential equations are provably exact (consistent with the stochastic model):

$$\begin{aligned} \begin{aligned} \dot{\langle S_i\rangle}&= -\sum _{j=1}^P T_{ij}\langle S_iI_j\rangle , \\ \dot{\langle I_i\rangle}&= \sum _{j=1}^P T_{ij}\langle S_iI_j\rangle-\gamma_i\langle I_i\rangle, \\ \dot{\langle S_iI_j\rangle}&= \sum _{k=1,k\neq i}^P T_{jk}\langle S_iS_jI_k\rangle-\sum _{k=1, k\neq j}^P T_{ik}\langle I_kS_iI_j\rangle -T_{ij}\langle S_iI_j\rangle- \gamma_j\langle S_iI_j\rangle, \\ \dot{\langle S_iS_j\rangle}&=-\sum _{k=1, k\neq j}^PT_{ik}\langle I_kS_iS_j\rangle-\sum _{k=1, k\neq i}^PT_{jk}\langle S_iS_jI_k\rangle, \end{aligned} \end{aligned}$$
(1)

where 〈S i 〉 and 〈I i 〉 denote the time-dependent probabilities (or equivalently the expected values of the indicator functions) for individual i to be susceptible and infectious, respectively, and expressions of the form 〈A i B j 〉 denote the time-dependent probability that individual i is in state A and individual j is in state B with a similar interpretation of terms of the form 〈A i B j C k 〉. Here and throughout, we adopt the dot notation to denote time derivatives. It follows that the expected population-level susceptible and infectious time series are given by \(\sum_{i=1}^{P}\langle S_{i}\rangle\) and \(\sum_{i=1}^{P}\langle I_{i}\rangle\), respectively.

Note that it is a short step (see Sharkey 2008) from (1) to the familiar population-level pair equations (Keeling 1999; Keeling and Eames 2005; Rand 1999), also proved independently by Taylor et al. (2012) for the susceptible-infectious-susceptible variant.

This system can be completed by formulating differential equations for the triples, quadruples, and so forth until we reach the full system size. This yields a self-contained system of differential equations that exactly determines the probabilities of each quantity given initial conditions. However, cascading these equations up to the full system size will usually result in a system that is impractical to solve due to its sheer size. This is why this system is typically closed at some level by introducing a functional relation approximating higher-order probabilities in terms of lower-order ones. One of the most frequently used closure relations can be written as

$$\begin{aligned} \langle A_iB_jC_k\rangle\approx \frac{\langle A_iB_j\rangle\langle B_jC_k\rangle}{\langle B_j\rangle} \end{aligned}$$
(2)

for the current context. Applying this closure relation to our system at the level of pairs we arrive at the following system:

$$\begin{aligned} \begin{aligned} \dot{\langle X_i\rangle}&= -\sum _{j=1}^P T_{ij}\langle X_iY_j\rangle , \\ \dot{\langle Y_i\rangle}&= \sum _{j=1}^P T_{ij}\langle X_iY_j\rangle-\gamma_i\langle Y_i\rangle, \\ \dot{\langle X_iY_j\rangle}&= \sum _{k=1,k\neq i}^P T_{jk}\frac {\langle X_iX_j\rangle\langle X_jY_k\rangle}{\langle X_j\rangle }-\sum _{k=1,k\neq j}^P T_{ik} \frac{\langle X_iY_k\rangle\langle X_iY_j\rangle}{\langle X_i\rangle}\\ &\quad\ {} -T_{ij}\langle X_iY_j\rangle- \gamma_j\langle X_iY_j\rangle, \\ \dot{\langle X_iX_j\rangle}&= -\sum _{k=1,k\neq j}^PT_{ik}\frac {\langle Y_kX_i\rangle\langle X_iX_j\rangle}{\langle X_i\rangle}-\sum _{k=1,k\neq i}^PT_{jk} \frac{\langle X_iX_j\rangle \langle X_jY_k\rangle}{\langle X_j\rangle}, \end{aligned} \end{aligned}$$
(3)

where we use X for susceptible and Y for infectious to emphasise that these are approximating differential equations based on the closure. When 〈X i 〉 in the denominator is zero, we assume that the approximation takes the value zero.

In general, we consider networks (graphs) with directed and undirected edges. In what follows, we use the terminology “tree graph” to include graphs with directed edges where the underlying (equivalent undirected) graph is a tree. Our main aim is to show that when matrix T represents a tree and the system is initiated in a pure system state (that is, one of the 3P possible configurations has probability 1 at time t=0), then the system can be closed at the level of pairs such that the closure holds exactly. Specifically, solving the closed system above, we obtain the same values for all marginal and pairwise-joint probabilities present in the unclosed system: 〈X i 〉=〈S i 〉, 〈Y i 〉=〈I i 〉, with similar equalities holding for the pairs.

In fact, we will prove the following theorem.

Theorem 1.1

Let us assume the following:

  • The graph (transmission network) is a tree (the underlying graph has no cycles).

  • The initial condition is a pure state, i.e. the system is initially in one of its 3P possible configurations with probability 1.

Then the following relations hold:

$$\begin{aligned} \langle S_j\rangle \langle S_iS_jI_k \rangle=\langle S_iS_j\rangle \langle S_jI_k\rangle \end{aligned}$$

for all i∈{1,2,…,P} and for all j with links towards i and all k with links towards j : ik;

$$\begin{aligned} \langle S_i\rangle \langle I_kS_iI_j \rangle=\langle I_kS_i\rangle \langle S_i I_j\rangle \end{aligned}$$

for all i∈{1,2,…,P} and for all j and k with links towards i: jk.

Remark

This theorem also holds for mixed (probabilistic) initial system states provided that the initial probabilities of the states of individuals in the system are uncorrelated. However, in general, mixed initial states cannot be represented exactly.

The theorem will be formulated in a more general context stating that even higher-order closure relations are also exact.

Figure 1 shows the numerical solution of (3) for a small network of 9 nodes where it is clear that it is accurate to within the precision visible on the graph. Matlab code for solving the system of Eqs. (3) is provided in Sharkey (2011). This code also works on networks which are not trees but is no longer exact in these cases. Cycles in the underlying graph of order three utilise the alternative closure 〈A i B j C k 〉=〈A i B j 〉〈B j C k 〉〈A i C k 〉/〈A i 〉〈B j 〉〈C k 〉 which is believed to gain increased accuracy in most circumstances, but these do not occur in the tree graphs considered in the present work.

Fig. 1
figure 1

(a) An undirected tree indicating two nodes which we infect to initiate epidemics, with all other nodes initially susceptible. (b) The mean (dots) of 100,000 stochastic simulations on the network with transmission rate τ=0.1 across each link and removal rate γ=0.05 for each node, with error bars denoting the 5th and 95th percentiles plotted together with the solution of (3) (solid line) using the Matlab code published with Sharkey (2011)

The structure of the paper is as follows. Section 2 introduces some notation which is needed to prove the result. This also contains an important theorem (Theorem 2.1) which specifies equations describing the probabilities of the states of arbitrary subsystems (the proof of this result is given in the Appendix). The relevant state space for our domain of a tree graph is then developed. Section 3 proves the main result, initially focusing on some special cases to help motivate and facilitate understanding of the main ideas and steps of the general proof in Sect. 3.5. The main ingredient for the general proof is Lemma 3.1 which is proved via Theorem 2.1. Theorem 3.1 then follows easily by induction from Lemma 3.1. The theorem as stated above is a simple corollary of Theorem 3.1. In Sect. 4 we discuss an application of the pair-based model to some special cases of graphs with cycles where it is also exact.

2 Formulating the Full System

In this section we introduce a new notation which will assist in formulating the set of differential equations for the full system. In (1) we formulated the differential equations up to the level of pairs and we said that this could be continued up to the full system level. This will be done formally here. In order to make the method clearer, using our existing notation let us first evaluate the full set of equations for the undirected line graph with three nodes which we refer to as the open triple, depicted in Fig. 2. Here we shall assume that the transmission rate parameter is τ across both links and that the removal rate is γ for all three nodes. Firstly we write all of the single node equations for this network. From (1):

$$\begin{aligned} \begin{aligned} \dot{\langle I_1\rangle}&= \tau\langle S_1I_2\rangle-\gamma\langle I_1\rangle, \\ \dot{\langle I_2\rangle}&= \tau\langle I_1S_2\rangle+\tau\langle S_2I_3 \rangle-\gamma\langle I_2\rangle, \\ \dot{\langle I_3\rangle}&= \tau\langle I_2S_3 \rangle-\gamma\langle I_3\rangle, \end{aligned} \end{aligned}$$
(4)

and

$$\begin{aligned} \begin{aligned} \dot{\langle S_1\rangle}&= -\tau\langle S_1I_2\rangle, \\ \dot{\langle S_2\rangle}&= -\tau\langle I_1S_2\rangle-\tau\langle S_2I_3 \rangle, \\ \dot{\langle S_3\rangle}&= -\tau\langle I_2S_3 \rangle. \end{aligned} \end{aligned}$$
(5)

We also need to specify the following equations for pairs:

$$\begin{aligned} \begin{aligned} \dot{\langle S_1I_2\rangle}&= \tau\langle S_1S_2I_3\rangle- \tau \langle S_1I_2\rangle-\gamma\langle S_1I_2 \rangle, \\ \dot{\langle I_1S_2\rangle}&= -\tau\langle I_1S_2I_3\rangle-\tau \langle I_1S_2\rangle-\gamma\langle I_1S_2 \rangle, \\ \dot{\langle S_2I_3\rangle}&= -\tau\langle I_1S_2I_3\rangle-\tau \langle S_2I_3\rangle-\gamma\langle S_2I_3 \rangle, \\ \dot{\langle I_2S_3\rangle}&= \tau\langle I_1S_2S_3\rangle-\tau \langle I_2S_3\rangle-\gamma\langle I_2S_3 \rangle. \end{aligned} \end{aligned}$$
(6)

Finally, at the triple level we have from the master equation (since the system has only three nodes):

$$\begin{aligned} \begin{aligned} \dot{\langle S_1S_2I_3 \rangle}&= -\tau\langle S_1S_2I_3\rangle - \gamma\langle S_1S_2I_3\rangle, \\ \dot{\langle I_1S_2I_3 \rangle}&= -2\tau\langle I_1S_2I_3\rangle -2\gamma\langle I_1S_2I_3\rangle, \\ \dot{\langle I_1S_2S_3\rangle}&= -\tau \langle I_1S_2S_3\rangle- \gamma\langle I_1S_2S_3\rangle. \end{aligned} \end{aligned}$$
(7)
Fig. 2
figure 2

Open triple graph

In order to formulate the full system for an arbitrary graph, we introduce notation for the subsystem states.

2.1 Notation for System and Subsystem States

In general, our stochastic system (which we denote by Γ) comprises of P individuals, each of which may be in any of the S, I or R states at any given time. In total, this corresponds to 3P possible states. Denoting these system states by Γ α, α∈{1,2,…,3P}, the probabilities for each state are given by the master equation (or Kolmogorov equations):

$$\begin{aligned} \dot{\bigl\langle \varGamma^\alpha\bigr\rangle }=\sum _{\beta=1}^{3^P}\sigma ^{\alpha\beta}\bigl\langle \varGamma^\beta\bigr\rangle -\sum_{\beta =1}^{3^P} \sigma^{\beta\alpha}\bigl\langle \varGamma^\alpha\bigr\rangle , \end{aligned}$$
(8)

where σ denotes a constant matrix of Poisson rate parameters. The master equation completely describes our stochastic system using a set of 3P ordinary differential equations. Our overall objective is to show that (3) is implied by the master equation when T represents a tree graph.

It is useful for us to define a general subsystem ψ W comprising of r nodes in Γ indexed by vector W of length r: W=(W 1,W 2,…,W i ,…,W r ), W i ∈{1,2,…,P}, where we can assume W 1<W 2<⋯<W r . We assume that the network connections of the nodes of ψ W are a subset of the connections of Γ.

Let \(\psi_{W}^{A}\) denote the state of subsystem ψ W where A=(A 1,A 2,…,A r ) and A i ∈{S,I,R} ∀i∈{1,2,…,r} is a sequence of S,I and R symbols of length r such that the state of node W i is A i . In terms of the notation of the previous section for subsystems of single nodes and pairs of nodes, we have \(S_{i}=\psi_{i}^{S}\), \(I_{i}=\psi_{i}^{I}\), \(S_{i}I_{j}=\psi_{i,j}^{SI}\), \(S_{i}S_{j}I_{k}=\psi_{i,j,k}^{SSI}\) etc. We shall use these two notations interchangeably. We shall also sometimes treat indexing vectors such as W as sets such that nW means that the node n is in the subsystem ψ W .

In general, although we can specify the states of each node with this type of notation, an important ambiguity remains because information about the network structure is not included. To remove this ambiguity, the notation should normally be used in the context of a sketch of the relevant network structure or where the network structure is clear from the context of its use (as in (1)).

Let us now show how the differential equations of the different subsystem states can be formulated in general.

2.2 Differential Equations for Subsystems

Here we obtain differential equations describing the rate of change of the state of any subsystem. First we make some definitions.

Definition 2.1

A neighbour of node i is a node with a network link directed towards i.

Definition 2.2

N i denotes the set of neighbours of node i. That is: T ij ≠0 ∀jN i .

Definition 2.3

For the subsystem \(\psi_{W}^{A}\), if node W k is infectious then:

$$\begin{aligned} h_{W_k}\bigl(\psi_W^A\bigr)= \psi_W^{A_1\cdots A_{k-1}SA_{k+1}\cdots A_r}. \end{aligned}$$

Otherwise, \(h_{W_{k}}(\psi_{W}^{A})=\psi_{W}^{A}\)

Remark

This operator changes the state of node W k in subsystem ψ W to S if it is infectious. If node W k is susceptible or removed then it leaves the state unchanged.

Definition 2.4

For the subsystem state \(\psi_{W}^{A}\) of r nodes, a subsystem of r+1 nodes can be generated as follows: Take k∈{1,2,…,r} and take a neighbour n of W k outside of the subsystem with a network link towards W k , i.e. let \(n\in N_{W_{k}}\), nW. If A k =S, then the generated subsystem state of (r+1) nodes is given by the generating rule:

$$\begin{aligned} g^n_{W_k}\bigl(\psi_W^A\bigr)= \psi^{A_1\cdots A_rI}_{W_1,\ldots,W_r,n}, \end{aligned}$$

i.e. the subsystem is extended by an infected at node n which is connected towards W k . If A k =I, then the generated subsystem state is given by:

$$\begin{aligned} g^n_{W_k}\bigl(\psi_W^A\bigr)= \psi ^{A_1\cdots A_{k-1}SA_{k+1}\cdots A_rI}_{W_1,\ldots,W_r,n}, \end{aligned}$$

i.e. the subsystem is extended by an infected at node n which is connected towards W k and the state of node W k is changed from I to S.

To complete the definition, if A k =R then the operator \(g_{W_{k}}^{n}\) leaves the subsystem unchanged. We also assume that for any state A k where there is no link from node n to node W k in the transmission matrix T, then the subsystem is also left unchanged.

Remark

The generated order r+1 subsystem is obtained by replacing a susceptible or infectious node W k in the original subsystem by an SI arc such that the S node of the arc is put in the place of the node W k and where the I node of the arc is external to the subsystem.

Definition 2.5

For the subsystem \(\psi_{W}^{A}\), if node W k is removed then:

$$\begin{aligned} f_{W_k}\bigl(\psi_W^A\bigr)= \psi_W^{A_1\cdots A_{k-1}IA_{k+1}\cdots A_r}. \end{aligned}$$

Otherwise, \(f_{W_{k}}(\psi_{W}^{A})=\psi_{W}^{A}\).

Definition 2.6

For any subsystem ψ W of r nodes in state \(\psi_{W}^{A}\) we define \(D_{k}^{Aa}\) where k∈{1,2,…,r} and a∈{S,I,R} to have value 1 if A k =a and to have value zero otherwise:

$$\begin{aligned} D_k^{Aa}=\left \{ \begin{array}{l@{\quad}l} 1 & \textrm{if $A_{k}=a$}, \\ 0 & \textrm{otherwise}. \end{array} \right . \end{aligned}$$

Theorem 2.1

The rate of change of the probability of a subsystem state \(\psi^{A}_{W}\) is:

(9)

The proof of this theorem is a rather long diversion and can be found in the Appendix.

As an example of applying the theorem, we can use it to obtain the set of subsystem equations (1) by considering each equation in turn:

  • If the subsystem is a single susceptible individual \(\psi_{i}^{S}\), then r=1 so k can only take the value k=1 where W 1=i and A 1=S, reducing (9) to:

    $$\begin{aligned} \dot{\bigl\langle \psi_i^S\bigr\rangle }=-\sum _{n=1,n\neq i}^PT_{in}\bigl\langle \psi_{i,n}^{SI}\bigr\rangle . \end{aligned}$$

    The first term on the second line of (9) is zero because T ii =0, and the other terms are zero because \(D_{1}^{SI}=0\) and \(D_{1}^{SR}=0\).

  • For an infectious individual \(\psi_{i}^{I}\) we obtain:

    $$\begin{aligned} \dot{\bigl\langle \psi_i^I\bigr\rangle }=\sum _{n=1,n\neq i}^PT_{in}\bigl\langle \psi_{i,n}^{SI}\bigr\rangle -\gamma_i\bigl\langle \psi _i^I\bigr\rangle , \end{aligned}$$

    where the first term on the second line of (9) is zero because T ii =0 and the last term is zero because \(D_{1}^{IR}=0\).

  • If the subsystem is the pair \(\psi_{i,j}^{SI}\) then the sum over k is over k=1 and k=2 and W 1=i, W 2=j, A 1=S, A 2=I so:

    $$\begin{aligned} \dot{\bigl\langle \psi_{i,j}^{SI}\bigr\rangle } =&- \sum_{n=1,n\notin\{i,j\} }^PT_{in}\bigl\langle \psi_{n,i,j}^{ISI}\bigr\rangle -T_{ij}\bigl\langle \psi _{i,j}^{SI}\bigr\rangle \\ &{}+\sum_{n=1,n\notin\{i,j\}}^PT_{jn} \bigl\langle \psi _{i,j,n}^{SSI}\bigr\rangle -\gamma_j \bigl\langle \psi_{i,j}^{SI}\bigr\rangle , \end{aligned}$$

    where the first line corresponds to k=1 and the second to k=2.

  • If the subsystem is the pair \(\psi_{i,j}^{SS}\) then the sum is over k=1 and k=2 where W 1=i, W 2=j, A 1=S and A 2=S so:

    $$\begin{aligned} \dot{\bigl\langle \psi_{i,j}^{SS}\bigr\rangle }=-\sum _{n=1,n\notin\{i,j\}}^PT_{in}\bigl\langle \psi_{n,i,j}^{ISS}\bigr\rangle -\sum _{n=1,n\notin\{i,j\}}^PT_{jn}\bigl\langle \psi_{i,j,n}^{SSI}\bigr\rangle , \end{aligned}$$

    where both terms come from the first line of (9).

We have therefore obtained (1) in a slightly different notation (recall that T ii =0 ∀i∈{1,2,…,P}).

2.3 The State Space for a Tree Graph

Here we build up a state space which is sufficient to describe a tree graph. We first make some definitions.

Definition 2.7

An r-motif is a subsystem of Γ comprising of r nodes and of network links such that it forms a weakly connected network.

Definition 2.8

An r-state is the state of an r-motif.

The state space that we need to consider is built up inductively from the states of single nodes by considering the infection process. Starting with the infected states of the single nodes \(\psi_{i}^{I}\), i∈{1,2,…,P}, (9) shows that they depend on the 2-states \(\psi_{i,j}^{SI}\), jN i as described by the generating rule (Definition 2.4).

The differential equations for \(\langle\psi_{i,j}^{SI}\rangle\) in turn contain the 3-states \(\psi_{i,j,k}^{SSI}\), kN j and \(\psi_{k,i,j}^{ISI}\), kN i . The differential equations for the 3-states contain 4-states and typically, the differential equations for r-states contain (r+1)-states for r∈{1,2,…,(P−1)}. This state generation process can continue until we reach P-states which can only depend on other P-states.

Note that this process always forms subsystems which are motifs and that the motif states can never include removed nodes.

Definition 2.9

An out-neighbour of node i is a node with a network link from i towards it.

Proposition 2.1

For a tree graph, if the out-neighbours of the I nodes are all S in an r-motif, then this is true for all (r+1)-motifs generated from this r-motif.

Proof

This follows easily from the definition of the generating rule (Definition 2.4). □

Definition 2.10

Consider a tree graph and take the 1-motifs with I nodes: \(\psi_{i}^{I}, i\in\{ 1,2,\ldots,P\}\). The “basic state space” M is formed by these 1-states together with the set of motif states that can be iteratively generated from them using the generating rule (Definition 2.4).

Remark

Due to the method of its construction, the state space M gives a self-contained system of differential equations, i.e. the time derivatives of the probabilities of each motif state can be expressed in terms of the probabilities of other motif states in the state space. An example in the case of the open triple is given by the motifs in (4), (6) and (7).

Definition 2.11

Consider a tree graph and the 1-states: \(\{\psi_{i}^{I},\psi_{i}^{S} : i\in\{ 1,\ldots,P\}\}\) and the 2-states with SS, i.e. \(\{\psi_{i,j}^{SS} : i\in\{1,\ldots,P\}, j\in N_{i}\}\). The “extended state space” \(\bar{M}\) comprises of these motifs states together with the set of motif states that can be generated from them by repeated iteration of the generating rule.

Remark

The extended state space is required to form the relevant closure relations. Due to the method of its construction, it is also self-contained.

Lemma 2.1

Let \(\psi_{W}^{A}\in\bar{M}\). Then the out-neighbour of an I node is an S node in \(\psi_{W}^{A}\).

Proof

Follows from Proposition 2.1. □

Lemma 2.2

For a tree graph, the equation for the time derivative of the probability of an r-state \(\psi_{W}^{A}\in\bar{M}\) is given by:

(10)

Proof

For these states we have \(D_{k}^{AR}=0\ \forall k\in\{1,2,\ldots,r\}\). Additionally, when \(D_{k}^{AI}=1\), the first term on the second line of (9) never arises because \(D_{l}^{AI}=1\) implies that an I is connected to an I node in r-state \(\psi_{W}^{A}\) which contradicts Lemma 2.1. Therefore (9) reduces to (10). □

Let us now formulate the exact closure relations and prove our main result.

3 Closure Relation and Proof of the Main Result

The exactness of (3) is straightforward to see provided that outbreaks of epidemics are always initiated with a single infected individual. We prove this first before considering the general case.

3.1 Proof for Single Initial Infected

When infection is initiated on a tree graph at a single individual, infection must always proceed in linear chains. Consequently there is no possibility of the state I k S j I i illustrated in Fig. 3 arising because an infection initiated at either k or i must pass through j to get to the other node. Furthermore,

$$\begin{aligned} \langle S_jI_k\rangle=\langle S_iS_jI_k \rangle +\langle I_iS_jI_k\rangle+\langle R_iS_jI_k\rangle, \end{aligned}$$

but since 〈I i S j I k 〉=0 and consequently 〈R i S j I k 〉=0, we have:

$$\begin{aligned} \langle S_jI_k\rangle=\langle S_iS_jI_k \rangle \end{aligned}$$

reducing (1) to the following closed system:

$$\begin{aligned} \dot{\langle S_i\rangle} =&-\sum _{j=1}^P T_{ij}\langle S_iI_j\rangle , \\ \dot{\langle I_i\rangle} =&\sum _{j=1}^P T_{ij}\langle S_iI_j\rangle-\gamma_i\langle I_i\rangle, \\ \dot{\langle S_iI_j\rangle} =&\sum _{k=1, k\neq i}^P T_{jk}\langle S_jI_k\rangle-T_{ij}\langle S_iI_j\rangle-\gamma_j\langle S_iI_j\rangle, \\ \dot{\langle S_iS_j\rangle} =&-\sum _{k=1, k\neq j}^PT_{ik}\langle I_kS_i\rangle-\sum_{k=1,k\neq i}^PT_{jk} \langle S_jI_k\rangle. \end{aligned}$$

Similar arguments show that this can be written in the form of (3).

Fig. 3
figure 3

Shown is a state which cannot arise on a tree graph where there is only one initially infectious node

More generally, this argument also applies to any tree graph where there is at most one network path by which any susceptible individual in the network can become infectious from the initial configuration of infected individuals.

Before discussing the general proof for any tree graph with multiple initially infected individuals, we consider two very simple example networks which will serve to motivate and illustrate the method of proof.

3.2 Proof for an Open Triple

Here we consider the case for the open triple depicted in Fig. 2. The equations for the probabilities of the basic state space M are given in (4), (6) and (7). To form the relevant closure relations, we require the equations for the extended state space \(\bar{M}\) formed by the equations for M together with (5),

$$\begin{aligned} \dot{\langle S_1S_2\rangle}=-\tau\langle S_1S_2I_3\rangle\quad \textrm{and}\quad\dot{\langle S_2S_3\rangle}=-\tau\langle I_1S_2S_3\rangle. \end{aligned}$$
(11)

Our objective is to close the system at the level of pairs using the closure relation (2), eliminating the need for differential equations describing triples (7), and show that the system remains exact. We note that the exactness of (3) can be proved in this case along the lines of the previous argument by considering each possible initial condition separately; however, the approach discussed here will be more useful for understanding the general case.

We need to consider closures for the triples 〈I 1 S 2 I 3〉, 〈I 1 S 2 S 3〉 and 〈S 1 S 2 I 3〉. Let us consider the closure:

$$\begin{aligned} \langle I_1S_2I_3\rangle\approx \frac{\langle I_1S_2\rangle\langle S_2I_3\rangle}{\langle S_2\rangle}. \end{aligned}$$

This is exact if α(t)=0 where

$$\begin{aligned} \alpha(t)=\langle S_2\rangle \langle I_1S_2I_3 \rangle-\langle I_1S_2\rangle \langle S_2I_3\rangle \end{aligned}$$

and 〈S 2〉≠0. Taking the derivative of α with respect to time gives

$$\begin{aligned} \dot{\alpha}(t)=\dot{\langle S_2\rangle}\langle I_1S_2I_3 \rangle+\langle S_2\rangle\dot{\langle I_1S_2I_3 \rangle}-\dot{\langle I_1S_2\rangle}\langle S_2I_3\rangle -\langle I_1S_2 \rangle\dot{\langle S_2I_3\rangle}. \end{aligned}$$

Substituting the relevant derivatives in from (5)–(7) and cancelling terms reduces this to

$$\begin{aligned} \dot{\alpha}(t)=-2(\tau+\gamma)\alpha(t) , \end{aligned}$$

so:

$$\begin{aligned} \alpha(t)=\alpha(0)e^{-2(\tau+\gamma)t}. \end{aligned}$$

Now it is easily verified that provided the system is initiated in a specific system state then α(0)=0. Consequently α(t)=0 for all t≥0 and the closure is exact.

By symmetry, it will suffice to consider one of the remaining two triples in (6). We wish to show that α(t)=0 where

$$\begin{aligned} \alpha(t)=\langle S_2\rangle \langle S_1S_2I_3 \rangle -\langle S_1S_2\rangle \langle S_2I_3\rangle. \end{aligned}$$

Here it is necessary to also use (11) for pairs of type SS in the extended state space. This closure is not established immediately, but there is a two-step process to establishing that α(t)=0 which the reader can verify by analogy with the example of the star graph in the next section.

3.3 Proof for a Star Graph

We now consider the case of the undirected star graph with P=4 shown in Fig. 4, where again we assume that the strength is the same across each network link and is denoted by τ and the removal rate for each node is γ. Writing down the equations of the extended state space, there are two types of closure which need to be proved: one for the SSI triples and one for the ISI triples (see (1)). The graph has three triples ((1,4,3),(2,4,3),(1,4,2)), but it is sufficient to prove exactness for one of them. Hence we want to prove the following two relations:

$$\begin{aligned} \begin{aligned} \langle S_4\rangle \langle S_1I_3S_4 \rangle&= \langle S_1S_4\rangle \langle I_3S_4\rangle, \\ \langle S_4\rangle \langle I_1I_3S_4 \rangle&= \langle I_1S_4\rangle \langle I_3S_4\rangle. \end{aligned} \end{aligned}$$
(12)

For brevity, we adopt the alternative notation:

$$\begin{aligned} \bigl\langle \psi_4^S\bigr\rangle \bigl\langle \psi_{1,3,4}^{SIS}\bigr\rangle -\bigl\langle \psi_{1,4}^{SS} \bigr\rangle \bigl\langle \psi_{3,4}^{IS}\bigr\rangle =&0, \\ \bigl\langle \psi_4^S\bigr\rangle \bigl\langle _{1,3,4}^{IIS}\bigr\rangle -\bigl\langle \psi _{1,4}^{IS}\bigr\rangle \bigl\langle \psi_{3,4}^{IS} \bigr\rangle =&0. \end{aligned}$$

We introduce

$$\begin{aligned} \alpha_1=\bigl\langle \psi_4^S\bigr\rangle \bigl\langle \psi_{1,3,4}^{SIS}\bigr\rangle -\bigl\langle \psi_{1,4}^{SS}\bigr\rangle \bigl\langle \psi_{3,4}^{IS} \bigr\rangle . \end{aligned}$$

By differentiating this, substituting in from the process equations and grouping terms, we obtain

$$\begin{aligned} \dot{\alpha}_1=-(\tau+\gamma)\alpha_1-\tau \alpha_2-\tau\alpha _3-\tau\alpha_4, \end{aligned}$$
(13)

where:

$$\begin{aligned} \alpha_2 =&\bigl\langle \psi_{1,4}^{IS} \bigr\rangle \bigl\langle \psi _{1,3,4}^{SIS}\bigr\rangle -\bigl\langle \psi_{1,4}^{SS}\bigr\rangle \bigl\langle \psi _{1,3,4}^{IIS}\bigr\rangle , \\ \alpha_3 =&\bigl\langle \psi_{2,4}^{IS} \bigr\rangle \bigl\langle \psi _{1,3,4}^{SIS}\bigr\rangle -\bigl\langle \psi_{1,4}^{SS}\bigr\rangle \bigl\langle \psi _{2,3,4}^{IIS}\bigr\rangle , \\ \alpha_4 =&\bigl\langle \psi_4^S \bigr\rangle \bigl\langle \psi _{1,2,3,4}^{SIIS}\bigr\rangle -\bigl\langle \psi_{1,2,4}^{SIS}\bigr\rangle \bigl\langle \psi_{3,4}^{IS}\bigr\rangle . \end{aligned}$$

Differentiating α 2 we get

$$\begin{aligned} \dot{\alpha}_2=-2(\tau+\gamma)\alpha_2-\tau \alpha_5-\tau\alpha_6, \end{aligned}$$
(14)

where:

$$\begin{aligned} \alpha_5 =&\bigl\langle \psi_{1,2,4}^{IIS} \bigr\rangle \bigl\langle \psi _{1,3,4}^{SIS}\bigr\rangle -\bigl\langle \psi_{1,2,4}^{SIS}\bigr\rangle \bigl\langle \psi _{1,3,4}^{IIS}\bigr\rangle , \\ \alpha_6 =&\bigl\langle \psi_{1,4}^{IS} \bigr\rangle \bigl\langle \psi _{1,2,3,4}^{SIIS}\bigr\rangle -\bigl\langle \psi_{1,4}^{SS}\bigr\rangle \bigl\langle \psi _{1,2,3,4}^{IIIS}\bigr\rangle . \end{aligned}$$

The derivatives of α 3 and α 4 can be obtained similarly.

Fig. 4
figure 4

Star graph with P=4 nodes

Differentiating α 5 we get

$$\begin{aligned} \dot{\alpha}_5=-3(\tau+\gamma)\alpha_5-\tau \alpha_7-\tau\alpha_8, \end{aligned}$$
(15)

where:

$$\begin{aligned} \alpha_7 =&\bigl\langle \psi_{1,2,3,4}^{IIIS} \bigr\rangle \bigl\langle \psi _{1,3,4}^{SIS}\bigr\rangle -\bigl\langle \psi_{1,2,3,4}^{SIIS}\bigr\rangle \bigl\langle \psi_{1,3,4}^{IIS}\bigr\rangle , \\ \alpha_8 =&\bigl\langle \psi_{1,2,4}^{IIS} \bigr\rangle \bigl\langle \psi _{1,2,3,4}^{SIIS}\bigr\rangle -\bigl\langle \psi_{1,2,4}^{SIS}\bigr\rangle \bigl\langle \psi_{1,2,3,4}^{IIIS}\bigr\rangle . \end{aligned}$$

The derivative for α 6 can also be obtained. Finally, differentiating α 7 and α 8 we obtain:

$$\begin{aligned} \dot{\alpha}_7=-4(\tau+\gamma)\alpha_7 \quad\textrm{and} \quad \dot{\alpha}_8=-4(\tau+\gamma) \alpha_8. \end{aligned}$$
(16)

To conclude the proof of the exactness of the closure, we first assume that the initial state is not mixed; that is, one of the 34=81 possible configurations has probability 1 at t=0. Then it is easy to see that α j (0)=0 for all j∈{1,2,…,8} (see Lemma 3.2 in Sect. 3.5 for a proof in a more general context). Hence the differential equations for α 7 and α 8 show that α 7(t)=0 and α 8(t)=0 for all t≥0. The differential equation for α 5 then implies that α 5=0 ∀t≥0 (and similarly for α 6). This implies that α 2=0 (and similarly α 3=α 4=0). The differential equation for α 1 shows that α 1=0 which is what we wanted to show. The other triple closure in (12) can be proved similarly.

Remark

In fact, we have proved several closure relations α j =0 ∀j∈{1,2,…,8}.

The closure relations each consist of two pairs which are visualised in Fig. 5. For reference, we refer to these as the left pair and the right pair referring to their position in this figure. Looking at these closure relations, we can form two observations:

  1. 1.

    For a given node i, the number of times it appears as S i is the same in the left and the right pair, and similarly with the number of times it appears as I i . For example, with α 5, node 1 (the left node) has one I and one S for both pairs and node 4 (central node) has two S’s in both pairs. For α 6, the number of I’s at nodes 1, 2, 3 and 4 is (1,1,1,0) in both pairs and the number of S’s is given by (1,0,0,2) in both pairs.

  2. 2.

    Any SI pairing on the left appears exactly the same number of times on the right. For example in α 7, I 1 S 4 appears once on the left and once on the right and I 3 S 4 appears twice on the left and twice on the right. Observing that only SS pairs and IS pairs appear in the closure relations, a consequence is that SS pairs also have this property.

These observations will be of key importance for developing the general proof in the following two sections.

Fig. 5
figure 5

Each box illustrates the relevant node states for the four parts of the closure relation in the equation above it. The node states on the left and the right correspond to the two terms in the closure relation. The node numbers correspond to the same positions as in Fig. 4

3.4 General Closure Relations

In general, to show that the closure relationship (2) is exact for the tree graph, we need to show that α=0 where

$$\begin{aligned} \alpha=\langle B_j\rangle \langle A_iB_jC_k \rangle-\langle A_iB_j\rangle \langle B_jC_k\rangle, \end{aligned}$$

B=S and A,C∈{S,I}. Our proof of this is via induction using a sequence of closures analogous to the proof in the case of the star graph in Sect. 3.3.

We shall consider many closure relations. In general we specify that they are composed of two pairs of motif states \((\psi_{W}^{A},\psi_{X}^{B})\) and \((\psi_{Y}^{C},\psi_{Z}^{D})\) and that the closure is exact if α=0 where

$$\begin{aligned} \alpha=\bigl\langle \psi_W^A\bigr\rangle \bigl\langle \psi_X^B\bigr\rangle -\bigl\langle \psi _Y^C\bigr\rangle \bigl\langle \psi_Z^D \bigr\rangle . \end{aligned}$$

We formalise the observations we made about the closure relations for the star graph at the end of Sect. 3.3 by defining what we term “compatible pairs”.

Definition 3.1

For all \(a\in\{S,I,R\} \textrm{ and } i\in\{1,2,\ldots,P\}, \psi_{i}^{a}\subset\psi_{W}^{A}\Leftrightarrow\exists j\) s.t. W j =i and A j =a.

For all \(a_{1}, a_{2}\in\{S,I,R\} \textrm{ and } i_{1}, i_{2}\in\{ 1,2,\ldots,P\} : i_{1}\neq i_{2}, \psi^{a_{1}a_{2}}_{i_{1},i_{2}}\subset\psi _{W}^{A}\Leftrightarrow\exists j_{1},j_{2} \textrm{ s.t. } W_{j_{1}}=i_{1},W_{j_{2}}=i_{2}\textrm{ and } A_{j_{1}}=a_{1},A_{j_{2}}=a_{2}\).

Remark

In general, the notation \(\psi_{X}^{B}\subset\psi_{W}^{A}\) denotes that the state of subsystem \(\psi_{X}^{B}\) is implied by the state of subsystem \(\psi_{W}^{A}\) because it is contained within it.

Definition 3.2

Two pairs of motif states \((\psi_{W}^{A},\psi_{X}^{B})\) and \((\psi _{Y}^{C},\psi_{Z}^{D})\) are called compatible pairs if the following conditions are met:

  • CP(i) \((\psi_{i}^{a}\subset\psi_{W}^{A}\textrm{ or }\psi _{i}^{a}\subset\psi_{X}^{B} )\Leftrightarrow (\psi_{i}^{a}\subset \psi_{Y}^{C}\textrm{ or }\psi_{i}^{a}\subset\psi_{Z}^{D} )\)

  • CP(ii) \((\psi_{i}^{a}\subset\psi_{W}^{A}\textrm{ and }\psi _{i}^{a}\subset\psi_{X}^{B} )\Leftrightarrow (\psi_{i}^{a}\subset \psi_{Y}^{C}\textrm{ and }\psi_{i}^{a}\subset\psi_{Z}^{D} )\)

  • CP(iii) \((\psi^{IS}_{i_{1},i_{2}}\subset\psi_{W}^{A}\textrm{ or }\psi^{IS}_{i_{1},i_{2}}\subset\psi_{X}^{B} )\Leftrightarrow (\psi^{IS}_{i_{1},i_{2}}\subset\psi_{Y}^{C}\textrm{ or } \psi ^{IS}_{i_{1},i_{2}}\subset\psi_{Z}^{D} )\)

  • CP(iv) \((\psi^{IS}_{i_{1},i_{2}}\subset\psi_{W}^{A}\textrm{ and }\psi^{IS}_{i_{1},i_{2}}\subset\psi_{X}^{B} )\Leftrightarrow (\psi^{IS}_{i_{1},i_{2}}\subset\psi_{Y}^{C}\textrm{ and } \psi ^{IS}_{i_{1},i_{2}}\subset\psi_{Z}^{D} )\)

  • CP(v) Same as CP(iii) and CP(iv) but with SS pairs

where a∈{S,I,R}.

Definition 3.3

Let \(\psi_{W}^{A}\) be an r-state and \(\psi_{X}^{B}\) be a q-state. Then the order of the pair \((\psi_{W}^{A},\psi_{X}^{B})\) is defined as r+q.

Proposition 3.1

If \((\psi_{W}^{A},\psi_{X}^{B})\) and \((\psi_{Y}^{C},\psi_{Z}^{D})\) are compatible pairs, then their order is equal.

Proof

Follows from CP(i) and CP(ii). □

Proposition 3.2

For a tree graph, applying the transformation h i to each of the four motif states in compatible pairs that contain node i generates compatible pairs.

Proof

The transformation satisfies CP(i) and CP(ii) because it replaces \(\psi _{i}^{a}=\psi_{i}^{I}\) with \(\psi_{i}^{a}=\psi_{i}^{S}\) which does not alter the form of the conditions. The transformation satisfies CP(iii) and CP(iv) because all IS pairs where i is the infected individual are removed by this transformation. New IS pairs cannot be created by the transformation since this would require II pairs which are prohibited for tree graphs by Lemma 2.1. CP(v) is satisfied because the transformation leaves existing SS pairs unchanged and created SS pairs result from existing IS pairs so are balanced on each side. □

3.5 Proof of the Main Result

Lemma 3.1

Let \((\psi_{W}^{A},\psi_{X}^{B})\) and \((\psi_{Y}^{C},\psi_{Z}^{D})\) be compatible pairs or order R and \(\psi_{W}^{A},\psi_{X}^{B},\psi_{Y}^{C},\psi_{Z}^{D}\in\bar {M}\). Let

$$\begin{aligned} \alpha_0=\bigl\langle \psi_W^A\bigr\rangle \bigl\langle \psi_X^B\bigr\rangle -\bigl\langle \psi_Y^C\bigr\rangle \bigl\langle \psi_Z^D \bigr\rangle . \end{aligned}$$

Then

$$\begin{aligned} \dot{\alpha_0}=\sum_{p=1}^mc_p \alpha_p+c_0\alpha_0, \end{aligned}$$
(17)

where each α p can be expressed as

$$\begin{aligned} \alpha_p=\bigl\langle \psi^{\bar{A}}_{\bar{W}}\bigr\rangle \bigl\langle \psi ^{\bar{B}}_{\bar{X}}\bigr\rangle -\bigl\langle \psi^{\bar{C}}_{\bar {Y}}\bigr\rangle \bigl\langle \psi^{\bar{D}}_{\bar{Z}} \bigr\rangle \end{aligned}$$

with \(\psi^{\bar{A}}_{\bar{W}},\psi^{\bar{B}}_{\bar{X}}\) and \(\psi^{\bar{C}}_{\bar{Y}},\psi^{\bar{D}}_{\bar{Z}}\) being compatible pairs of order R+1, and c 0, c p being constants and m being an integer denoting the number of terms in the summation.

Remark

This is a general statement of the forms of (13)–(16) in the star graph example.

Proof

Take the derivative of α 0:

$$\begin{aligned} \dot{\alpha_0}=\bigl\langle \dot{\psi}^A_W \bigr\rangle \bigl\langle \psi _X^B\bigr\rangle + \bigl\langle \psi_W^A\bigr\rangle \bigl\langle \dot{ \psi}^B_X\bigr\rangle -\bigl\langle \dot{ \psi}_Y^C\bigr\rangle \bigl\langle \psi_Z^D \bigr\rangle -\bigl\langle \psi _Y^C\bigr\rangle \bigl\langle \dot{\psi}^D_Z\bigr\rangle . \end{aligned}$$
(18)

We consider the terms associated with removal, transmission terms of order R and transmission terms of order R+1 separately. Firstly, from (10), this derivative contains the following terms associated with the removal process:

where the sums over k 1,k 2,k 3,k 4 are over all nodes in the motifs ψ W ,ψ X ,ψ Y ,ψ Z respectively and where

is easily seen to follow from CP(i) and CP(ii).

The right-hand side of (18) also contains the following transmission terms with motifs of order R:

$$\begin{aligned} &-\sum_{k_1}D_{k_1}^{AS} \sum_{l_1}T_{W_{k_1}W_{l_1}}D_{l_1}^{AI} \bigl\langle \psi_W^A\bigr\rangle \bigl\langle \psi_X^B\bigr\rangle -\sum_{k_2}D_{k_2}^{BS} \sum_{l_2}T_{X_{k_2}X_{l_2}}D_{l_2}^{BI} \bigl\langle \psi_W^A\bigr\rangle \bigl\langle \psi_X^B\bigr\rangle \\ &\quad\quad{}+\sum_{k_3}D_{k_3}^{CS} \sum_{l_3}T_{Y_{k_3}Y_{l_3}}D_{l_3}^{CI} \bigl\langle \psi_Y^C\bigr\rangle \bigl\langle \psi_Z^D\bigr\rangle +\sum_{k_4}D_{k_4}^{DS} \sum_{l_4}T_{Z_{k_4}Z_{l_4}}D_{l_4}^{DI} \bigl\langle \psi_Y^C\bigr\rangle \bigl\langle \psi_Z^D\bigr\rangle \\ &\quad=-wa_0, \end{aligned}$$

where

$$\begin{aligned} w =&\sum_{k_1}D_{k_1}^{AS} \sum_{l_1}T_{W_{k_1}W_{l_1}}D_{l_1}^{AI}+ \sum_{k_2}D_{k_2}^{BS}\sum _{l_2}T_{X_{k_2}X_{l_2}}D_{l_2}^{BI} \\ =&\sum_{k_3}D_{k_3}^{CS} \sum_{l_3}T_{Y_{k_3}Y_{l_3}}D_{l_3}^{CI}+ \sum_{k_4}D_{k_4}^{DS}\sum _{l_4}T_{Z_{k_4}Z_{l_4}}D_{l_4}^{DI} \end{aligned}$$

and where the sums over k 1,k 2,k 3,k 4,l 1,l 2,l 3,l 4 are over all nodes in each of the relevant motifs. This follows from CP(iii) and CP(iv). Hence the removal terms and transmission terms of order R contribute c 0 a 0 to the derivative of a 0 where c 0=−vw.

For transmission terms with motifs of order R+1, consider the term \(\langle\dot{\psi}^{A}_{W}\rangle\langle\psi_{X}^{B}\rangle\) in (18). This gives rise to the following terms in the derivative of α 0:

$$\begin{aligned} \sum_{k_1}(-1)^{D_{k_1}^{AS}}\sum _{n=1,n\notin W}^PT_{W_{k_1}n}\bigl\langle g_{W_{k_1}}^n\bigl(\psi_W^A\bigr) \bigr\rangle \bigl\langle \psi _X^B\bigr\rangle . \end{aligned}$$

To prove the lemma, it is sufficient to show that each term in this sum can be paired uniquely with a term in \(\langle\dot{\psi}_{Y}^{C}\rangle \langle\psi_{Z}^{D}\rangle\) or in \(\langle\psi_{Y}^{C}\rangle\langle\dot {\psi}^{D}_{Z}\rangle\), such that the difference of these terms forms α p . By symmetry this is a one-to-one pairing establishing that each term of order R+1 on the right-hand side of (18) is accounted for exactly once in the sum of α p .

Let us take an element from the sum by choosing a node W w , w∈{1,2,…,r}, and an outside neighbour node \(n\in N_{W_{w}}\). This neighbour can either be in ψ X or outside.

Case 1: A w =S.

Consider first the case where A w is a susceptible node, resulting in the following term in the sum:

$$\begin{aligned} -T_{{W_w}n}\bigl\langle g_{W_w}^n\bigl( \psi^A_W\bigr)\bigr\rangle \bigl\langle \psi_X^B\bigr\rangle , \end{aligned}$$

where we can identify \(c_{p}=-T_{{W_{w}}n}\).

We have A w =S, \(n\in N_{W_{w}}\) and nW. Let us now identify a term in \(\langle\dot{\psi}_{Y}^{C}\rangle\langle\psi_{Z}^{D}\rangle +\langle\psi_{Y}^{C}\rangle\langle\dot{\psi}^{D}_{Z}\rangle\) to form compatible pairs. According to CP(i) and CP(ii) we can assume without loss of generality that W w Y, i.e. ∃y:Y y =W w and C y =S (see Fig. 6). There are two subcases:

Fig. 6
figure 6

Each circle refers to one of the motif states \(\psi_{W}^{A}\), \(\psi_{X}^{B}\), \(\psi_{Y}^{C}\), \(\psi_{Z}^{D}\) specified to the top left. The position of the relevant node states with respect to the motif states are then illustrated. (a) Subcase 1.1 (nY). (b) Subcase 1.2 (nY)

Subcase 1.1: nY.

When nY, either nZ or nZ and these are shown by solid and dashed lines respectively in Fig. 6(a). By CP(i), nZnX since nW so the solid lines match on the left and right pairs as do the dashed lines. The corresponding term must therefore be \(-T_{Y_{y}n}\langle g_{Y_{y}}^{n}(\psi_{Y}^{C})\rangle\langle\psi_{Z}^{D}\rangle=c_{p}\langle g_{Y_{y}}^{n}(\psi_{Y}^{C})\rangle\langle\psi_{Z}^{D}\rangle\) irrespective of whether nZ or not. Hence:

$$\begin{aligned} \alpha_p=\bigl\langle g_{W_w}^n\bigl( \psi_W^A\bigr)\bigr\rangle \bigl\langle \psi^B_X\bigr\rangle -\bigl\langle g_{Y_y}^n \bigl(\psi^C_Y\bigr)\bigr\rangle \bigl\langle \psi^D_Z\bigr\rangle , \end{aligned}$$

where \((g_{W_{w}}^{n}(\psi^{A}_{W}),\psi^{B}_{X})\) and \((g_{Y_{y}}^{n}(\psi ^{C}_{Y}),\psi^{D}_{Z})\) are easily seen to satisfy the definition of compatible pairs since the extra node is n which is I in both pairs.

Subcase 1.2: nY.

If nY, then the edge nY y is an SS or IS edge in C. By CP(iii), CP(iv) and CP(v), it is also the same edge in B because nW. Hence ∃x:X x =W w and B x =S. By CP(ii), W w Z is also true where ∃z:Z z =W w andD z =S. This is illustrated in Fig. 6(b). We therefore have the corresponding term \(-T_{Z_{z}n}\langle\psi ^{C}_{Y}\rangle\langle g_{Z_{z}}^{n}(\psi^{D}_{Z})\rangle=c_{p}\langle\psi ^{C}_{Y}\rangle\langle g_{Z_{z}}^{n}(\psi^{D}_{Z})\rangle\) and:

$$\begin{aligned} \alpha_p=\bigl\langle g_{W_w}^n\bigl( \psi^A_W\bigr)\bigr\rangle \bigl\langle \psi^B_X\bigr\rangle -\bigl\langle \psi^C_Y \bigr\rangle \bigl\langle g_{Z_z}^n\bigl( \psi^D_Z\bigr)\bigr\rangle , \end{aligned}$$

where again, the relevant pairs are seen to satisfy compatibility.

Case 2: A w =I.

So far we have proved the existence of α p when A w =S, \(n\in N_{W_{w}}\), nW. Now we have to show α p can be defined when A w =I, \(n\in N_{W_{w}}\), nW. In this case, the motif generating rule firstly changes A w =I to A w =S and then applies the same generating rule as if A w =S initially. From Proposition 3.2, applying the transformation to compatible pairs of order R produces compatible pairs of order R in the case of the tree graph. After this transformation, the argument runs identically to case 1. This completes the proof of Lemma 3.1. □

Lemma 3.2

Assume that the initial condition is not mixed, i.e. ∃A∈{I,S}P such that \(\langle\psi_{1,2,\ldots,P}^{A}\rangle=1\). If \((\psi _{W}^{A},\psi_{X}^{B})\) and \((\psi_{Y}^{C},\psi_{Z}^{D})\) are compatible pairs, then for any graph, \(\langle\psi_{W}^{A}\rangle\langle\psi_{X}^{B}\rangle -\langle\psi_{Y}^{C}\rangle\langle\psi_{Z}^{D}\rangle=0\) at t=0.

Proof

Assume that \(\langle\psi_{W}^{A}\rangle\langle\psi_{X}^{B}\rangle=1\). Then by CP(i) and CP(ii), for all \(\psi_{i}^{a}\subset\psi_{W}^{A}\), we must have \(\psi_{i}^{a}\subset\psi_{Y}^{C}\) and/or \(\psi_{i}^{a}\subset\psi_{Z}^{D}\). This is also true for all \(\psi_{i}^{a}\subset\psi_{X}^{B}\) and, by the symmetry between the compatible pairs, it follows that \(\langle\psi _{Y}^{C}\rangle\langle\psi_{Z}^{D}\rangle=1\). Similarly, it follows that \(\langle\psi_{W}^{A}\rangle\langle\psi_{X}^{B}\rangle=0\) implies that \(\langle\psi_{Y}^{C}\rangle\langle\psi_{Z}^{D}\rangle=0\). □

Theorem 3.1

Let us assume the following:

  • The graph is a tree.

  • The initial condition is not mixed.

  • \((\psi_{W}^{A},\psi_{X}^{B})\) and \((\psi_{Y}^{C},\psi_{Z}^{D})\) are compatible pairs.

Then \(\langle\psi_{W}^{A}\rangle\langle\psi_{X}^{B}\rangle-\langle\psi _{Y}^{C}\rangle\langle\psi_{Z}^{D}\rangle=0\) for all time t≥0.

Proof

We prove the theorem by induction according to the order of the closure. This is analogous to the proof for the star graph in Sect. 3.3.

Step 1.

If the closure is of order 2P, then it is exact. More precisely, if \(\psi_{W}^{A}\), \(\psi_{X}^{B}\), \(\psi_{Y}^{C}\) and \(\psi_{Z}^{D}\) are P-states, then (17) does not contain the summation terms and becomes:

$$\begin{aligned} \dot{\alpha_0}=c_0\alpha_0. \end{aligned}$$

Since we start from an initial condition that is not mixed, we have (by Lemma 3.2) α 0(0)=0⇒α 0(t)=0 ∀t≥0.

Step 2.

Assume that the theorem is proved for compatible pairs of order R+1. We prove that it is true for compatible pairs of order R. Applying Lemma 3.1, we have:

$$\begin{aligned} \dot{\alpha_0}=\sum_{p=1}^mc_p \alpha_p+c_0\alpha_0. \end{aligned}$$

According to the induction condition, α p =0 ∀p because these are compatible pairs of order R+1. Therefore \(\dot{\alpha _{0}}=c_{0}\alpha_{0}\). From Lemma 3.2, α 0(0)=0 so α 0(t)=0∀t≥0. Since we have proved the result for compatible pairs of order 2P then we have completed the proof of the theorem. □

The lowest-order compatible pairs are of order four. The closure relation corresponding to these pairs is formulated in the following important corollary.

Corollary 3.1

Under the assumptions on the graph and the initial conditions in Theorem 3.1, we have the special cases:

$$\begin{aligned} \bigl\langle \psi_j^S\bigr\rangle \bigl\langle \psi_{i,j, k}^{SSI}\bigr\rangle =\bigl\langle \psi_{i,j}^{SS} \bigr\rangle \bigl\langle _{j, k}^{SI}\bigr\rangle \end{aligned}$$

for all i∈{1,2,…,P} and for all jN i , kN j : ik;

$$\begin{aligned} \bigl\langle \psi_i^S\bigr\rangle \bigl\langle \psi_{k , i, j}^{ISI}\bigr\rangle =\bigl\langle \psi_{k , i}^{IS} \bigr\rangle \bigl\langle \psi_{i,j}^{SI}\bigr\rangle \end{aligned}$$

for all i∈{1,2,…,P} and for all j,kN i : jk.

This corollary is Theorem 1.1 expressed in a different notation.

Remark

From CP(i) and CP(ii), it is clear that Lemma 3.2 can be extended to the mixed initial condition where the probabilities of the initial states of each individual in the system are statistically independent, leading to \(\langle\psi_{W}^{A}\rangle\langle\psi _{X}^{B}\rangle-\langle\psi_{Y}^{C}\rangle\langle\psi_{Z}^{D}\rangle=0\) at t=0. However, for general mixed initial conditions where correlations between individuals can occur, Lemma 3.2 does not hold and the pair-based model is not exact.

4 Application to Some Graphs which Are not Trees

To complete this work, we make a final observation which shows that the pair-based model can sometimes provide an exact representation of infectious dynamics on graphs which are not strictly trees. We first make two definitions which can be understood with reference to the examples in Fig. 7.

Fig. 7
figure 7

The graphs on the left are the initial transmission networks where the initially infected nodes are indicated by the symbol I. The graphs on the right are the reduced representation graphs where the cuts for independent segments which occur for cases (b) and (d) are indicated with dashed lines. The tree structure of the graphs on the right shows that applying the pair-based model to these graphs generates an exact representation of the infection dynamics on the original system

Definition 4.1

A reduced representation is a graph which is constructed from the initial transmission network and the given initial conditions by removing transmission routes which cannot carry infection dynamics.

Definition 4.2

An independent segment is a region of a graph that is only connected to other regions via nodes in the segment which are initially infectious.

Theorem 4.1

Given SIR dynamics on a transmission network with infection and removal governed by Poisson processes and given an unmixed initial state of the system, if every independent segment of the reduced representation is a tree, then applying (3) to this representation exactly generates the expected infection dynamics on the original transmission network.

Proof

By definition, the infection dynamics of the system remain unchanged after the removal of edges which cannot support infection dynamics. Additionally, the infection dynamics of any independent segment are independent of the dynamics on the rest of the graph because there is no process that allows influence across the initially infectious nodes. If the resulting representation graph is a set of trees, then since (3) is an exact representation of the dynamics on each independent segment, solving (3) on the reduced representation graph is equivalent to the infection dynamics on the original transmission network. □

Figure 7 shows some graphs and the associated representation graphs where the dashed lines indicate the boundaries that separate independent segments. For each of these examples, the solution of (3) on the representation graph exactly reproduces the expected infection dynamics of the original system.

This suggests that the accuracy of the pair-based model could be increased by first generating the representation graph for the particular network and initial conditions prior to numerically solving the pair-based model.

5 Discussion

We considered the pair-based variant of the subsystem approach to constructing epidemic models on networks (Sharkey 2008, 2011). We proved that for SIR dynamics on fixed tree graphs with exponentially distributed transmission and removal processes, the pair-based model provides an exact determination of the infection probability time course for each individual in the network. We also showed that the dynamics of some networks with cycles can also be represented exactly by the pair-based model under specific initial conditions.

This represents the first provably exact deterministic model of epidemic dynamics on finite heterogeneous systems which has been numerically evaluated. Here we use the qualifying term “heterogeneous” to exclude systems with significant symmetry which may be employed to obtain exact representations in very specialised circumstances (Keeling and Ross 2008; Simon et al. 2011). In principle, the message-passing approach of Karrer and Newman (2010) will also yield an exact description of finite heterogeneous systems in a way that is numerically feasible, but to our knowledge this has not yet been implemented in this context. Interestingly, the message-passing method also applies more generally beyond the usual assumptions of Markovian dynamics to arbitrary distributions for transmission and removal processes, although there may be implementation issues for more general distributions.

We note that effective degree models can generate very good agreement with stochastic simulation (Ball and Neal 2008; Lindquist et al. 2011) as do the PGF or edge-based compartmental modelling methods (Miller et al. 2012; Miller and Volz 2012; Volz 2008), although exact correspondence has not been proven here. For some idealised networks, including fully connected networks and some configuration networks (Volz 2008), convergence to the expected value can be shown in the infinite population limit (Ball and Neal 2008; Decreusefond et al. 2012; Karrer and Newman 2010). However, these models have a large measure of homogeneity, and convergence only occurs for infinite populations.

It is intuitively understood that clustering is at the root of problems with models based around closures at the level of pairs (Keeling and Eames 2005). Previous analysis (Sharkey 2011) attributed the failure to anomalous terms which emerge in subsystem equations when differentiating closure approximations based around the statistical independence of individuals. Here, repeating similar analysis for a closure at the order of pairs in the context of tree graphs, these anomalies do not arise and we are able to prove that the closure is exact via induction.

In principle, models based around subsystems at the order of three nodes or higher could be constructed. The next higher-order model would require obtaining a closure which is able to preserve correlations between triples, and similarly for higher orders. This leads to an interesting theoretical question for future analysis: does the hierarchy of exact order-by-order models suggested in Sharkey (2011) exist, and if so, what form should the closure approximations take at each level? We conjecture that exact closures of a similar nature to those considered here are possible for networks with more structure, given that the order at which the closure is performed is guided by the network structure; future work will focus on this question.