1 Introduction

The role that concurrent partnerships might play in the spread of HIV in sub-Saharan Africa is the subject of an ongoing debate. While simulation studies have shown the large impact that concurrency potentially has on the epidemic growth rate and the endemic prevalence of HIV (Kretzschmar and Morris 1996; Morris and Kretzschmar 1997, 2000; Eaton et al. 2011; Goodreau 2011), the empirical evidence for such a relationship is inconclusive (Lurie and Rosenthal 2010; Reniers and Watkins 2010; Tanser et al. 2011; Kenyon and Colebunders 2012).

Mathematical modelling results have played a key role in fuelling the debate (Watts and May 1992; Kretzschmar and Morris 1996; Morris and Kretzschmar 1997, 2000; Eaton et al. 2011; Goodreau 2011). However, a mathematical framework suitable to derive analytical results is still lacking. At present, simulation studies prevail, and general theory is mainly focused on static networks (Diekmann et al. 1998; Ball and Neal 2008; House and Keeling 2011; Lindquist et al. 2011; Miller et al. 2012; Miller and Volz 2013). This motivated us to develop and analyse a mathematical model for the spread of an \({ SI}\) (Susceptible–Infectious) infection along a dynamic network.

In a previous paper (Leung et al. 2012) a model for a dynamic sexual network of a homosexual population is presented that incorporates demographic turnover and allows for individuals to have multiple partners at the same time, with the number of partners varying over time. This network model can be seen as a generalization of the pair formation models (that describe sequentially monogamous populations) to situations where individuals are allowed more than one partner at a time. Pair formation models were first introduced into epidemiology by Dietz and Hadeler (1988) and extended in various ways (Kretzschmar et al. 1994; Inaba 1997; Kretzschmar and Dietz 1998; Xiridou et al. 2003; Heijne et al. 2011; Powers et al. 2011). In the present generalization, individuals have at most \(n\) partners at a time. We call \(n\) the partnership capacity. In the partnership network individuals are, essentially, collections of \(n\) ‘binding sites’ where binding sites can be either ‘free’ or ‘occupied’ (by a partner). In the case that \(n=1\) we recover the pair formation model of a monogamous population.

Consider an individual in the sexual network. Since individuals may have several partners simultaneously, the risk of acquiring infection depends on that individual’s partners, but also on their partners, and so on. We would need to keep track of the entire network to fully characterize the risk of infection to an individual. Here we introduce an approximation rather than taking full network information into account: we assume that properties concerning partners of partners can be obtained by averaging over the population. This approximation is termed the ‘mean field at distance one’ assumption (‘mean field at distance one’ should be read as one term; from here on we write this without quotation marks). This assumption relates to what is called ‘effective degree’ in Lindquist et al. (2011), where transmission of infection along a static network is studied (we are, apart from Britton and Lindholm 2010; Britton et al. 2011), not aware of any analytical work so far, on disease transmission across dynamic networks with demography (see e.g. Altmann 1995, 1998; Ferguson and Garnett 2000; Bansal et al. 2010; Kiss et al. 2012; Miller and Volz 2013) and references therein for models incorporating dynamic partnerships in a demographically closed population).

The mean field at distance one assumption is a moment closure approximation obtained by ignoring certain correlations between the states of two individuals that are in a partnership and, as a consequence, this assumption is inconsistent with the assumptions that underlie the partnership network (see e.g. Ferguson and Garnett 2000; Kamp 2010; House and Keeling 2011; Taylor et al. 2012) and references therein for different moment closure approximations on networks). However, this assumption allows us to write down a closed system of ODEs to describe an approximation of the \({ SI}\) infection on the partnership network. If a partnership capacity \(n\) is given, then we have an \((n+1)(n+2)\) dimensional system of ODEs.

A large part of the paper is devoted to characterizing the basic reproduction number \(R_0\) and proving its threshold character for the nonlinear system of ODEs. This system is quite large already for small \(n\). However, by considering only states-at-infection and using the next-generation matrix approach, \(R_0\) can be characterized as the dominant eigenvalue of an \(n\times n\) matrix. Using the interpretation we can further reduce this and \(R_0\) can ultimately be characterized as the dominant eigenvalue of a \(2\times 2\) matrix where the entries of this matrix are explicit, and therefore also \(R_0\) has an explicit expression. In fact, we are able to interpret \(R_0\) in terms of individuals (which are considered in the model specification) and in terms of binding sites.

The structure of the paper is as follows. First, in Sect. 2, we consider the partnership network of Leung et al. (2012) and summarize the main results needed for this paper. Next, in Sect. 3 we superimpose an \({ SI}\)-infection on the network and specify the model assumptions. Particular attention is given to the mean field at distance one assumption. The rest of the paper is devoted to characterizing the basic reproduction number \(R_0\). For this, in Sect. 4, we first consider the linearisation of the system.

In Sect. 5, which constitutes the core of the paper, we characterize \(R_0\) in terms of newly infected binding sites that produce newly infected binding sites. We introduce a transition matrix \(\varSigma \) and a transmission matrix \(T\) and define \(R_0\) as the dominant eigenvalue of the next generation matrix \(-T\varSigma ^{-1}\) (Diekmann et al. 2013, Section 7.2). The building blocks for an explicit expression for \(R_0\) are presented in Appendix C. We also show that \(R_0\) thus defined can be interpreted as the basic reproduction ratio for individuals, since individuals can be considered to be collections of \(n\) binding sites. Section 5 can be read independently of the rest of the paper.

The characterization of \(R_0\) in Sect. 5 does not, by itself, provide a mathematical proof that the disease-free steady state is stable for \(R_0<1\) and unstable for \(R_0>1\). We provide such a proof in Sect. 6. The proof is based on the Perron–Frobenius theory of spectral properties of positive and positive-off-diagonal irreducible matrices. In particular we use that

  • \({{\mathrm{sign}}}(R_0- 1) = {{\mathrm{sign}}}(r)\) where \(r\) is the Malthusian parameter (i.e. the dominant eigenvalue) of the matrix \(T+ \varSigma \)

  • the linearised system derived in Sect. 4 can be mapped in a natural way to the binding-site system defined by the matrices \(\varSigma \) and \(T\), while preserving positivity.

The final Sect. 7 provides conclusions and plans for future work. Some more technical calculations are left for the six appendices. In particular, in Appendix B we show with explicit calculations for the case \(n=2\) (suggested to us by Pieter Trapman (personal communication, 26 August, 2013)) that states of partners are not independent of one another, implying that the mean field at distance one assumption yields only an approximate and not an exact description.

2 The partnership network

In this section we will give a summary of the specification of the partnership network and of the main results presented in Leung et al. (2012).

Consider a population of homosexual individuals—all with partnership capacity \(n\). The partnership capacity is the maximum number of simultaneous partners an individual may have. One may think of an individual as having \(n\) binding sites. Binding sites are either ‘occupied’ (by a partner) or ‘free’. We assume that binding sites of an individual behave independently from one another as far as forming and dissolving partnerships are concerned. Furthermore, individuals enter (‘birth’) and leave (‘death’) the sexually active population.

The model specification begins at the individual level. The state of an individual is given by \(k\), the number of occupied binding sites, \(k=0,\ldots ,n\). Consider one individual born at time \(t_b\) and suppose it does not die in the time interval under consideration. An occupied binding site becomes free at rate \(\sigma +\mu \), where \(\sigma \) corresponds to ‘separation’ and \(\mu \) to ‘death of partner’. A free binding site becomes occupied at rate \(\rho F\), where \(F\) denotes the fraction of free binding sites in the pool of all binding sites in the population. The possible state transitions and the rates at which they occur are:

$$\begin{aligned} k\rightarrow k+1&\qquad \text { with rate } \rho (n-k)F,\\ k\rightarrow k-1&\qquad \text { with rate } (\sigma +\mu )k. \end{aligned}$$

The probability that an individual is in state \(k\) at age \(a\) is denoted by \(p_k(t_b,a)\), where \(t_b\) denotes the time of birth. A newborn individual has \(n\) free binding sites, i.e.

$$\begin{aligned} p(t_b,0)=\begin{pmatrix}1\\ 0\\ \vdots \\ 0\end{pmatrix}\!. \end{aligned}$$

Let \(A=A(F)\) denote the matrix corresponding to the state transitions described above. So, as an example, for \(n=2\), the matrix \(A\) is as follows:

$$\begin{aligned} A=\begin{pmatrix}-2\rho F&{}\quad \sigma +\mu &{}\quad 0\\ 2\rho F&{}\quad -(\rho F+\sigma +\mu )&{}\quad 2(\sigma +\mu )\\ 0&{}\quad \rho F&{}\quad -2(\sigma +\mu )\end{pmatrix}\!. \end{aligned}$$

Note that, throughout this paper, we will use the convention that, for a transition matrix \(M=(m_{ij})\), \(m_{ij}\) denotes the probability per unit of time at which a transition from \(j\) to \(i\) is made (instead of the transition from \(i\) to \(j\), as it is common in the stochastic community.)

Then, as long as the individual does not die, we have

$$\begin{aligned} \frac{\partial p}{\partial a}(t_b,a)&=A(F(t_b+a))p(t_b,a). \end{aligned}$$

We assume a stationary age distribution which is exponential with parameter \(\mu \), so it has probability density function

$$\begin{aligned} a\mapsto \mu e^{-\mu a}. \end{aligned}$$
(1)

Then, in a deterministic description of a large population, the fraction of the population in state \(k\) at time \(t\) is

$$\begin{aligned} P_k(t)&=\int _0^\infty \mu e^{-\mu a}p_k(t-a,a)da=\int _{-\infty }^t\mu e^{-\mu (t-\alpha )}p_k(\alpha ,t-\alpha )d\alpha . \end{aligned}$$
(2)

The fraction of free binding sites \(F\) is defined as

$$\begin{aligned} F(t)=\frac{1}{n}\sum _{k=0}^n(n-k)P_k(t). \end{aligned}$$
(3)

Due to the assumption of independence of binding sites with respect to partnership dynamics, the dynamics of \(F\) decouple as stated in Lemma 1 below (the proof is presented in Leung et al. 2012).

Lemma 1

The fraction of free binding sites \(F\) satisfies the differential equation

$$\begin{aligned} \frac{dF}{dt}=\mu +(\sigma +\mu )(1-F)-\rho F^2-\mu F. \end{aligned}$$
(4)

Consequently,

$$\begin{aligned} F(t)\rightarrow \bar{F}, \end{aligned}$$

for \(t\rightarrow \infty \), where

$$\begin{aligned} \bar{F}=\frac{\sqrt{(\sigma +2\mu )(4\rho +\sigma +2\mu )}-(\sigma +2\mu )}{2\rho }. \end{aligned}$$
(5)

This convergence to \(\bar{F}\) motivates us to take \(F\) constant and equal to \(\bar{F}\) (note, incidentally, that \(\bar{F}\) does not depend on the partnership capacity \(n\)). As a consequence the argument \(t_b\) in \(p_k(t_b,a)\) no longer matters and \(P_k(t)=P_k\) is independent of time. In fact, one finds that

$$\begin{aligned} P_k=\left( \begin{array}{c}n\\ k\end{array}\right) \int _0^\infty \mu e^{-\mu a}\epsilon (a)^k(1-\epsilon (a))^{n-k}da, \end{aligned}$$

where

$$\begin{aligned} \epsilon (a)=\frac{\rho \bar{F}}{\rho \bar{F}+\sigma +\mu }(1-e^{-(\rho \bar{F}+\sigma +\mu )a}) \end{aligned}$$

is the probability that a binding site is occupied at age \(a\), given that the ‘owner’ of the binding site is alive. . We can get rid of the integral by using the binomium of Newton to expand \(\epsilon (a)^k(1-\epsilon (a))^{n-k}\) and compute the integral of an exponential function:

$$\begin{aligned} P_k=&\left( {\begin{array}{c}n\\ k\end{array}}\right) \mu \left( \frac{\rho \bar{F}}{\rho \bar{F}+\sigma +\mu }\right) ^{n}\nonumber \\&\sum _{j=0}^{n-k}\sum _{i=0}^k\left( {\begin{array}{c}n-k\\ j\end{array}}\right) \left( {\begin{array}{c}k\\ i\end{array}}\right) (-1)^i\left( \frac{\sigma +\mu }{\rho \bar{F}}\right) ^j\frac{1}{\mu +(\rho \bar{F}+\sigma +\mu )(n-k-j+i)}. \end{aligned}$$
(6)

So we have explicit expressions for the degree distribution \(P=(P_k)_{k=0}^n\).

There are two probability distributions that play a more important role in the characterization of \(R_0\). First, consider an individual that acquires a new partner. We assume, in accordance with (3), that this newly acquired partner will have state \(k\) with probability

$$\begin{aligned} q_{k}=\frac{(n-k+1)P_{k-1}}{\sum _{l=0}^n(n-l)P_l}=\frac{(n-k+1)P_{k-1}}{n\bar{F}}. \end{aligned}$$
(7)

(A potential partner with state \(k-1\) has \((n-k+1)\) free binding sites. Immediately after a match is made it will have state \(k\). The denominator serves to renormalise into a probability distribution.) This assumption gives us information on the state of an individual in a randomly chosen partnership, as expressed in the next lemma.

Lemma 2

Choose an individual by first sampling a partnership from the pool of all partnerships and next choosing one of the two partners at random. The probability that this individual has \(k\) partners equals

$$\begin{aligned} Q_k=\frac{kP_k}{\sum _{l=1}^nlP_l}=\frac{kP_k}{n(1-\bar{F})}. \end{aligned}$$
(8)

Note that Lemma 2 does not imply that the states of the two individuals in this partnership are independent of one another. Indeed, they are not. Information about the number of partners of one of the individuals provides some information about the duration of the partnership and thus influences the probability that the other individual has \(k\) partners (or, in other words, there exists degree correlation in this network); see Appendix B for explicit calculations for \(n=2\). (We have, so far, not calculated degree correlations for general \(n\).)

Note that the model specification is deterministic in the sense that it concerns expected values for a population of infinite size. Partnership formation is at random between two free binding sites. As a consequence of mass action and infinite population size, partnership formation with oneself or multiple partnerships with the same individual occur with probability zero. For the same reason clustering does not occur in the network. It should be possible to formulate a stochastic version for a population of size \(N\) and derive the present description by considering the limit \(N\rightarrow \infty \). We conjecture that all the previous statements hold in the limit. In particular clustering disappears in the limit, i.e. the probability that a path of a fixed finite length contains a loop goes to zero in the limit.

Finally, to summarize, we have three degree distributions, i.e. probability distributions for the number of partners of an individual, that we will use throughout this paper:

  • \(P=(P_k)\) for a random individual,

  • \(q=(q_k)\) for an individual who just acquired a partner (but is otherwise randomly chosen),

  • \(Q=(Q_k)\) for an individual in a randomly chosen partnership.

3 Superimposing transmission of an infectious disease

We consider an \({ SI}\) infection spreading on the dynamic sexual network described in Sect. 2. We assume that individuals become infectious at the very instant that they become infected and stay infectious (with the same infectiousness) for the rest of their life.

3.1 i-states and i-dynamics

The model specification begins at the i-level (i for individual). We classify individuals as either susceptible (indicated by the symbol \(-\)) or infectious (indicated by \(+\)). We assume that the \(\pm \) classification has no influence whatsoever on partnership formation and separation nor on the probability per unit of time of dying.

The state of an individual is now a triple \((x, k_-,k_+)\), where \(x\) is either \(+\) or \(-\) and \(k_-\) and \(k_+\) are nonnegative integers with \(0\le k_- +k_+\le n\). The \(x\) specifies whether the individual itself is susceptible or infectious, \(k_{-}\) specifies the number of its susceptible partners, and \(k_+\) specifies the number of its infectious partners.

3.1.1 Demographic change of i-states

Consider an individual and suppose it does not die in the period under consideration. There are two types of state transitions: those that contribute to demography and those that involve transmission of infection.

We let \(F_-\) denote the fraction of the total pool of binding sites that is free and belongs to a susceptible individual and let \(F_+\) denote the fraction that is free and belongs to an infectious individual so \(F_-+F_+=\bar{F}\). We shall say that a binding site is susceptible or infectious if the ‘owner’ is so.

The possible state transitions and corresponding rates that involve partnership formation, separation, and death of a partner are as follows:

3.1.2 Transmission (mean field at distance one)

Next, consider the transmission events. A susceptible having a binding site that is occupied by an infectious partner, gets infected by this partner at rate \(\beta \). There is more than one way in which transmission events show up as i-level state transitions. First of all, we have the possibility that a susceptible individual \(u\) gets infected by one of its infectious partners. This occurs at rate \(\beta \) times the number of infectious partners \(u\) has, i.e.,

$$\begin{aligned} (-,k_-,k_+)\rightarrow (+,k_-,k_+) \quad \qquad \qquad \text { with rate }\beta k_+. \end{aligned}$$

Here we have assumed that the frequency of sex acts within one partnership does not depend on concurrent other partnerships.

It is also possible that a partner \(v\) of \(u\) (with \(u\) either susceptible or infectious) becomes infected by one of \(v\)’s infectious partners (which includes \(u\) if \(u\) is infectious). Of course the probability that this happens depends on the actual configuration in terms of number of partners of \(v\) and their infection status. That information is, however, not incorporated in our description.

Therefore, we assume that we can average over all possibilities (we call this ‘mean field at distance one’). This assumption is an approximation that we make in order to close the infectious disease model within our limited bookkeeping framework; we will come back to this in more detail in Sect. 3.2.2. More concretely we assume that rates \(\varLambda _{\pm }(t)\) exist such that

$$\begin{aligned} (-,k_-,k_+)&\rightarrow (-,k_{-}-1,k_++1)\qquad \text { with rate }\beta \varLambda _-(t)k_-,\nonumber \\ (+,k_-,k_+)&\rightarrow (+,k_{-}-1,k_++1)\qquad \text { with rate }\beta \varLambda _+(t)k_-, \end{aligned}$$
(9)

and that we can specify \(\varLambda _{\pm }(t)\) as appropriate population averages. But before we can provide this specification in Sect. 3.2.2, we have to define the relevant population-level quantities. For this we need to first consider the i-level dynamics.

3.1.3 i-level dynamics

We have now described all i-states and the possible changes in i-states. The i-level dynamics are as follows. Newborn individuals are in state \((-,0,0)\) (we call this the i-state-at-birth), i.e. at birth an individual is susceptible and has no partner at all. Let \(p_\ell (t_b,a)\) denote the probability that an individual born at time \(t_b\) is in state \(\ell \) at age \(a\) given that the individual does not die in the period under consideration, where \(\ell \) is any allowed triple \((\pm ,k_-,k_+)\). By choosing a way to order the \(\ell \)’s, we can think of \(p\) as a vector. This ordering then also allows us to construct a matrix

$$\begin{aligned} B=B(F_\pm , \varLambda _\pm ) \end{aligned}$$

on the basis of the transition rates that are described in Sects. 3.1.1 and 3.1.2.

Then the matrix \(B\) allows us to describe the dynamics of \(p\). As long as the individual does not die,

$$\begin{aligned} \frac{\partial p}{\partial a}(t_b,a)=B\left( F_\pm (t_b+a), \varLambda _\pm (t_b+a)\right) \ p(t_b,a), \end{aligned}$$
(10)

with

$$\begin{aligned} p(t_b,0)=\left( \begin{array}{c}1\\ 0\\ \vdots \\ 0\end{array}\right) \end{aligned}$$
(11)

if \((-,0,0)\) is chosen as the first triple in our list.

Finally, as an example, we write out the matrix \(B\) for \(n=2\). If we order the twelve states \((\pm ,k_-,k_+)\) as \((-,0,0)\), \((-,1,0)\), \((-,2,0)\), \((-,0,1)\), \((-,1,1)\), \((-,0,2)\), \((+,0,0)\), \((+,1,0)\), \((+,2,0)\), \((+,0,1)\), \((+,1,1)\), \((+,0,2)\), then \(B\) is of the form

$$\begin{aligned} B=\begin{pmatrix} B_1 &{}\quad 0\\ B_2 &{}\quad B_3 \end{pmatrix}, \end{aligned}$$

with the \(B_i\) being \(6\times 6\) matrices. \(B_1\) describes the transitions between \(-\) states:

$$\begin{aligned} B_1=\begin{pmatrix} (B_1)_{11} &{}\quad \sigma +\mu &{}\quad 0 &{}\quad \sigma +\mu &{}\quad 0 &{}\quad 0 \\ 2\rho F_- &{}\quad (B_1)_{22} &{}\quad 2(\sigma +\mu ) &{}\quad 0 &{}\quad \sigma +\mu &{}\quad 0 \\ 0 &{}\quad \rho F_- &{}\quad (B_1)_{33} &{}\quad 0 &{}\quad 0 &{}\quad 0\\ 2\rho F_+ &{}\quad \beta \varLambda _- &{}\quad 0 &{}\quad (B_1)_{44} &{} \quad \sigma +\mu &{}\quad 2(\sigma +\mu )\\ 0&{}\quad \rho F_+ &{}\quad 2\beta \varLambda _-&{}\quad \rho F_-&{}\quad (B_1)_{55}&{}\quad 0\\ 0&{}\quad 0&{}\quad 0&{}\quad \rho F_+&{}\quad \beta \varLambda _-&{}\quad (B_1)_{66} \end{pmatrix}\!, \end{aligned}$$

with \((B_1)_{jj}=-\sum _{i=1}^6((B_1)_{ij}+(B_2)_{ij})\), and where \(B_2\) describes the transitions from \(-\) to \(+\) states:

$$\begin{aligned} B_2=\begin{pmatrix} 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{} \quad 0\\ 0 &{}\quad 0 &{}\quad 0 &{} \quad 0 &{} \quad 0 &{} \quad 0\\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{} \quad 0 &{}\quad 0\\ 0 &{}\quad 0 &{}\quad 0 &{}\quad \beta &{}\quad 0 &{}\quad 0\\ 0 &{}\quad 0 &{}\quad 0 &{} \quad 0 &{}\quad \beta &{}\quad 0\\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 2\beta \\ \end{pmatrix}\!, \end{aligned}$$

and \(B_3\) describes the transitions between \(+\) states:

$$\begin{aligned} B_3=\begin{pmatrix} (B_3)_{11} &{}\quad \sigma +\mu &{}\quad 0 &{} \quad \sigma +\mu &{}\quad 0 &{} \quad 0\\ 2\rho F_- &{}\quad (B_3)_{22} &{}\quad 2(\sigma +\mu ) &{}\quad 0 &{}\quad \sigma +\mu &{}\quad 0\\ 0 &{} \rho F_- &{}\quad (B_3)_{33} &{}\quad 0 &{}\quad 0 &{}\quad 0\\ 2\rho F_+ &{} \quad \beta \varLambda _+ &{}\quad 0 &{}\quad (B_3)_{44} &{}\quad \sigma +\mu &{}\quad 2(\sigma +\mu )\\ 0 &{}\quad \rho F_+ &{}\quad 2\beta \varLambda _+ &{}\quad \rho F_- &{}\quad (B_3)_{55} &{}\quad 0\\ 0 &{}\quad 0 &{}\quad 0 &{}\quad \rho F_+ &{}\quad \beta \varLambda _+ &{}\quad (B_3)_{66}\\ \end{pmatrix}\!, \end{aligned}$$

with \((B_3)_{jj}=-\sum _{i=1}^6(B_3)_{ij}\). So in this way, one can construct the matrix \(B\) explicitly.

3.2 Bookkeeping on the p-level and feedback

We have now specified the i-level dynamics. In this section we consider the p-level (p for population) and the feedback to the i-level via the variables \(F_\pm \) and \(\varLambda _\pm \).

3.2.1 Bookkeeping

In a deterministic description of a large population

$$\begin{aligned} P_\ell (t)=\int _0^\infty \mu e^{-\mu a}p_\ell (t-a,a)da=\int _{-\infty }^t\mu e^{-\mu (t-\alpha )}p_\ell (\alpha ,t-\alpha )d\alpha , \end{aligned}$$
(12)

is the fraction of the population that is in state \(\ell \) at time \(t\). In Sect. 3.3 we rewrite these identities as differential equations.

3.2.2 Feedback

It remains to provide the feedback relations that express the individual level input variables \(F_\pm (t)\) and \(\varLambda _\pm (t)\) in terms of output variables at the population level. Directly from the interpretation it follows that we should take

$$\begin{aligned} F_\pm (t)=\frac{1}{n}\sum _{k_+=0}^{n}\sum _{k_-=0}^{n-k_+}(n-k_{-}-k_+)P_{(\pm ,k_-,k_+)}(t). \end{aligned}$$
(13)

The only unknown terms left are the mean field at distance one rates \(\varLambda _\pm (t)\). In the remainder of this section we define these rates and explain why our description is not exact.

Consider a transition of an individual \(u\), with \(u\) in state \((\pm ,k_-,k_+)\rightarrow (\pm ,k_{-}-1,k_++1)\). This transition occurs when a susceptible partner \(v\) of the focus individual \(u\) in state \((\pm ,k_-,k_+)\) gets infected. The rate at which \(v\) gets infected depends on the number of infectious partners \(v\) has. However, we only know that \(v\) is a susceptible partner of \(u\).

Note that we can not distinguish between two susceptible partners \(v_1\) and \(v_2\) of an individual \(u\) and that the states of \(v_1\) and \(v_2\) are correlated in the same way with the state of \(u\). In particular, the probability that \(v_1\) is in state \((-,m_-,m_+)\) is equal to the probability that \(v_2\) is in that state. Therefore, we are interested in probabilities \(\lambda (m_+|(\pm ,k_-,k_+))\), where \(\lambda (m_+|(\pm ,k_-,k_+))\) denotes the conditional probability that a susceptible partner of an individual in state \((\pm ,k_-,k_+)\) has itself \(m_+\) infectious partners. The force of infection on a susceptible individual with \(m_+\) partners is \(\beta m_+\). Therefore, by averaging over all possibilities, we obtain the following rates for the corresponding transitions:

$$\begin{aligned} (-,k_-,k_+)&\rightarrow (-,k_{-}-1,k_++1)\quad \text { with rate }k_-\sum _{m_+=0}^{n-1}\beta m_+\lambda (m_+|(-,k_-,k_+)),\\ (+,k_-,k_+)&\rightarrow (+,k_{-}-1,k_++1)\quad \text { with rate }k_-\sum _{m_+=1}^{n}\beta m_+\lambda (m_+|(+,k_-,k_+)). \end{aligned}$$

We now make the simplifying assumption that the probability that a susceptible partner of \(u\) has \(m_+\) infectious partners does not depend on the exact state of \(u\) but only on \(u\) being susceptible or infectious. More precisely, we assume that we can approximate \(\lambda (m_+|(\pm ,k_-,k_+))\) by

$$\begin{aligned} \lambda _\pm (m_+), \end{aligned}$$

where \(\lambda _-(m_+)\) is the conditional probability that \(v\) has \(m_+\) infectious partners, given that \(v\) is susceptible and \(v\) is a partner of susceptible individual \(u\) and \(\lambda _+(m_+)\) is that same conditional probability when \(v\) is a partner of infectious individual \(u\). In fact, as we explain in Appendix B, the probabilities \(\lambda _\pm (m_+)\) are really an approximation of \(\lambda (m_+|(\pm ,k_-,k_+))\) as these probabilities ignore correlations of \(u\) and \(v\), i.e. between the states of two individuals that are in a partnership. Note that for certain static networks one can actually justify the mean field at distance one assumption for \({ SI}\) and \({ SIR}\) infection (but presumably not for \({ SIS}\)), see (Decreusefond et al. 2012; Barbour and Reinert 2013).

Assuming a two-type version of (8) we define

$$\begin{aligned} \lambda _-(m_+)=\frac{\sum _{m_-=1}^{n-m_+}m_-P_{(-,m_-,m_+)}(t)}{\sum _{l_+=0}^{n-1}\sum _{l_-=1}^{n-l_+}l_-P_{(-,l_-,l_+)}(t)}, \end{aligned}$$
(14)

with the convention that \(\lambda _-(m_+)=0\) if the denominator equals zero, and

$$\begin{aligned} \lambda _+(m_+)=\frac{\sum _{m_-=0}^{n-m_+}m_+P_{(-,m_-,m_+)}(t)}{\sum _{l_+=1}^n\sum _{l_-=0}^{n-l_+}l_+P_{(-,l_-,l_+)}(t)}, \end{aligned}$$
(15)

with the convention that \(\lambda _+(1)=1\) and \(\lambda _+(m_+)=0\) for \(m_+>1\), if the denominator equals zero. The explanation of (14) and (15) is as follows. In both cases, we consider the probability that the state of \(v\) is \((-,m_-,m_+)\), given that \(v\) is susceptible and \(v\) has a partner \(u\). In the case of (14), \(u\) is susceptible, so, if we also take into account that \(u\) is one of the \(m_-\) susceptible partners of \(v\), the probability that \(v\) is in state \((-,m_-,m_+)\) is

$$\begin{aligned} \frac{m_-P_{(-,m_-,m_+)}(t)}{\sum _{l_+=0}^{n-1}\sum _{l_-=1}^{n-l_+}l_-P_{(-,l_-,l_+)}(t)} \end{aligned}$$

cf. Lemma 2. Similarly, in the case of (15), we ‘arrive’ at \(v\) via its link to the infectious \(u\), so then the probability that \(v\) is in state \((-,m_-,m_+)\) is

$$\begin{aligned} \frac{m_+P_{(-,m_-,m_+)}(t)}{\sum _{l_+=0}^{n-1}\sum _{l_-=1}^{n-l_+}l_+P_{(-,l_-,l_+)}(t)}. \end{aligned}$$

In both cases, the denominator serves to normalize.

The mean field at distance one terms \(\varLambda _\pm \) in (9) are now specified by

$$\begin{aligned} \varLambda _-(t)=\sum _{m_+=1}^{n-1}m_+\lambda _-(m_+) \end{aligned}$$
(16)

with \(\lambda _-(m_+)\) given by (14), and

$$\begin{aligned} \varLambda _+(t)=\sum _{m_+=1}^n m_+\lambda _+(m_+)=1+\sum _{m_+=2}^n(m_+-1)\lambda _+(m_+), \end{aligned}$$
(17)

with \(\lambda _+(m_+)\) given by (15). (For mean field at distance one terms also see Lindquist et al. 2011.)

Note that, from an individual-based perspective, (16) and (17) are the only formulas consistent with our assumption that \(u\)’s susceptible partners are subject to a force of infection \(\beta \varLambda _\pm \) depending only on \(t\) and \(u\)’s infection status \(\pm \) (and not on the number of susceptible and infectious partners of \(u\) cf. Appendix B). Hence our choice of the term ‘mean field at distance one’ for the latter assumption.

3.3 The p-level differential equations

In a deterministic description of a large population, \(P_{(\pm ,k_-,k_+)}\) denotes the fraction of the population in state \((\pm ,k_-,k_+)\) We take as the convention that the \(P_{(\pm ,k_-,k_+)}\) should be interpreted as zero when \(k_-+k_+>n\), \(k_-<0\), or \(k_+<0\). By differentiation of (12) and using (10)–(11) for \(p\), we obtain the following set of \((n+1)(n+2)\) differential equations:

$$\begin{aligned} \frac{dP_{(-,0,0)}}{dt}&=\mu -(\rho \bar{F} n+\mu )P_{(-,0,0)}+(\sigma +\mu )(P_{(-,1,0)}+P_{(-,0,1)})\\ \frac{dP_{(-,k_-,k_+)}}{dt}&=-\left( \rho \bar{F}(n-k_{-}-k_+)+(\sigma +\mu )(k_-+k_+)\right. \\&\quad \left. +\beta k_++\beta \varLambda _-k_-+\mu \right) P_{(-,k_-,k_+)}\\&+\rho F_-(n-k_{-}-k_++1)P_{(-,k_{-}-1,k_+)}\\&+\rho F_+(n-k_{-}-k_++1)P_{(-,k_-,k_+-1)}\\&+(\sigma +\mu )\left( (k_-+1)P_{(-,k_-+1,k_+)}+(k_++1)P_{(-,k_-,k_++1)}\right) \\&+\beta \varLambda _-(k_-+1)P_{(-,k_-+1,k_+-1)}\\ \frac{dP_{(+,k_-,k_+)}}{dt}&{=}{-}\left( \rho \bar{F} (n-k_{-}-k_+)+(\sigma +\mu )(k_-+k_+){+}\beta \varLambda _+k_-+\mu \right) P_{(+,k_-,k_+)}\\&+\rho F_-(n-k_{-}-k_++1)P_{(+,k_{-}-1,k_+)}\\&\quad +\rho F_+(n-k_{-}-k_++1)P_{(+,k_-,k_+-1)}\\&+(\sigma +\mu )\left( (k_-+1)P_{(+,k_-+1,k_+)}+(k_++1)P_{(+,k_-,k_++1)}\right) \\&+\beta \varLambda _+(k_-+1)P_{(+,k_-+1,k_+-1)}+\beta k_+P_{(-,k_-,k_+)}. \end{aligned}$$

Choose the same ordering of the \(\ell \)’s as before with the i-states in Sect. 3.2 and let \(P\) denote the corresponding vector of the variables \(P_\ell \). In matrix notation, we have

$$\begin{aligned} \frac{dP}{dt}=\mu \mathbf {1}_{{(-,0,0)}}+B\left( F_\pm ,\varLambda _\pm \right) P-\mu P, \end{aligned}$$
(18)

where \(\mathbf {1}_{{(-,0,0)}}\) is the indicator function of \({(-,0,0)}\), and \(B\) is the matrix corresponding to the rates of the state transitions described in Sects. 3.1.1 and 3.1.2.

3.3.1 Consistency relations

The \(P_\ell \) are related to each other by:

$$\begin{aligned} \sum _{k_+=0}^n\sum _{k_-=0}^{n-k_+}k_+P_{(-,k_-,k_+)}(t)=\sum _{k_+=0}^n\sum _{k_-=0}^{n-k_+}k_-P_{(+,k_-,k_+)}(t), \end{aligned}$$
(19)

This is evident from the interpretation, since both terms denote the number of \({ SI}\) partnerships, i.e. the number of partnerships involving an infectious and a susceptible individual. The proof of (19) starts by differentiating both left- and right hand side with respect to \(t\) and continues by inserting components of (18); this is worked out for a similar situation in (Lindquist et al. 2011, Appendix B).

We have assumed that the infectious disease has no influence on the partnership formation and separation or on the probability per unit of time of dying. Therefore, the disease-free partnership network is embedded in (18) and the fraction of individuals in the population in state \(k\) at time \(t\) is equal to

$$\begin{aligned} P_k(t)=\sum _{k_-+k_+=k}\left( P_{(-,k_-,k_+)}(t)+P_{(+,k_-,k_+)}(t)\right) . \end{aligned}$$
(20)

Furthermore, the dynamics of partnerships in the population are governed by the sum of the fraction of free susceptible and the fraction of free infectious binding sites, which is equal to the total fraction of free binding sites, i.e. \(F_-(t)+F_+(t)=F(t)\). As a consequence, the set characterized by

$$\begin{aligned} F_-(t)+F_+(t)=\bar{F} \end{aligned}$$
(21)

is invariant and attracting. Therefore, also in the network with infection superimposed, we consider \(F(t)\) constant and equal to \(\bar{F}\) (see Lemma 1). Likewise, we can consider the left hand side of (20) as constant in time and given by (6).

4 Linearisation and the map \(L\)

In this section we linearise system (18) around the disease-free equilibrium. Next we show that we can reduce the dimension of the linearised system and consider only the variables \(P_{(-,k_-,1)}\) and \(P_{(+,k_-,k_+)}\). In Sect. 6 we will use this reduced linearised system to prove that the basic reproduction number \(R_0\), that we characterize in Sect. 5, indeed provides a threshold value of 1 for the disease free steady state of system (18) to become unstable. To this end we define a map \(L\) in Sect. 4.2, which allows us to relate, in the linearisation, population-level fractions of individuals (that we consider in the present section) to fractions of binding sites (that we consider in Sect. 5).

4.1 Linearisation

Note that the disease-free equilibrium is given by

$$\begin{aligned} P_{(-,k,0)}(t)=P_k, \end{aligned}$$

\(0\le k\le n\), and \(P_\ell (t)=0\) for all triplets \(\ell \) not of the form \((-,k,0)\).

Next, note that we can use relationship (21) in order to replace \(F_-\) by \(\bar{F}-F_+\) (note that this last expression does not involve any variable of the form \(P_{(-,k,0)}\)). Next, we can reduce the dimension of the system by \(n+1\) by eliminating the \(P_{(-,k,0)}\), \(k=0,\ldots ,n\), from the system using relation (20).

Consider the differential equations for \(P_{(-,k_-,1)}\), \(0\le k_-\le n-1\), explicitly given by

$$\begin{aligned} \frac{dP_{(-,k_-,1)}}{dt}&={-}\left( \rho \bar{F}(n-k_{-}{-}1)+(\sigma +\mu )(k_-+1)+\beta +\beta \varLambda _-k_-+\mu \right) P_{(-,k_{-},1)}\\&+\rho (\bar{F}-F_+)(n-k_-)P_{(-,k_{-}-1,1)}+\rho F_+(n-k_-)P_{(-,k_-,0)}\\&+(\sigma +\mu )\left( (k_-+1)P_{(-,k_-+1,1)}+2P_{(-,k_-,2)}\right) \\&+\beta \varLambda _-(k_-+1)P_{(-,k_-+1,0)} \end{aligned}$$

(as one can verify by writing out the relevant part of (18)).

Then the only nonlinear terms are those that involve \(F_+\) or \(\varLambda _\pm \) as a factor. In these differential equations we find, among the nonlinear terms,

$$\begin{aligned} \rho F_+(n-k_-)P_{(-,k_-,0)} \end{aligned}$$
(22)

and

$$\begin{aligned} \beta \varLambda _-(k_-+1)P_{(-,k_-+1,0)}. \end{aligned}$$
(23)

Trusting that it does not lead to confusion we will denote the variables in the linearisation of (18) by the same symbols as the variables in the nonlinear system.

Linearisation of (22) yields

$$\begin{aligned} \rho F_+(n-k_-)P_{k_-} \end{aligned}$$

where \(P_{k_-}\) is the fraction of the population in state \(k_-\) in the disease-free network and \(F_+\) is defined as in (13), only now for the variables of the linearised system. For (23), similarly replace \(P_{(-,k_-+1,0)}\) by \(P_{k_-+1}\) but next use the identity

$$\begin{aligned} (k_-+1)P_{k_-+1}=Q_{k_-+1}\sum _{m}mP_m \end{aligned}$$

(cf. (8)). In the definition (16) of \(\varLambda _-\) we take linearisation into account by adapting the denominator of the expression for \(\lambda _-\) in (14). More precisely, we replace that denominator by

$$\begin{aligned} \sum _mmP_m. \end{aligned}$$

Note that this cancels the identical factor in the numerator. The upshot is that this sum leads to the linearisation of (23) being equal to

$$\begin{aligned} \beta Q_{k_-+1}\sum _{j_+=0}^{n-1}\sum _{j_-=1}^{n-j_+}j_+j_-P_{(-,j_-,j_+)}. \end{aligned}$$
(24)

In all other nonlinear terms, whenever \(F_+\) or \(\varLambda _\pm \) multiplies \(P_\ell \) and \(P_\ell \) is zero in the disease free steady state, simply put \(F_+\) respectively \(\varLambda _\pm \) equal to their values in the disease-free equilibrium, i.e.

$$\begin{aligned} F_+&=0\\ \varLambda _-&=0\\ \varLambda _+&=1, \end{aligned}$$

to obtain the corresponding term for the linearised system.

Thus we deduce that the linearised system is given by

(25)

Remark 1

In Lemma 3 below we will show that we can simplify expression (24) to

$$\begin{aligned} \beta Q_{k_-+1}\sum _{j_-=0}^{n-1}j_-P_{(-,j_-,1)}. \end{aligned}$$

Intuitively, one would expect that, in the linearisation, for \(k_+\ge 2\), \(P_{(-,k_-k_+)}(t)=0\) for all \(t\) if \(P_{(-,k_-k_+)}(0)=0\). Indeed, in the beginning of an epidemic very few individuals in the population are infectious. It is already very unlikely for a susceptible individual to have an infectious partner, so the probability that a susceptible individual has more than one infectious partner should be negligible. That this is indeed the case, is established in the following lemma.

Lemma 3

In the linearised system (25), if \(P_{(-,k_-k_+)}(0)=0\), then

$$\begin{aligned} P_{(-,k_-,k_+)}(t)\equiv 0, \end{aligned}$$

for \(k_+\ge 2\).

Proof

We prove the lemma in four steps

Step 1.:

Observe first that the differential equations for \(P_{(-,k_-,k_+)}\), \(k_+\ge 2\), form a closed system, i.e. they do not depend on the remaining variables (see (25)).

Step 2.:

Observe that this closed system has a certain hierarchical structure, viz. the subsystem for the variables

$$\begin{aligned} P_{(-,j,n-k)}, \end{aligned}$$

\(0\le j\le k\), depends on the variables of the subsystems with a lower value of \(k\), but not on the variables of any subsystem with a higher value of \(k\) (the reason is that both \(F_+\) and \(\varLambda _-\) were put equal to zero to derive the equations that we consider; recall that we focus on \(n-k\ge 2\)).

Step 3.:

For \(k=0\) we have

$$\begin{aligned} \frac{dP_{(-,0,n)}}{dt}=-\left( (\sigma +\mu )n+\mu +\beta n\right) P_{(-,0,n)} \end{aligned}$$

so, if \(P_{(-,0,n)}(0)=0\), then \(P_{(-,0,n)}\equiv 0\).

Step 4.:

Consider \(k=1\). The diagram in Fig. 1 shows at once that the zero state is globally stable, i.e. if \(P_{(-,j,n-1)}(0)=0\), then \(P_{(-,j,n-1)}\equiv 0\), \(j=0,1\). For \(k=2\), we have the diagram in Fig. 2, which shows that if \(P_{(-,j,n-2)}(0)=0\), then \(P_{(-,j,n-2)}\equiv 0\), \(j=0,1,2\).

By continuing in this way we establish that for all \(k\) with \(0\le k\le n-2\), if \(P_{(-,j,n-k)}(0)=0\), then \(P_{(-,j,n-k)}\equiv 0\), \(j=0,1,\ldots ,k\). \(\square \)

Fig. 1
figure 1

Diagram that shows that, if \(P_{(-,j,n-1)}(0)=0\), then \(P_{(-,j,n-1)}\equiv 0\), \(j=0,1\)

Fig. 2
figure 2

Diagram that shows that, if \(P_{(-,j,n-2)}(0)=0\), then \(P_{(-,j,n-2)}\equiv 0\), \(j=0,1,2\)

It follows that we are left to deal with the stability of the following linear system:

(26)

Recall definition (13) of \(F_+\). In the reduced linearised system (26) we are left with variables \(P_{(-,k_-,1)}\) and \(P_{(+,k_-,k_+)}\), \(k_-,k_+\ge 0\), \(0\le k_-+k_+\le n\). Therefore, (26) is a closed system. Furthermore, note that the dimension of the system is \(n+\frac{1}{2}(n+1)(n+2)=\frac{1}{2}(n^2+5n+2)\) (where the contribution \(n\) comes from the \(P_{(-,k_-,1)}\) and the \(\frac{1}{2}(n+1)(n+2)\) from the \(P_{(+,k_-,k_+)}\)).

4.2 The map \(L\)

Order the \(P_\ell \) in some appropriate way, and denote the corresponding vector by \(P\). We define a linear map \(L\) from \({{\mathrm{\mathbb {R}}}}^{\frac{1}{2}(n^2+5n+2)}\) to \({{\mathrm{\mathbb {R}}}}^{n+2}\) as follows:

$$\begin{aligned} L(P)=\begin{pmatrix}\sum _{k_+=0}^n\sum _{k_-=0}^{n-k_+}(n-k_--k_+)P_{(+,k_-,k_+)}\\ \left( P_{(-,j-1,1)}\right) _{j=1}^n\\ \sum _{k_+=0}^n\sum _{k_-=0}^{n-k_+}k_+P_{(+,k_-,k_+)} \end{pmatrix}. \end{aligned}$$
(27)

Note that \(L\) maps the positive \(P\)-cone to the positive cone in \({{\mathrm{\mathbb {R}}}}^{n+2}\). In fact, if \(P\) is in the interior of the positive cone, i.e. all vector elements are strictly positive, then

$$\begin{aligned} \sum _{k_+=0}^n\sum _{k_-=0}^{n-k_+}(n-k_--k_+)P_{(+,k_-,k_+)}>0, \end{aligned}$$

since \(n-k_--k_+\ge 0\) for all \(k_-+k_+< n\), and \(P_{(+,k_-,k_+)}>0\) for all \(k_-\) and \(k_+\),

$$\begin{aligned} P_{(-,j,1)}>0, \end{aligned}$$

and

$$\begin{aligned} \sum _{k_+=0}^n\sum _{k_-=0}^{n-k_+}k_+P_{(+,k_-,k_+)}>0, \end{aligned}$$

since \(P_{(+,k_-,k_+)}>0\) for all \(k_-\) and we sum over \(k_+=0,1,2,\ldots ,n\). In particular it follows that if \(L(P)=0\), then \(P=0\). We shall use this linear operator \(L\) in Sect. 6.

5 Dynamics of the binding sites of an infectious individual: characterization of \(R_0\)

By exploiting that an individual can be considered as a collection of \(n\) binding sites that behave independently from one another as far as separation or acquiring a new partner is concerned and by using our mean field at distance one assumption, we are able to characterize \(R_0\) in terms of binding sites. In this section we only use the interpretation of the model and we do not use the system (18) or its reduced linearisation (26). We characterize \(R_0\) as the dominant eigenvalue of a next-generation matrix (NGM) that we construct using the interpretation of the model.

The entries in the NGM can be viewed as expected offspring values for a multi-type branching process (Jagers 1975; Haccou et al. 2005), with the two matrix-indices specifying the type at birth of, respectively, offspring and parent. Several slightly different branching processes may yield the same NGM and for the deterministic theory (which is what we deal with here) there is no need to choose one of these as ‘the’ underlying process. A branching process corresponding to the NGM is subcritical when \(R_0<1\) and supercritical when \(R_0>1\). But does such a branching process indeed correspond to the linearisation of (18) in the disease free steady state? Especially for \(n>1\) this is a nontrivial question. In Sect. 6 we will therefore prove that \(R_0\), as computed from the NGM, is indeed a threshold parameter with threshold value one for (18).

First, in Sect. 5.1, we consider the case \(n=1\). In Sect. 5.2 we generalize the transition and transmission scheme to \(n>1\), and in Sect. 5.3 we characterize \(R_0\) on the level of binding sites. We conclude this section by showing in Sect. 5.4 that \(R_0\) also has an interpretation in terms of individuals. The explicit expression for \(R_0\) and the remainder of its derivation is left for Appendix C.

Consider the usual setting for determining \(R_0\), i.e. suppose that we have a population in which only a few individuals are infectious and all others are susceptible. We are interested in the expected number of secondary cases caused by one ‘typical’ infectious case.

5.1 The case \(n=1\)

First, consider a population of individuals with partnership capacity \(n=1\). Then each individual has exactly one binding site. If we now consider an infectious individual, then its binding site can be in one of three states:

  • \(A\)—free

  • \(B\)—occupied by a susceptible partner

  • \(C\)—occupied by an infectious partner

Please note that we recycle symbols: the \(A\) here has nothing to do with the matrix \(A\) of Sect. 2 and the \(B\) here has nothing to do with the matrix \(B\) in (10) of Sect. 3. In Fig. 3 the possible state transitions and corresponding rates for an infectious individual are given. Note that it is highly unlikely that an infectious individual acquires an infectious partner in the beginning of an epidemic, and therefore there is no transition from \(A\) to \(C\).

Fig. 3
figure 3

Flow chart describing the possible transitions between states \(A\), \(B\) and \(C\) and their corresponding rates. Note that, in the beginning of the epidemic, only a few individuals in the population are infectious. Therefore the probability that an infectious individual acquires an infectious partner is zero. This is represented in the flowchart where there is no direct arrow from \(A\) to \(C\)

We can characterize \(R_0\) by constructing an NGM \(K_1\) that involves a transmission part \(T_1\) and a transition part \(\varSigma _1\).

Recall that we use the convention that, for a transition matrix \(M=(m_{ij})\), \(m_{ij}\) denotes the probability per unit of time at which a transition from \(j\) to i occurs (instead of the transition from i to \(j\), as it is common in the stochastic community).

The matrices \(T_1\) and \(\varSigma _1\) are obtained as follows. Consider an infectious individual, and order the states as \(A\), \(B\), \(C\). Then the transitions of the individual’s binding site are described by the following matrix \(\varSigma _1\) (see Fig. 3 for its graphical representation):

$$\begin{aligned} \varSigma _1=\begin{pmatrix}-(\rho \bar{F}+\mu ) &{} \sigma +\mu &{}\sigma +\mu \\ \rho \bar{F} &{} -(\beta +\sigma +2\mu )&{} 0\\ 0&{}\beta &{}-(\sigma +2\mu )\end{pmatrix}. \end{aligned}$$
(28)

Here \((\varSigma _1)_{xy}\) is the rate at which a transition from a binding site in state \(y\) to state \(x\) occurs, \(x,y\in \{A,B,C\}\), \(x\ne y\), and for the diagonal elements we have \((\varSigma _1)_{xx}=-(\mu +\sum _{y\ne x}(\varSigma _1)_{yx})\).

Consider an infectious individual \(u\) with its binding site in state \(B\). If \(u\) infects its susceptible partner \(v\), then the binding site of \(u\) transits from \(B\) to \(C\). This transition is represented by \((\varSigma _1)_{CB}=\beta >0\). In addition to this transition, an additional \(C\) binding site is created. Indeed, \(v\) is now also an infectious individual who has a binding site occupied by an infectious partner (namely \(u\)). This shows that one transition from \(B\) to \(C\) always creates one additional \(C\) binding site in the population. Accordingly we define the transmission matrix \(T_1\):

$$\begin{aligned} T_1=\beta \begin{pmatrix}0&{}\quad 0&{}\quad 0\\ 0&{}\quad 0&{}\quad 0\\ 0&{}\quad 1&{}\quad 0\end{pmatrix}. \end{aligned}$$
(29)

Using \(T_1\) and \(\varSigma _1\) we can construct the NGM \(K_1\):

$$\begin{aligned} K_1=T_1(-\varSigma _1)^{-1}. \end{aligned}$$

The basic reproduction number \(R_0\) is defined as the dominant eigenvalue of \(K_1\) (Diekmann et al. 2013, Section 7.2).

In the present case we can, quite easily, give an explicit expression for \(R_0\). Note that \(T_1\) has one-dimensional range spanned by the vector \((0,0,1)'\). Therefore \((0,0,1)'\) is the eigenvector corresponding to the dominant eigenvalue \(R_0\). We find \(K_1\) applied to this vector by first constructing \((-\varSigma _1)^{-1}\) applied to this vector. This can be done by either treating it as a linear algebra problem or we can use the interpretation for it: \((-(\varSigma _1)^{-1}(0,0,1)')_x\) is the mean time spent in state \(x\) when starting in state \(C\), \(x=A\), \(B\), \(C\) (in fact we only use \(x=B\)). We find that

$$\begin{aligned} (-\varSigma _1)^{-1}\begin{pmatrix}0\\ 0\\ 1\end{pmatrix}=\begin{pmatrix}\frac{\sigma +\mu }{\mu (\rho \bar{F}+\sigma +2\mu )}\\ \frac{\rho \bar{F}(\sigma +\mu )}{\mu (\beta +\sigma +2\mu )(\rho \bar{F}+\sigma +2\mu )}\\ \frac{\beta (\rho \bar{F}+\mu )+\mu (\rho \bar{F}+\sigma +2\mu )}{\mu (\beta +\sigma +2\mu )(\rho \bar{F}+\sigma +2\mu )}\end{pmatrix}, \end{aligned}$$

and subsequently,

$$\begin{aligned} K_1\begin{pmatrix}0\\ 0\\ 1\end{pmatrix}=\frac{\beta \rho \bar{F}(\sigma +\mu )}{\mu (\beta +\sigma +2\mu )(\rho \bar{F}+\sigma +2\mu )}\begin{pmatrix}0\\ 0\\ 1\end{pmatrix}, \end{aligned}$$

from which we conclude that

$$\begin{aligned} R_0=\frac{\beta \rho \bar{F}(\sigma +\mu )}{\mu (\beta +\sigma +2\mu )(\rho \bar{F}+\sigma +2\mu )}. \end{aligned}$$
(30)

Alternatively, we can characterize \(R_0\) by first step analysis; see Appendix A for the details or Diekmann et al. (2013, Section 7.8) or Miller et al. (2012, formula (3.1.9)) or Inaba (1997, Section 4.1). However, this does not have such a nice generalization to \(n>1\) as the \(ABC\) scheme does.

5.2 Generalization of the transition and transmission matrix: \(n>1\)

Now consider the case \(n>1\). In this case, an individual is a collection of \(n\) binding sites. These binding sites may be free, occupied by a susceptible or occupied by an infectious individual, i.e. in states \(A\), \(B\), or \(C\), respectively. An infectious individual can infect a susceptible individual in the population if it has a binding site that is occupied by a susceptible individual. In that case, that binding site becomes occupied by an infectious individual. Similar to the \(n=1\) situation we observe that if a binding site makes a transition from ‘occupied by a susceptible individual’ to \(C\), it creates a new infectious individual in the population. However, we need to know in which states the \(n\) binding sites of this new infectious individual are. Obviously, one new infectious binding site is in state \(C\), viz. the binding site still occupied by its epidemiological parent. In order to know the states of the other \(n-1\) binding sites, we need to know the number of (susceptible) partners of this individual at epidemiological birth.

Naively, motivated by Lemma 2, one would think (as we did at first) that the number of partners of a newly infected individual is \(k\) (i.e. 1 binding site in state \(C\),   \(k-1\) binding sites in state \(B\) and \(n-k\) binding sites in state \(A\)) with probability \(Q_k\). The computation of the corresponding \(R_0\) is rather straightforward (using the method explained in Appendix A for \(n=1\)). However, one can check numerically that the stability switch of the disease free steady state of (18) does not coincide with \(R_0=1\) when \(R_0\) is defined in this manner. We conclude that the premise is wrong. In retrospect this makes sense. First of all, we know that \(q\) differs from \(Q\), where \(q\) and \(Q\) are defined by (7) and (8), respectively. In our model description we keep track of the number of partners of an individual. We use mean field at distance one for the partners of partners of this individual (and this shows up in the \(\varLambda _\pm \) in the transmission events). So we need to do the same when characterizing \(R_0\) and also take into account the partners of susceptible partners. Therefore, we need to extend the information that is tracked in the scheme.

We generalize the \(ABC\) scheme of Sect. 5.1 as follows. Consider an infectious binding site. Then this binding site can be in one of \(n+2\) states:

  • \(A\)—free

  • \(B_j\)—occupied by a susceptible partner that has \(j\) partners in total, \(j=1,\ldots , n\)

  • \(C\)—occupied by an infectious partner.

Let \({{\varvec{B}}}\) denote the collection of all states \(B_j\). We denote the transition matrix of the states \(A\), \(B_j\), \(j=1,\ldots ,n\), and \(C\) by \(\varSigma \) (see Fig. 4 for the corresponding flowchart), where

$$\begin{aligned} \varSigma =\begin{pmatrix}-(\rho \bar{F}+\mu )&{}\quad \varvec{\sigma } +\varvec{\mu }&{}\quad \sigma +\mu \\ \rho \bar{F} {\varvec{q}}&{}\quad \varSigma _{{\varvec{B}}} &{}\quad \varvec{0}\\ 0&{}\quad \varvec{\beta }&{}\quad -(\sigma +2\mu )\end{pmatrix}, \end{aligned}$$
(31)

where \(\varvec{0}\) denotes the \(n\) dimensional zero vector, \(\varvec{\sigma } +\varvec{\mu }\) and \(\varvec{\beta }\) both denote an \(n\)-dimensional row vector, namely

$$\begin{aligned} \varvec{\sigma } +\varvec{\mu }&=(\sigma +\mu )\begin{pmatrix}1&1&\cdots&1\end{pmatrix},\\ \varvec{\beta }&=\beta \begin{pmatrix}1&1&\cdots&1\end{pmatrix}. \end{aligned}$$

The vector \({\varvec{q}}\) is the probability vector with elements \(q_k\) given by (7), and \(\varSigma _{{\varvec{B}}}\) is an \(n\times n\) matrix describing the transitions between the states \(B_1\), \(\ldots \), \(B_n\) and out of \({\varvec{B}}\); see Fig. 4 for the corresponding flowchart.

Fig. 4
figure 4

Flow chart describing the possible transitions between states \(A\), \(B_j\), \(j=1,\ldots ,n\), and \(C\) and their corresponding rates

Let’s describe \(\varSigma _{{\varvec{B}}}\) more carefully. The matrix \(\varSigma _{{\varvec{B}}}\) describes the transitions between the \(B_j\) and out of \({\varvec{B}}\). Thus \(\varSigma _{{\varvec{B}}}\) is an \(n\times n\) tridiagonal matrix with negative diagonal entries and positive off-diagonal entries. More specifically,

$$\begin{aligned} (\varSigma _{{\varvec{B}}})_{j-1,j}&= \ (\sigma +\mu )(j-1),\\ (\varSigma _{{\varvec{B}}})_{j,j}&= \ -(\beta +\rho \bar{F}(n-j)+(\sigma +\mu )j+\mu ),\\ (\varSigma _{{\varvec{B}}})_{j+1,j}&= \ \rho \bar{F}(n-j). \end{aligned}$$

Indeed, a susceptible individual with 1 infectious and \(j-1\) susceptible partners loses one of these susceptible partners at rate \((\sigma +\mu )(j-1)\), acquires a new susceptible partner at rate \(\rho \bar{F}(n-j)\), and, since it can also become infectious, lose its infectious partner, or die (these last three mark transitions out of \({\varvec{B}}\)), the rate out of \(B_j\) is \((\varSigma _{{\varvec{B}}})_{j,j}=\ -(\beta +\rho \bar{F}(n-j)+(\sigma +\mu )j+\mu )\).

The other elements of \(\varSigma \) have the following interpretation. Note that, in the beginning of an epidemic, a binding site in state \(A\) acquires a susceptible partner at rate \(\rho \bar{F}\). The probability that, just after the moment of acquisition, this susceptible partner has in total \(j\) partners is \(q_j\) in accordance with (7). Therefore, the rate at which a binding site in state \(A\) transits to state \(B_j\) is \((\varSigma )_{B_j,A}=\rho \bar{F} q_j\). In a similar way one can use the interpretation (and the flowchart in Fig. 4) to find the other entries for the matrix \(\varSigma \).

Finally, we need to construct the transmission matrix \(T\). A transmission corresponds to a transition \(B_j\rightarrow C\), i.e. if an infectious individual \(u\) with a binding site in \(B_j\) infects its partner \(v\). This is included in the matrix \(\varSigma \) since \((\varSigma )_{C,B_j}=\beta >0\). The transmission matrix \(T\) includes the binding sites of the newly infected partner \(v\). Concerning the binding sites of \(v\), since it is now infectious, we observe that it has one binding site in \(C\), \(n-j\) binding sites in \(A\) and \(j-1\) binding sites will be occupied by susceptible individuals, i.e. \(j-1\) binding sites will be in the set \({\varvec{B}}\) (see Fig. 5 for an illustration where \(u\) has a binding site in \(B_2\) that changes state to \(C\) and \(v\) is the newly infected individual with one binding site in \(C\), one binding site in \(A\) and one binding site occupied by a susceptible individual). All that is left to specify are the states of the \(j-1\) binding sites in \({\varvec{B}}\), i.e. we need to know how many partners these susceptible partners of \(v\) have (in Fig. 5: how many partners does \(w\) have).

Fig. 5
figure 5

Illustration of the construction of \(T\) for \(n=3\). Suppose we start with an individual with two binding sites in \(A\) and one binding site in \(B_2\). Then \(u\) has one susceptible partner \(v\). If \(u\) infects \(v\), then \(v\) will have one binding site in \(C\), one binding site in \(A\), and one binding site will be occupied by a susceptible partner \(w\). In the example, \(w\) has three partners in total and therefore the binding site of \(v\) would be in state \(B_3\). However, information about the partners of \(w\) is not incorporated in our model description and therefore we assume that \(w\) has three partners with probability \(Q_3\)

The probability that a partner \(w\) of \(v\) has \(k\) partners depends on the state of \(v\), where \(v\) is in state \((+,j-1,1)\) immediately after infection by \(u\). However, as another manifestation of the mean field at distance one assumption, we approximate this probability by only taking into account that the susceptible individual \(w\) has at least one partner \(v\). Therefore, we assume that \(w\) has \(k\) partners with probability \(Q_k\) (cf. Lemma 2). In other words, we assume that a binding site of \(v\) occupied by a susceptible partner, i.e. a binding site in the set \({\varvec{B}}\), is in state \(B_k\) with probability \(Q_k\).

Accordingly, we define the transmission matrix \(T\) as follows:

$$\begin{aligned} T=\beta \left( \begin{array}{ccccc}0&\phi _1&\cdots&\phi _n&0\end{array}\right) , \end{aligned}$$
(32)

where \(\phi _j\) is the \(n+2\) vector

$$\begin{aligned} \phi _j=\left( \begin{array}{c}n-j \\ (j-1)\varvec{Q} \\ 1\end{array}\right) =(n-j)\psi _A+(j-1)\psi _{{\varvec{B}}}+\psi _C, \end{aligned}$$
(33)

\(j=1,\ldots ,n\), where

$$\begin{aligned} \psi _A=\left( \begin{array}{c}1 \\ \varvec{0} \\ 0\end{array}\right) , \quad \psi _{{\varvec{B}}}=\left( \begin{array}{c}0\\ \varvec{Q}\\ 0\end{array}\right) , \quad \psi _C=\left( \begin{array}{c}0\\ \varvec{0}\\ 1\end{array}\right) , \end{aligned}$$
(34)

and \({\varvec{Q}}\) is the probability vector with components \(Q_k\) given by (8). Note that the \(\phi _j\) are a linear combination of the \(\psi _x\), \(x\in \{A,{{\varvec{B}}},C\}\). We conclude that the range of \(T\) is spanned by \(\psi _A\), \(\psi _{{\varvec{B}}}\), \(\psi _C\).

In Sect. 5.4 we shall show that we can identify the \(\phi _j\) with an individual in state \((+,j-1,1)\), which allows us to interpret \(R_0\) in terms of individuals. But first, in Sect. 5.3, we focus on the interpretation in terms of binding sites.

5.3 \(R_0\) in terms of binding sites

Now that we have defined the transition matrix \(\varSigma \) and the transmission matrix \(T\), we are ready to define the basic reproduction ratio \(R_0\) for \(n>1\) as the dominant eigenvalue of the matrix

$$\begin{aligned} T(-\varSigma )^{-1}. \end{aligned}$$

In order to underpin this, consider variables \(X_A\), \(X_{B_j}\), and \(X_C\), where \(X_A\), \(X_{B_j}\), and \(X_C\) are the fractions of the total binding-site population in states \(A\), \(B_j\), and \(C\), respectively. Then, based on the interpretation, \(X_A\), \(X_{B_j}\), and \(X_C\) should satisfy the following system of differential equations:

$$\begin{aligned} \frac{d}{dt}\begin{pmatrix}X_A\\ X_{B_1}\\ \vdots \\ X_{B_n}\\ X_C\end{pmatrix}= (T+\varSigma )\begin{pmatrix}X_A\\ X_{B_1}\\ \vdots \\ X_{B_n}\\ X_C\end{pmatrix}\!. \end{aligned}$$
(35)

It follows that the zero state \((X_A,X_{B_1},\ldots ,X_{B_n},X_C)'=0\) switches stability at \(R_0=1\). We formulate this as

Theorem 1

\(R_0\), defined as the dominant eigenvalue of \(T(-\varSigma )^{-1}\), is a threshold parameter with threshold value one for the zero state of (35).

Note that \(T(-\varSigma )^{-1}\) is an \((n+2)\times (n+2)\) matrix. Also, elements \((T(-\varSigma )^{-1})_{xy}\) can be interpreted as the expected number of binding sites in \(x\) created by one binding site in \(y\), where \(x,y\in \{A,B_1,\ldots ,B_n,C\}\). This gives us an interpretation of \(R_0\) in terms of binding sites \(A,B_1,\ldots ,B_n,C\). However, we can reduce the characterization of \(R_0\) to a problem involving a \(3\times 3\) matrix by averaging the \(B_j\) in the right way (and this allows us to consider binding sites in \(A\), \({\varvec{B}}\), \(C\) only). We show this in the remainder of this subsection.

Consider the \(3\times 3\) matrix \(K=(k_{x,y})\) where the \(k_{x,y}\), \(x,y=A,{{\varvec{B}}},C\), are defined by

$$\begin{aligned} T(-\varSigma )^{-1}\psi _y=\sum _{x=A,{{\varvec{B}}},C}k_{x,y}\psi _x. \end{aligned}$$
(36)

Then \(R_0\) is also the dominant eigenvalue of \(K\). We formulate this in a theorem.

Theorem 2

\(R_0\), defined as the dominant eigenvalue of \(K\), where \(K\) is defined by (36), is a threshold parameter with threshold value one for the zero state of (35).

Proof

We have defined \(R_0\) as the dominant eigenvalue of \(T(-\varSigma ^{-1})\) and this \(R_0\) is a threshold parameter of the linear system corresponding to the matrix \(T+\varSigma \) according to Theorem 1. We will show that \(T(-\varSigma ^{-1})\) and \(K\) have the same dominant eigenvalue.

The range of \(T\) is spanned by three linearly independent vectors \(\psi _A\), \(\psi _{{\varvec{B}}}\), \(\psi _C\). If \(T(-\varSigma )^{-1}v=\lambda v\), with \(\lambda \ne 0\), \(v\ne 0\), then \(v\) lies in the range of \(T\), i.e. \(v=\sum _{y}w_y\psi _y\), with at least one of the \(w_y\ne 0\). Therefore,

$$\begin{aligned} T(-\varSigma )^{-1}v&=T(-\varSigma )^{-1}\left( \sum _{y}w_y\psi _y\right) \\&=\sum _{x}\sum _{y}k_{x,y}w_y\psi _x, \end{aligned}$$

where the summation is over \(x\) or \(y\in \{A,{\varvec{B}}, C\}\). On the other hand, this is equal to

$$\begin{aligned} \lambda v=\lambda \sum _x w_x\psi _x. \end{aligned}$$

Since the \(\psi _x\) are linearly independent, it follows that

$$\begin{aligned} \sum _yk_{x,y}w_y=\lambda w_x, \end{aligned}$$

for all \(x=A,{\varvec{B}},C\). In matrix notation:

$$\begin{aligned} Kw=\lambda w, \end{aligned}$$

where \(w=(w_x)\) is a three-dimensional vector, not equal to the zero vector. We conclude that if \(\lambda \) is a nonzero eigenvalue of \(T(-\varSigma )^{-1}\), then \(\lambda \) is also a nonzero eigenvalue of \(K\). To find the dominant eigenvalue \(R_0\) of \(T(-\varSigma )^{-1}\), we can focus on the \(3\times 3\) matrix \(K=(k_{x,y})\). \(\square \)

Consider the definition of \(K\) given by (36). This definition allows for an interpretation of the elements \(k_{x,y}\). Indeed, \(k_{x,y}\) can be interpreted as the expected number of binding sites in \(x\) created by one binding site in \(y\), with \(x,y \in \{ A,{{\varvec{B}}},C\}\). Therefore, we call \(K\) the NGM on the level of binding sites, and \(R_0\) can be interpreted as the expected number of secondary cases caused by a typical newly infected binding site in the beginning of an epidemic. Note that when \(x\) or \(y\) equals \({\varvec{B}}\) we specify a probability distribution rather than a specific state.

The relation (36) completely characterizes the matrix \(K\). However, using the interpretation, we can give explicit expressions for the entries of \(K\); see Appendix C. In this appendix it is also shown that, in order to find \(R_0\), we can reduce \(K\) to a \(2\times 2\) matrix and calculate the dominant eigenvalue of this smaller matrix. By combining (55)–(57), (59), and (61)–(63) we then find \(R_0\) given as an explicit function of the model parameters.

We have characterized \(R_0\) in terms of binding sites, both by considering all possible states \(\{A,B_1,\ldots , B_n, C\}\) and by considering \(\{A,{\varvec{B}}, C\}\). This allows for an interpretation of \(R_0\) in terms of binding sites. As we next show, we may also interpret \(R_0\) in terms of individuals.

5.4 \(R_0\) in terms of individuals

The model description is on the level of individuals, so it is only sensible that, in this section, we concern ourselves with the interpretation of \(R_0\) in terms of individuals, i.e. the interpretation of \(R_0\) as the expected number of secondary cases caused by a typical newly infected individual (rather than binding site) in the beginning of an epidemic.

Individuals can be considered as collections of \(n\) binding sites. We find the relation between the binding site level and the individual level as follows. Recall (33), where we see in the second equality that the \(\phi _j\) are a linear combination of the \(\psi _A\), \(\psi _{{\varvec{B}}}\), and \(\psi _C\). Note that \(\phi _j\) is a collection of \(n\) infectious binding sites, \(n-j\) in state \(A\), 1 in state \(C\), and \(j-1\) in states \(B_l\), \(l=0,\ldots ,n\) (and where the infectious binding site is in state \(B_l\) with probability \(Q_l\)). We can identify \(\phi _j\) with an individual in state \((+,j-1,1)\). Note that the \((+,j-1,1)\) are the possible states of an individual at epidemiological birth. For the case \(n=1\), we have \(\phi _1=\psi _C\) only (which corresponds to the only state-at-epi-birth \((+,0,1)\) since an infectious individual at epi-birth is in a partnership with its epidemiological partner).

This observation allows us to also give an interpretation to \(R_0\) for individuals. Indeed, consider \(K^\mathrm{ind}=((k^\mathrm{ind})_{ij})\), where the \((k^\mathrm{ind})_{ij}\) are characterized by

$$\begin{aligned} T(-\varSigma ^{-1})\phi _j=\sum _{i=1}^n(k^\mathrm{ind})_{ij}\phi _i. \end{aligned}$$
(37)

Element \((k^\mathrm{ind})_{ij}\) is then the expected number of secondary cases in state i caused by one infectious individual in state \(j\). Here i and \(j\) are of the form \((+,m,1)\), \(m=0,\ldots ,n-1\). To arrive at the interpretation of \(R_0\) on the individual-level, we can prove that the dominant eigenvalue of \(K^\mathrm{ind}\) (which is the NGM on individual level) equals the dominant eigenvalue \(R_0\) of \(K\); see Appendix E for the details.

The matrix \(K^\mathrm{ind}\) is completely characterized by the identity (37). But, as in the case of \(K\), we can use the interpretation to give a more explicit expression for the entries of \(K^\mathrm{ind}\); see Appendix F.

5.5 \(R_0\): equivalence of different interpretations

In Sect. 5.3 \(R_0\) is defined as the dominant eigenvalue of \(T(-\varSigma )^{-1}\). Theorem 2 states that \(R_0\) is also the dominant eigenvalue of the \(ABC\)-NGM \(K\), where \(K\) is defined by (36), and in Appendix C we show that, in order to find the dominant eigenvalue of \(K\), we can reduce \(K\) to a \(2\times 2\) matrix \(\tilde{K}\). Finally, in Appendix E, we show that \(R_0\) is also the dominant eigenvalue of the NGM \(K^\mathrm{ind}\) on individual level. We summarize this in (38), where \(\Leftrightarrow \) refers to ‘has the same dominant eigenvalue’.

$$\begin{aligned} T(-\varSigma ^{-1})\quad \Longleftrightarrow&\quad K \quad \Longleftrightarrow \quad K^{\text {ind}}\\&\quad \Updownarrow \nonumber \\&\quad \tilde{K}\nonumber \end{aligned}$$
(38)

In the next section we prove that \(R_0\) defined in this way is indeed a threshold for the stability of the disease-free steady state of the nonlinear system (18), by using \(L\) defined in (27) to relate the linearisation of (18) to (35).

6 Proof that \(R_0\) is a threshold parameter

Recall that, using the mean field at distance one assumption, we have written down a system of differential equations to describe the transmission of the infectious disease across the dynamic network. We will refer to the system (18) of differential equations for the fractions of the population of individuals in states \(\ell \), \(\ell =(\pm ,k_-,k_+)\) as the \(P\)-system. In Sect. 4 we have linearised this system around the disease-free steady state and we were able to restrict this linearised system to the fractions \(P_{(-,k,1)}\) and \(P_{(+,k_-,k_+)}\). In Sect. 5 we considered binding sites of an infectious individual (in the linearisation!) and these binding sites could be in \(A\), \({\varvec{B}}\), and \(C\). This led to the \(ABC\)-system (35). \(R_0\), defined as the dominant eigenvalue of \(K\), is a threshold for the stability of the zero state of (35); this was formulated in Theorem 2. In this section we will prove that \(R_0\) is also a threshold for the stability of the disease-free steady state of system (18). We do so by relating the reduced linearisation (26) of the \(P\)-system to the \(ABC\)-system (35).

6.1 The case \(n=1\)

For \(n=1\) the proof is relatively easy, since there is no distinction between ‘individual’ and ‘binding site’. As the proof provides guiding lines for the general case, we present it first.

If we write out (26) for \(n=1\) we obtain the system of four ODE:

$$\begin{aligned} \frac{dP_{(-,0,1)}}{dt}&=-(\sigma +2\mu +\beta )P_{(-,0,1)}+\rho F_+ P_0 \\ \frac{dP_{(+,0,0)}}{dt}&=-(\rho \bar{F}+\mu )P_{(+,0,0)}+(\sigma +\mu )P_{(+,1,0)}+(\sigma +\mu )P_{(+,0,1)}\\ \frac{dP_{(+,1,0)}}{dt}&=-(\sigma +2\mu +\beta )P_{(+,1,0)}+\rho \bar{F}P_{(+,0,0)}\\ \frac{dP_{(+,0,1)}}{dt}&=-(\sigma +2\mu +\beta )P_{(+,0,1)}+\beta P_{(+,1,0)}+\beta P_{(-,0,1)}. \end{aligned}$$

The consistency relation (19), which for \(n=1\) reduces to

$$\begin{aligned} P(-,0,1) = P(+,1,0), \end{aligned}$$
(39)

is reflected in the fact that the first and third equation of the system of ODEs are identical (recall that, for \(n=1\), \(\bar{F}\) equals \(P_0\) and \(F_+\) equals \(P_{(+,0,0)}\)). Using (39) we reduce to the three-dimensional system

$$\begin{aligned} \frac{dP_{(+,0,0)}}{dt}&=-(\rho \bar{F}+\mu )P_{(+,0,0)}+(\sigma +\mu )P_{(+,1,0)}+(\sigma +\mu )P_{(+,0,1)}\\ \frac{dP_{(+,1,0)}}{dt}&=-(\sigma +2\mu +\beta )P_{(+,1,0)}+\rho \bar{F}P_{(+,0,0)}\\ \frac{dP_{(+,0,1)}}{dt}&=-(\sigma +2\mu +\beta )P_{(+,0,1)}+2\beta P_{(+,1,0)}+\beta P_{(-,0,1)}. \end{aligned}$$

To finish the proof, we only need to observe that the corresponding matrix is exactly \(\varSigma _1 + T_1\), with \(\varSigma _1\) defined in (28) and \(T_1\) in (29).

Indeed, recall the three states \(A\), \(B\), and \(C\) that we defined for the binding site of an infectious individual in Sect. 5.1 and the population level fractions \(X_A\), \(X_B\), \(X_C\) in states \(A\), \(B\), and \(C\). Since individuals have exactly one binding site we identify the fractions of binding sites with fractions of individuals:

$$\begin{aligned} \begin{aligned} X_A&=P_{(+,0,0)}\\ X_B&=P_{(+,1,0)}\\ X_C&=P_{(+,0,1)}. \end{aligned} \end{aligned}$$

With this identification, the linearisation of the \(P\)-system equals the (linear) \(ABC\)-system. Therefore, not only is there a stability switch of the disease-free state of the \(ABC\)-system at \(R_0=1\) (see also Theorem 2), but in fact there is also a stability switch for the disease-free state of the system at \(R_0=1\).

To enhance the understanding, we present the main ingredients of the proof once more, but now by way of pictures. Figure 3 depicts the possible states and state transitions for an infectious individual. The corresponding part of the transition matrix is \(\varSigma _1\). The corresponding p-level variables are \(P_\ell \) with indices \(\ell =(+,0,0), (+,1,0), (+,0,1)\). This \(+\) part of the \(P\)-vector does not form a closed system. Indeed, an individual in state \((-,0,1)\) has probability per unit of time \(\beta \) to jump to \((+,0,1)\), as indicated in Fig. 6.

Fig. 6
figure 6

Flow chart representing part of the linearised system of ODEs (26) for the p-level fractions

When this jump occurs, the responsible partner (the ‘epidemiological parent’) jumps from \((+,1,0)\) to \((+,0,1)\). The interpretation underlying this last statement is mathematically reflected in the consistency relation (39). Using (39) we reduce the flow chart of Fig. 6 to the one depicted in Fig. 7. The corresponding matrix is \(\varSigma _1+T_1\).

Fig. 7
figure 7

Flow chart of Fig. 6 with \(P_{(-,0,1)}\) eliminated. Note that this figure does not allow for an individual-level interpretation; the rate at which an individual in state \((+,1,0)\) infects its susceptible partner is \(\beta \) (compare with Fig. 3). But the flow from population-level fraction \(P_{(+,1,0)}\) to \(P_{(+,0,1)}\) is with rate \(2\beta \) since it implicitly captures the inflow from \(P_{(-,0,1)}\)

6.2 Generalization: \(n>1\)

In general, for \(n>1\), we can express \(X_A\), \(X_{B_j}\), and \(X_C\) in terms of the linearised \(P\)-system by:

$$\begin{aligned} X_A&= \sum _{k_+=0}^n\sum _{k_-=0}^{n-k_+}(n-k_--k_+)P_{(+,k_-,k_+)}\\ X_{B_j}&= P_{(-,j-1,1)}\\ X_C&= \sum _{k_+=0}^n\sum _{k_-=0}^{n-k_+}k_+P_{(+,k_-,k_+)}. \end{aligned}$$

The explanation is as follows. An individual in state \((+,k_-,k_+)\) is infectious and has \(n-k_{-}-k_{+}\) free binding sites and \(k_+\) binding sites occupied by infectious partners. Summing over all possible states \((+,k_-,k_+)\) we obtain the number of binding sites in, respectively, states \(A\) and \(C\). For the number of binding sites in state \(B_j\) we observe that an individual in \((-,j-1,1)\) has \(j\) partners in total and one infectious partner. This infectious individual therefore has a binding site occupied by a susceptible partner who has \(j\) partners in total, i.e. a binding site in state \(B_j\). The total number of binding sites in state \(B_j\) is therefore \(P_{(-,j-1,1)}\).

So the map \(L\) defined in (27) maps the \(P\)-variables to the \(X_{ABC}\)-variables, i.e. we have the linear transformation

$$\begin{aligned} \begin{pmatrix}X_A\\ X_{B_1}\\ \vdots \\ X_{B_n}\\ X_C\end{pmatrix}=LP. \end{aligned}$$
(40)

By differentiating \(LP\) and using (26), we obtain the linear system of differential equations (35) for \(X_A\), \(X_{B_j}\), and \(X_C\).

It remains to prove that the stability switch of the zero state of the \(ABC\)-system occurs if and only if the disease-free state of the \(P\)-system (18) switches stability. This will be shown in the remainder of this section.

We know that \(R_0\) is a threshold parameter for the zero state of the \(ABC\)-system (see Theorem 2), i.e.

$$\begin{aligned} {{\mathrm{sign}}}(R_0-1)={{\mathrm{sign}}}(r_{ABC}), \end{aligned}$$
(41)

where \(r_{ABC}\) is the spectral bound of \(T+\varSigma \), i.e. \(r_{ABC}=\sup \{\text {Re}(\lambda ) :\lambda \in \sigma (T+\varSigma )\}\), and \(\sigma (T+\varSigma )\) is the spectrum of \(T+\varSigma \).

So in order to show that \(R_0\) is a threshold for the disease free state of the \(P\)-system, it suffices to show that

$$\begin{aligned} {{\mathrm{sign}}}(r_{ABC})={{\mathrm{sign}}}(r_P). \end{aligned}$$
(42)

Here \(r_P\) is the spectral bound of \(M_P\) where \(M_P\) is the matrix corresponding to the right-hand side of (26). In fact we will show that \(r_{ABC}=r_P\).

We will proceed as follows. First we shall prove that \(r_{ABC}\) and \(r_P\) are dominant eigenvalues of the matrices \(T+\varSigma \) and \(M_P\), respectively, in the sense that these eigenvalues are uniquely characterized by the positivity of the eigenvector (up to a multiplicative positive constant).

We show in Lemmas 4 and 5 that \(T+\varSigma \) and \(M_P\) are irreducible matrices. This then allows us to conclude that the dominant eigenvalues of \(M_P\) and \(T+\varSigma \) are real and uniquely characterized by a positive eigenvector (see e.g. Theorem 2.5 of Seneta 1973). In other words, there exists a real eigenvalue \(r_P\) for \(M_P\) for which it holds that \(r_P>\text {Re } \lambda \) for any eigenvalue \(\lambda \ne r_P\) of \(M_P\) and \(r_P\) is uniquely defined by the positivity of the corresponding eigenvector (and similarly with \(r_{ABC}\) replacing \(r_P\) and \(T+\varSigma \) replacing \(M_P\)).

In Lemmas 4 and 5 below we use that a matrix \(M=(m_{xy})\) is irreducible if and only if variable \(x\) communicates with variable \(y\) (\(x\leftrightarrow y\)) for all variables \(x\) and \(y\), i.e. there is a path from \(x\) to \(y\) (\(x\rightarrow y\)), i.e. there are variables \(y_1\), \(y_2\), \(\ldots \), \(y_n\) such that \(m_{y,y_n}\cdots m_{y_2,y_1}m_{y_1,x}>0\), and a path from \(y\) to \(x\) (\(y\rightarrow x\)), i.e. there are variables \(x_1\), \(x_2\), \(\ldots \), \(x_k\) such that \(m_{y,x_k}\cdots m_{x_2,x_1}m_{x_1,y}>0\). Note that the somewhat unusual notation is due to our convention that \(m_{xy}\) denotes the transition from \(y\) to \(x\) (instead of the transition from \(x\) to \(y\), as it is common in the stochastic community).

Lemma 4

\(T+\varSigma \) is an irreducible matrix.

Proof

The flowchart describing the matrix \(\varSigma \) is presented in Fig. 4. We immediately see from this figure that from any state \(x\) there is a path to any other state \(y\), with \(x,y\in \{A,B_1,B_2,\ldots ,B_n,C\}\). It follows that \(\varSigma \) is irreducible. Since \(T\) is nonnegative, also \(T+\varSigma \) is irreducible. \(\square \)

Lemma 5

\(M_{P}\) is an irreducible matrix.

Proof

With respect to a splitting of \(P\) into \(-\) components and \(+\) components, \(M_P\) is a block matrix that consists of four matrices \(M_1\), \(M_2\), \(M_3\), \(M_4\):

The matrices \(M_2\) and \(M_3\) are non-negative matrices, not equal to the zero matrix, while \(M_1\) and \(M_4\) are positive off-diagonal. We show that \(M_1\) and \(M_4\) are irreducible, and that this implies that \(M_P\) is irreducible.

Consider \(M_1\). This matrix consists of the rates corresponding to the possible flows of the \(-\) variables, i.e. population-level fractions of the form \(P_{(-,k,1)}\) (and rates \(\beta +\sigma +2\mu \) out of the \(-\) states, that we do not need to consider here). In Fig. 8 part of the possible flows and corresponding rates are represented graphically. From Fig. 8 we immediately see that, from any variable \(P_{(-,k,1)}\), one can find a path to any other variable \(P_{(-,l,1)}\), or in other words, \(P_{(-,k,1)}\rightarrow P_{(-,l,1)}\) for all \(k,l=0,\ldots ,n-1\). Therefore \(M_1\) is an irreducible matrix.

Fig. 8
figure 8

Graphical representation of part of matrix \(M_1\) (point \(k\) represents fraction \(P_{(-,k,1)}\)) showing that \(x\rightarrow y\) for all \(x,y=P_{(-,l,1)}\), i.e. \(M_1\) is irreducible. Part of \(M_1\) that is being ignored is e.g. the rates \(\beta +\sigma +2\mu \) out of each variable \(P_{(-,k_-,1)}\) leaving the \(-\) system

Consider the matrix \(M_4\). This matrix consists of the rates corresponding to the possible flows of the \(+\) variables. i.e. population-level fractions of the form \(P_{(+,k_-,k_+)}\). In Fig. 9 a graphical representation of part of the possible flows are given and in Fig. 10 the rates corresponding to these flows are given. These figures show (literally) that \(M_4\) is irreducible.

Fig. 9
figure 9

Graphical representation of the possible flows incorporated in the matrix \(M_4\) (coordinate \((k_-,k_+)\) represents fraction \(P_{(+,k_-,k_+)}\)), ignoring the death rate \(\mu \) out of each fraction

Fig. 10
figure 10

Rates corresponding to the flows of Fig. 9, ignoring the death rate \(\mu \) out of each variable

Finally, consider two variables \(x^-=P_{(-,k,1)}\), \(x^+=P_{(-,l_-,l_+)}\) of the matrix \(M_P\). We show that \(x^-\leftrightarrow x^+\).

Since \(M_2\) and \(M_3\) are non-negative and non-zero, there are variables \(y^-\), \(y^+\), \(z^-\), \(z^+\) such that \(y^-\rightarrow y^+\) and \(z^+\rightarrow z^-\). Note that, in terms of interpretation, the nonzero elements of \(M_2\) correspond to infection of \(-\) individuals by one of their \(+\) partner, i.e. transitions with rate \(\beta \) from fractions \(P_{(-,j_-,1)}\) to \(P_{(+,j,1)}\). The nonzero elements of \(M_3\) correspond to the feed into the \(P_{(-,j_-,1)}\) category via the \(F_+\) terms from fractions \(P_{(+,k_-,k_+)}\).

We find a path from \(x^-\rightarrow x^+\) through \(y^-\) and \(y^+\), i.e.

$$\begin{aligned} x^-\rightarrow y^-\rightarrow y^+\rightarrow x^+, \end{aligned}$$

and a path from \(x^+\rightarrow x^-\) through \(z^+\) and \(z^-\), i.e.

$$\begin{aligned} x^+\rightarrow z^+\rightarrow z^-\rightarrow x^-. \end{aligned}$$

Note that the paths \(x^-\rightarrow y^-\), \(y^+\rightarrow x^+\), \(x^+\rightarrow z^+\), and \(z^-\rightarrow x^-\) exist since \(M_1\) and \(M_4\) are irreducible.

Since any two variables \(x^-\) and \(x^+\) of \(M_P\) communicate, i.e. \(x^-{{\mathrm{\leftrightarrow }}}x^+\), \(M_P\) is irreducible. \(\square \)

We now have all the ingredients to prove that \(R_0\) is a threshold parameter for the disease free state of (18).

Since \(M_P\) is an irreducible positive off-diagonal matrix, we know that \(M_P\) has a real dominant eigenvalue \(r_P\) with corresponding positive eigenvector \(v\), i.e.

$$\begin{aligned} M_P\, v=r_P\, v. \end{aligned}$$

Then how does this relate to \(T+\varSigma \)? On the one hand we find that

$$\begin{aligned} \frac{d}{dt}LP=L\frac{dP}{dt}=LM_PP, \end{aligned}$$

on the other hand \(LP=x\) and (35) holds. Therefore

$$\begin{aligned} LM_PP=(T+\varSigma )LP, \end{aligned}$$
(43)

and it follows that

$$\begin{aligned} r_PLv=LM_P\, v=(T+\varSigma )Lv. \end{aligned}$$

We have seen in Sect. 4 that if \(v\) is strictly positive then so is \(Lv\). Furthermore, since \(T+\varSigma \) is an irreducible positive off-diagonal matrix (see Lemma 4), the Malthusian parameter of \(T+\varSigma \) is uniquely characterized by a positive eigenvector. Therefore \(r_P\) is also the Malthusian parameter of \(T+\varSigma \) with corresponding eigenvector \(Lv\), i.e.

$$\begin{aligned} r_P=r_{ABC}. \end{aligned}$$
(44)

Finally, (42) together with Theorem 2 shows that \(R_0\) is a threshold parameter of the \(P\)-system.

6.3 Characterization of the Malthusian parameter \(r\) (\(=r_{ABC}=r_P\))

In this section we characterize the initial exponential growth rate \(r=r_{ABC}=r_P\) (recall (44)). The Malthusian parameter \(r\) satisfies

$$\begin{aligned} (T+\varSigma )v=rv\quad \Leftrightarrow \quad Tv=(rI-\varSigma )v. \end{aligned}$$

So \((rI-\varSigma )v\) lies in the range of \(T\), i.e.

$$\begin{aligned} (rI-\varSigma )v=w \end{aligned}$$

with

$$\begin{aligned} w=d_A\psi _A+d_B\psi _{{\varvec{B}}}+d_C\psi _C, \end{aligned}$$
(45)

where the \(d_x\) are some constants, not all equal to zero, and the \(\psi _x\) are defined in (34), \(x=A,{\varvec{B}},C\). This is equivalent to

$$\begin{aligned} v=(rI-\varSigma )^{-1}w. \end{aligned}$$

Therefore

$$\begin{aligned} T(rI-\varSigma )^{-1}w=w, \end{aligned}$$

where \(w\) is defined by (45), so

$$\begin{aligned} \sum _xd_xT(rI-\varSigma )^{-1}\psi _x=\sum _xd_x\psi _x. \end{aligned}$$
(46)

Since the range of \(T\) is spanned by \(\psi _x\), we also have that, for certain constants \((m_r)_{yx}\),

$$\begin{aligned} T(rI-\varSigma )^{-1}\psi _x=\sum _{y}(m_r)_{yx}\psi _y. \end{aligned}$$
(47)

The Malthusian parameter \(r\) then needs to satisfy

$$\begin{aligned} M_rd=d, \end{aligned}$$

where \(M_r=((m_r)_{xy})\) is a \(3\times 3\) matrix characterized by (47), with matrix elements \((m_r)_{xy}\) depending on the unknown \(r\). Identity (47) fully characterizes elements \((m_r)_{xy}\), but, as in the case of \(K\) and \(K^{\text {ind}}\), we can use the interpretation to give explicit expressions for the entries of \(M_r\), in the last paragraph of Appendix C we outline how this can be done.

Finally, consider the case \(n=1\), then \(r\) satisfies

$$\begin{aligned} T_1(rI-\varSigma _1)^{-1}w=w, \end{aligned}$$
(48)

where \(\varSigma _1\) and \(T_1\) are defined in (28) and (29), respectively. Since the range of \(T_1\) is spanned by \(\psi _C\), we see that \(w=\psi _C\). We find that

$$\begin{aligned} T_1(rI-\varSigma _1)^{-1}\begin{pmatrix}0\\ 0\\ 1\end{pmatrix}=\begin{pmatrix}0\\ 0\\ \frac{\beta \rho \bar{F}(\sigma +\mu )}{(r+\mu )(r+\beta +\sigma +2\mu )(r+\rho \bar{F}+\sigma +2\mu )}\end{pmatrix}. \end{aligned}$$

It then follows from (48) that we find \(r\) by solving the following third-order polynomial in \(r\):

$$\begin{aligned} \beta \rho \bar{F}(\sigma +\mu )=(r+\mu )(r+\beta +\sigma +2\mu )(r+\rho \bar{F}+\sigma +2\mu ). \end{aligned}$$

7 Looking back and ahead

The overall aim of our research is to formulate and analyse models for the spread of an infectious disease across a network that is dynamic in the double sense that individuals come (by birth) and go (by death) and that links/partnerships are formed and broken. In particular our aim is to investigate the role of concurrency in the spread of sexually transmitted infections.

In Leung et al. (2012) we introduced a class of doubly dynamic network models that are relatively simple to describe, that involve just a few parameters, and for which one can calculate many statistics exactly in explicit detail. The next step, taken here, is superimposing the spread of an infection. In order to retain the simplicity, we again characterize individuals by their dynamic degree (i.e. the current number of their partners), but now include the disease status (\(S\) versus \(I\)) of the individual itself and of its partners. In this bookkeeping scheme we need to account for the infection of a partner by one of its other partners, but the scheme itself does not provide information about partners of partners. Thus we faced a closing problem. The mean field at distance one assumption provided a natural solution.

Originally we thought that this was an assumption because we had not yet found a way to prove it. In a late stage Pieter Trapman pointed the way to the current Appendix B, showing that the assumption is inconsistent with the model itself. We then realised that, in essence, our bookkeeping scheme constitutes a first order description that we close by making the (inconsistent) mean field at distance one assumption. So the deterministic system studied here provides at best an approximation to the large system size limit of a stochastic model.

The great advantage of the deterministic system of dimension \((n+1)(n+2)\) is that it is amenable to analysis. The fact that binding sites operate to some extent independently from each other enables a reduction of the dimension from \((n+1)(n+2)\) to \(2\) in the characterization of \(R_0\). Indeed, we characterized the basic reproduction number \(R_0\) as the dominant eigenvalue of a \(3\times 3\) matrix with elements describing the expected numbers of newly infected binding sites of three different types generated by one infected binding site of either type during its life time. We could then further reduce the \(3\times 3\) matrix to a \(2\times 2\) matrix which lead to an explicit expression for the dominant eigenvalue \(R_0\). We also verified that the basic reproduction number \(R_0\) defined in this way is indeed a threshold parameter for the stability of the disease free steady state of the nonlinear system of model equations. This is done by establishing a relationship between the exponential growth rate \(r\) of the epidemic in the linearised system and the quantity \(R_0\) on the level of binding sites.

The characterization of \(r\) and \(R_0\) opens up the route for investigating the impact of concurrency on the transmission of the \({ SI}\) infection in the dynamic network. We can now study how \(r\) and \(R_0\) depend on the capacity \(n\) when fixing all other parameters at constant values. Furthermore, the relationship between concurrency measures on the one hand and \(R_0\), \(r\), and the endemic steady state on the other, can be analysed. This will be explored in a follow-up paper. (Concerning the endemic steady state, we will need to derive the equations that characterize it, to investigate the uniqueness and to prove that existence requires \(R_0>1\).)

There are a number of generalisations of the network model that are both useful and feasible. The extension to a heterosexual population requires only the distinction between males and females and some assumptions on the symmetry or asymmetry in rates and partnership capacity between the two sexes. We expect that all results presented here carry, mutatis mutandis, over to that situation. No doubt the model can also be extended to the situation that \(n\) is a random variable with a prescribed distribution.

Other generalisations pertain to the description of infectiousness. An obvious example is a model with two consecutive stages \(I_1\) and \(I_2\), where infectiousness is characterised by \(\beta _i\) in stage \(I_i\). Other compartmental epidemic models could be considered as well, such as \({ SIR}\) and \({ SIS}\). Inclusion of the impact of the disease on mortality is very relevant in the context of HIV. Unfortunately it might turn out to be very hard.

The most stringent limitation of our framework is the assumption that having a partner does not influence an individual’s propensity to enter into a new partnership or its contact rate in other ongoing partnerships. This is clearly at odds with reality (although equally clearly it is an impossible task to disentangle the manifold ways in which dependence ‘works’ in reality). Dependence destroys the basis on which our analytic approach rests.

Be that as it may, we view the work presented here as a first step towards a framework for studying the impact of dynamic network structure on the transmission of an infectious disease.