1 Introduction

Superinfections are a major cause of global mortality and morbidity. For example, the WHO estimates 15 million cases worldwide of Hepatitis D, which spreads only amongst carriers of Hepatitis B and greatly worsens their prognosis [25]. There is a need, therefore, to develop a robust understanding of the conditions under which outbreaks of secondary infections are possible. Coevolving infections have been studied previously in the case of symbiotic/antagonistic relationships where infections mutually affect fitness [7, 8, 11], however, relatively little is known in the case that one infection has a strict obligate relationship with another.

In 2013, Court, Blythe and Allen [6] introduced a model of hierarchical infection referred to as the stacked contact process. Their model concerns the fate of a population of coevolving hosts, spreading as a contact process on a lattice, and parasites, spreading as a contact process restricted to sites currently occupied by hosts. In epidemiological language, the contact processes of [6] correspond to coupled Susceptible-Infective-Susceptible (SIS) epidemics; empty lattice sites are interpreted as susceptible individuals, who may be infected by the primary (host) and then secondary (parasite) infections. Simulations of this model system revealed a surprising feature: the success of the parasites depends non-monotonically on the turnover rate of the host population. Specifically, for the parasite to succeed, it is necessary for the dynamics of the host population to be neither too fast, nor too slow. Later in [19], Lanchier and Zhang rigorously established the main features of the phase diagram for the stacked contact process.

Fig. 1
figure 1

Possible events and their rates in the network superinfection model. Circles represent nodes in the network, with the state of the primary (resp. secondary) infection shown by the colour of the lower-left (resp. top-right) sector; light denotes susceptible, midtone denotes infective, dark denotes recovered

At around the same time, Newman and Ferrario [21] independently proposed a related model in the context of epidemic dynamics in social contact networks. They considered a pair of Susceptible-Infective-Recovered (SIR) epidemics with a strictly obligate relationship such that the secondary infection is only transmitted amongst those who have recovered from the primary. In this formulation, the dynamics of the two diseases are completely separated in time, allowing for analytical treatment of the model using “cavity method” techniques which have been quite successful in the study of epidemics on networks (see, e.g. [20, 23]). The introduction of network structure to the population in [21] has the advantage of improving the relevance of the model for human epidemic dynamics, however, by separating the dynamics of the two diseases this model cannot display the curious interaction between infection timescales observed in [6, 19].

In this paper we study the dynamics of coevolving SIR superinfections in sparse contact networks. We consider a population of individuals occupying the vertices of an Erdős–Rényi (ER) random graph with mean degree c. A primary infection spreads through the population with infective individuals passing the disease on to their neighbours with rate \(\beta _1\), and recovering from the disease with rate \(\rho _1\). Individuals who are carrying a live primary infection may also play host to a secondary infection, which spreads and recovers with rates \(\beta _2,\rho _2\) respectively. See Fig. 1 for an illustration of the possible state transitions. As in [21], our secondary infection is restricted to spread on the subgraph of hosts infected with the primary, however, differently from that paper we consider the more complex case in which this subgraph is evolving in time due to the recovery of primary infections.

As well as arguably improving the realism of the model, moving from lattice to network topologies allows us access to a rigorous branching process approximation—an approach that has previously enjoyed success in approximating SIR-type models in large populations, as seen for instance in [4, 24]. By coupling the dynamics of the secondary infection to those of a multi-type branching process, we will be able to characterise the phase diagram of the system. Note that, in the context of large finite networks, when we discuss survival of the infection we mean an asymptotically positive proportion of vertices become infected at some point in time.

Fig. 2
figure 2

Phase diagram of the superinfection network model for fixed \(\beta _1/\rho _1=\beta _2/\rho _2\), shown as a function of the relative timescale \(\varphi =\beta _1/\beta _2\) of the infections and the connectivity c of the network. The secondary infection survives with positive probability only in a convex region whose boundary is characterised in our Theorem 1. In this log–log plot, the asymptotic slope of the boundary is \(-\,1\) for small \(\varphi \), and 1 for large \(\varphi \), as implied by the scaling laws in (2)

The success of the primary infection is controlled by the connectivity (mean node degree) c of the network, and the ratio of the infection and recovery rates \(\alpha :=\beta _1/\rho _1\). This parameter is well understood as the basis of the classical single infection process; for fixed \(\alpha \) there exists a critical value of c above which the infection survives with positive probability and at or below which we have certain extinction, see e.g. [13]. Note that simultaneously adjusting \(\beta _1\) and \(\rho _1\) by a multiplicative factor will change the timescale of the disease dynamics, but will not alter the probability of survival since \(\alpha \) is unchanged.

Inspired by the results of [6, 19], we are interested here in the behaviour possible when the primary and secondary diseases are similarly virulent, but may differ in their in timescales. To this end, we will mainly concentrate on the case that \(\beta _2/\rho _2=\alpha \) also. We have made this choice only for simplicity of presentation; the more general case is in fact covered by Lemma 2, and the results are not qualitatively different in other cases. Three parameters then describe success of secondary infection: the connectivity of the underlying graph, c; the ratio between rates of spread and recovery, \( \alpha \); and, crucially, the relative timescales of the two infections, \( \varphi :=\beta _1/\beta _2\). If \(\varphi \gg 1\) then the dynamics of the primary infection are very much faster than those of the secondary; if \(\varphi \ll 1\) then they are much slower.

In order for the secondary infection to survive it is perhaps intuitive that it must progress at a rate fast enough compared to the primary infection, else the primary infection will have itself recovered and subsequently ended the secondary infection before it has a chance to spread. Perhaps more surprisingly however we shall also show that the secondary infection should not act too quickly as this too compromises survival potential. Our characterisation of the survival of the secondary infection is illustrated in Fig. 2 and summarised by our main result:

Theorem 1

For all \(\alpha ,\varphi >0\) there exists a critical connectivity \(c^\star \) such that, in the limit of large network size, for \(c<c^\star \) the secondary infection dies out with probability one, and for \(c>c^\star \) it survives with positive probability.

Furthermore, the critical connectivity \(c^\star \) is found to be the smallest positive solution of the implicit equation

$$\begin{aligned} c^\star = \frac{(1+\alpha +\alpha \varphi )(1+\varphi +2\alpha +\alpha \varphi )}{\varphi } \frac{_0F_1 (2+\varphi +(1+\varphi )/\alpha ;-c^\star \varphi /\alpha ^2)}{_0F_1 (3+\varphi +(1+\varphi )/\alpha ;-c^\star \varphi /\alpha ^2)}, \end{aligned}$$
(1)

where \(_0F_1(a;z)=\sum _k{\frac{z^k}{k!(a)_k}}\) is a hypergeometric function. In particular, for large and small \(\varphi \) we have the scaling behaviour

$$\begin{aligned} c^\star =\mathrm {\Theta }(\varphi )\quad \text {for }\,\,\varphi \rightarrow \infty ,\quad c^\star =\mathrm {\Theta }(1/\varphi )\quad \text {for }\,\,\varphi \rightarrow 0. \end{aligned}$$
(2)

Here we have made use of “big theta” notation, defined as follows: \(f(x)=\mathrm {\Theta }(g(x))\) as \(x\rightarrow \infty \) (resp. \(x\rightarrow 0\)) if there exist positive constants L and U and X such that \(\forall x>X\) (resp. \(x<X\)) we have

$$\begin{aligned} L g(x)< f(x) <U g(x). \end{aligned}$$

The remainder of the article is organised as follows: in the next section we map the early dynamics of our network superinfection model to a certain multitype branching process; in Sect. 3 we compute the long-time behaviour of this process and thus give the proof of Theorem 1; Sect. 4 is for discussion, including illustrative numerical results.

2 Branching Process Description

2.1 Primary Infection

We begin by recapping the standard branching process approximation to the dynamics of an infection spreading on an Erdős–Rényi random graph [4, 17]. Heuristically, the method relies on the fact that for fixed connectivity, short cycles become asymptotically rare in the limit of large graphs, meaning that during the crucial early dynamics of the infection, each susceptible node may have at most one infective neighbour.

Let us consider the infection spread as generational; the nth generation being the individuals at graph distance n from the seed vertex that gain the infection at any point in time. In this way the primary infection is modelled as a simple Galton Watson process described by the quantity \( Z_{n } \), giving the number of individuals in the \( n^{th } \) generation. The offspring distribution describes the probability \(p_i\) for an individual to pass the infection on to i others in the next generation. If the offspring distribution has mean \(\mu \), the expected number of infected individuals at distance n from the seed is then given by \(\mathbb {E}Z_{n}=\mu ^n\). If \(\mu \le 1\) then the branching process will almost surely go extinct after finitely many generations; if \(\mu >1\) then it may survive; survival of the branching model being simply characterised by the size of the nth generation being non zero for all n.

In the network SIR model, the number of offspring is equated with the number of neighbours (other than the single infected ‘parent’) that an infective node succeeds in transmitting the disease to before it recovers. There are several sources of randomness: the number of neighbours to potentially infect, the recovery time, and times of infection. We note that whilst the fates of the neighbours of an infected node are not independent (they are jointly exposed to the random time to recovery of the parent) the mean of the offspring distribution can be found simply by multiplying the probability \(\alpha /(1+\alpha )\) to infect any given neighbour before recovery, with the expected number of neighbours to infect, c. From standard branching process theory, we thus deduce that in the limit of large networks the primary infection will have a non-zero chance of survival if and only if \(\mu >1\), that is, if \(c>1+1/\alpha \).

For finite graphs, the coupling between the random graph and branching process model is of course only local. Suppose in a population of size N, in generation n we have m infected individuals, so \( Z_{n }= m \) in the branching model. We then have errors coming from the fact that each infective may only be connected to at most \( N-m \) susceptibles (not constant for each generation) as well as the fact that each of these may not be unique (and so children in the subsequent generation of the branching process may not be unique). However when \( m = o\sqrt{n}\) the random graph may be coupled to the branching model with high probability; for a proof of this see [10].

2.2 Secondary Infection

For the primary infection, to determine if the probability of survival is positive only requires knowledge of two quantities: the expected number of susceptible neighbours an individual has, and the chance any one of those will gain the infection. The difficulty with modelling the secondary infection is that the first of these is dynamic, since the subgraph composed of individuals currently carrying the primary infection changes with time. We account for this additional complexity by introducing a type parameter t, which specifies the time elapsed between the primary and secondary infections. Specifically, if an individual acquires the primary infection at time \(t_1\) and the secondary at time \(t_2\), then they are said to have type \(t=t_2-t_1\).

It is clear that at least this much information is required to predict the potential of an individual to transmit the secondary infection to new hosts; for example, the larger t, the more likely an individual is to pass on the primary infection long before it passes on the secondary, by which time the primary infection in the new host may have recovered. We will see in Sect. 3.1 that in fact knowledge of t is enough to completely characterise the distribution of the number and timing of new secondary infections arising from an individual. The progress of the secondary infection is then mapped to that of a multi-type branching process with type space \(\mathbb {T} =[0,\infty )\).

Where previously survival was predicted by just the mean number of offspring, now the picture is more complicated, and we are required to compute the intensity of production of all types of offspring resulting from all types of parents. This information is captured in the kernel \(\mu (t'|t)\), which is defined by the property that the expected number offspring with types in the interval [ab] coming from a parent of type t is given by the integral of \(\mu (t'|t)\) over \(t'\in [a,b]\). This kernel defines a linear operator with the action

$$\begin{aligned} M[\psi ](t')=\int \mu (t'|t)\psi (t)\text {d}t. \end{aligned}$$
(3)

In words, \(M[\psi ]\) describes the expected size and composition of the population of offspring arising from a population of parents with types given by \(\psi \).

We say that a kernel \(\mu \) defined over an interval I is: strictly positive if \(\forall t,t'\in I\) we have \(\mu (t'|t)>0\); uniformly positive if \(\exists \, \varepsilon >0\) such that \(\forall t,t'\in I, \mu (t'|t)>\varepsilon \); integrable if \(\iint \mu (t'|t)\text {d} t\,\text {d} t'<\infty \). We assume that M can be defined as a linear operator \(M:\mathcal {C}_{\text {b}}(\bar{\mathbb {T}}) \rightarrow \mathcal {C}_{\text {b}}(\bar{\mathbb {T}})\) over the space of continuous bounded functions on the compact interval \(\bar{\mathbb T}=[0,\infty ]\) equipped with the supremum norm, and in particular that \(\mu \) has vanishing mass as t goes to infinity. One then has the following general result:

Lemma 1

Let \(\{Z_n\}\) be a multi-type branching process on \(\mathbb {T} =[0,\infty )\) with production operator M arising as above from a kernel \(\mu \) that is strictly positive on \(\mathbb {T}\), integrable, and continuous in both arguments, then

  1. 1.

    There exists an eigenvalue \(\lambda >0\) equal to the spectral radius of M, moreover, this is the only eigenvalue corresponding to a non-negative eigenfunction

  2. 2.

    If \( \lambda < 1 \) then the process goes extinct in finite time with probability one

  3. 3.

    If \( \lambda > 1 \) then the process survives with positive probability.

Proof

  1. 1.

    For the first part, we observe that the properties of \(\mu \) imply the compactness of M on \( \mathcal {C}_{\text {b}}(\bar{\mathbb {T}})\) by virtue of the Arzéla–Ascoli theorem [9, IV.6.7]. The Krein–Rutman theorem [22, Th 1.3, Sect. 3.2] then gives that the spectral radius is a positive eigenvalue which, by [3, Theorem 7.3], is the only non-zero eigenvalue with a non-negative eigenfunction.

  2. 2.

    We simply observe that if \( \lambda < 1 \) then \(\Vert M^n[\psi ]\Vert \rightarrow 0\) for all \(\psi \), hence we have convergence of the expected generation size to zero (i.e. \(\mathbb {E}Z_n\rightarrow 0\)), which implies extinction in finite time with probability one.

  3. 3.

    We make use of results of Harris [12, Sect. 3] who proved positive survival probability for multi-type branching processes with a uniformly positive kernel. Our kernel \(\mu \) is not uniformly positive, but we are able to couple to such a process by restricting to a bounded type space [0, T]. Choosing T large enough forces close agreement in the maximum eigenvalues of the corresponding production operators.

    Let us start by considering the process \(\{Z_n^{_{(T)}}\}\) obtained from \(\{Z_n\}\) by removing all individuals of type greater than T along with their descendants. The law of \(\{Z_n^{_{(T)}}\}\) is that of a multitype branching process on [0, T] with operator \( M^{(T)} :\mathcal {C}_{\text {b}}[0,T] \rightarrow \mathcal {C}_{\text {b}}[0,T] \) defined by

    $$\begin{aligned} M^{(T)}[\psi ](t')=\int _{[0,T]}\mu (t'|t)\psi (t)\text {d}t. \end{aligned}$$
    (4)

    Note that \(\inf _{t,t'\in [0,T]}\mu (t'|t)>0\) and so the kernel is strictly positive and we refer to [12, Sect. 3] to prove both the existence of a positive top eigenvalue \(\lambda ^{{(T)}}\) of \( M^{(T)} \) strictly greater in magnitude than all others and survival of the process \(\{Z_n^{_{(T)}}\}\) with positive probability if \(\lambda ^{(T)}>1\).

    To show closeness of the eigenvalues \(\lambda ^{(T)}\) and \( \lambda \) we extend the operator \( M^{(T)} \) to \( \tilde{M}^{{(T)}} :\mathcal {C}_{\text {b}}(\bar{\mathbb {T}}) \rightarrow \mathcal {C}_{\text {b}}(\bar{\mathbb {T}})\) defined by

    $$\begin{aligned} \tilde{M}^{{(T)}}[\psi ](t') = \int _{[0,T]}\mu (t'\wedge T|t)\,\psi (t)\,\text {d}t \end{aligned}$$
    (5)

    Note that operators \( M^{(T)} \) and \( \tilde{M}^{(T)} \) share eigenvalues so we may equivalently consider the top eigenvalue \( \tilde{\lambda }^{(T)}\) of \( \tilde{M}^{(T)} \). Since \( \mu \) is continuous and integrable, for all \( \varepsilon >0 \) there exists T such that

    $$\begin{aligned} \big \Vert M^{}-\tilde{M}^{(T)} \big \Vert < \varepsilon , \end{aligned}$$
    (6)

    where \(\Vert \cdots \Vert \) is the operator norm induced by the infinity norm on \(\mathcal {C}_{\text {b}}(\bar{\mathbb {T}})\).

    We have already observed that the principal eigenvalue \(\lambda \) of M can be separated from the rest of the spectrum by a closed curve. Hence, by Kato [16, IV, Sect. 3.5 ], we have that \(|\lambda - \lambda ^{{(T)}}|\) goes to zero with \(\Vert M^{}-\tilde{M}^{(T)} \Vert \). In particular, if \(\lambda >1\) it follows from (6) that we can choose T such that

    $$\begin{aligned} \big | \lambda ^{(T)}-\lambda \big | < \lambda -1 , \end{aligned}$$
    (7)

    and hence \(\lambda ^{(T)}>1\) and \(\{Z_n^{_{(T)}}\}\) survives with positive probability. The untrimmed process satisfies \(Z_n\ge Z_n^{_{(T)}}\) and hence also survives with positive probability.

\(\square \)

To prove our main result about the survival of the secondary infection, we must explicitly identify the operator M, analyse its spectrum, and compute the scaling behaviour when the timescales of the infections are well separated.

3 Survival of the Secondary Infection

3.1 Production Kernel

The form of the kernel \( \mu (t'|t) \) may be found by considering when a type t parent will have a type \(t'\) offspring. For this to happen, the parent must pass on the primary infection at some time s (measured from the moment they first acquired it), and then pass on the secondary infection at time \(s+t'\). The primary and secondary infections in the parent, and the primary infection in the child, must all survive long enough for this process to complete. We find it useful to break the calculation into two cases, depending on whether the primary infection is transmitted before or after the parent acquires the secondary; that is, depending on the order of s and t.

The case \(s<t\) is illustrated in Fig. 3(i). To achieve a type \(t'\) offspring in this case: the transmission time \(s>0\) of the primary must occur before t but after \(t-t'\) (which may be negative); the secondary must be transmitted \(s+t'-t\) time units after it was acquired in the parent at time t; the primary infection in the parent must not recover in the time between s and t; and none of the three active infections may recover in the window of time between t and \(s+t'\). Putting these contributions together, we reach

$$\begin{aligned} \mu (t'|t, s <t)=c\int _{(t-t')_+}^{t}\big [\beta _{1}e^{-\beta _{1}s}\big ] \big [\beta _{2}e^{-\beta _{2}(t'-t+s)}\big ]\big [e^{-\rho _{1}(t-s)}\big ]\big [e^{-(2\rho _{1}+\rho _{2})(t'-t+s)}\big ]\text {d}s, \end{aligned}$$

where \((\cdots )_+\) denotes the positive part, and the prefactor of c comes from the expected number of neighbours to which the infection may be transmitted.

Similarly, the case \(s\ge t\) is illustrated Fig. 3(ii). Here transmission of the primary may occur any time after t, with the secondary being transmitted \(t'\) time units later. Both infections in the parent must survive until time s, after which all three infections must survive for at least \(t'\) time units. The resulting expression is

$$\begin{aligned} \mu (t'|t, s\ge t)=c\int _{t}^{\infty }[\beta _{1}e^{-\beta _{1}s}\big ]\big [\beta _{2}e^{-\beta _{2}t'}\big ]\big [e^{-(\rho _{1}+\rho _{2})(s-t)}\big ]\big [e^{-(2\rho _{1}+\rho _{2})t'}\big ]\text {d}s. \end{aligned}$$
Fig. 3
figure 3

Illustration of the timing of the necessary events for the secondary infection to successfully create a type \(t'\) offspring from a type t parent; in each case the top line represents the life of the parent and the bottom line that of the offspring. Pale lines denote the transmission of the primary and dark lines denote the transmission of the secondary, similarly, pale/dark regions denote the corresponding status of the nodes. We split into two cases depending on whether the time s of transmission of the primary infection (measured from when it is acquired by the parent) is (i) before, or (ii) after, the time t that parent acquires the secondary infection

Combining the two cases and evaluating the integral gives

$$\begin{aligned} \mu (t'|t)=\left\{ \begin{array}{ll} {\frac{c\beta _{1}\beta _{2}(\beta _{2}e^{-\beta _{1}t-(\beta _{2}+2\rho _{1}+\rho _{2})t'}+(\beta _{1}+\rho _{1}+\rho _{2})e^{-\beta _{1}t-(\rho _{1}-\beta _{1})t'})}{(\beta _{1}+\rho _{1}+\rho _{2})(\beta _{1}+\beta _{2}+\rho _{1}+\rho _{2})} }&{} \quad \mathrm {if}\ t'\le t\ \\ &{} \\ {\frac{c\beta _{1}\beta _{2}(\beta _{2}e^{-\beta _{1}t-(\beta _{2}+2\rho _{1}+\rho _{2})t'}+(\beta _{1}+\rho _{1}+\rho _{2})e^{(\beta _{2}+\rho _{1}+\rho _{2})t-(\beta _{2}+2\rho _{1}+\rho _{2})t'})}{(\beta _{1}+\rho _{1}+\rho _{2})(\beta _{1}+\beta _{2}+\rho _{1}+\rho _{2})}} &{} \quad \mathrm {if}\ t'>t. \end{array}\right. \end{aligned}$$
(8)

We are now ready to state our result about the spectrum of the production operator resulting from this kernel.

Lemma 2

For the integral operator M defined in (3) with kernel \(\mu \) given in (8), the top eigenvalue \( \lambda \) solves the implicit equation

$$\begin{aligned} \frac{c\beta _{1}\beta _{2}\,{}_{0}F_{1}\left( \frac{\beta _{1}+\beta _{2}+3\rho _{1}+\rho _{2}}{\rho _{1}};-\frac{\beta _{1}\beta _{2}}{(\lambda /c)\rho _{1}^{2}}\right) }{\lambda (\beta _{1}+\rho _{1}+\rho _{2})(\beta _{1}+\beta _{2}+2\rho _{1}+\rho _{2})}={}_{0}F_{1}\left( \frac{\beta _{1}+\beta _{2}+2\rho _{1}+\rho _{2}}{\rho _{1}};-\frac{\beta _{1}\beta _{2}}{(\lambda /c)\rho _{1}^{2}}\right) \end{aligned}$$
(9)

where \({}_{0}F_{1}(a;z)=\displaystyle \sum _{k}\frac{z^{k}}{k!(a)_{k}}\) is a hypergeometric function.

Proof

From part 1 of Lemma 1, to determine that \(\lambda \) is the top eigenvalue of M, it is sufficient to exhibit a non-negative function \(\psi \) such that \(\lambda \psi =M\psi \). We begin a search for such a function by considering the successive action of M starting from the initial state \(\psi _0(t)=\delta _0(t)\), corresponding to a single seed infected individual who acquires the primary and secondary infections at the same instant. Defining the series

$$\begin{aligned} \psi _{n+1}=M[\psi _n], \end{aligned}$$
(10)

we observe that each iterate \(\psi _n\) is a member of a family, \(\varPsi \), of functions that can be written as a certain positive sum of exponentials:

$$\begin{aligned} \varPsi =\left\{ \psi (t)=e^{-(\beta _{2}+\rho _{1}+\rho _{2})t}\sum _{k\ge 1}a_{k}e^{-k\rho _{1}t}\,:\, a_k\ge 0 \right\} . \end{aligned}$$
(11)

We look for an eigenfunction of M that lies in \(\varPsi \). The eigenvalue equation \(\lambda \psi =M[\psi ]\) is thus reduced to a statement about the coefficients \(\{a_k\}\). Specifically, we find

$$\begin{aligned} \lambda \psi (t)&=\int _\mathbb {T}\mu (t|t')\psi (t')\text {d}t' \nonumber \\&\Downarrow \nonumber \\ \lambda e^{-\left( \beta _{2}+\rho _{1}+\rho _{2}\right) t}\sum _{k\ge 1}a_{k}e^{-k\rho _{1}t}&=c e^{-\left( \beta _{2}+\rho _{1}+\rho _{2}\right) t}\sum _{k\ge 1}a_{k}\left( b_{k}e^{-\rho _{1}t}-d_{k}e^{-(k+1)\rho _{1}t}\right) , \end{aligned}$$
(12)

where

$$\begin{aligned} b_{k}&=\frac{\beta _{1}\beta _{2}(\beta _{1}+(k+1)\rho _{1}+\rho _{2})}{k\rho _{1}(\beta _{1}+\rho _{1} +\rho _{2})(\beta _{1}+\beta _{2} +(k+1)\rho _{1}+\rho _{2})}\\ d_{k}&=\displaystyle \frac{\beta _{1}\beta _{2}}{k\rho _{1}(\beta _{1}+\beta _{2} +(k+1)\rho _{1}+\rho _{2})}. \end{aligned}$$

Equating coefficients in (12) determines

$$\begin{aligned} \lambda =c\sum _{k\ge 1}a_{k}b_{k} \end{aligned}$$
(13)

where the \(\{a_k\}\) are found to satisfy

$$\begin{aligned} a_{k+1}=-\frac{ca_{k}d_{k}}{\lambda }. \end{aligned}$$
(14)

This recursive equation specifies a solution up to a multiplicative constant:

$$\begin{aligned} a_{k}=\left( -\frac{\beta _{1}\beta _{2}}{(\lambda /c)\rho _{1}^{2}}\right) ^{(k-1)}\frac{a_{1}}{(k-1)!((\beta _{1}+\beta _{2}+2\rho _{1}+\rho _{2})/\rho _{1})_{k-1}}, \end{aligned}$$
(15)

where \((\cdots )_k\) denotes the Pochhammer symbol. Combining this result with (13), yields the implicit equation (9) for \(\lambda \) given in the statement. \(\square \)

3.2 Bounds on the Ratio of Hypergeometric Functions

Recall that the survival of the primary infection is dependent only on its birth-death ratio \(\alpha \) and the connectivity of the underlying graph c, while the secondary infection additionally depends on its relative speed when compared primary, \(\varphi :=\beta _1/\beta _2\). As per the discussion in Sect. 2.2, we specialise to the case that \(\beta _1/\rho _1=\beta _2/\rho _2=\alpha \). Then the implicit eigenvalue equation (9) can be rewritten in terms of the parameters \(\alpha \) and \(\varphi \) to give

$$\begin{aligned} \frac{c}{\lambda }=\frac{(1+\alpha +\alpha \varphi )(1+\varphi +2\alpha +\alpha \varphi )}{\varphi } \frac{1}{\mathrm {\Phi }_\gamma (c\varphi /\lambda \alpha ^2)}, \end{aligned}$$
(16)

where \(\gamma =(1+\varphi )(1+1/\alpha )\) and \(\mathrm {\Phi }\) denotes the hypergeometric ratio

$$\begin{aligned} \mathrm {\Phi }_a(z):=\frac{_0F_1(a+2;-z)}{_0F_1(a+1;-z)}. \end{aligned}$$
(17)

Our strategy to prove the scaling relations claimed in Theorem 1, will be to replace this function by suitably simple upper and lower bounds with the same asymptotic behaviour. Fortunately, there is a substantial literature on topic that we may draw on.

Lemma 3

For \(a>0\) write \(j_{a}\) for the smallest positive root of \(J_a\), the Bessel function of the first kind. Then

$$\begin{aligned} a(a+2)<j_{a}^2<4(a+1)(a+2), \end{aligned}$$
(18)

and for all \(z\in (0,j_{a})\) we have

$$\begin{aligned} 1<\mathrm {\Phi }_a(z)<1+\frac{4z}{j_{a}^2-4z}. \end{aligned}$$
(19)

Proof

Ismail and Muldoon [15] list many different bounds on \(j_{a}\), including those in (18) coming from formulas (6.7) and (6.22) in that article. For the second part, it is well-known [1] that the Bessel functions of the first kind may be expressed as

$$\begin{aligned} J_a(x)=\frac{(x/2)^{a}}{\varGamma (a+1)}\,{}_0F_1(a+1;-x^2/4), \end{aligned}$$

hence, introducing \(x=2\sqrt{z}\), we obtain

$$\begin{aligned} \mathrm {\Phi }_a(z)=\frac{2(a+1)}{x}\frac{J_{a+1}(x)}{J_{a}(x)}. \end{aligned}$$
(20)

This function has previously been studied by Ifantis and Siafarikas [14], who proved various inequalities including their formulas (1.2) and (2.17) which imply the lower and upper bounds of (19). \(\square \)

3.3 Proof of Theorem 1

Proof

As argued previously, in the limit of large Erdős-Rényi random graphs with mean degree c, the survival probability of the secondary infection coincides with that of a multi-type branching process \(\{Z_n\}\) with production kernel given by Eq. (8). From Lemma 2 and Theorem 1 we establish that \(Z_n\) has non-zero probability to survive indefinitely if and only if \(\lambda ^\star >1\), where \(\lambda ^\star \) is the largest real number satisfying

$$\begin{aligned} \frac{\beta _{1}\beta _{2}{}_{0}F_{1}\left( \frac{\beta _{1}+\beta _{2}+3\rho _{1}+\rho _{2}}{\rho _{1}};-\frac{\beta _{1}\beta _{2}}{(\lambda ^\star /c)\rho _{1}^{2}}\right) }{(\lambda ^\star /c)(\beta _{1}+\rho _{1}+\rho _{2})(\beta _{1}+\beta _{2}+2\rho _{1}+\rho _{2})}={}_{0}F_{1}\left( \frac{\beta _{1}+\beta _{2}+2\rho _{1}+\rho _{2}}{\rho _{1}};-\frac{\beta _{1}\beta _{2}}{(\lambda ^\star /c)\rho _{1}^{2}}\right) . \end{aligned}$$
(21)

Noticing that \(\lambda ^\star \) appears only in ratio with c, it follows that the condition for the possibility of survival may be rewritten in terms of the critical connectivity \(c^\star \) such that for \(c>c^\star \) we have \(\lambda ^\star >1\). Rearranging Eq. (21) we straightforwardly find that \(c^\star \) is the smallest positive solution to

$$\begin{aligned} c^\star =\frac{(1+\alpha +\alpha \varphi )(1+\varphi +2\alpha +\alpha \varphi )}{\varphi } \frac{1}{\mathrm {\Phi }_\gamma (c^\star \varphi /\alpha ^2)}, \end{aligned}$$
(22)

which is precisely Eq. (1), as required.

To quantify the scaling behaviour of \(c^\star \) for large and small \(\varphi \), we recall the definition of “big theta” notation: \(f(x)=\mathrm {\Theta }(g(x))\) as \(x\rightarrow \infty \) (resp. \(x\rightarrow 0\)) if there exist positive constants L and U and X such that \(\forall x>X\) (resp. \(x<X\)) we have

$$\begin{aligned} L g(x)< f(x) <U g(x). \end{aligned}$$

Two sufficient conditions are easy to check: \(f(x)=\mathrm {\Theta }(g(x))\) if either

  1. (i)

    f(x) / g(x) has a positive finite limit, or

  2. (ii)

    there exist functions \(l(x),u(x)=\mathrm {\Theta }(g(x))\) such that \(l(x)<f(x)<u(x)\).

We will use the bounds in Lemma 3 to exhibit functions with appropriate finite limits that sandwich \(c^\star \). Specifically, recalling \(\gamma =(1+\varphi )(1+1/\alpha )\), let

$$\begin{aligned} u(\varphi )&= \frac{1}{\varphi }(1+\alpha +\alpha \varphi )(1+\varphi +2\alpha +\alpha \varphi ), \end{aligned}$$
(23)
$$\begin{aligned} l(\varphi )&= u(\varphi )\left( 1-\frac{4\varphi u(\varphi )}{\gamma (\gamma +2)\alpha ^2+4\varphi u(\varphi )}\right) . \end{aligned}$$
(24)

First we check the upper bound. From (22) and the lower bound of unity in Eq. (19) of Lemma 3, we have that

$$\begin{aligned} c^\star =\frac{u(\varphi )}{\mathrm {\Phi }_\gamma (c^\star \varphi /\alpha ^2)}<u(\varphi ). \end{aligned}$$
(25)

For the lower bound, we note that the upper bound on \(\mathrm {\Phi }\) given in Lemma 3 implies a lower bound on \(c^\star \) as the smallest positive \(l^\star \) satisfying the equation

$$\begin{aligned} l^\star =u(\varphi )\left( 1+\frac{4 l^\star \varphi /\alpha ^2 }{j_{\gamma }^2-4l^\star \varphi /\alpha ^2}\right) ^{-1}. \end{aligned}$$
(26)

In fact there is only one solution:

$$\begin{aligned} l^\star =u(\varphi )\left( 1-\frac{4\varphi u(\varphi )}{j^2_\gamma \alpha ^2+4\varphi u(\varphi )}\right) . \end{aligned}$$
(27)

The lower bound \(l(\varphi )<c^\star \) given in (23) follows immediately from this and the lower bound on \(j_\gamma ^2\) given in Eq. (18) of Lemma 3.

It remains to check that the upper and lower bounds both have the desired scaling in large and small \(\varphi \). We begin with \(u(\varphi )\), which has easily determined limits

$$\begin{aligned} \lim _{\varphi \rightarrow 0}\varphi \, u(\varphi ) = (1+\alpha )(1+2\alpha ),\quad \lim _{\varphi \rightarrow \infty }\frac{u(\varphi )}{\varphi } = \alpha (1+\alpha ), \end{aligned}$$
(28)

both of which are finite and positive, implying \(u(\varphi )=\mathrm {\Theta }(\varphi )\) for large \(\varphi \) and \(u(\varphi )=\mathrm {\Theta }(1/\varphi )\) for small \(\varphi \). For the lower bound we use these results to obtain

$$\begin{aligned} 1-\frac{4\varphi u(\varphi )}{\gamma (\gamma +2)\alpha ^2+4\varphi u(\varphi )}\rightarrow \frac{1+3\alpha }{1+\alpha (8\alpha ^2+4\alpha +3)}\in (0,\infty ) \quad \text {as} \quad \varphi \rightarrow 0, \end{aligned}$$
(29)

and

$$\begin{aligned} 1-\frac{4\varphi u(\varphi )}{\gamma (\gamma +2)\alpha ^2+4\varphi u(\varphi )}\rightarrow \frac{1+\alpha }{1+\alpha +4\alpha ^3}\in (0,\infty ) \quad \text {as} \quad \varphi \rightarrow \infty . \end{aligned}$$
(30)

It follows from the defintion of \(l(\varphi )\) and finiteness of these limits that \(l(\varphi )\) has the same scaling form as \(u(\varphi )\) for both large and small arguments. Since u and l sandwich \(c^\star \), the desired scaling is confirmed. \(\square \)

4 Discussion

Theorem 1 provides an exact but implicit formula for the region in which survival of the secondary infection is possible (in the limit of infinitely large graphs), and establishes the scaling behaviour of the boundary of this region for large and small values of the parameter \(\varphi =\beta _1/\beta _2\) which controls the relative timescales of the two infections. Knowledge of this scaling behaviour is enough to prove that, for fixed \(\alpha \) and c, the survival of the secondary infection is confined to a bounded region of \(\varphi \) values—this is the reentrant phase transition of our title. Figures 4 and 5 show the results of numerical simulations of both the branching process and the network model to illustrate this phenomenon.

Fig. 4
figure 4

The top panel shows the fractional size of outbreaks (f, stars) and the probability of an outbreak of size > 100 (p, blue curve) of the secondary infection, measured from 1000 simulations of ER networks with mean degree \(\text {c}=10\) and \(\text {N}=10{,}000\) nodes. The bottom panel shows on the same scale the theoretical survival region of the branching process (pale green box) and the probability of the branching process to reach size> 100 (p, green curve), measured from 1000 simulations of the branching process process (Color figure online)

Fig. 5
figure 5

The density plot shows the probability (estimated as a fraction of 25 simulations per pixel) of an outbreak of size > 100, starting from a single infected node, in an ER network of 10,000 nodes. The red line is the boundary of the region where \(\lambda>1\) (Color figure online)

It is interesting to note that the simulations of the network process and the limiting branching process are not in perfect agreement. Viewing the mean outbreak size over 1000 runs of the model we see in Fig. 4 that, while we have agreement with the branching process for small values of \( \varphi \), large outbreaks still seem to be possible beyond the point predicted by the branching process. Moreover, considering the individual simulation results it seems that this unexpected tail is comprised of a few very large outbreaks; while outbreaks of any size are rare for large values of \( \varphi \), when they do happen they reach most of the graph. By considering the infection spread in a closed connected community we start to encounter finite size effects. Recall that the branching approximation is only valid when the number of infected is relatively small compared to the size of the graph. As the outbreak becomes large the approximation breaks down, a problem exacerbated by the two levels of infection we study. Furthermore in a more highly connected environment we may have the existence of transmission routes for the secondary infection to primary infected cousins as well as direct descendants allowing opportunity for the secondary infection to progress before direct primary progression. Similar finite size scaling effects have been observed in other coevolving infection models; see [7] for example.

Comparing the average outbreak size with individual realisations demonstrates an interesting choice of risk vs reward in the strategy of a secondary infection, due to the different locations of the maxima of the curves shown in the top panel of Fig. 4. The values of \(\varphi \) for which outbreaks are most likely to occur (blue curve) are in the lower end of the survival window, corresponding to smaller total outbreak sizes (black stars). Conversely, larger values of \( \varphi \) have potential for much larger outbreaks, but come with a higher risk of rapid extinction. Looking at this another way, in nature we should expect survival probability to be a strongly selected characteristic, and hence to find that the majority of secondary infections reach only a minority of primary hosts.

The work presented here could easily be extended to a host of other random graph models, for example by building on techniques of [2, 5, 18]. It may also be interesting to explore the application of the model (or variants) to other areas, including: the successive invasions of different species necessary to rebuild a diverse ecosystem in a damaged habitat; the evolution of hyperparasitism (that is, parasites that live on other parasites); radicalisation, and the incremental spread of increasingly extreme political views through social media.