Introduction

Infectious diseases are among the most visible and costly threats to individual and public health. Antibiotics, vaccines, and the molecular revolution in biology have not erased their mark. Millions of persons die every year from treatable ancient diseases, such as malaria (Gething et al. 2011; WHO 2012), tuberculosis (Dye et al. 2008), and measles (Orenstein and Hinman 2012; Simons et al. 2012). Sometimes, elimination of these diseases through vaccination, prophylaxis, and/or vector control is possible, but sustaining elimination campaigns is difficult as pathogens approach the point of elimination (Cohen et al. 2012). Conversely, emerging pathogens such as SARS and swine-origin influenza A (H1N1) cause excess mortality, disrupt business, terrorize vulnerable populations, and lead to new, sometimes ineradicable endemic diseases (e.g., HIV/AIDS). The benefits of accurately forecasting disease emergence would be tremendous: in the case of a low-incidence SARS-like pathogen, the savings could be tens of billions of $US (Rossi and Walker 2005; Smith et al. 2009); an illness resembling the 1918 influenza virus might take millions of lives and impose costs of the same order as a year’s gross domestic product (Osterholm 2005).

Elimination and emergence of infectious disease both involve a transmission system that is pushed over a critical point. In most cases, criticality occurs at the point where the basic reproduction number, \(R_{0}\), the number of secondary infected cases arising from a single infected case in an entirely susceptible population, is equal to one (Heffernan et al. 2005). Similar critical points occur in other complex systems (Strogatz 1994; Sole 2011). Particularly, in noisy (stochastic) systems, these critical points manifest as transitions between alternative modes of fluctuation (Scheffer 2009; Scheffer et al. 2009; Lenton 2011; Scheffer et al. 2012). We call such a stochastic transition a critical transition if there exists a bifurcation in a suitably constructed limit case of the mean field model. A central problem in the study of critical transitions is the identification of phenomena indicating the proximity to a critical transition in the absence of a detailed understanding on the system’s dynamical equations and/or the forcing variables causing the change (Scheffer et al. 2009; Scheffer et al. 2012; Boettiger and Hastings 2012). Recent studies have established that some noise-induced phenomena may signal the approach to a critical transition in a slowly forced dynamical system (Ives and Dakos 2012; Dakos et al. 2012a, b; Dakos et al. 2008; Carpenter and Brock 2006; Donangelo et al. 2012; Seekell et al. 2011; Carpenter and Brock 2011; Brock and Carpenter 2010; Dakos et al. 2010; Carpenter et al. 2008; Guttal and Jayaprakash 2008, 2009). If this property would apply to infectious diseases, it would suggest a model-independent route to forecasting infectious disease emergence in subcritical systems with \(R_{0}<1\) (i.e., crossing the critical point “from below”) and documenting the approach to elimination in endemic (supercritical) disease systems where \(R_{0}>1\) (crossing the critical point “from above”).

Many of these characteristic noise-induced phenomena involve critical slowing down, a decline in the resilience of the system to perturbations, which generally gives rise to an increase in the variance and autocorrelation of fluctuations as the system approaches the transition (van Nes and Scheffer 2007; Dakos et al. 2012a). But, these properties require that the transition be suitably regular (Hastings and Wysham 2010). Specifically, to observe critical slowing down requires that the potential function of the system, if it exists, be smooth. Also, for systems with multiple attracting sets (e.g., bistable systems), a sufficient level of noise gives rise to “flickering” which exhibits the characteristic increase in variance (but for a different reason) and not the increase in autocorrelation (Scheffer et al. 2009; Dakos et al. 2012b). Other obstacles to anticipating critical transitions in infectious diseases include that epidemiological systems are complicated by amplification of transients and oscillatory dynamics (Rohani et al. 2002; Bauch and Earn 2003), that infectious diseases are often seasonally forced (Altizer et al. 2006; Fraser and Grassly 2006) and propagate in demographically open systems subject to imported cases (Keeling and Rohani 2002), and that diseases that are close to elimination or close to emerging are, by definition, characterized by low prevalence and, therefore, both subject to demographic stochasticity and difficult to observe (Lloyd-Smith et al. 2009). Therefore, to determine whether diagnostic noise-induced phenomena accompany the critical transitions that occur in infectious disease dynamics, namely the transition to endemicity of emerging infectious diseases and the transition to extinction in disease elimination, it will be useful to have a quantitative theory, both to guide the selection of statistical quantities to be investigated and to predict the form that observable quantities should take as the transition is approached.

This paper is a contribution to this theory. We consider the compartmental susceptible–infectious–susceptible (SIS) and susceptible–infectious–recovered (SIR) models, which are widely used to represent the dynamics of a range of nonimmunizing and immunizing pathogens and may be viewed as approximations to an even broader class of infectious diseases. In their deterministic formulations, these models are characterized by a transcritical bifurcation at \(R_{0}=1\), where the endemic and disease-free steady states meet and exchange stability. The epidemic models we investigate are generalizations of the closed population SIS and SIR models that allow for immigration from external sources, of which the more familiar models without immigration that are characterized by a transcritical bifurcation are special cases. To appropriately model the slowly forced dynamical system, we first develop mean field theory for the nonstationary systems gradually approaching a bifurcation. We develop the mean field theory here because slowly forced dynamical systems approaching a bifurcation have rarely been investigated (but see Kuehn (2011)). We then develop the full stochastic description in terms of a master equation. Our strategy is to use the van Kampen system size expansion to separate the mean field dynamics from the fluctuations induced by demographic stochasticity. These fluctuations are described by a Fokker–Planck equation, from which the power spectrum, variance, autocorrelation, and coefficient of variation may be analytically obtained. Our main result is contained in Table 4, which provides formulas expressing these theoretical predictions in terms of the dominant eigenvalue of the mean field solutions, which goes to zero as the system approaches the critical transition. Hence, these formulas express how epidemiologically measurable quantities are expected to change as the system approaches the critical transition. A number of approximations are introduced in the derivation of these formulas, including separation of time scales, second-order description of the fluctuations, and continuum representation of the state space. We, therefore, performed simulations to study the possible detrimental effects of these approximations. Specifically, we examined the agreement between the theoretical predictions and “measured” statistics obtained in a moving window, as one would analyze real-world data, in both discrete state-space and continuous state-space simulations. We show that the analytical expressions are robust, provided the conditions of fast–slow systems are met. These results demonstrate that the critical transitions associated with disease emergence and elimination may be anticipated even in the absence of a detailed understanding of their underlying causes.

Theory and methods for emergence and elimination

SIS model

Mean field theory of SIS model

We first consider a general SIS model that allows for immigration. Denoting the proportions of susceptible and infectious populations by \(x(t)\) and \(y(t)\), respectively, the SIS model in a population of size N is given by

$$ \begin{array}{rll} \dot{x} &=& -\beta x y - \eta x +\gamma y \\ \dot{y} &=& \beta x y + \eta x -\gamma y, \end{array} $$
(1)

where \(\beta \) is the the transmission rate, \(\gamma \) is the rate of transfer from the infectious class to the susceptible class, and a dot denotes the time derivative. The model assumes that infection may also occur through contact with a trickle of infectious imports at a rate \(\eta \), either by susceptibles briefly leaving the population and making contact with infectious individuals located elsewhere or through infectious visitors briefly entering the population and making contact with susceptibles, so that the total force of infection is \(\Lambda = \beta y + \eta \). Since the population size N is constant, i.e., \(x+y=1\), the SIS model may be described by a single equation,

$$ \dot{y} = \beta (1-y) y + \eta (1 -y) -\gamma y. $$
(2)

We note that Eq. (2) can encompass a variety of infectious disease systems, including closed population SI models. Thus, the system can be used to model diseases that confer no long-lasting immunity, e.g., sexually transmitted diseases or acute infections such as influenza or diseases that are fatal. If \(\eta = 0\), Eq. (2) has two equilibria: the disease-free equilibrium \(y_{0}=0\) and the endemic equilibrium \(y^{\ast } = (1-1/R_{0})\), where \(R_{0} = \beta / \gamma \) denotes the basic reproduction number. When \(R_{0}>1\), the infection can persist, but if \(R_{0}<1\), then the infection dies out. The equilibria \(y_{0}\) and \(y^{\ast }\) meet and exchange stability when the basic reproduction number \(R_{0}=1\), i.e., a transcritical bifurcation occurs. We will refer to the SIS model with no immigration as the limiting case. However, it is usually more realistic to assume that a low level of immigration occurs, i.e., \(\eta \) is positive. If \(\eta >0\), then Eq. (2) has a single positive equilibrium that is always locally stable, provided \(\beta , \gamma >0\), since the slope at the equilibrium \(\lambda = \beta - 2 \beta y^{\ast } - \gamma -\eta \) is negative. Moreover, whether \(\eta = 0\) or \(\eta >0\), the return time \(-1/\lambda \) decreases in the limit \(\beta \rightarrow \gamma \). At the critical point in the limiting case SIS model, \(\lambda =0\). Therefore, the transcritical bifurcation occurs in a suitably constructed limiting case (\(\eta =0\)), which we suggest gives an indication for the behavior of the fluctuations for the stochastic version of the model with immigration, provided \(\eta \) is small.

Slow changes in the transmission rate \(\beta \), through demographic or evolutionary means, may induce a transcritical bifurcation of Eq. (2). For example, transmission may decrease as a result of slow increases in vaccination uptake among the population, leading to extinction of the pathogen, or it may increase due to slow decreases in vaccination uptake, potentially causing a transition to endemicity. To formally incorporate slow changes in transmission into the model, we rewrite Eq. (2) as a fast–slow system:

$$ \begin{array}{rll} \dot{I} &=& \beta (1-y) y + \eta (1-y) -\gamma y,\\ \dot{\beta} &=& \epsilon f(y, \beta) , \end{array} $$
(3)

where \(0< \epsilon \ll 1\) and the function f describes the change in transmission rate. In this paper, we assume that transmission is a slowly changing linear function of time, i.e., \(\beta = \beta (t) =\beta (1-p(t))\). The proportion of the population that are vaccinated is modeled by the function \(p(t)=p_{s}+p_{0} t\), with \(p_{0}\) an incremental change in \(\beta \). Hence, \(\dot {\beta } = p_{0}\). If \(p_{0}>0\), then the rate of transmission is slowly declining over time, and if \(p_{0}<0\), it is slowly increasing. The limit case \(\epsilon \rightarrow 0\) of Eq. (3) is Eq. (2), and thus, the endemic equilibrium of the limit case assumes \(\beta = \beta (1-p_{s})\), which arises from the level of vaccination uptake \(p(t)\) (the bifurcation parameter). Figure 1 shows the transcritical bifurcation diagram for the SIS model. For small \(\eta \), the plot of the stable endemic equilibrium \(y^{\ast }\) as a function of vaccination uptake is similar to the bifurcation diagram for \(\eta =0\), except in the vicinity of the critical point \(p^{\ast }\) (inset figure of Fig. 1). For \(p > p^{\ast }\), the disease-free equilibrium is stable in the limiting case, but for the model with immigration, the infectious population is sustained at a low level due to importation of the disease from external sources (Fig. 1).

Fig. 1
figure 1

Bifurcation diagram of the SIS model (\(\epsilon \rightarrow 0\) mean field theory). The red dashed line indicates the stable endemic equilibria of the SIS model without immigration as a function of vaccination uptake p. The inset plot shows a close-up of the diagram near the transcritical bifurcation point in the limiting case where \(\eta =0\). The dashed line indicates the location of the critical point, \(p^{\ast }=1-1/R_{0}=0.9\). The bifurcation diagram was plotted using the parameters given in Table 7

Stochastic description of SIS model

To investigate the effect of noise on this transition, we assume that fluctuations in the infectious population are caused by demographic stochasticity (intrinsic noise). To understand these fluctuations, we require an individual-level description rather than a population-level description. We assume that all individuals have identical attributes, and individuals may move from the susceptible state \(S(t)\) to the infectious state \(I(t)\), or from the infectious state to the susceptible state. We assume that the population size is constant, i.e., \(S(t)+I(t)=N\). Since the population size is constant, we need only model the transitions into and out of the infectious state. Furthermore, since the SIS system moving towards a bifurcation is a fast–slow system, the transmission rate \(\beta \) is not constant in time but is a slowly changing function of time. Because the probability of transmission between an infectious and susceptible individual is slowly changing over time, we treat the transmission rate as constant for each small time increment dt, \(\overline {\beta }\). Table 1 shows the variables and parameters of the SIS model.

Table 1 Variables and parameters of the SIS model

We assume that in a sufficiently small time increment dt, the number of infectious individuals I can either (a) increase by 1, (b) decrease by 1, or (c) not change in number. Infection and removal are the events in an SIS process that lead to these changes in state. We denote the number of individuals I at time t or the state of the system by \(\alpha = I\) at time t and the alternative states by \(\tilde {\alpha }\), where either \(\tilde {\alpha } = I+1\) or \(\tilde {\alpha }=I-1\) in the SIS model. The system of individuals goes from a state with \(\alpha =I\) individuals at time t to a state with \(\tilde {\alpha }\) individuals at time \(t+dt\) with transition probability per unit time \(T_{i}(\tilde {\alpha }|\alpha )\). The system can also go from a state with \(\tilde {\alpha }\) individuals at time t to I individuals at time \(t+dt\) with a probability flux \(\tilde {T_{i}}(\alpha |\tilde {\alpha })\). Table 2 outlines the events, changes in state, and the transition probability fluxes into and out of state I that can occur in a stochastic SIS process with a slowly changing transmission rate.

Table 2 Transition probability fluxes for SIS model into and out of state \(\alpha =I\)

Since the transitions in Table 2 describe a Markov process, we can write down a master equation describing how the probability of there being I individuals at time t evolves with time. The master equation represents the continuous time version of the dynamics of the Markov process. Derivations of master equations can be found in Van Kampen (1981) and Renshaw (1991). Letting \(P(I,t)=\text {Prob}\) \((I(t)=I)\) be the probability that the infectious state variable \(I(t)\) is equal to some nonnegative integer I, the master equation for the evolution of the probability of the infectious population being in state \(\alpha = I\) at time t is

$$ \frac{dP(\alpha,t)}{dt} = \sum\limits_{\alpha \neq \tilde{\alpha}} \tilde{T_{i}}(\alpha|\tilde{\alpha})P(\tilde{\alpha},t)- \sum\limits_{\alpha \neq \tilde{\alpha}} T_{i}(\tilde{\alpha}|\alpha)P(\alpha,t), $$
(4)

where \(\tilde {\alpha }\) describes all other states (Table 3), and the probability fluxes into and out of state I are as described in Table 2. This is a system of \(N+1\) ordinary differential equations, \(\alpha =0, 1, \dots N\), which can be solved with an initial condition of \(I_{0}\) individuals at time \(t=0\), i.e., \(P(\alpha , 0) = 1\) for \(\alpha = I_{0}\) and \(P(\alpha , 0) = 0\) for all \(\alpha \neq I_{0}\). Thus, the probability distribution at time \(t=0\) is a delta function. Table 3 summarizes the quantities in Eq. (4).

Table 3 Master equation terms. The transition probability of infection per unit time is given as an example of a transition into and out of state \(\alpha \) for the SIS and SIR stochastic processes. The other transition probability fluxes can be found in Tables 2 and 6

We will apply the van Kampen system size expansion to the master equation. The system size expansion takes advantage of the fact that, for large population size N, the demographic fluctuations in the population are small, and so N is the expansion parameter. Thus, the key assumption for the expansion to be valid is that the system size N is sufficiently large (see Appendix A for details). Here, we pause to note that that the SIS model is equivalent to a simple logistic process, and hence, there are analytical expressions for the mean and variance (Allen 2003). The \(I=0\) state is absorbing, and the expected time to extinction is finite (Allen 2003). Nevertheless, if the population size is large, the probability distribution is approximately stationary for a very long time. It is the statistics of this quasi-stationary distribution that we are interested in.

Following expansion of the master equation, the leading order and next-to-leading order terms are collected, giving rise to a deterministic system that describes the evolution of the trend and its stochastic correction, respectively, see Appendix A for details. The deterministic system turns out to be the fast–slow system (3). The distribution of the fluctuations about the solution of Eq. (3) is given by a linear Fokker–Planck equation. Thus the fluctuations are Gaussian distributed about the mean, and thus, \(I(t) \sim \text{Normal}(N \varphi(t), N \sigma^2)\), see Appendix A for details and definitions. The Fokker–Planck equation is equivalent to a stochastic differential equation (Gardiner 2004). This is the key advantage of applying the system size expansion. Using the stochastic differential equation, we can obtain analytical expressions for statistical signatures of leading indicators and early warning signals, including the power spectrum and autocorrelation function (see Appendix A for details). Table 4 summarizes the indicator statistics calculated using the methods described in Appendix A for the stable limiting case \(\epsilon \rightarrow 0\) for the SIS model, in terms of the eigenvalue \(\lambda \). We will examine how the statistics change over a range of vaccination uptake values in section “Results”.

Table 4 Analytical expressions for quasi-stationary statistics about the endemic infectious quasi-steady state expressed in terms of the eigenvalues

Robustness of the van Kampen approximation

Since the system size expansion involves approximating the discrete random variable I with a normal random variable, its assumptions may break down when I is small. For small numbers of infectious individuals, e.g., if the system is subcritical and, thus, incidence of a disease is low, the assumption that the fluctuations are normally distributed about the mean is likely to be inappropriate. The van Kampen expansion is valid when the system is far from the absorbing boundary at \(I=0\). However, the approximation upon which the approach is built (Eq. 10 in Appendix A) cannot describe chance extinctions, which can occur if I is small. If \(\eta = 0\) and \(\overline {p} > 1- 1/R_{0}\), the disease-free equilibrium is stable, and so the approximation given by Eq. (10) is not valid. Due to the absorbing boundary, the probability distribution about the disease-free state for a given time t will be one-sided. On the the other hand, if \(\eta >0\), then, close to the system boundary, the normal distribution approximation about the deterministic mean may be poor because the distribution is bimodal due to extinction events. However, the models with immigration do not have a disease-free state; just a small infectious population when \(\overline {p}\) is approximately greater than \(1/R_{0}\). Therefore, predictions for the quasi-stationary statistics about this state can be obtained.

SIR model

Mean field theory of SIR model

Denoting the proportions of susceptible, infectious, and recovered populations by \(x(t)\), \(y(t)\), and \(z(t)\) respectively, the SIR model with immigration in a closed population of size N is

$$\begin{array}{rll}\label{eqn:SIRmodelimmigration}\dot{x} &=& \mu(1-p) -\beta x y - \eta x -\mu x \\\dot{y} &= &\beta x y + \eta x -(\gamma+\mu) y, \\\dot{z} &=& \mu p+ \gamma y - \mu z,\end{array}$$
(5)

where \(\beta \) is the transmission rate, \(\gamma \) is the recovery rate, \(\eta \) is the immigration rate, and \(\mu \) is the per capita birth rate. To maintain a constant population size, the per capita death rate is set equal to the birth rate. A proportion p of the population are chosen at random for vaccination at birth and are recruited into the recovered class. The remaining unvaccinated proportion of the population enter the susceptible class at birth. Since the population size N is constant, \(x+y+z=1\), system (5) is equivalent to the following system of ordinary differential equations

$$\begin{array}{rll}\label{eqn:SIRmodel}\dot{x} &=& \mu(1-p) -\beta x y - \eta x -\mu x \\\dot{y} &=& \beta x y + \eta x -(\gamma+\mu) y.\end{array} $$
(6)

The SIR model is particularly appropriate for acute immunizing infections such as measles and pertussis.

In the absence of vaccination and immigration, the basic reproduction number is given by \(R_{0} =\beta /(\gamma +\mu )\). If \(\eta = 0\), the SIR model (6) has two equilibria: the disease free state \((x_{0}, y_{0})=(1-p, 0)\) and the endemic equilibrium \((x^{\ast }, y^{\ast })=(1/R_{0}, \mu (1-p-1/R_0) /(\gamma+\mu)\), respectively. The SIR model undergoes a transcritical bifurcation at \(R_{0} =1\). The endemic equilibrium is locally stable if \(R_{0}>1\) and is not biologically feasible if \(R_{0}<1\), when the disease-free equilibrium is locally stable. As the vaccination uptake p increases, the basic reproduction number reduces by a factor \((1-p)\), i.e., the effective reproduction number is \(R_{0}(1-p)\). The vaccination uptake p for the effective reproduction number is 1 at the critical vaccination threshold, \(p^{\ast } = 1 - 1/R_{0}\) (Anderson and May 1991), and it is at this critical threshold that the transcritical bifurcation occurs. Again, the SIR model with no immigration is the relevant limiting case.

Temporary importation of pathogen often occurs in infectious disease systems (Keeling and Rohani 2008). Assuming that a low level of immigration occurs, i.e., \(\eta >0\), only a single positive equilibrium is biologically feasible. This equilibrium is a stable spiral when the square of the trace of the Jacobian matrix evaluated about the equilibrium is less than four times its determinant and is a stable node if it is greater than or equal to this quantity. Complex eigenvalues of the Jacobian matrix of Eq. (6) characterize a stable spiral, and the eigenvalues are real and negative if the equilibrium is a stable node. The stable node equilibrium can be thought of as a “disease-free” equilibrium if \(\eta \) is small, in the sense that the infectious population is sustained at a low level due to importation of the disease from external sources. Therefore, while the transcritical bifurcation occurs in a suitably constructed limiting case (\(\eta =0\)), the limit case may give an indication for the behavior of the fluctuations for the stochastic version of the model with immigration, provided that \(\eta \) is small.

Gradual changes in the vaccination uptake rate p may induce a transition from endemicity or to extinction. Recruitment of susceptibles may slowly vary over long time scales as a result of demographic or evolutionary changes. Vaccination uptake rates are often not constant over time and may exhibit trends, e.g., percentage uptake of pertussis vaccine in the USA (Rohani and Drake 2011). Mathematically, we may express the SIR model approaching a transcritical bifurcation as a fast–slow system:

$$\begin{array}{rll}\label{eqn:fastslowSIRmodel}\dot{x} &=& \mu(1-p) -\beta x y - \eta x -\mu x \\\dot{y} &= &\beta x y + \eta x -(\gamma+\mu) y, \\\dot{p} &= &\epsilon f(x, y, p), \end{array}$$
(7)

where \(0<\epsilon \ll 1\) and the function f describes the change in vaccination uptake p. Again, we model vaccination uptake as a linear function of time \(p(t)=p_{s}+p_{0} t\), with \(p_{0}\) an incremental change in p. Consequently, \(\dot {p} = p_{0}\). If \(p_{0} >0\), recruitment into the susceptible class is slowly declining over time and if \(p_{0}<0\), then recruitment is slowly increasing over time. In the limit \(\epsilon \rightarrow 0\), the vaccination uptake is fixed at a constant rate \(p_{s}\), and the system is stable. Figure 2 shows the transcritical bifurcation diagram for the SIR model. The infectious equilibrium \(y^{\ast }\) is plotted as a function of vaccination uptake p. The bifurcation diagram indicates that the endemic infectious equilibrium declines linearly with p in the models with and without immigration. However, the inset plot indicates that in the vicinity of the \(\eta = 0\) critical point, the infectious equilibrium of the immigration model is elevated relative to the disease-free equilibrium, since the infectious population is sustained at a low level due to immigration.

Fig. 2
figure 2

Bifurcation diagram of SIR model (\(\epsilon \rightarrow 0\) mean field theory). The red dashed line indicates the stable endemic equilibria of the SIR model without immigration as a function of vaccination uptake p. The inset plot shows a close-up of the diagram near the critical point. The dashed line indicates the location of the critical point, \(p^{\ast }=1-1/R_{0}=0.94\)

Stochastic description of SIR model

Model (7) is the deterministic description of the SIR system approaching a transition, but there will be stochastic fluctuations in the state of the system as the transition is approached. As before, we assume that these fluctuations result from demographic stochasticity. To quantify these fluctuations, we assume individuals are identical. Individuals may be recruited into the susceptible state and can transition out of this state through infection or death. Infectious individuals may recover or die. Assuming that the population size \(S(t) + I(t) +R(t) = N\) is constant, we need only consider the transitions into and out of the susceptible and infectious states. Furthermore, since the SIR system moving towards a bifurcation is a fast–slow system, the vaccination uptake p is not constant in time but is a slowly changing function of time. Therefore, the recruitment rate into the susceptible class is slowly changing, but, for each small time increment dt, we can treat the vaccination uptake as constant \(\overline {p}\). Table 5 presents the variables and parameters of the SIR model.

Table 5 Variables and parameters of the SIR model

Events that occur in an SIR process in a small time increment dt moving slowly towards a transition include infection, recovery, recruitment to the susceptible class, and death due to natural causes. Table 6 outlines the events, changes in state, and the transition probabilities per unit time for each change in state. We can construct a master equation in the same manner as for the SIS model. Letting \(P(S, I,t)=\text {Prob}((S(t),I(t)) = (S, I))\) be the probability that the state vector \((S(t),I(t))\) is equal to some nonnegative integer \((S,I)\), the master equation for the evolution of the probability of the population being in state \(\alpha = (S,I)\) at time t is

$$ \frac{dP(\alpha,t)}{dt} = \sum\limits_{\alpha \neq \tilde{\alpha}} \tilde{T_{i}}(\alpha|\tilde{\alpha})P(\tilde{\alpha},t)- \sum\limits_{\alpha \neq \tilde{\alpha}} T_{i}(\tilde{\alpha}|\alpha)P(\alpha,t), $$
(8)

where \(\tilde {\alpha }\) describes all other states (Table 3).

Table 6 Transition probability fluxes for SIR model. Transitions can occur into and out of state \(\alpha = (S,I)\)

The master Eq. (8) is nonlinear and, therefore, cannot be solved analytically. To make analytical progress with Eq. (8), again, we can use the van Kampen system size expansion (see Appendix B for details). The approach gives rise to the deterministic SIR fast–slow system (7) and a linear Fokker–Planck equation that describes the evolution of the fluctuations, which may be written as a system of stochastic differential equations. These equations can be analyzed using Fourier transformation. The solution of the Fokker–Planck equation is a bivariate normal distribution. Table 4 summarizes the indicator statistics in the stable system limiting case \(\epsilon \rightarrow 0\) for the SIR model, in terms of its eigenvalues.

Simulations

The preceding sections present an analytical theory of early warning signals for emergence and leading indicators of elimination. To investigate the results of this theory for a particular parameter set (Table 7), we calculated leading indicators of elimination and emergence, assuming alternatively that (a) the mean proportion of infectious individuals is given by the deterministic endemic equilibrium (\(\epsilon \rightarrow 0\) theory) or (b) assuming it is given by the current state of the fast–slow system approaching a transition. We selected parameters consistent with sexually transmitted SIS diseases with long infectious period and large \(R_{0}\) and parameters typical for childhood infectious diseases with SIR dynamics. To examine how different changes in vaccination uptake \(p_{0}\) affect the statistics, we varied \(p_{0}\) by 1/500 and \(1/30~\text {year}^{-1}\) in the model with immigration. We also compared the elimination indicators with those calculated assuming that the mean proportion of infectious individuals was given by the deterministic endemic equilibrium from the limiting case models with no immigration. The early warning signals of emergence were not calculated for the limiting case because the disease-free equilibrium is stable.

Table 7 Parameter values for the simulations

To test the robustness of this theory to the range of approximations that were introduced (fast–slow approximation, continuum description, van Kampen expansion), we simulated the approach to elimination and emergence in a variety of cases. To simulate the approach to elimination, we followed the “bottom-up” approach of Allen (2003) to derive stochastic differential equations that incorporated demographic stochasticity. This approach uses the rates in Tables 2 and 6 to build a system of stochastic differential equations. Stochastic differential equations formulated in this way are appropriate, provided that the population size is sufficiently large, because then changes in the state variables are assumed to be normally distributed. Simulations were compared with output from Gillespie’s direct method (Gillespie 1977). The simulations were qualitatively similar for population sizes greater than 50,000. To simulate the approach to emergence, we used Gillespie’s direct method as this is most appropriate for small population sizes.

We simulated the SIS and SIR stochastic models with immigration approaching elimination and emergence 500 times. To compare the statistics to the theoretical predictions in Table 4, the infectious time series approaching elimination were sampled at yearly intervals. The transcritical bifurcation in these scenarios was approached over a long time frame (e.g., 470 years when \(p_{0} = 1/500\) in the SIR model). However, the transcritical bifurcation to emergence was approached over a relatively short time frame for the SIR model (10 years). To obtain a better sampled time series, the data from the Gillespie simulations were aggregated over monthly intervals for the SIR model. The infectious time series were aggregated over yearly intervals for the SIS model approaching emergence because events did not always occur at a monthly frequency. Thus, time series obtained from the SIS system approaching emergence allowed us to examine the issues that arise from poorly sampled time series.

Analysis over a moving window

To investigate the robustness of the early warning predictions over a moving window, i.e., as they would be used in online analysis of surveillance data, the influence of the slowly varying trend must be removed. The van Kampen approach (Appendices A and B) leads to a natural expression for the fluctuations from the quasi-stationary state \(N \varphi (t)\),

$$ \zeta(t) = N^{-1/2}(I -N \varphi(t) ), $$
(9)

where ϕ(t) is determined by the mean field equations (11)(SIS) and (22) (SIR) in Appendices A and B respectively. To obtain the fluctuations, we subtracted the current mean, which we assumed to be determined by the current state of the fast–slow system \(N \varphi (t)\) , from the state of the system at the start of each year and divided this quantity by the square root of the population size. We refer to this as Van Kampen detrending.

Gaussian filtering is another, more common, method used to remove the influence of a slowly varying mean of a data series. To compare the performance of Gaussian smoothing to van Kampen detrending, for each time series, we fit a Gaussian kernel smoothing function across the entire infectious case record up to the time that the transcritical bifurcation was predicted using a fixed bandwidth. Lenton et al. (2012) have shown that the results obtained from applying the Gaussian filter across the entire time series do not differ significantly from detrending within windows. To obtain the residuals, we subtracted the fit from each time series and divided by the square root of the population size to be consistent with van Kampen detrending. The choice of bandwidth was informed by the resemblance of the Gaussian residuals to the fluctuations obtained from the van Kampen approach. To study the changes in the statistics up to the critical transition, we calculated the lag-1 autocorrelation and the variance of the fluctuations obtained using the two detrending methods over a moving window half the length of the time series. We calculated the lag-1 autocorrelation coefficient of each replicate using the acf function in R. The coefficient of variation (CV) was calculated by calculating, over a moving window, the mean and standard deviation of each infectious replicate. The median and 95 % prediction intervals for each of the statistics were calculated over the 500 replicates of each model. The prediction intervals were calculated using the quantile function in R. To quantify trends in each statistic for each replicate, we used Kendall’s correlation coefficient \(\tau \). To determine the distribution of Kendall’s \(\tau \), we calculated the coefficient for the trend in the test statistic for each realization.

To assess the performance of the leading indicators, we followed the approach of Boettiger and Hastings (2012) and calculated receiver operating characteristic (ROC) curves from the distribution of Kendall’s \(\tau \) calculated from realizations of the models with and without transitions. The model without a transition is quasi-stationary, and we refer to it as the baseline or null model. The null models for elimination assume vaccination uptake \(p=0\) and are simulated beginning from the deterministic equilibrium at \(p=0\). The null models for emergence assume vaccination uptake \(p=0.96\) and are simulated beginning from the deterministic equilibrium at \(p=0.96\). The baseline models were simulated for the same length of time as it takes for the transition to be approached in the test models. The model with a transition is the test model. An ROC curve enables investigation of the sensitivity of leading indicators to detect differences between quasi-stationary systems and those approaching a critical transition. We simulated the baseline models 500 times and obtained fluctuations using the van Kampen approach and from Gaussian filtering. We then quantified the trend in each indicator using Kendall’s \(\tau \) for each baseline simulation. In our results, indicator statistics typically exhibited increasing trends or no trend, but for those that decreased, we multiplied Kendall’s \(\tau \) for each realization by -1 to calculate the ROC curve. The area under the ROC curve (AUC) was also calculated. An AUC close to one indicates near-perfect detection.

Results

Transitions to elimination and emergence in SIS systems

The mean field fast–slow dynamics drive the transitions to elimination and emergence in stochastic SIS models with gradual changes in transmission. Figure 3a shows a solution \(y(t)\) of the fast–slow system (3) with \(\eta \neq 0\). The solution is the current “pool” of infectious individuals (the proportion infectious \(y(t)\)), not the number of new cases per time interval that comprise the traditional epidemic curve or recorded case reports of infectious individuals. At time \(t=450\), the limiting case model predicts elimination of the pathogen. The limit case plotted in the figure is that arising from the \(\epsilon \rightarrow 0\) mean field theory, not the \(\eta =0\) critical case. Here, we note that the limit case \(\epsilon \rightarrow 0\) and the solution of the fast–slow system diverge from one another close to the proximity of the transcritical bifurcation of the \(\eta = 0\) limit case. Figure 3b shows a typical realization of the stochastic counterpart of the fast–slow SIS system with \(\eta \neq 0\), assuming a change in vaccination uptake of \(p_{0} =1/500\) per year. The description of the stochastic process is outlined in Table 2. The infectious population slowly declines and closely follows the limiting case \(\epsilon \rightarrow 0\) up until time \(t=400\) before eventually diverging. The inset figure indicates that the infectious population remains elevated 50 years after the transcritical bifurcation has occurred in the model with \(\eta = 0\). Therefore, the transition to elimination in this system is characterized by a slow change in the mean, followed by rapid fall off in incidence that precedes a very slow fadeout to extinction.

Fig. 3
figure 3

a SIS mean field predictions, assuming that immigration is nonzero. The black line indicates a solution of the fast–slow system (3) with \(\eta > 0\), and the blue dashed line indicates the stable equilibrium for the vaccination uptake at time t indicated on the top axis (limit case \(\epsilon \rightarrow 0\)). The gray dashed line indicates the location of the critical point when \(\eta =0\). b Realization of the stochastic fast–slow SIS model with immigration. The dashed line marks the time of the transcritical bifurcation. The proportion of the population vaccinated at time t is indicated by the top axis in all figures. Parameter values for the figures are given in Table 7

On the other hand, if a disease is emerging due to small increases in transmission, or the level of vaccination uptake in the population is declining, solutions of the fast–slow system may exhibit a “delay” in emergence relative to the occurrence of the transcritical bifurcation. Figure 4a shows the delay in emergence in a solution of Eq. (3) with \(p_{0}=1/500\). At time \(t=30\), the limiting case model predicts emergence. Again, the limit case indicated in the figure is the limit \(\epsilon \rightarrow 0\) of Eq. (3), not the \(\eta =0\) critical case. Figure 4b shows a typical realization of the stochastic counterpart of the solution shown in Fig. 4a. The delay in the take-off of the epidemic is apparent. The rapid increase in the number of cases, is preceded by a low level of incidence long after the transition to endemicity has occurred. Of course, the question is, does the low level of infection in the time series give any clue, through early warning signals, of the eventual rapid increase of infectious cohorts?

Fig. 4
figure 4

a SIS mean field predictions, assuming that immigration is nonzero. The black line indicates a solution of the fast–slow system (3) with \(\eta > 0\), and the blue dashed line indicates the limiting case \(\epsilon \rightarrow 0\) of the fast–slow system. The limit case here refers to the endemic equilibrium of Eq. (2) corresponding to the transmission rate at time t \(\beta (1-p(t))\) indicated on the top axis (\(\beta =1\)). The gray dashed line indicates the location of the critical point when \(\eta =0\). b Realization of the stochastic fast–slow SIS model with immigration. The black dashed line indicates the timing of the transcritical bifurcation when \(\eta =0\). The unvaccinated proportion of the population at time t is indicated by the top axis in all figures. Parameter values for the figures are given in Table 7

Transitions to elimination and emergence in SIR systems

The transitions to disease elimination and endemicity in the stochastic SIR models with gradually changing vaccination uptake are also driven by the mean field fast–slow dynamics. To eradicate a disease, the effective reproduction number must be reduced by increasing the vaccination uptake p. Figure 5a shows a solution of the fast–slow system (7) with \(\eta \neq 0\). The mean field \(\epsilon \rightarrow 0\) theory and the solution of Eq. (7) agree closely. The timing of the transcritical bifurcation in the \(\eta =0\) model occurs at \(t=470\), and the number of infectious individuals has reached a low level by this time. Figure 5b shows a stochastic realization of the SIR model with immigration approaching elimination (the details of the stochastic process are described in Table 6). The infectious population exhibits amplified oscillations, unlike the deterministic trajectory in Fig. 5a. Demographic stochasticity is expected to excite the transient oscillations of the SIR model (Bauch and Earn 2003), and we will see this in detail later when we examine the power spectrum of the fluctuations.

Fig. 5
figure 5

a SIR mean field predictions, assuming that immigration is nonzero. The black line indicates a solution of the fast–slow system (7) with \(\eta > 0\), and the blue dashed line indicates the equilibrium arising from the limiting case \(\epsilon \rightarrow 0\) of the fast–slow system, corresponding to the level of vaccination uptake indicated on the top axis. The gray dashed line indicates the location of the critical point when \(\eta =0\). b Realization of the stochastic fast–slow SIR model with immigration. The dashed line marks the time of the transcritical bifurcation. The proportion of the population vaccinated at time t is indicated by the top axis in all figures. Parameter values for the figures are given in Table 7 (elimination parameters)

On the other hand, if a disease is emerging through an increase in recruitment into the susceptible class, the effective reproduction number is increasing. Figure 6a shows a solution of the fast–slow system (7) with \(\eta > 0\) on the verge of emergence. After the transcritical bifurcation, the solution of the system does not agree with the stable endemic equilibrium at time t until approximately \(t=55\) years. The solution grows more slowly than the stable equilibrium. The delay in the dynamics is also exhibited by stochastic realizations, e.g., Fig. 6b. Outbreaks remain small and sporadic for at least 10 years after the bifurcation has occurred. The delay before emergence arises because the mean predicted by system (7) remains small following the bifurcation. However, the bifurcation delay is not as marked as that in the SIS model (compare Fig. 4 with Fig. 6).

Fig. 6
figure 6

a SIR mean field predictions, assuming that immigration is nonzero. The black line indicates a solution of the fast–slow system (7) with \(\eta > 0\), and the blue dashed line indicates the equilibrium arising from the limiting case \(\epsilon \rightarrow 0\) of the fast–slow system, corresponding to the recruitment rate at time t \(\mu (1-p(t))\) indicated on the top axis (\(\mu =1/50~\text {year}^{-1}\)). The gray dashed line indicates the location of the critical point when \(\eta =0\). b Realization of the stochastic fast–slow SIR model with immigration. The black dashed line indicates the timing of the transcritical bifurcation. The per capita recruitment rate of the population at time t is indicated by the top axis in all figures. Parameter values for the figures are given in Table 7 (emergence parameters)

Leading indicators of elimination

Power spectrum predictions

Figure 7 compares the power spectra of the fluctuations in the stable models with and without immigration. The patterns in the power spectra with immigration (Fig. 7c, d) and without (Fig. 7a, b) are similar. For SIS systems with low transmission rates, and SIR models with low recruitment rate into the susceptible class, the power in the peak of the spectrum is low. The power spectrum for SIS and SIR systems is expected to shift to lower frequencies as the pathogen approaches extinction. In the SIS systems, the power at smallest frequencies increases as the critical point is approached, but this effect is subtle. In contrast, in the SIR case, there is a dramatic reduction in the frequency at which the highest power is observed. The approach to endemicity is expected to be indicated by a shift to higher frequencies in the power spectrum. In the SIS model, the spectrum encompasses a broader range of frequencies as emergence is approached. On the other hand, in subcritical SIR systems approaching endemicity, fluctuations should become less noisy as the threshold is approached. The power spectrum transforms from a flat “white-noise” spectrum in the subcritical scenario to a spectrum exhibiting resonant peaks for sufficiently decreased vaccination uptake. Oscillations are expected to be amplified at these resonant frequencies. It has been shown that complex eigenvalues of the Jacobian matrix of system (6) guarantee the existence of the power spectrum peak (Alonso et al. 2007).

Fig. 7
figure 7

Limiting case \(\epsilon \rightarrow 0\) prediction for the power spectrum for the models with and without immigration. The predictions for \(\eta >0\) agree closely with those generated from the \(\eta = 0\) models. In the SIS case, the power at smallest frequencies increases as the critical point is approached, but this effect is subtle, whereas in the SIR case, there is a dramatic reduction in the frequency at which the highest power is observed . Each power spectrum is evaluated about the mean field infectious equilibrium \(\varphi ^{\ast }\), given in Tables 8 and 9. The parameters used to calculate the power spectrum are given in Table 7. Importation, vaccination uptake, and population sizes were chosen according to the elimination scenario. The power spectrum has been log-transformed for clarity

SIS leading indicator predictions

Systems approaching a critical transition are expected to exhibit rising variance and rising autocorrelation (Scheffer 2009). In general, the predictions for the leading indicators of SIS disease elimination follow these expectations for one-dimensional stochastic systems. Figure 8a–c shows the leading indicators of elimination for the SIS model. As the threshold to pathogen extinction is approached by gradually increasing the vaccination uptake, all of the statistics evaluated about the endemic equilibrium of the stable system increase as the bifurcation parameter is increased, as expected from the analytical formulas for the limiting case (Table 4). The lag-1 autocorrelation rises, indicating an increase in system memory, and the fluctuation variance increases, as suggested by the plot of the power spectrum for increasing vaccination uptake values. The coefficient of variation also increases, indicating that infectious time series should become noisier as the transition is approached. The signal predictions for the SIS systems with and without immigration in the limit case \(\epsilon \rightarrow 0\) (in green) are in close agreement, which makes sense from the analytical formulae, provided \(\eta \) is small. To examine how different changes in vaccination uptake \(p_{0}\) in the fast–slow model with immigration affect the statistics, we varied \(p_{0}\) by \(1/500\) and \(1/30~\text {year}^{-1}\). Predictions are in close agreement far from the \(\eta =0\) critical point but they differ close to it.

Fig. 8
figure 8

Leading indicators of elimination. The statistics evaluated about the endemic equilibrium for the limiting case \(\epsilon \rightarrow 0\) with \(\eta = 0\) are shown in dashed green line. The indicators evaluated about the endemic equilibrium assuming \(\eta >0\) are indicated in red line. The statistics evaluated about the current state of the mean field fast–slow system are shown in blue (dotted) and black lines. The gray dashed line indicates the location of the critical point when \(\eta =0\). The theoretical predictions in the \(\eta > 0\) SIS case are not as dramatic as one expects from the limit case but are qualitatively consistent with standard expectations, unlike the SIR limit case predictions, which do not conform with standard expectations. The variance in the SIR case is predicted to decline, which is not unexpected here, but does not conform to standard expectations. The coefficient of variation increases as the critical point is approached, but the increase is not as dramatic as one may expect from the limiting case. In contrast to expectations, the lag-1 autocorrelation is predicted to decline close to the critical point if \(\eta >0\). The predictions for the statistics evaluated about the current state of the fast–slow system differ from those evaluated about the endemic equilibrium. Even if \(p_{0} = 1/500~\text {year}^{-1}\), there are discrepancies in the predictions near the transcritical bifurcation point. The autocorrelation of the fluctuations with a lag of 1 year is shown, and the variance shown is the fluctuation variance. All calculations used the parameter values in Table 7

SIR leading indicator predictions

Leading indicators for SIR systems approaching elimination (Fig. 8d–f) are not always consistent with the standard expectations of increasing variance, increasing autocorrelation, and increasing coefficient of variation. In addition, the predictions for the leading indicators of SIR disease elimination evaluated about the endemic equilibrium of the stable system behave differently to SIS leading indicators, due to the presence of resonant frequencies. As the threshold to elimination is approached, the lag-1 autocorrelation increases with vaccination uptake, indicating an increase in system memory, but in contrast to the SIS model, the fluctuation variance declines. We observe in Fig. 7b, d that the area under the power spectrum declines as vaccination uptake increases, particularly in the model with \(\eta > 0\), where the height of the peak of the power spectrum, which reflects amplification of fluctuations, declines close to the transition. The coefficient of variation also increases, indicating that infectious time series should become noisier as the transition is approached but does so more dramatically in the model without immigration, because the power spectrum peak height increases as the transcritical bifurcation is approached. To examine how different changes in vaccination uptake \(p_{0}\) affect the statistics, we varied \(p_{0}\) by \(1/500\) and \(1/30~\text {year}^{-1}\) in the fast–slow model with immigration. Predictions generated from the \(p_{0}=1/500\) model are in close agreement with predictions arising from the \(\eta > 0\) stable model. However, when vaccination uptake is increasing an order of magnitude faster (i.e., \(p_{0}=1/30\)), there can be temporary fluctuations, including increases, in the variance (Fig. 8e). Such a pattern is not predicted when we evaluate the statistics about the endemic equilibrium.

SIS predictions over a moving window

Generally, the analytical predictions for leading indicators for SIS disease systems approaching elimination are robust over a moving window, and they perform well in comparison to statistics generated from the null stable models, as indicated by the ROC curves in Fig. 10. The ROC curves lie above the black line, showing that the indicators behave better than chance in distinguishing between realizations that have been generated by the null and test models and the reported AUC values are close to 1. Furthermore, the trends in the statistics agree with the theoretically predicted trends (compare Fig. 10a, c, e with Fig. 8a–c). However, the Gaussian smoother may not be appropriate when the trend declines rapidly, as occurs for the SIS system approaching elimination. Various bandwidths were used, but while smaller bandwidths removed the slowly varying trend successfully, they did not capture the fluctuations relevant for critical slowing down. Larger bandwidths captured these fluctuations but did not successfully remove the slowly varying trend from the last 10 years of the time series, as can be seen in the plots. This problem with Gaussian filtering has been noted before by Dakos et al. (2008).

SIR predictions over a moving window

The theoretical predictions for leading indicators for SIR disease systems approaching elimination are robust over a moving window. The trends in the statistics shown in Figs. 11a, c, e, and 12a, c, e agree with the theoretical predictions (Fig. 8d–f). Notably, the median fluctuation variance for the system with \(p_{0} = 1/30\) also exhibits more variability, as predicted theoretically. The ROC analysis for the SIR models indicates that the statistics from the test models perform well in comparison to statistics generated from the null stable models, but the fluctuation variance for the system with \(p_{0} = 1/30\) is an exception. The performance is poor because the trend in the variance exhibits wider variation (Fig. 12c). Finally, the Gaussian filtering performs well in comparison to the van Kampen detrending, because the mean of the SIR model declines linearly and does not exhibit any rapid changes.

Early warning signals for emergence

SIS early warning signal predictions

As noted, the standard expectations for early warning signals of systems approaching a critical transition include increasing autocorrelation, increasing variance, and increasing coefficient of variation. Prior to emergence, the theoretical predictions for the SIS model generally agree with these standard expectations (Figs. 9a–c). For emerging SIS diseases with immigration, the lag-1 autocorrelation is predicted to increase, indicating an increase in memory in the system. The variance is also predicted to increase, as suggested by the increasing area under the power spectrum in Fig. 7c. However, the coefficient of variation is predicted to decline as \(p(t)\) decreases, indicating that the mean of the stable SIS system rises more rapidly than the variance. The rapid rise in the mean can be seen in Fig. 4a. However, the coefficient of variation is predicted to increase as the transition is approached if the statistic is evaluated about the solution of the fast–slow system. This is because the mean remains small long after the predicted critical transition (Fig. 4a). However, the limiting case theoretical predictions for the trends in the statistics for systems on the threshold of emergence do not always agree with the predictions for the corresponding fast–slow systems beyond the critical threshold for emergence. This finding results from the predicted mean given by fast–slow system being lower than that of the stable system in the \(\epsilon \rightarrow 0\) theory (c.f., Fig. 4a). Due to the rapid rise in the mean following the bifurcation delay, each statistic exhibits another increase following the transcritical bifurcation.

Fig. 9
figure 9

Early warning signals of emergence. Increasing transmission in the SIS model and increasing recruitment in the SIR model result from decreasing vaccination uptake \(p(t)\). Here, the limit in the legend refers to the \(\epsilon \rightarrow 0\) limit, and only predictions for the models with immigration (\(\eta >0\)) are shown. The statistics evaluated about \(\varphi ^{\ast }\) (limiting case \(\epsilon \rightarrow 0\)) are shown in red line (\(\eta > 0\)). The statistics evaluated about the current state of the mean field fast–slow system are shown in blue (dashed) and black lines. The gray dashed line indicates the location of the critical point when \(\eta =0\). In agreement with standard theory, the variance and autocorrelation increase in the SIR and SIS systems as the bifurcation is approached (the peaks in the variance and CV predictions have been cut off for clarity). To aid comparison with Fig. 14c, the inset plot of (e) shows a close-up of increasing variance prior to the transition in the SIR case. However, the \(\epsilon \rightarrow 0\) theory predicts the CV declines on the approach to emergence. The effect of the bifurcation delay is also seen in the statistics predicted by the fast–slow theory. An order of magnitude difference in the change in vaccination uptake \(p_{0}\) affects the predictions; e.g., the CV in the SIS model is predicted to rise before the transcritical bifurcation by the fast–slow theory. Increases in autocorrelation (a) and in variance (b) can occur in this model after the transcritical bifurcation. The autocorrelation of the fluctuations with a lag of 1 year (SIS) and 1 month (SIR) is shown, and the variance shown is the fluctuation variance. All calculations used the parameter values in Table 7; the SIR model calculations used the parameters in Table 7 scaled to rates per month

SIR early warning signal predictions

In subcritical SIR systems approaching endemicity, the theoretical predictions for the SIR model agree with standard expectations, except for the coefficient of variation, which is predicted to decline (Figs. 9d–f). The limiting case theoretical predictions for the trends in the statistics for SIR systems on the threshold of emergence agree more closely with the predictions for the fluctuations about the solution of the corresponding SIR fast–slow system, in contrast to the SIS model, because the bifurcation delay, although present, is not as marked as that in the SIS model. Figure 9d, e shows that the monthly lag-1 autocorrelation and the variance increase as the bifurcation is approached. It is not surprising that the variance increases because the power spectrum increases in power as the critical threshold for emergence is approached (Fig. 7d). However, the coefficient of variation is predicted to decline, i.e., the signal should become less noisy with decreasing vaccination uptake. In the context of the power spectrum result, this again makes sense because when the system transitions from subcritical to endemic, the power spectrum changes from a relatively flat shape to one with a peak about a resonant frequency. To examine how differences in vaccination uptake \(p_{0}\) affect the statistics, we varied \(p_{0}\) by \(1/500\) and \(1/30~\text {year}^{-1}\) in the fast–slow model. The statistics predicted by the fast–slow theory do not change as rapidly as predicted in the \(\epsilon \rightarrow 0\) theory. Finally, we did not calculate the early warning signals for the disease systems without immigration because the disease-free state is stable.

Predictions over a moving window

In contrast to disease systems approaching elimination, the early warning signals for diseases on the verge of emergence do not perform well. No marked trends are present in the median statistics calculated from either of the van Kampen or Gaussian detrending approaches, with the exception that the variance of the SIS and SIR systems approaching emergence (Figs. 13c, d and 14c, d) increases slightly. The 95 % prediction intervals for each statistic further show that all the statistics are highly variable. Moreover, the ROC curves indicate that it may, in general, be difficult to distinguish between subcritical systems with \(R_{0}<1\) and endemic disease systems with \(R_{0}>1\) (Figs. 13b, d, f and 14b, d, f). The ROC curves show that the distributions of Kendall’s correlation coefficient generated from stable systems and systems approaching a critical transition overlap significantly. However, we are conservative in our approach because we take only the years up to the critical transition in the absence of immigration, and not the years after, when a bifurcation delay may occur, during which time prevalence remains low and so the mean prevalence may not increase for a long time after the transition. Therefore, it is not surprising that it is difficult to distinguish between the subcritical and supercritical systems.

Discussion

Summary

The transmission of infectious diseases is a paradigm example of a low-dimensional, noisy, nonlinear dynamical process. Critical transitions in infectious disease systems, which correspond to such socially important events as the emergence of novel pathogens and the elimination of disease, are of special interest. The development of early warning signals of emerging infectious diseases and leading indicators for disease elimination, particularly, would be of tremendous value to the advance of public health. The goal of our study was to develop the theory of such early warning signals and leading indicators for infectious disease transmission systems that meet the assumptions of the familiar SIS and SIR models and which are forced through a critical transition by changes in transmission. Our main result—analytical expressions for the change in observable statistics as the transmission system approaches the critical transition—are reported in Table 4.

Because this theory depends on a sequence of approximations, we investigated the robustness of our results in a sequence of simulations. For the SIS model, we found that the approach to elimination (Fig. 10) was indicated by an increase in the autocorrelation, variance, and the coefficient of variation, as predicted by the theory. For the SIR model, the approach to elimination (Figs. 11 and 12) was indicated by an increase in the autocorrelation, a decrease in the variance, and an increase in the coefficient of variation, as predicted by the theory. For the SIS and SIR models, the approach to emergence was indicated by an increase in variance, but the effects on the autocorrelation and the coefficient of variation were imperceptible (Figs. 13 and 14). Further, since the theoretical patterns may be difficult to distinguish from random noise or fail to provide a sufficiently reliable signal under realistic conditions for data collection, we also examined the suitability of these statistics as an online algorithm for early warning. Following Boettiger and Hastings (2012), we summarized the results of online analysis using the receiver operator characteristic. These results showed that the approach to elimination in both the SIS model (Fig. 10) and the SIR model (Figs. 11 and 12) could be detected with a high level of reliability (large AUC), although variance was a much poorer indicator in the SIR model than autocorrelation and the coefficient of variation. However, the approach to emergence was very difficult to detect in both the SIS model and the SIR model, with estimated AUC values hovering just above the null value of 0.5 (Figs. 13 and 14). In summary, our simulation studies show the approximations required to obtain the theoretical predictions to be acceptable but indicate that reliable prediction is likely only in the case of leading indicators for elimination, not early warning signals of emergence.

Fig. 10
figure 10

Performance of the statistics over a moving window for the SIS system approaching elimination, assuming that immigration occurs (\(\eta >0\)). Panels a, c, and e show the median statistics (thick lines) and 95 % prediction intervals (shaded regions). The lag-1 autocorrelation and variance have been calculated from the fluctuations obtained from van Kampen and Gaussian detrending. The coefficient of variation (CV) is marked in dashed green line to indicate that it was calculated from the raw time series, not the deviations from the mean. The dashed vertical line marks the time of the transcritical bifurcation in the \(\epsilon \rightarrow 0\) limiting case with \(\eta =0\). Panels b, d, and f show the corresponding ROC curves. The AUC value indicates the area under the corresponding ROC curve. All curves are above the black line, showing that the indicators behave better than chance in distinguishing between realizations that have been generated by the null and test models. However, the Gaussian filtering may not be an appropriate method to obtain the fluctuations, because it does not remove the slowly varying trend close to the transition, as exhibited by the rapid rise in the indicators. All calculations used the parameter values in Table 7. A bandwidth of 20 years was chosen for the Gaussian filtering

Fig. 11
figure 11

Performance of the statistics over a moving window for the SIR system approaching elimination, assuming that immigration occurs (\(\eta >0\)). Panels a, c, and e show the median statistics (thick lines) and 95 % prediction intervals (shaded regions). The lag-1 autocorrelation and variance have been calculated from the fluctuations obtained from van Kampen and Gaussian detrending. The coefficient of variation is marked in green line to indicate that it was calculated from the raw time series, not deviations from the mean. The dashed vertical line marks the time of the transcritical bifurcation in the \(\epsilon \rightarrow 0\) limiting case with \(\eta =0\). Panels b, d, and f show the ROC curves. The AUC value indicates the area under the corresponding ROC curve. All curves are above the black line, showing that the indicators behave better than chance in distinguishing between realizations that have been generated by the null and test models. A bandwidth of 5 years was chosen for the Gaussian filtering. All calculations used the parameter values in Table 7

Fig. 12
figure 12

Performance of the statistics over a moving window for the SIR system rapidly approaching elimination, assuming that immigration occurs (\(\eta >0\)). Panels a, c, and e show the median statistics (thick lines) and 95 % prediction intervals (shaded regions). The lag-1 autocorrelation and variance have been calculated from the fluctuations obtained from the van Kampen and Gaussian detrending. The coefficient of variation (CV) is marked in dashed green line to indicate that it was calculated from the raw time series, not deviations from the mean. The dashed vertical line marks the time of the transcritical bifurcation in the \(\epsilon \rightarrow 0\) limiting case with \(\eta =0\). Panels b, d, and f show the ROC curves. The AUC value indicates the area under the corresponding ROC curve. All curves are above the black line, showing that the indicators behave better than chance in distinguishing between realizations that have been generated by the null and test models. However, the variance performs poorly compared with the lag-1 autocorrelation and CV. A bandwidth of 5 years was chosen for the Gaussian filtering. All calculations used the parameter values in Table 7 except for \(p_{0} = 1/30~\text {year}^{-1}\)

Fig. 13
figure 13

Performance of the statistics over a moving window for the SIS system approaching endemicity, assuming that immigration occurs (\(\eta >0\)). Panels a, c, and e show the median statistics (thick lines) and 95 % prediction intervals (shaded regions). The lag-1 autocorrelation and variance have been calculated from the fluctuations obtained from the van Kampen detrending and Gaussian filtering. The coefficient of variation is marked in green (dashed) line to indicate that it was calculated from the raw time series, not deviations from the mean. The dashed vertical line marks the time of the transcritical bifurcation. Panels b, d, and f shows the ROC curves. The AUC value indicates the area under the corresponding ROC curve. There are no marked trends in the median statistics, although the variance exhibits a slight increasing trend. The ROC curves show that the distributions of Kendall’s \(\tau \) overlap greatly, suggesting that it is difficult to distinguish between a quasi-stationary system and one approaching an emergence threshold. Due to the bifurcation delay, it is not surprising that it is difficult to distinguish between null and test replicates. A bandwidth of 10 years was selected for the Gaussian filtering. All calculations used the parameter values in Table 7

Fig. 14
figure 14

Performance of the statistics over a moving window for the SIR system approaching emergence, assuming that immigration occurs (\(\eta >0\)). Panels a, c, and e show the median statistics (thick lines) and 95 % prediction intervals (shaded regions). The lag-1 autocorrelation and variance have been calculated from the fluctuations obtained from van Kampen and Gaussian detrending. The coefficient of variation is marked in green (dashed) line to indicate that it was calculated from the raw time series, not deviations from the mean. The dashed vertical line marks the time of the transcritical bifurcation. Panels b, d, and f shows the ROC curves. The AUC value indicates the area under the corresponding ROC curve. None of median statistics exhibit a marked trend. In addition, the ROC curves indicate that the distributions of Kendall’s \(\tau \) overlap greatly, suggesting that it is difficult to reliably distinguish between a stable system and one undergoing a critical transition using the lag-1 autocorrelation coefficient. The coefficient of variation also performs poorly, as predicted theoretically. Due to the bifurcation delay, it is not surprising that it is difficult to distinguish between null and test replicates. A bandwidth of 20 months was selected for the Gaussian filtering. All calculations used the parameter values in Table 7

Bifurcation delays

Besides the practical goal of indicating patterns that function as early warning signals and leading indicators, our study also provides some basic insights into the dynamics of infectious diseases. Particularly, our simulations illustrated the ubiquity of bifurcation delays—changes in system state that lag behind the bifurcation of the limiting case—during disease emergence and elimination. Delays are of fundamental interest because they provide a mechanism for the frequent realization of far-from-equilibrium situations and because they are a dramatic example of tipping point phenomena in infectious diseases. Indeed, delays can cause a system to appear to undergo a catastrophic shift (e.g., the dramatic upturn of infectious cases associated with emergence) when, in fact, only a noncatastrophic transcritical bifurcation has occurred. Our results show that two kinds of delays occur in disease emergence and elimination. The first delay is a deterministic phenomenon referred to as a canard (Diener 1984). Canards occur in fast–slow systems where the solution follows an attracting slow manifold of the fast–slow system, passing close to a bifurcation point, and then follows a repelling slow manifold for a considerable period of time. Thus, for instance, in Fig. 3a, which depicts the emergence of an SIS infection, the \(\epsilon \rightarrow 0\) limit case branch rapidly increases close to the \(\eta = 0\) critical point, but the solution of the fast–slow system (\(\epsilon >0\)) tracks the disease-free equilibrium for a noticeable amount of time even after it has become unstable. The influence that the canard has on the stochastic dynamics can be observed in Fig. 3b, where the number of infectious individuals remains in the vicinity of the precritical level for a noticeable period of time. A related phenomenon, but not so dramatic, is observed in the elimination case (Fig. 2a). Following the bifurcation, the infectious population remains elevated, but unlike the emergence case, no dramatic shift to extinction occurs. Analogous phenomena are observed in the SIR model (Figs. 5 and 6).

The second class of delays is a stochastic phenomenon, caused by the nonzero time interval that occurs between the time at which the critical point is reached and the time of the event that initiates the transition (i.e., the index case of a major outbreak). This phenomenon is most obvious in the emergence of an immunizing pathogen (the emergence scenario for the SIR model). In this case, the number of infectious individuals in the population prior to the critical transition is most commonly zero. Assuming that there are no infectious individuals in the population at the time of the bifurcation event, then the number of infections in the population will remain zero until the time that an imported case arises. In view of the assumption of demographic stochasticity, importation is a Poisson process with nonzero rate parameter, which similarly ensures that a nonzero period of time must elapse before the first case appears. Indeed, even after the first case appears, there is no guarantee that the system will undergo transition, since even in a supercritical population, there is some chance of rapid extinction (Allen 2003). When the system is only slightly supercritical, this chance may not be small. For measurement purposes, we might define this stochastic bifurcation delay as the time elapsed between the occurrence of the bifurcation and the final occurrence of zero infectious individuals in the population. Clearly, this quantity is a random variable. The distributional properties of this stochastic bifurcation delay are an important problem for further research.

Delays are also important for the immediate practical reason that their existence implies that the state of the system, i.e., the presence or absence of disease, cannot be taken as reliable evidence concerning whether the system is subcritical or supercritical. In the case of disease emergence, a system may have developed to the supercritical state—a significant concern for public health—although no cases have occurred. In such a scenario, any introduction of infectious individuals may spark a major outbreak. In the case of disease elimination, a system may already exist in the subcritical state, with only lingering infections and short transmission chains enabling the pathogen to transiently persist. In such cases, the battle has already been won, but it is crucial that vaccination campaigns and other elimination activities be continued until the remaining lines of transmission are snuffed out. Otherwise, gains that may have come at great cost will fail to be maintained, and the pathogen may resurge in the population, sparked by its own embers still smoldering in the face of relaxed intervention. The development of statistical methods to identify such situations is an urgent problem for further research.

Limiting case predictions can sometimes be misleading

Predictions for the statistics derived from the limiting case (\(\eta = 0\)) version of the SIR model can sometimes be misleading. As the extinction threshold is approached, the predicted SIR (\(\eta >0\)) lag-1 autocorrelation increases as expected, but this phenomenon gives way to a decrease close to the \(\eta = 0\) critical point (Fig. 8d). This finding is not so surprising when one recalls that the single biologically feasible equilibrium in the \(\eta >0\) model changes from a stable spiral to a stable node near the critical point as vaccination is increased. The magnitude of the real part of the eigenvalue, which approximates the rate of decay of the amplitude of the oscillations to the equilibrium, decreases as elimination is approached, but, near the threshold, the magnitude of the real part begins to increase once more, indicating a shorter return time to equilibrium and increased resistance of the system to perturbations, as the spiral begins to resemble a node. This subsequent decline in system memory is reflected in the decrease in the autocorrelation function near the critical point. The more rapid decay of the oscillations is also reflected in the power spectrum, in that the the spectrum peak declines as vaccination uptake increases, meaning that the resonant frequencies become less amplified. On the other hand, the recovery rate of the stable spiral in the limiting case decreases much more dramatically on the transition to elimination, and oscillations at the resonant frequencies become more amplified. Thus, while the limiting case is an important guide in understanding more general disease models, it is also necessary to thoroughly investigate these models to understand the behavior of systems approaching a critical transition.

Limitations

This study has followed Kuehn (2011) and Boettiger and Hastings (2012) in the use of stochastic differential equations to understand noise-induced phenomena that occur during a critical transition. It is understood, of course, that for discrete systems (such as infectious disease transmission), this continuum representation is only an approximation, and, moreover, that the approximation breaks down as the system size—or even if the size of just one important compartment—becomes small. The case of disease emergence, where the number of infectious individuals in a population prior to the critical transition is typically zero and only occasionally one or more, is precisely this situation. It would not be surprising, therefore, to discover the approach taken in this paper to perform poorly in this situation. But, this is not the case. As Figs. 13 and 14 show, in both the SIS and SIR systems, the “measured” statistics obtained in simulation are qualitatively similar to the analytic expressions reported in Table 4, with the exception of the coefficient of variation in the SIR model. We note, further, that although emergence was difficult to predict (AUC values were not much greater than 0.5), the van Kampen detrending, which we think of as the theory-dependent approach, typically performed better than generic Gaussian detrending, which may be computed without any theoretical assumptions. It is, nevertheless, a concern that the approach taken here will be misleading under some scenarios: for instance, a very slow approach to the critical transition, when the time between immigration events may be large and the van Kampen approximation is likely to break down. Moreover, the stochastic features of bifurcation delay, particularly important in the emergence scenarios, depend heavily on (a) the sequence of immigration events by infectious individuals, and (b) the probability of nonextinction for a transmission chain initiated by an infectious individual after the critical point has been passed. These are both intrinsically discrete aspects of the emergence process. For this reason, we suggest that the study of (nonstationary) discrete transmission models may be a fruitful direction for further study. There is, of course, a large body of work on (stationary) discrete contagion models on which such a theory could build (Bailey 1964; Renshaw 1991; Daley and Gani 1999; Allen 2003).

Conclusion

In conclusion, early warning systems for emerging infectious diseases and leading indicators of elimination, which would be of tremendous benefit to society, now appear to be potentially achievable. Their realization depends on two key developments: (1) better understanding of noise-induced phenomena exhibited by nonlinear contagion processes in the vicinity of their critical points, and (2) surveillance methods that acquire data of sufficiently high accuracy, resolution, and timeliness. Toward the first of these goals, our studies have indicated that anticipation of critical transitions is theoretically possible. It is hoped that these results, which include expressions for the key observable phenomena (Table 4), may help to guide achievement of the second, which is now the most important outstanding problem, i.e., empirical demonstration in experimental or surveillance data.