Mathematical Models in Infectious Disease Epidemiology
- 5.7k Downloads
The idea that transmission and spread of infectious diseases follows laws that can be formulated in mathematical language is old. In 1766 Daniel Bernoulli published an article where he described the effects of smallpox variolation (a precursor of vaccination) on life expectancy using mathematical life table analysis (Dietz and Heesterbeek 2000). However, it was only in the twentieth century that the nonlinear dynamics of infectious disease transmission was really understood. In the beginning of that century there was much discussion about why an epidemic ended before all susceptibles were infected with hypotheses about changing virulence of the pathogen during the epidemic.
KeywordsVaccination Coverage Reproduction Number Epidemic Modeling Infected Person Susceptible Population
The idea that transmission and spread of infectious diseases follows laws that can be formulated in mathematical language is old. In 1766 Daniel Bernoulli published an article where he described the effects of smallpox variolation (a precursor of vaccination) on life expectancy using mathematical life table analysis (Dietz and Heesterbeek 2000). However, it was only in the twentieth century that the nonlinear dynamics of infectious disease transmission was really understood. In the beginning of that century there was much discussion about why an epidemic ended before all susceptibles were infected with hypotheses about changing virulence of the pathogen during the epidemic. Hamer (1906) was one of the first to recognize that it was the diminishing density of susceptible persons alone that could bring the epidemic to a halt. Sir Ronald Ross, who received the Nobel prize in 1902 for elucidating the life cycle of the malaria parasite, used mathematical modeling to investigate the effectiveness of various intervention strategies for malaria.
In 1927, Kermack and McKendrick published a series of papers in which they described the dynamics of disease transmission in terms of a system of differential equations (Kermack and McKendrick 1991a; Kermack and McKendrick 1991b; Kermack and McKendrick 1991c). They pioneered the concept of a threshold quantity that separates different dynamic regimes. Only if the so-called basic reproduction number is above a threshold value can an infectious disease spread in a susceptible population. In the context of vaccination this leads to the concept of herd immunity , stating that it is not necessary to vaccinate the entire population to eliminate an infectious disease. This theory proved its value during the eradication of smallpox in the 1970 s. Vaccination coverage of around 80% worldwide in combination with ring vaccination was sufficient for eradication of this virus.
Only towards the end of the twentieth century did mathematical modeling come into more widespread use for public health policy making. Modeling approaches were increasingly used during the first two decades of the AIDS pandemic for predicting the further course of the epidemic and for trying to identify the most effective prevention strategies. But the real impact of mathematical modeling on public health came with the need for evaluating intervention strategies for newly emerging and re-emerging pathogens. In the first instance it was the fear of a bioterrorist attack with smallpox virus that sparked off the use of mathematical modeling to combine historical data from smallpox outbreaks with questions about vaccination in modern societies (Ferguson et al. 2003). Later the outbreak of the SARS virus as a newly emerging pathogen initiated the use of mathematical modeling for analyzing infectious disease outbreak data in real time to assess the effectiveness of intervention measures (Wallinga and Teunis 2004).
Analysis of historical data about pandemic outbreaks of influenza A have led to the important insight that the basic reproduction number of influenza has been low in historical outbreaks, but the serial interval is short (Mills et al. 2004). This implies that in principle an outbreak of influenza can be stopped with moderate levels of intervention, but measures have to be taken very rapidly in order to be effective. In contrast, for an infection such as measles with a high basic reproduction number, very high levels of vaccination coverage are needed for elimination. Such insights gained from mathematical analysis are extremely helpful for designing appropriate intervention policy and for the evaluation of existing interventions.
12.2 Basic Concepts in Mathematical Modeling
The central idea about transmission models , as opposed to statistical models, is a mechanistic description of the transmission of infection between two individuals. This mechanistic description makes it possible to describe the time evolution of an epidemic in mathematical terms and in this way connect the individual level process of transmission with a population level description of incidence and prevalence of an infectious disease. The rigorous mathematical way of formulating these dependencies leads to the necessity of analyzing all dynamic processes that contribute to disease transmission in much detail. Therefore, developing a mathematical model helps to focus thoughts on the essential processes involved in shaping the epidemiology of an infectious disease and to reveal the parameters that are most influential and amenable for control. Mathematical modeling is then also integrative in combining knowledge from very different disciplines like microbiology, social sciences, and clinical sciences.
In the illustration of Fig. 12.2, the number of new infections increases in the first generation by a factor equal to the reproduction number R. The number of available susceptible individuals is depleted in the course of the epidemic. When the last infected person fails to contact any susceptible person, the epidemic dies out.
This provides us with a simple and robust relation that indicates what would happen if a new infection were to hit a completely susceptible population: if the new infection is like influenza, with a reproduction number of about R = 1.5, we expect that more than half of the population will be infected; and if the new infection is like smallpox, with a reproduction number of about R = 5, we expect that almost the entire population will be infected during an epidemic without interventions.
The timelines determine another epidemiological key quantity, the generation time T. This generation time is defined as the typical duration between the time of infection of a source and the time of infection of its secondary case(s). For influenza, the generation time is in the order of T = 3 days. For smallpox, the generation time is in the order of T = 20 days.
The chain reaction nature of the epidemic process leads to exponential growth in real (calendar) time during the initial phase of the epidemic, once the number of infected individuals has become large enough to avoid chance events that lead to an early extinction of the epidemic. The exponential growth rate r is determined by the precise timelines of infection. There is a lower limit to the growth rate r that is set by both the reproduction number R and the generation time T (specifically, r > ln (R)/T).
To illustrate the strength of this basic approach to epidemic modeling, we use it to assess the impact of border closure on epidemic spread. The number of infected persons that will try to cross the border from an infected country into a country that is not yet infected will increase exponentially with a growth rate r. Closing the borders will stop most infected persons, but a proportion p might slip through. Therefore, closing the borders will result in a reduction by a factor p of the exponential growth of number of imported cases. This reduction corresponds to a delay in the exponential growth of the number of imported cases (specifically, the delay is at most (–ln p /ln R) T). Therefore, border closure will only postpone the import of cases for a few generations of infection. For example, if closure was to reduce all of those infected travelers who would ordinarily have crossed the border to 1%, the introduction of an influenza epidemic may be delayed by about a month, and the introduction of a smallpox epidemic may be delayed by about 2 months.
The key epidemiological variables that characterize spread of infection are the generation time T and the reproduction number R. If a novel infection starts spreading, such as SARS in 2003, these key variables are unknown. But even if an outbreak of a more familiar infection occurs, such as norovirus , we might be groping in the dark about the precise values of these key variables. Yet, if modeling is to be helpful in infectious disease control, it is crucial to have the best possible estimates for the generation time and the reproduction number, along with other quantities such as the incubation time and hospitalization rate . Estimation would be easy if we had perfect information about the outbreak. If we would know exactly who had infected whom, and if we know precisely who was infected when, we could simply measure the duration of each time interval from infection of a case back to time of infection of its source, and the distribution of the length of these time intervals would inform us about the generation interval . Similarly, we could simply count for each infected individual how many others were infected by this individual, and the distribution of such counts would inform us about the reproduction number. Of course, in a real world such information is not available and we have to deal with incomplete observations, proxy measures, and reporting delays . But real-time estimating procedures have been proposed that attempt to reconstruct the likely patterns of who infected whom, and who was infected when, from the incomplete data and proxy measures, using standard statistical techniques for dealing with missing data and censoring (Wallinga and Teunis 2004; Cauchemez et al. 2006). The main message is that during an outbreak it is important to collect data on cases (time of symptom onset ) and about the relation between cases (existence of an epidemiological link). The more accurate this data is, the more useful it is to estimate the key model ingredients, the generation time T and the reproduction number R, and the more helpful this data can be in predicting the likely future course of the epidemic without intervention and the required control effort to curb the epidemic.
We will now derive some basic principles using this model as an example.
12.3 Basic Concepts: Reproduction Number, Final Size, Endemic Steady State, and Critical Vaccination Coverage
The most important concepts of epidemic models can be demonstrated using the SIR model. Let us first consider an infectious disease which spreads on a much faster time scale than the demographic process. Then, on the scale of disease transmission the birth rate ν and the death rate μ can be considered to be close to zero. When can the prevalence in the population increase? An increase in prevalence is equivalent with dI/dt > 0, which means that βSI/N > γI. This leads to βS/N > γ or equivalently to βS/(γN)>1. In the situation that all individuals of the population are susceptible we have S = N; this means that an infectious disease can spread in a completely susceptible population if β/γ>1. The quantity R 0 = β/γ is also known as the basic reproduction number and can in principle be determined for every infectious disease model and can be estimated for every infectious disease. In biological terms the basic reproduction number describes the number of secondary infections produced by one index case in a completely susceptible population during his entire infectious period (Diekmann et al. 1990; Diekmann and Heesterbeek 2000). The effective reproduction number R – as mentioned in Section 12.2– describes the number of secondary cases per index case in a situation where intervention measures are applied or where a part of the population has already been infected and is now immune.
If R 0 > 1 the infection can spread in the population, because on average every infected individual replaces himself by more than one new infected person. However, this process can only continue as long as there are sufficiently many susceptible individuals available. Once a larger fraction of the population has gone through the infection and has become immune, the probability of an infected person to meet a susceptible person decreases and with it the average number of secondary cases produced. If – as we assumed above – there is no birth into the population, no new susceptible individuals are coming in and the epidemic outbreak will invariably end. Analysis of the model shows, however, that the final size of the outbreak will never encompass the entire population, but there will always be a fraction of susceptible individuals left over after the outbreak has subsided. It can be shown that the final size A (attack rate in epidemiological terms) is related to the basic reproduction number by the implicit formula A = 1 – exp(–R 0 A). In other words, if the basic reproduction number of an infectious disease is known, the attack rate in a completely susceptible population can be derived.
Note that the fraction of susceptible individuals S * /N * in the endemic steady state is independent of the vaccination coverage p. On the other hand, the prevalence of infection I * /N * depends on p: the prevalence decreases linearly with increasing vaccination coverage until the point of elimination is reached. This means we can compute the critical vaccination coverage p c, i.e., the threshold coverage needed for elimination from 0 = 1–1 / R 0 – p c as p c = 1 – 1 / R 0. As we would expect intuitively, the larger the basic reproduction number, the higher the fraction of the population that has to be vaccinated in order to eliminate an infection from the population. However, it also follows that elimination can be reached without vaccinating everybody in the population. The reason is that with an increasing density of immune persons in the population, the risk for those who are not yet vaccinated to be exposed decreases. This effect – the indirect protection of susceptible individuals by increasing levels of immunity in the population – is known as herd immunity . Besides the positive effect of decreasing the risk of infection for non-vaccinated persons, herd immunity has the sometimes adverse effect of increasing the mean age at first infection in the population. This can lead to an increased incidence of adverse events following infection, if the coverage of vaccination is not sufficiently high.
For an infection such as smallpox with an estimated basic reproduction number of around 5, a coverage of 80% is needed for elimination , while for measles with a reproduction number of around 20 the coverage has to be at least 96%. This provides one explanation for the fact that it was possible to eradicate smallpox in the 1970 s whereas we are still a long way from measles eradication . There are some countries, however, that have been successful in eliminating measles based on a consistently high vaccination coverage (Peltola et al. 1997).
12.3.1 Advanced Models
Building on the basic ideas of the SIR framework, numerous types of mathematical models have been developed in the meanwhile, all incorporating more structure and details of the transmission process and infectious disease dynamics.
18.104.22.168 More Complex Compartmental Models
A first obvious extension is the inclusion of more disease-specific details into a model. Compartments describing a latent period , the vaccinated population, chronic and acute stages of infection, and many more have been described in the literature (Anderson and May 1991). Another important refinement of compartmental models is to incorporate heterogeneity of the population into the model, for example, by distinguishing between population subgroups with different behaviors or population subgroups with differences in susceptibility or geographically distinct populations. Heterogeneity in behavior was first introduced into models describing the spread of sexually transmitted infections by Hethcote and Yorke (Hethcote and Yorke 1994). Later, during the first decade of the HIV/AIDS pandemic, models were proposed that were able to describe population heterogeneity in sexual activity and mixing patterns between population subgroups of various sexual activity levels (Koopman et al. 1988). Models of this type are used frequently for assessing the effects of intervention on the spread of sexually transmitted infections. Age structure has also been modeled as a series of compartments with individuals passing from one compartment to the next according to an aging rate, but this requires a large number of additional compartments to be added to the model structure. This also shows the limitation of compartmental models: with increasing structure of the population the number of compartments increases rapidly and with it the necessity to define and parameterize the mixing between all the population subgroups in the model. The theory of how to define and compute the basic reproduction number in heterogeneous populations was developed by Diekmann et al. (1990). Geographically distinct population groups with interaction among each other have been investigated using the framework of meta-populations for analyzing the dynamics of childhood infections (Rohani et al. 1999).
22.214.171.124 Models with Continuous Age Structure
Age structure can best be described as a continuous variable, where age progresses with time. Mathematically this leads to models in the form of partial differential equations , where all variables of the model depend on time and age (Diekmann and Heesterbeek 2000). Analytically, partial differential equations are more difficult to handle than ordinary differential equations, but numerically solving an age-structured system of model equations is straightforward.
126.96.36.199 Stochastic Transmission Models
In a deterministic model based on a system of differential equations it is implicitly assumed that the numbers in the various compartments are sufficiently large such that stochastic effects can be neglected. In reality this is not always the case. For example, when analyzing epidemic outbreaks in small populations such as schools or small villages, typical stochastic events can occur such as extinction of the infection from the population or large stochastic fluctuations in the final size of the epidemic. In contrast to deterministic models, stochastic models are formulated in terms of integers with probabilities describing the transitions between states. This means that outcomes are given in terms of probability distributions such as the final size distribution. Questions of stochastic influences on infectious disease dynamics have been studied in various ways, starting with the Reed–Frost model for a discrete time transmission of infection up to a stochastic version of the SIR model introduced above (Bailey 1975; Becker 1989). Finally, stochastic models have been investigated using simulation techniques also known as Monte Carlo simulations . An important theoretical result from the analysis of stochastic models is the distinction between minor and major outbreaks for infectious diseases with R 0 >1. While in a deterministic model a R 0 larger than unity always leads to an outbreak if the infection is introduced into an entirely susceptible population, in a stochastic model a certain fraction of introductions remain minor outbreaks with only a few secondary infections. This leads to a bimodal probability distribution of the final epidemic size following the introduction of one infectious index case . The peak for small outbreak sizes describes the situation that the infection dies out after only a few secondary infections, the peak for large outbreak sizes describes those outbreaks that take off and affect a large part of the population. The larger the basic reproduction number, the larger the fraction of major outbreaks in the susceptible population (Andersson and Britton 2000).
188.8.131.52 Network Models
Some aspects of contact between individuals cannot easily be modeled in compartmental models. In the context of the spread of sexually transmitted diseases models were developed that take the duration of partnerships into account, the so-called pair formation models (Hadeler et al. 1988). Extending those models to also include simultaneous long-term partnerships leads to the class of network models, where the network of contacts is described by a graph with nodes representing individuals and links representing their contacts (Keeling and Eames 2005). Different network structural properties have been related to the speed of spread of an epidemic through the population. In the so-called small world networks , most contacts between individuals are local, but some long-distance contacts ensure a rapid global spread of an epidemic (Watts and Strogatz 1998). Long-distance spread of infections is becoming increasingly important in a globalizing world with increasing mobility – as the example of the SARS epidemic in 2003 demonstrated. Recently the concept of scale-free networks where the number of links per node follows a power law distribution (i.e., the probability for a node to have k links is proportional to k –γ with a positive constant γ) was discussed in relation to the spread of epidemics. With respect to the spread of sexually transmitted diseases a network structure where some individuals have very many partners while the majority of people have only few might lead to great difficulties in controlling the disease by intervention (Liljeros et al. 2001). Network concepts have also been applied to study the spread of respiratory diseases (Meyers et al. 2003).
12.4 Use of Modeling for Public Health Policy
Mathematical models have been widely used to assess the effectiveness of vaccination strategies, to determine the best vaccination ages and target groups, and to estimate the effort needed to eliminate an infection from the population. More recently, mathematical modeling has supported contingency planning in preparation for a possible attack with smallpox virus (Ferguson et al. 2003) and in planning the public health response to an outbreak with a pandemic strain of influenza A (Ferguson et al. 2006). Other types of intervention measures have also been evaluated such as screening for asymptomatic infection with Chlamydia trachomatis (Kretzschmar et al. 2001), contact tracing (Eames and Keeling 2003), and antiviral treatment in the case of HIV. In the field of nosocomial infections and transmission of antibiotic-resistant pathogens modeling has been used to compare hospital-specific interventions such as cohorting of health workers, increased hygiene, and isolation of colonized patients (Grundmann and Hellriegel 2006). In health economic evaluations it has been recognized that dynamic transmission models are a necessary requisite for conducting good cost-effectiveness analyses for infectious disease control (Edmunds et al. 1999).
It is a large step from developing mathematical theory for the dynamics of infectious diseases to application in a concrete public health-relevant situation. The latter requires an intensive focusing on relevant data sources, clinical and microbiological knowledge to make a decision about how to design an appropriate model. Appropriate here means that the model uses the knowledge available, is able to answer the questions that are asked by policy makers, and is sufficiently simple so that its dynamics can be understood and interpreted. In the future it will be important to strengthen the link between advanced statistical methodology and mathematical modeling in order to further improve the performance of modeling as a public health tool.
12.5 Further Reading
One of the first comprehensive texts on epidemic modeling is Bailey (Bailey 1975). Bailey treats both deterministic and stochastic models and links them to data. A more recent, but also classic text for infectious disease modeling is Anderson and May (1991); however, it deals mainly with deterministic unstructured models. Its strength is a good link with data and discussion of public health relevant questions. In Diekmann and Heesterbeek (2000) the mathematical theory of deterministic modeling is laid out with many exercises for the reader. A focus of the book is the incorporation of population heterogeneity into epidemic modeling and a generalization of the basic reproduction number to heterogeneous populations. In Andersson and Britton (2000) an introduction to stochastic epidemic modeling is given. Becker (1989) describes advanced statistical methods for the analysis of infectious disease data taking the specific characteristics of these data into account. A recent text incorporating case studies from applications of epidemic modeling was published by Keeling and Rohani (2007).
- Anderson RM, May RM (1991) Infectious disease of humans: dynamics and control. Oxford: Oxford University PressGoogle Scholar
- Becker NG (1989) Analysis of infectious disease data. London: Chapman and HallGoogle Scholar
- Diekmann O, Heesterbeek JAP (2000) Mathematical epidemiology of infectious diseases. Chichester: WileyGoogle Scholar
- Hamer WH (1906) Epidemic disease in England – the evidence of variability and persistency of type. Lancet; 1:733–39Google Scholar
- Keeling MJ, Rohani P (2007) Modeling infectious diseases in humans and animals. Princeton: Princeton University PressGoogle Scholar
- Kermack WO, McKendrick AG (1991a) Contributions to the mathematical theory of epidemics–II. The problem of endemicity.1932. Bull Math Biol; 53(1–2):57–87Google Scholar
- Kermack WO, McKendrick AG (1991b) Contributions to the mathematical theory of epidemics – I. 1927. Bull Math Biol; 53(1–2):33–55Google Scholar
- Kermack WO, McKendrick AG (1991c) Contributions to the mathematical theory of epidemics–III. Further studies of the problem of endemicity.1933. Bull Math Biol; 53(1–2):89–118Google Scholar
- Koopman J, Simon C, Jacquez J, Joseph J, Sattenspiel L, Park T (1988) Sexual partner selectiveness effects on homosexual HIV transmission dynamics. J Acquir Immune Defic Syndr; 1(5):486–504Google Scholar
- Meyers LA, Newman ME, Martin M, Schrag S (2003) Applying network theory to epidemics: control measures for Mycoplasma pneumoniae outbreaks. Emerg Infect Dis; 9(2):204–10Google Scholar