Introduction
 3.6k Downloads
Abstract
Infectious diseases ranging from respiratory (influenza, common cold, tuberculosis, the respiratory syncytial virus), vectorborne (plague, malaria, dengue, chikungunya, and Zika) to sexually transmitted (the human immunodeficiency virus, syphilis) have historically affected the human population in profound ways. For example, the Great Plague, well known as the Black Death, was caused by the bacterium Yersinia pestis and killed up to 200 million people in Eurasia and about 30–60% of Europe’s population during a 5year span in the fourteenth century. At the time, the plague infection was thought to be due to some “bad air”, but it was not discovered that bites of infected fleas were behind the pandemic until late 1890s. If the human civilization had known about the transmission mechanisms behind the plague infections, the epidemic’s impact on morbidity and mortality could have been mitigated through basic public health interventions. This is to say that knowledge of the transmission processes and the natural history of infectious diseases in different environments represents invaluable actionable information for thwarting the spread of infectious diseases.
Infectious diseases ranging from respiratory (influenza, common cold, tuberculosis, the respiratory syncytial virus), vectorborne (plague, malaria, dengue, chikungunya, and Zika) to sexually transmitted (the human immunodeficiency virus, syphilis) have historically affected the human population in profound ways. For example, the Great Plague, well known as the Black Death, was caused by the bacterium Yersinia pestis and killed up to 200 million people in Eurasia and about 30–60% of Europe’s population during a 5year span in the fourteenth century. At the time, the plague infection was thought to be due to some “bad air” and it was not discovered that bites of infected fleas were behind the pandemic until late 1890s. If the human civilization had known about the transmission mechanisms behind the plague infections, the epidemic’s impact on morbidity and mortality could have been mitigated through basic public health interventions. This is to say that knowledge of the transmission processes and the natural history of infectious diseases in different environments represents invaluable actionable information for thwarting the spread of infectious diseases.
One remarkable and definite shift to the germ theory occurred during the “golden bacteriology” era during the second half of the nineteenth century. In fact, the 1889–1990 influenza pandemic is arguably the first influenza pandemic that occurred in a new and progressive state of knowledge about infectious disease transmission. This pandemic is better known as the “Russian Flu” because the rapid global spread of the pandemic virus can be traced back to Saint Petersburg, Russia in October 1889 (Valleron et al. 2010). Moreover, it was the first pandemic to unfold in a world connected by rail and maritime transportation; it spread across Europe in approximately 6 weeks, with an estimated mean speed at 394 km/week (Valleron et al. 2010) and circulated around the world in just 4 months (Valleron et al. 2010).
1.1 The Motivation
Mathematical modeling plays an important role in ordering our thoughts and sharpening vague intuitive notions. Initial models are verbal descriptions that tend to become insufficient as soon as the scenarios become complicated. Mathematics provides a powerful language that forces us to be logically consistent and explicit about assumptions.
 1.
While most disease transmission models predict an expected exponential growth at the beginning of the epidemic, empirical data often exhibit subexponential growth patterns (Viboud et al. 2016). How do we best characterize these nonunique subexponential growth functions in the context of infectious disease modeling?
 2.
Are there many, even infinitely many, mechanisms that lead to the same or very similar subexponential growth functions?
 3.
Does a slower than expected initial growth at the beginning of the epidemic imply a smaller value of the basic reproduction number R_{0}, a key quantity in the field of infectious disease epidemiology (Anderson and May 1991; Diekmann and Heesterbeek 2000; Brauer 2006), as suggested by many transmission models?
 4.
What exactly does it mean when we say “deterministic models approximate their stochastic counterparts by the lawoflarge numbers”? Are we referring to a population that is infinitely large or something else?
 5.
Which features of the populationbased models, in which the exponential distribution is assumed at the individual levels, can be generalized with nonexponential distributions?
 6.
Regarding effectiveness of control measures against the spread of diseases, even if imperfect implementation in terms of coverage or compliance has been explicitly taken into account in the models, empirical observations often leave us with impressions that the control measure that “looks good” in theory “do not work at all” in practice. Are there more theories that could capture this phenomenon?
 7.
How do we reconcile the quantities as predicted by disease transmission models with observed data from outbreak investigations and public health surveillance?
 8.The need for precise definitions of verbal descriptions in quantitative analyses. For instance,

What do we mean by “a case” when data from outbreak investigations and surveillance are presented as timeseries of “number of cases”?

Are “generation intervals” consistently defined across literature in epidemiology and infectious disease models?

How do we characterize and compare “variability” among random variables, such as the infectious periods or the numbers of secondary infections transmitted by a primary infector?

 9.
What do we mean by “nonidentifiability” when fitting models to data?
Of models formulated in mathematical languages, there are different types that are designed for different purposes.
Broadly speaking, there are mathematical models aimed at facilitating our understanding of the medical, biological, ecological, and social interactions that manifest the outbreaks and epidemics of infectious diseases in order to gain insight into specific questions or to generate theories about what must or might happen; and there are statistical models aimed at capturing the data generation process, for detecting general patterns, predicting epidemic trajectories, managing control strategies, or simply describing epidemic trends. Within both mathematical and statistical models, there are models designed at the population level in a phenomenological way versus models that are individualbased with which researchers aim to capture relevant mechanistic processes.
Individualbased models start from descriptions or assumptions about the evolution of the infectiousness and the natural history of the disease progression within an infected host. These include models for the latent periods, the infectious periods, the incubation periods, recovery, mortality, and so on. Some of the individualbased models also combine social contacts with the evolution of the infectiousness in terms of infectious contacts (Dietz 1995).
Phenomenological models can be deterministic or stochastic and include transmission dynamics models formulated using differential equations or stochastic processes as well as empirical growth functions, such as the generalized logistic growth models. Transmission dynamics models depend on tacit assumptions at the individual level.
The developments of many new statistical models and methods in the study of infectious diseases were driven by the HIV epidemic (Brookmeyer and Gail 1994). Data arising from infectious disease investigations pose unique challenges in classic statistical theory and practice because disease outbreak data do not arise from designed experiments. Each outbreak cannot be repeated naturally under identical conditions, whereas the large amount and multiple sources of clinical data, outbreak investigation data from nonconventional surveys, public health surveillance, and observational data from prevalence and incidence cohorts are collected addressing the same outbreak event. Before statistical methods are used to understand and control the epidemic, statistical models are needed to address the data generation processes, which not only include the epidemiologic and biologic processes that give rise to the disease outbreaks, but also the processes that dictate how data are observed and how a “case” is documented and reported.
When talking about “fitting the model to data,” we tend to think of one type of model designed for a specific purpose. However, fitting a dynamic mathematical model to observed outbreak data (e.g., for the purpose of estimating important transmission parameters) involves all three levels of models: the population phenomenological model which depends on tacit assumptions of the individualbased model nested within it, and the statistical model that links the disease transmission process to the data generation process. Very often in practice, these different types of models are considered simultaneously even without the investigators’ consciousness.
Driven by the HIV epidemic that started in the late 1970s, the outbreaks of the severe acute respiratory syndrome (SARS) in 2003, pandemic influenza preparedness, and preparedness for other emerging and reemerging epidemics, the literature on infectious disease modeling has flourished during the past 40 years. However, most articles are confined within subdisciplines according to model characteristics and research focus. While the field of mathematical epidemiology has a long history (e.g., Ross 1911, 1928; Anderson and May 1991; Diekmann and Heesterbeek 2000; Keeling and Rohani 2008; Sattenspiel 2009; Allen 2010; Vynnycky and White 2010; Becker 2015; Andersson and Britton 2012; Manfredi and D’Onofrio 2013; Kermack and McKendrick 1927; Brauer 2006; Brauer and CastilloChávez 2001), formal efforts at connecting mathematical models with epidemiological data with the goal of calibrating models for predictive/forecasting purposes have only started to take hold during the last decade (Chretien et al. 2015; Biggerstaff et al. 2016; Chowell 2017; Viboud et al. 2018).
1.2 Structure of the Book with Brief Summary
Chapter 2 provides a review of basic concepts of probability and statistical models for the distributions of continuous lifetime data, closely related to individualbased models that describe the evolution of infectiousness and the natural history of the disease progression. We retell the story from a different angle with emphases on the shapes of hazard functions and tail properties of the lifetime distributions instead of repeating the subject commonly found in a typical textbook on survival analysis. These characteristics have profound impacts on outcomes of the transmission dynamic models at the population level. We will discuss and compare two lifetime random variables, both in terms of magnitude and variability, together with the Laplace transform of lifetime distributions. These concepts will provide the foundations for most of the remaining chapters.
Chapter 3 addresses the distributions of random counts and counting processes, which are closely related to populationbased phenomenological models. Section 3.2 provides a framework that links the continuous lifetime distributions at the individual level to the distributions of random counts at the population level. It also provides a historical account. Contemporary discussions on “superspreading events” as seen in outbreak investigation data in SARSlike diseases are typically associated with transmissions along highly heterogeneous networks characterized by long tailed degree distributions (LloydSmith et al. 2005). Similarly, in the context of incurring accidents, publications in actuarial science journals can be traced back to debates on proneness, contagion, or spells in the first half of the twentieth century that gave rise to important models such as the mixedPoisson process and the Yule process. Section 3.3 lays the foundation for measuring the evolution of random counts over time, which are key measurements in all populationbased models.
 1.
R_{0} only depends on the average value of the infectious periods regardless of the variance or the exact distribution. In models without natural births and deaths in the population, the value of R_{0} is not affected by the presence or absence of latent periods.
 2.
The probability of extinction δ depends on the specific distribution of the infectious periods but is not affected by the presence or absence of latent periods.
 3.
If the infectious disease does not become extinct during the first few generations, the initial (exponential) growth rate r depends on specific distributions for both the latent periods and the infectious periods.
 4.
Each of the mathematical relationships between R_{0} and δ, and between R_{0} and r, as found in the literature, is under a set of strict assumptions on the social contact process and the progression of infectiousness within infected individuals.
 1.
Given the fixed value R_{0} > 1 and the infectious periods distribution, the model with latent periods has a smaller initial growth rate r than the one without.
 2.
Given the fixed value R_{0} > 1 and the latent periods distribution, the more variable the infectious periods, the smaller the value of r.
 3.
Without specifying the distributions of the latent periods and the infectious periods, there is no order between the values of r and of R_{0}.
 4.
If R_{0} > 1, without specifying the distribution of the number of secondary infections generated by the primary infectious individual (through the distribution of the infectious periods), there is no order between the values of δ and of R_{0}.
 5.
There is a direct relationship between r and δ, rarely mentioned in the literature, that r = β(1 − δ), provided that there is no latent period and that the number of infections produced by a typical infectious individual during a time interval of length x is Poisson distributed with mean value βx. This relationship does not depend on the distribution of the infectious period.
The second part of Chap. 4 emphasizes that the three parameters R_{0}, δ, and r are intrinsic in the sense that they represent the state of the system at (diseasefree) equilibrium when the initially infected individuals are seeded. Section 4.5 presents growth patterns that are most likely to happen when the system moves away from the equilibrium condition. Many discussions are on empirically observed slower growth patterns that largely deviate from the exponential growth assumption (Chowell et al. 2015; Chowell 2017). We attempt to precisely define the subexponential growth functions in the context of infectious disease transmission and enlist several assumptions about the transmission dynamics that all lead to such early growth pattern, from the depletion of the susceptible population to scaling of epidemic growth shaped by various factors and their combination including the level of contact clustering and reactive behavior changes (Chowell et al. 2016) and to unobservable individuallevel heterogeneity. A special subexponential growth function of the form, (1 + rvt)^{1∕v}, r, t > 0, 0 < v ≤ 1, is introduced in Chap. 4 which frequently appears in later chapters ( 6, 8 and 9) in examples and discussions.
Chapters 5 and 6 discuss compartment models when the outbreak moves beyond the initial phase. Much of Chap. 5 is the synthesis of previously published literature on both stochastic and deterministic transmission dynamic models, with our added perspectives. Our interest is to generalize some of the features of these models beyond the assumptions based on the exponential distribution on durations of various stages, and beyond the simple generalizations such as the Erlang distribution (which is a subset of the gamma distribution characterized by smaller variances compared to the exponential distribution with equal mean values). These discussions start in Sect. 5.5.2 and continue in Sect. 6.2.1. In these discussions, Laplace transforms of probability distributions are extensively used as tools to calculate transition probabilities among compartments and average durations within compartments. They are valid for arbitrary distributions without specific assumptions of these distributions. When these distributions are exponential, general results in Sects. 5.5.2 and 6.2.1 return to those published in the literature, such as the expression of the reproduction number as the nonnegative eigenvalue of the next generation matrix (van den Driessche and Watmough 2008) as well as in examples in these sections.
We also point out a transcendental relationship among ( 4.43), ( 5.66), and ( 6.24). In these expressions, the Laplace transforms are tools to compare distributions ranked by variability which lead to Propositions 27 and 28 along with discussions in subsequent paragraphs.
Other distinct topics in Chap. 5 are empirical models to describe populationbased phenomena without “mechanically” modeling the transmission dynamics at the level of individuals and interactions among individuals. These models are useful for curve fitting, as used in examples later in Chap. 8.
Models in Chap. 6 are more complex and involve intervention measures during the epidemic. Section 6.3 demonstrates a potential application of these models in the context of preparedness for an influenzalike acute respiratory infectious disease with numerical illustrations in hypothetical racetotreat scenarios and with limited treatment supply. Section 6.5 discusses the impact of unobservable heterogeneity in treatment rates on effectiveness. This section addresses Question 6 in Sect. 1.1. We also draw the attention of the expression \(\left ( 1+\phi xv\right ) ^{1/v}\) in ( 6.31) which echoes the subexponential growth function (1 + rvt)^{1∕v} in Chap. 4. This is because in both cases, a frailty model from survival analysis is used to model the unobservable heterogeneity among individuals.
Chapter 7 addresses Question 7, 8, and 9 in Sect. 1.1 and serves as a transition between the theoretical topics in previous chapters and Chaps. 8 and 9. The focus is on the data generating processes and statistical issues of fitting models to data. As repeatedly emphasized in Chaps. 4– 6, populationbased models involve tacit assumptions at the level of individuals, such as the exponential, gamma, or other distributions of the infectious periods. These are conceptual models to address general issues and general patterns, such as the prediction of “incidence” according to time at infection (which is usually unobservable). On the other hand, statistical models address the data generating processes, which include the epidemiology aspects but also the observational schemes, including “case definition,” surveillance and reporting, and adjustments for observational biases. In each model, choices are made with respect to which aspects of “the real world” should be included in the description of the model and which should be ignored. These choices not only depend on the perceived importance of various factors, but also on the purpose of each of these models. Frequently, fitting a mathematical model, such as a transmission model, to data collected from surveillance and outbreak investigations involves three types of models (assumptions) that take place at the same time. This requires “nesting” one type of model within another. For example, the statistical model that describes data may involve assumptions of the mean and variance, and in some instances, the assumptions of specific distributions such as Poisson or negative binomial. In addition, the model also handles observational biases such as adjustment for reporting delays (Sect. 7.3). The mean of the statistical model may be a function of time with unknown parameters. This function may involve convolution structures, such as backcalculation (Sect. 7.4), to connect predictions from a conceptual model to expected values of observable outcomes. The conceptual model is thus embedded inside a statistical model. However, this will inevitably result in statistical issues such as nonidentifiability (Sect. 7.2). This section mainly discusses concepts, with a few examples as well as some simple methods where applicable. This is an important field that needs more research and development.
Chapters 8 and 9 focus more heavily on applications, although some models not covered in Chaps. 5 and 6 are presented such as metapopulation spatial models and individualbased network models (Chap. 9). Examples presented are based on a case study for the 2016 Zika epidemic in Antioquia, Colombia (Sect. 8.3), a case study of the 2016 epidemic of yellow fever in two areas of Angola: Luanda (the capital) and Huambo (Sect. 8.4), and a case study of the 2014 Ebola outbreak in Mali (Sect. 9.4).
References
 Allen, L. J. (2010). An introduction to stochastic processes with applications to biology. Boca Raton, FL: CRC Press.Google Scholar
 Alonso, W. J., Nascimento, F. C., Chowell, G., & SchuckPaim, C. (2018). We could learn much more from 1918 pandemicthe (mis)fortune of research relying on original death certificates. Annals of Epidemiology, 28(5), 289–292.CrossRefGoogle Scholar
 Anderson, R. M., & May, R. M. (1991) Infectious diseases of humans, dynamics and control. Oxford: Oxford University Press.Google Scholar
 Andersson, H., & Britton, T. (2012). Stochastic epidemic models and their statistical analysis (Vol. 151). New York, NY: Springer.zbMATHGoogle Scholar
 Becker, N. G. (2015). Modeling to inform infectious disease control. London: Chapman and Hall/CRC.CrossRefGoogle Scholar
 Biggerstaff, M., Alper, D., Dredze, M., Fox, S., Fung, I. C., Hickmann, K. S., et al. (2016). Results from the centers for disease control and prevention’s predict the 2013–2014 Influenza Season Challenge. BMC Infectious Diseases, 16, 357.CrossRefGoogle Scholar
 Brauer, F. (2006). Some simple epidemic models. Mathematical Biosciences and Engineering, 3, 1–15.MathSciNetCrossRefGoogle Scholar
 Brauer, F., & CastilloChávez, C. (2001). Mathematical models in population biology and epidemiology. New York, NY: Springer.CrossRefGoogle Scholar
 Brookmeyer, R., & Gail, M. H. (1994). AIDS epidemiology: A quantitative approach. New York, NY: Oxford University Press.Google Scholar
 Chowell, G. (2017). Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer for parameter uncertainty, identifiability, and forecasts. Infectious Disease Modelling, 2, 379–398.CrossRefGoogle Scholar
 Chowell, G., Viboud, C., Hyman, J. M., & Simonsen, L. (2015). The Western Africa ebola virus disease epidemic exhibits both global exponential and local polynomial growth rates. PLoS Currents, 7. https://doi.org/10.1371/currents.outbreaks.8b55f4bad99ac5c5db3663e916803261
 Chowell, G., Viboud, C., Simonsen, L., & Moghadas, S. (2016). Characterizing the reproduction number of epidemics with early subexponential growth dynamics. Journal of the Royal Society Interface, 13(123). https://doi.org/10.1098/rsif.2016.0659 CrossRefGoogle Scholar
 Chretien, J. P., Swedlow, D., Eckstrand, I., Johansson, M., Huffman, R., & Hebbeler, A.(2015). Advancing epidemic prediction and forecasting: A new US government initiative. Online Journal of Public Health Informatics, 7(1), e13.CrossRefGoogle Scholar
 Dahal, S., Jenner, M., Dinh, L., Mizumoto, K., Viboud, C., & Chowell, G. (2017). Excess mortality patterns during 1918–1921 influenza pandemic in the state of Arizona, USA. Annals of Epidemiology, 28(5), 273–280.CrossRefGoogle Scholar
 Diekmann, O., & Heesterbeek, J. A. P. (2000). Mathematical epidemiology of infectious diseases: Model building, analysis and interpretation. Mathematical and computational biology (Vol. 5). Chichester: Wiley.zbMATHGoogle Scholar
 Dietz, K. (1995). Some problems in the theory of infectious diseases transmission and control. In D. Mollison (Ed.), Epidemic models: Their structure and relation to data (pp. 3–16). Cambridge: Cambridge University Press.Google Scholar
 Johnson, N. P., & Mueller, J. (2002). Updating the accounts: Global mortality of the 1918–1920 “Spanish” influenza pandemic. Bulletin of the History of Medicine, 76(1), 105–115.CrossRefGoogle Scholar
 Keeling, M. J., & Rohani, P. (2008). Modeling infectious diseases in humans and animals. Princeton, NJ: Princeton University Press.CrossRefGoogle Scholar
 Kermack, W. O., & McKendrick, A. G. (1927). Contributions to the mathematical theory of epidemics, part I. Proceedings of the Royal Society London A, 115, 700–721.Google Scholar
 LloydSmith, J. O., Schreiber, S. J., Kopp, P. E., & Getz, W. M. (2005). Superspreading and the effect of individual variation on disease emergence. Nature, 438(7066), 355.CrossRefGoogle Scholar
 Manfredi, P., & D’Onofrio, A. (Eds.). (2013). Modeling the interplay between human behavior and the spread of infectious diseases. New York, NY: Springer.zbMATHGoogle Scholar
 Mills, C. E., Robins, J. M., & Lipsitch, M. (2004). Transmissibility of 1918 pandemic influenza. Nature, 432(7019), 904.CrossRefGoogle Scholar
 Ross, R. (1911). The prevention of malaria. London: John Murray.Google Scholar
 Ross, R. (1928). Studies on malaria. London: John Murray.Google Scholar
 Sattenspiel, L. (2009). The geographic spread of infectious diseases: Models and applications. Princeton, NJ: Princeton University Press.CrossRefGoogle Scholar
 Simonsen, L., Chowell, G., Andreasen, V., Gaffey, R., Barry, J., Olson, D., et al. (2018). A review of the 1918 herald pandemic wave: Importance for contemporary pandemic response strategies. Annals of Epidemiology, 28(5), 281–288.CrossRefGoogle Scholar
 Valleron, A. J., Cori, A., Valtat, S., Meurisse, S., Carrat, F., & Boelle, P. Y. (2010). Transmissibility and geographic spread of the 1889 influenza pandemic. Proceedings of the National Academy of Sciences, 107(19), 8778–8781.CrossRefGoogle Scholar
 van den Driessche, P., & Watmough, J. (2008). Further notes on the basic reproduction number. In: F. Brauer, P. van den Driessche, & J. Wu (Eds.) Mathematical epidemiology. Lecture notes in mathematics (Vol. 1945). Berlin: Springer.Google Scholar
 Viboud, C., Simonsen, L., Chowell, G. (2016). A generalizedgrowth model to characterize the early ascending phase of infectious disease outbreaks. Epidemics 15, 27–37.CrossRefGoogle Scholar
 Viboud, C., Sun, K., Gaffey, R., Ajelli, M., Fumanelli, L., Merler, S., et al. (2018). The RAPID ebola forecasting challenge: Synthesis and lessons learnt. Epidemics, 22, 13–21.CrossRefGoogle Scholar
 Vynnycky, E., & White, R. (2010). An introduction to infectious disease modelling. Oxford: Oxford University Press.Google Scholar