Epidemics spread in heterogeneous populations

Individuals building populations are subject to variability. This variability affects progress of epidemic outbreaks, because individuals tend to be more or less resistant. Individuals also differ with respect to their recovery rate. Here, properties of the SIR model in inhomogeneous populations are studied. It is shown that a small change in model’s parameters, e.g. recovery or infection rate, can substantially change properties of final states which is especially well-visible in distributions of the epidemic size. In addition to the epidemic size and radii distributions, the paper explores first passage time properties of epidemic outbreaks.


Introduction
The SIR (Susceptible-Infectious-Recovered) model [1][2][3] is a well-known, classical example of an epidemiological model. Originally, it has been suggested and studied in the continuous limit [4]. The SIR model is considered as a starting point for epidemiological modeling of many diseases. Therefore, various extensions [5] to the model have been suggested and explored. In particular, they include modification of possible states and system topology. These alternations are usually introduced to the discrete version of the SIR model, which is suitable for extensions and modifications.
Observations of epidemics progress and their spatial patterns show that real populations are not homogeneous but intrinsically heterogeneous. Every population consists of individuals which share common features but are not exactly the same. Heterogeneity of a population can be introduced in various ways. The most common approach is to divide populations into smaller groups, which differ with respect to their characteristics/traits [6,7]. Another possibility is to assume that traits. e.g. susceptibility or infectivity, follow continuous distributions [8][9][10][11][12][13]. Some of such heterogeneous models, by means of nonlinear transformations [9], can be mapped into homogeneous models. Population heterogeneity at the level of infection probability can be studied using percolation theory [14,15] or agent-based modeling [16,17].
Inhomogeneity in epidemiological models not only affects individuals but also connections among them [18,19]. Variable connectivity changes progress of epidemics. Heterogeneity of connections decreases the epidemic a e-mail: karol.capala@uj.edu.pl b e-mail: bartek@th.if.uj.edu.pl threshold. This fact is especially well-visible in case of epidemics on scale-free networks. In the thermodynamic limit, due to the presence of highly connected hubs, there is no infection threshold on scale-free networks [20,21]. Nevertheless, it can be reintroduced by an appropriate vaccination strategy [22].
Epidemiological processes are inherently stochastic due to the random nature of infection and recovery processes. The contact between susceptible and infectious individuals is necessary for infection but does not guarantee that an infection indeed takes place. In the case of the SIR model in the inhomogeneous population the stochasticity has two sources. The stochasticity of epidemiological systems is extended by the already mentioned variability of individuals. Therefore, a general population consists of individuals which are more or less vulnerable to infections. Similarly, some of individuals are capable of fast recovery, while others recover slowly. The presence of such individuals is a key element for creation of infection paths and islands of isolated, uninfected individuals.
Individuals' heterogeneity affects the average epidemics size and the invasion probability. If only the susceptibility is varied, less homogeneous populations are less likely to be invaded and consequently epidemic size is reduced [10,15,23]. Increasing heterogeneity reduces the chances of large outbreaks due to the possibility of creating impenetrable firewalls [14]. For negatively correlated infectivity and susceptibility the final size of epidemic is larger than in homogeneous case [24]. Finally, using percolation theory, it has been also shown that the increase of the average transmissibility increases the invasion probability. Contrary to the average transmissibility, an increase in the standard deviation of transmissibility reduces the invasion probability and decreases the epidemic size [11,12] showing again that heterogeneity can decrease epidemic size.
Using Monte Carlo methods [25], we study the modification of the Kermack-McKendrick model. We extend the discrete lattice SIR model to take into account simultaneous variability in susceptibility and recovery rates. At the same time the connectivity among individuals is fixed and limited to the von Neuman neighborhood only. The model studied here is similar to ones considered in [11,12] and [14,15]. In contrast to [11,12], within the studied model, the time is discrete, resulting in the replacement of infection and recovery rates by appropriate probabilities per time step. Moreover, within presented studies more quantifiers characterizing epidemics are examined. Here, along with the examination of the epidemic size distribution and the invasion probability we study temporal properties like the epidemic duration and the first time to reach a given radius or size. Contrary to [14,15], variability is also included in the recovery rate.
Exploration of epidemic properties in heterogeneous populations is very important because an epidemic constitutes one of the processes that can significantly affect proper functioning of societies [26,27]. It can result in significant economic losses. Identification of the role of inhomogeneity is crucial for understanding the dynamics of plant [28,29], animal [30] and human epidemics [31].
The manuscript is organized in the following way: Section 2 (Model) presents the lattice SIR model in inhomogeneous populations, Section 3 (Results) discusses obtained results. Finally, the paper ends with Summary and Conclusions (Sect. 4).

Model
We are studying the lattice SIR (Susceptible-Infectious-Recovered) model [1] using computer simulations. The population of N individuals is divided into three classes of individuals: S -susceptible, I -infected and R -recovered. In every time step, any susceptible individual, let say i, can be infected with the probability P I (i) due to a contact with an infectious neighbor. After becoming infected, an infectious individual can spontaneously recover with the probability P R (i). Recovered individuals are immune, i.e. they cannot be re-infected. The population is inhomogeneous, therefore every individual is characterized by probabilities per time step to become infected P I (i) and to recover P R (i). The (total) probability of infecting a susceptible individual i during a single iteration (MC step), see below, is where N (i) represents the nearest neighbors of the ith individual and Θ(j) is the indicator function showing whether a neighbor j is infectious or not Probabilities (per time step) of becoming infected P I and spontaneous recovery P R are independent and distributed according to the truncated normal density on [0, 1] for different values of μ I , σ I and μ R , σ R respectively, where C is the normalization constant C = [erf(μ/ √ 2σ)− erf((μ − 1)/ √ 2σ)]/2. If 0 μ I 1 and 0 μ R 1, μ I and μ R are modal values of infection and recovery probabilities (per time step) respectively. If μ < 0 (μ > 1), the modal value is located at 0 (1). Finally, σ I and σ R control distribution width, but they do not have a simple interpretation, as appropriate standard deviations are nonlinear functions of σ I (σ R ) sensitive to the choice of μ I (μ R ). Moreover, an increase of σ increases (decreases) the mean value of the appropriate probabilities if μ is smaller (larger) than 0.5, see below and Table 2. The infection and recovery probabilities per time step P I (i) and P R (i) have been generated at the system initiation.
A special type of the probability density, see equation (3), originates in the commonness of the normal distributions. The shape of the density follows a general intuition that there are some typical (modal) values of infection and recovery rates. Moreover, within the population there are individuals whose traits take larger or smaller values than typical (most likely) value. Finally, from the simulation point of view, alternations of parameters in equation (3) allow for an easy modification of the most probable (modal) values and transition from full homogeneity (σ → 0) to full heterogeneity (σ → ∞). Thus, the special type of the distribution (3) allows easy control of population properties.
Epidemics spread on the square lattice of size 101×101. Every outbreak is characterized by its size s, duration T and radius r, which is the Manhattan distance between the first infected (patient zero) individual and the most distant infected agent. Initially, there is only one infected individual which is placed in the center of the lattice. Each agent interacts locally only with its four nearest neighbors in its von Neuman neighborhood. The model is studied by means of Monte Carlo methods [25]. Every MC step consists of: (i) infecting all susceptible individuals (having infected neighbors) and (ii) spontaneous recovery of all infected individuals according to the mechanisms described above. The MC steps are repeated as long as there are infectious individuals in the system. The number of performed steps defines the epidemic duration T . Simulations were repeated 10 4 times for each set of μ I , σ I , μ R and σ R parameters. Furthermore, two different boundary conditions were considered: free and periodic. The main scope is to find how the population heterogeneity affects dynamics of epidemics (severity, duration) and properties of final states.

Results
The epidemic dynamics depends on the four model's parameters μ I , σ I , μ R , σ R , which control the shape of distributions from which epidemics' parameters are generated. The mean value of the infection probability per time step P I (recovery probability per time step P R ) grows with the increase of μ I (μ R ). Moreover, for μ I < 0.5 (μ R < 0.5) the mean value of P I ( P R ) is an increasing function of σ I (σ R ), while for μ I > 0.5 (μ R > 0.5) it is a decreasing function of σ I (σ R ). In the limiting case of μ I = 0.5 (μ R = 0.5), the modal value is equal to the mean value P I ( P R ). Standard deviations of infection and recovery probabilities are weakly sensitive to the exact choice of μ I and μ R , but they are increasing functions of σ I and σ R . Values of these parameters determine the average epidemic size, see Figure 1, and the mean duration, see  Figure 1. Moreover, the small epidemic size is associated with the short epidemic duration, see Figure 2.
The top panel of Figure 1 demonstrates results for a homogeneous population (σ I = σ R = 0 resulting in P I = μ I and P R = μ R ), while the middle and bottom panels for inhomogeneous (σ I > 0 and σ R > 0) cases. The inhomogeneity of infection and recovery rates changes the domain in which epidemic outbreaks are observed. Heterogeneity seriously affects the duration of epidemics, which is significantly longer than for homogeneous populations, compare the top, middle and bottom panels of Figure 2. The differences in epidemic progress are due to variability of individuals. The variability of the infection rate determines which individuals can be easily infected and which individuals are resistant. Resistant individuals are capable of building firewalls [14], while irresistant individuals determine paths along which epidemics can easily spread. Figure 1 indicates that increasing σ I , in line with earlier investigations [10,23], might decrease the epidemic size, see the top, bottom and middle panels of Figure 1.
Subsequently, Figures 3-6 display epidemic characteristics (the distribution of epidemic size s and radius r, the average first time T (s) to infect a chosen size s, the average time T (r) to infect a given radius r for the first time, and the distribution of epidemic duration T ) for periodic (full symbols) and free (empty symbols) boundary conditions. Different figures correspond to various model parameters. These figures demonstrate in detail how heterogeneity changes epidemic dynamics. Consequently, they are used to explain the observed phenomena.
First of all, heterogeneity changes the mean value of probabilities P I and P R except the case when μ I = 0.5 and μ R = 0.5. It causes significant differences in dynamics of epidemics between sets of parameters with very similar values of σ. For example, it can be seen for μ I = 0.2, μ R = 0.1, σ I = 0.1 with σ R = 0.2 (Fig. 3) and σ R = 0.3 ( Fig. 4). For the lower value of σ R , if a disease does not end on the first individual, it spreads through almost 90% of a population. In this case an epidemic either reaches a radius smaller than 4 or equal to the maximum possible radius (r = 100), which is the Manhattan distance from the initially infected individual to the agent located in the grid corner. For σ R = 0.3, see Figure 4, the description is drastically changed. The probability that an epidemic reaches a given size decreases when the epidemic size is growing. Most epidemics still end on the lowest or the largest radius, but it is more likely to observe the final radius somewhere between these two limiting values. Moreover, epidemics spread faster, as measured by T (s) and T (r) , when heterogeneity in P R is lower, i.e. when σ R is reduced and consequently the variance of P R is smaller. The epidemic duration displays a different pattern because it is not only determined by the infection but also by the recovery, see below.
Different effects are observed for μ I = 0.1, μ R = 0.5, σ I = 0.5 with σ R = 0.2 (Fig. 5) and σ R = 0.5 (Fig. 6). For these parameters the mean value of the recovery probability is equal to μ R ( P R = μ R ). At the same time P I is different than μ I but it is the same in both figures. The increase of σ R increases the probability that epidemics reach at least one of the lattice corners (maximum radius). Even a greater σ R increases the chances that epidemics spread beyond the initially infected individuals and reach large final sizes. For σ R 0.3, more outbreaks reach a modal value of epidemic size and the modal value increases, see Figure 6. The rate of epidemic spread is also slightly increased, which is manifested by the decreasing time needed to reach the same size or radius. This behavior can be explained by the heterogeneity of a population, which makes individuals with extreme recovery rates appear more often in the population. If one of the initially infected individuals has a high recovery probability (per time step) P R , an epidemic ends early reaching a small size. In the opposite case, every individual with a low recovery probability per time step has more opportunities to infect neighbors, thus chances of spreading a disease through neighborhood are higher, even if neighbors are characterized by a low infection rate. This dependence can be found for every set of parameters when the mean value of the recovery probability per time step is equal to the modal value P R = μ R = 0.5 and the mean value of the infection probability per time step is constant. Nevertheless, for many sets of parameters for which this effect appears, it can be hard to recognize due to influence of P I .
The situation when the mean value of the infection probability per time step is equal to the modal value P I = μ I = 0.5 and the mean value of the recovery probability per time step is constant can be also easily analyzed. In this case the modal value of epidemic size is decreasing with the increasing heterogeneity of the infection rate. Moreover, the rate of epidemics' spread is lower.
The top panels of Figures 3 and 4 show spatial properties of final states. The top left panels demonstrate epidemic size distributions P (s), while the top right panels epidemic radii distributions P (r) i.e. the Manhattan distance between the initially infected individual and the most distant infected agent. The maximal possible distance is obtained when the outbreak reaches corners of the grid, i.e. r = 100 for the lattice 101 × 101. Depending on parameters' values, epidemic outbreaks can be characterized by a typical size and radius. In the epidemic size distribution and epidemic radii distribution there are peaks corresponding to small sizes and small radii. These peaks are due to epidemics which end at a very small size due to the immediate recovery. Moreover, if the recovery rate is large enough and the infection rate is small enough, epidemics die out rapidly giving rise to a fast decay of the epidemic size distribution. Nevertheless, there are situations when despite a significant probability that epidemics end immediately there are many outbreaks (realizations) infecting the majority of the population, see Figure 3. Surprisingly, this behavior can be significantly changed by a small change in σ R , see Figure 4. In Figure 4, all parame- ters except σ R , are the same as in Figure 3, which is set to σ R = 0.3. Therefore, it is slightly larger than in Figure 3 where σ R = 0.2. The increase in σ R results in the increase of P R which determines the recovery rate. Consequently, the recovery rate is large enough to completely diminish the pronounced peak visible in Figure 3 around s ≈ 8000. This also implies that epidemics reach mainly small radii. In Figures 5 and 6, the average infection and recovery probabilities per time step are larger than in Figures 3  and 4, although, on average, epidemics are less severe because larger chances of infection are compensated by a faster recovery. Only in Figure 6 a local maximum of P (s) is well-visible especially for periodic boundary conditions. Otherwise, P (s) is a decreasing function of the epidemic size s.
In general, the distance distribution is not only a function of the model's parameters but also of a system size. In order to very the sensitivity of obtained results to the system size some of simulations have been performed for the two times larger system. When epidemics infect large fraction of individuals, like in Figures 3 and 6, there are only minor quantitative differences. Otherwise, in situations when epidemic outbreaks are limited, like in Figures 4 and 5, despite quantitative differences qualitative behavior is the same.
The middle panels of Figures 3-6 demonstrate temporal properties of epidemic spread. The middle left panels present the average first time T (s) to infect a given Fig. 6. The same as in Figure 5 for σR = 0.5. size s, while the middle right panels the average first time, T (r) , to reach a given radius r. The mean first time T (r) needed to infect a cluster of the radius r scales almost linearly with r. This can be intuitively explained by the fact that in every time step the radius can increase by 1 at most giving rise to the linear growth of r. In practical situations the slope is larger than 1 because the (average) infection probability is smaller than 1. Moreover, the recovery process further slows down the growth rate of the infected region. An increase in the average recovery probability per time step P R leads to a slower growth of the T (r) , compare Figures 3 and 4. Analogously, the rate of the epidemic spread is larger in Figures 5 and 6 Figures 3 and 4 due to the increase in the average infection probability per time step P I which facilitates the epidemic spread, see the middle panels of Figures 3-6 and Table 1.

than in
The growth of the mean first time to reach a given size s is limited by topological properties of the system. The type of the neighborhood restricts the maximal possible growth rate of the epidemic size which is further reduced by the recovery process and finite (smaller than 1) infection probability. Initially, T (s) grows like s 1/2 , because the epidemic size grows with the distance r to the initially infected individual as r 2 and the radius grows (approximately) linearly in time, i.e. r ∝ t. Here, analogously, like for T (r) , the increase in the mean infection probability per time step P I increases the growth rate, see for example Figures 4 and 5. The increase in P R reduces the growth rate, see Figures 3 and 4.
Finally, the bottom panels of Figures 3-6 show distribution of the epidemic duration P (T ). The exact shape of P (T )s depends on the model parameters, which also determine the size and the rate of epidemic spread and Table 1. Average values of parameters and characteristic quantifiers for Figures 3-6: the average infection probability per time step PI , the average recovery probability per time step PR , the average transmissibility Ψ , the average epidemic size s , the average epidemic duration T and the (linear) slope α in the average time to reach a given radius r, i.e. T (r) = αr + β.   in turn the epidemic duration T . The general shape of P (T ) curves follow predictions of [32]. If epidemics do not finish immediately at small sizes, they are likely to last long. The duration of epidemics is determined by two main factors. The first factor is the moment when the last infection takes place, which is the epidemic first time to reach its final size. The last infection is followed by the recovery of infected individuals only. After reaching the maximal size (maximal value of R + I), the remaining duration time depends on the recovery of infected individuals only. For non-homogeneous populations, the rate of the recovery process is determined by individuals with extremely low recovery probabilities per time step, which significantly contribute to the epidemic duration because the average time that an individual needs to recover is ∝1/P R . Such individuals always exist in inhomogeneous populations. Moreover, the growth in the distribution of P R width increases the chances of observing individuals with the very low recovery rates. This relation is visible in Figures 3-6 and Table 1, namely in situations when an outbreak is large. Then its duration is likely to be extended. The probabilities (per time step) of becoming infected P I and recovery P R have been adjusted at the beginning of simulations. Consequently, the same set of parameters has been used for all realizations. Nevertheless, we have checked whether the model properties change when P I and P R vary among realizations. We have not observed any differences between these two ways of assigning probabilities (results not shown).
In order to verify the role of boundary conditions, simulations with free and periodic boundary conditions have been performed. Boundary conditions have an observable influence when epidemics reach edges of the lattice. In this case, the epidemic size is characterized by a larger modal value for periodic boundary conditions than for free boundary conditions. Also for the periodic boundary conditions the modal value is reached by more outbreaks but fewer epidemics end at an intermediate size between 0 and the mode. This difference is related to the number of trajectories connecting individuals on the edge of the lattice with any other individual. Otherwise, quantitative differences between results for free and periodic boundary conditions are not significant, compare filled and empty points in Figures 3-6.
The condition when the epidemic outbreak is observed can be deducted from transmissibility [11,12]. The transmissibility Ψ (P I , P R ) is a probability that an epidemic is transmitted from an infected individual to a susceptible one, i.e. it is a probability of transmission along the link connecting these two agents. It plays a similar role to the bond or site opening probability in the percolation theory [33,34]. The transmissibility takes into account the infection and the recovery processes, as well as inhomogeneity of the population. If P I is the infection probability (per time step) and P R is the recovery probability (per time step), the transmissibility Ψ (P I , P R ) is In equation (4), (1 − P R ) k−1 P R is a probability that an infected individual stays infectious for k iterations and 1 − (1 − P I ) k is the probability of infecting a susceptible individual during these k consecutive iterations. An inhomogeneous system can be characterized by the average transmissibility where f (x, μ, σ) is the P I and P R distribution given by equation (3). The average transmissibility is calculated numerically. Figure 7 presents a percolation-like analysis corresponding to the results depicted in Figures 1 and 2. The figure shows two characteristics used in order to detect the percolation: the average epidemic size s and the probability that epidemics reaches a given radius, i.e. r = 50. Due to the fact that epidemics start from a single infected site, the average epidemic size is equivalent to the probability that a given site belongs do the largest (dominating) cluster, which is used as a standard percolation quantifier [34]. Figure 7 suggests that epidemic start to be observed for the average transmissibility located somewhere between the bond (0.5) and the site percolation (0.592) thresholds, which is in accordance with earlier theoretical and experimental considerations [11,12]. Mean values of the model parameters along with selected properties of final states are included in Table 1.
Complementary to the previous analysis based on infection and recovery rates, the average transmissibility can be used to explain phenomena depicted in Figures 3-6. The average transmissibility Ψ in the subsequent figures is equal to 0.58, 0.52, 0.52, 0.54, see Table 1 decreases the Ψ from 0.58 (a value close to the site percolation) to 0.52 (a value close to the bond percolation) explaining changes in epidemic severity as measured by the epidemic size distribution P (s). In Figures 5 and 6, the increase in σ R resulted in the increase of the average transmissibility. This in turn is responsible for higher chances of observing outbreaks characterized by larger radii and sizes.
The studied model displays non-trivial sensitivity to its parameters. Sometimes general conclusions can be difficult to draw because changes in heterogeneity usually also affect the mean values of parameters. Therefore, as indicated Changes in the average transmissibility responsible for differences among Figures 3-6 are due to changes in the average values of the infection P I and recovery P R probabilities per time step. These changes can be introduced by modifications of parameters of the probability density (3). Table 2 indicates how the average transmissibility Ψ changes with changes in σ I and σ R depending on the modal values μ I and μ R . The dependence of the average transmissibility on μ I and μ R is simpler: it is an increasing function of μ I and a decreasing function of μ R .
Within the model it is assumed that the probability per time step of infecting a susceptible individual is assigned to this particular susceptible individual. Therefore, we have considered the modified version of the model in which probabilities (per time step) to infect a susceptible agent are assigned to its infectious neighbors. In such a case the probability of infecting a susceptible individual i during a single iteration is not given by equation (1) but by the following formula where the interpretation of the parameters is the same as in equation (1). As it can be seen in Figure 8, the modified model leads to qualitatively the same results as the basic model, see equation (1) and Figure 5. On the one hand, the epidemic radius distribution P (r) and the mean first passage time to reach a given radius T (r) are practically the same. On the other hand, the size distribution P (s) and the first passage time T (s) to infect the domain of the size s are enlarged due to the increase in the epidemic size. This can be explained by the fact that the modified model results in a larger average infection probability per time step P S→I . If a susceptible individual i has n (n > 1) infected neighbors, P S→I = 1 − (1 − P I ) n for the modified model is larger than for the basic model P S→I = 1 − (1 − P I ) n . At the same time the average transmissibilities, see equation (5), stay the same.

Summary and conclusions
The SIR model [2] is an archetypal epidemiological model, which is almost a hundred years old. Despite its simplicity Fig. 8. The same as in Figure 5 for the modified model, see equation (6). Full symbols correspond to periodic while empty symbols to free boundary conditions. Simulation parameters μI = 0.1, μR = 0.5, σI = 0.5 and σR = 0.2.
the Kermack-McKendrick model can be used to describe complex and complicated situations even in realistic realms. Within the current manuscript, excessive computer simulations have been used to investigate how the heterogeneity of a population affects temporal and spatial properties of epidemics on lattices.
Variable infection and recovery rates result in nontrivial system properties. On the one hand, individuals which can be easily infected create paths along which epidemics can propagate. On the other hand, resistant individuals make impenetrable firewalls responsible for the creation of islands (patches) of uninfected individuals. Therefore, the system is sensitive to exact values of the mean infection probabilities per time step and (mean) transmissibilities. A minimal change in one of the system parameters can significantly modify mean values of parameters. Consequently, as it was demonstrated in Section 3, epidemic spread can be facilitated or reduced.
The distribution of epidemic radii has two maxima corresponding to epidemics which die out fast (small r) and those that propagate to peripheral parts of the system (large r). The mean first time that an epidemic needs to reach a given radius grows almost linearly with r. The epidemic size distribution always has a maximum corresponding to small outbreaks, but it can also have another local maximum corresponding to large epidemics.
Within the model epidemics end spontaneously, because there are no mechanisms of disease eradication. Further extensions of the model could introduce such mechanisms in order to provide answers to questions such as what the optimal control strategy is and how eradication strategies are affected by heterogeneity of individuals.
Computer simulations have been performed at the Academic Computer Center Cyfronet, Akademia Górniczo-Hutnicza (Kraków, Poland). Suggestions from Adam Kleczkowski are greatly acknowledged.