# Capturing the dynamics of pathogens with many strains

- First Online:

- Received:
- Revised:

DOI: 10.1007/s00285-015-0873-4

- Cite this article as:
- Kucharski, A.J., Andreasen, V. & Gog, J.R. J. Math. Biol. (2016) 72: 1. doi:10.1007/s00285-015-0873-4

- 1 Citations
- 2.1k Downloads

## Abstract

Pathogens that consist of multiple antigenic variants are a serious public health concern. These infections, which include dengue virus, influenza and malaria, generate substantial morbidity and mortality. However, there are considerable theoretical challenges involved in modelling such infections. As well as describing the interaction between strains that occurs as a result cross-immunity and evolution, models must balance biological realism with mathematical and computational tractability. Here we review different modelling approaches, and suggest a number of biological problems that are potential candidates for study with these methods. We provide a comprehensive outline of the benefits and disadvantages of available frameworks, and describe what biological information is preserved and lost under different modelling assumptions. We also consider the emergence of new disease strains, and discuss how models of pathogens with multiple strains could be developed further in future. This includes extending the flexibility and biological realism of current approaches, as well as interface with data.

### Keywords

Transmission model Evolution Cross-immunity Multi-strain pathogens Influenza### Mathematics Subject Classification

37N25 Dynamical systems in biology 92B Mathematical Biology## 1 Introduction

Many human pathogens can be categorized into distinct strains, each defined by its antigenic properties (Balmer and Tanner 2011; Grenfell et al. 2004). These infections, which include influenza (Webster et al. 1992; Wilson and Cox 1990), dengue virus (Rothman 2011) and malaria (McKenzie et al. 2008) are responsible for substantial morbidity and mortality each year. Further, prior infection with one strain of a disease may not always protect against another. For instance, as the influenza virus evolves, antibodies generated against a specific past strain become progressively less effective against the current one (Davenport et al. 1953; Potter 1979). This results in a highly complex system, with pathogens interacting through the partial cross-immunity they generate in the host population. Examining the effect of this interaction on disease outbreaks has therefore posed a major challenge, both theoretically and biologically.

Population dynamic models can generate insights into the mechanisms that drive the transmission of an infection, as well as test new hypotheses about evolution and immunity. In this paper, we review current research into diseases with many strains, providing a detailed comparison of available methods. We also aim to identify key topics for future research that will unify recent developments in the field, providing powerful tools with which to understand the evolutionary, epidemiological and immunological dynamics of these diseases.

## 2 Multiple-strain models

### 2.1 Influenza

Mathematical models have long been used been used to study disease transmission (Kermack and McKendrick 1927) and control (Ross 1911), but the interaction of disease strains through cross-immunity is a more recent area of research. Early models looked at competition between two strains; infection with one strain conferred immunity to the other for the duration of infection. Such models have been implemented in both discrete (Elveback et al. 1964) and continuous-time (Dietz 1979). Following this work, Castillo-Chavez et al. (1989) introduced a model in which one strain could give imperfect cross-immmunity to another. The work was motivated by the dynamics of influenza, and considered two interacting strains.

Although modelling studies have looked at multi-strain pathogens such as *Plasmodium falciparum* (Gupta and Day 1994; Gupta et al. 1994) or *Neisseria meningitidis* (Gupta et al. 1996; Buckee et al. 2008), influenza remains a central focus for theoretical work. As well as its impact on public health, with seasonal epidemics causing substantial morbidity and mortality, the virus undergoes frequent mutations, resulting in rapid turnover of seasonal strains, as well as between-subtype reassortment, which can lead to the occasional emergence of pandemic variants (Webster et al. 1992; Wilson and Cox 1990). Further, multiple influenza infections are possible during an individual’s lifetime, with a host’s history of infection and immunity determining the result of future exposures (Francis 1960). In turn, this collection of varying individual infection histories shapes the dynamics of the disease at the population level. Capturing the behaviour of diseases such as influenza can therefore require the use of a model that accounts for multiple strains.

One of the most detailed—and computationally intensive—of these frameworks is the individual-based model (Bedford et al. 2012; Ferguson et al. 2003; Tria et al. 2005), which tracks the infection history of every host, updating individuals’ immune status as the disease spreads and evolves during a simulation. Alternatively, population models, in which individuals are grouped into compartments, provide a way of exploring disease dynamics that is analytically tractable as well as easier to implement numerically. One of the earliest of these was the susceptible–infective–recovered (SIR) model (Kermack and McKendrick 1927), which considers the proportion of the population susceptible to, infectious with, and recovered from—and hence assumed immune to—a particular infection. However, the SIR model focuses only the dynamics of a single pathogen: it does not account for the evolving nature of the influenza virus. The susceptible-infective-recovered-susceptible (SIRS) model (Pease 1987; Girvan et al. 2002) can incorporate changes in immunity as a result of disease evolution by assuming individuals who are recovered gradually again become susceptible to the current circulating infection. This can be expanded further by including an additional set of ‘cross-immune’ individuals, resulting in the susceptible–infective–recovered–cross-immune (SIRC) model (Casagrandi et al. 2006). Although they can be explored analytically, the SIRS and SIRC models collect all information about population immunity into one or two variables, which means they do record information about the combination of past infections that generated this immunity.

As we move from an individual-based model, which keeps track of both infection and immune history, to a simpler system, we inevitably sacrifice information for tractability. Population models of multiple strains can therefore be classified by the information that they retain, and the information they do not.

### 2.2 History-based models

#### 2.2.1 Two strain model

*infectiousness*of hosts who have previously been infected with the other strain. We define \(\beta _i\) to be the rate of transmission for a primary infection with strain \(i\). Therefore \(\Lambda _2=\beta _2(I_2+ \sigma J_2)\). Finally, we assume that each strain confers total immunity to itself and the population birth/death rate is \(\mu \). With these assumptions, the model is as follows,

As additional strains are added, the complexity of this model increases substantially. For \(n\) strains, the model has \((n+2) 2^{n-1}\) variables (Andreasen et al. 1997). To simplify the model, we can assume that individuals obtain an updated infection history immediately upon infection (Gupta et al. 1996; Ferguson and Andreasen 2002). This is equivalent either to assuming that hosts are immediately available for further infection—and hence superinfection is implicitly allowed—or having a mathematical approximation in which hosts spend a negligibly small part of their lives infected (i.e. \(\mu /\gamma \) is small).

We define \(\hat{I}_i=I_i+\sigma J_i\) to be the weighted proportion of the population who contribute to the force of infection for strain \(i\). Because infection history is updated immediately, individuals who are in the \(\hat{I}_i\) compartment are always in one of the \(S\) compartments too. Hence \(S_\emptyset +S_1+S_2+S_{12}=1\).

#### 2.2.2 Extension to multiple strains

To include the reduced transmission assumption, we define \(\sigma (Y,i)\) to be the relative contribution to the force of infection for strain \(i\) from individuals who have infection history \(Y\). We require \(\sigma (\emptyset ,i)=1\) for any \(i\), so total lack of acquired immunity results in transmission at rate \(\beta \), and \(\sigma (Y,i)=0\) if \(i \in Y\), to prevent transmission of an already seen strain.

### 2.3 Model dimension

The drawback with history-based models is the sheer number of possible variables that the system generates: given \(n\) strains, there are \(2^n\) combinations of infection an individual could have seen (Andreasen et al. 1997). Although the dynamics of up to ten strains have been examined using a full history-based framework (Gomes et al. 2002), it has been technically challenging to explore more strains than this.

One alternative is to focus on the equilibrium dynamics of a completely symmetric system, where all strains have the same epidemiological properties (Abu-Raddad and Ferguson 2004). Models with this very specific structure are not intended as a system to be fitted to disease data, but rather make it possible to make some analytic progress on the problem of strain complexity by exploring symmetric extreme cases. Alternatively, reduced versions of the history-based model, described in the following sections, can be used to explore certain aspects of the system in a tractable way.

### 2.4 Model reduction via symmetry

It is possible to use the symmetry of the strain space to reduce the number of variables in Eqs. 15–16. For instance, suppose there are three strains, with strains 1 and 3 giving the same degree of cross-immunity to strain 2, but no cross-immunity to each other. If strains 1 and 3 have the same epidemiological properties, they can therefore be recorded as one variable in the system (Lin et al. 1999).

Next, suppose each strain is defined by two loci and two alleles, which is analogous in terms of symmetry to a circle of four strains (Gog and Swinton 2002). If cross-immunity only acts to reduce transmission (i.e. \(\tau \equiv 1\)), then Gupta et al. (1998) showed that symmetry of this strain space can be exploited to define the model using only 8 immunity variables, rather than \(2^4=16\). This approach was an extension of an earlier, approximate method, which used four immunity variables (Gupta et al. 1996). The approach can also be extended for strains with multiple alleles (Gupta et al. 1998).

The same type of reduction may be exploited more generally if we assume cross-immunity between strains—as measured by antigenic similarity—is consistent with a space in which a given set of strains can be organized into antigenic ‘neighbourhoods’ (Ferguson and Andreasen 2002). For each strain \(i\), define \(\{i\} = N_0(i) \subseteq N_1(i) \subseteq \dots \subseteq N_m(i)=\mathcal {N}\) to be a nested sequence denoting collections of strains that are within a particular antigenic distance of strain \(i\). At one end, we have the strain \(i\) itself, \(\{i\}\); at the other, the complete set of all strains \(\mathcal {N}\).

Reduction in model dimension via symmetry requires three main sacrifices to be made. First, cross-immunity must take the form of reduced transmission; a similar general reduction is yet to be achieved for a model with reduced susceptibility. The cross-immunity function is also constrained by the set of antigenic neighbourhoods, with the reduction dependent on the use of a minimum function, rather than a generic \(\sigma (Y,i)\) cross-immunity term. Finally, without information on individuals’ full infection histories, it is not possible to ascertain the level of immunity against new strains. For instance, suppose a novel strain \(z\)—not previously included in Eqs. 19–20—were introduced to the population. Using the information stored in the \(2^n\) variables, it would be possible, via the \(\sigma (Y,z)\) function, to calculate susceptibility to the new strain in the full history based model. However, in this reduced framework there is no way of constructing the new \(\hat{S}_z^k\) compartments from the existing set of variables, \(\{ \hat{S}_i^k ~|~ i \in \mathcal {N}, k=0,\dots ,m\}\).

### 2.5 Model reduction via age structure

*at least*strain \(i\). Formally,

Although Eqs. 23–24 only depend on \(i\) explicitly, \(Q_i\) is a function of all \(2^n\) sets, and unless each \(S_Y\) can be expressed in terms of \(\hat{S}_i\) variables, the system is intractable. One solution is to introduce age structure (Kucharski and Gog 2012a).

If immunity acts only to reduce transmission, one might naively expect the probability of having been infected with any two particular strains to be independent: infection with the first strain will not change the rate at which hosts become infected with the second, just the rate at which they transmit. Hence in a two-strain model, might expect \(S_{12}=S_1 S_2\). However, if we know a randomly chosen host has previously been infected with the first strain some point in their life, it means they are more likely to be old than young. Hence they are more likely to have also experienced another specific event in the past, such as infection with the second strain. This means \(S_{12} \ge S_1 S_2\).

As well as the necessary reduced transmission assumption, which was also required in the previous section, there are two additional drawbacks to the age-structured approach. The introduction of age dependency increases model complexity, requiring a system of PDEs rather than ODEs, making it challenging to obtain analytic results, and the requirement that infection with each strain is independent for a specific age group also limits the type of population structure that can be imposed. Although Eqs. 28–29 are still valid if more realistic transmission between age groups is introduced (Kucharski and Gog 2012c), a metapopulation framework, for example, with commuting between patches would not be possible because individuals arriving from different subpopulations may have previously been exposed to different strains (Wikramaratna et al. 2014).

### 2.6 Status-based models

The full individual-based model records both the infection history and current immune status of each host. However, there may not be a straightforward relationship between the two: for influenza, infections may not always produce an immune response, and immunity to a certain strain could potentially be generated by one of several past infections (Potter 1979). In principle, it should be possible to develop a compartmental model that accounted for both infection history and immune status. However, in practice the number of possible combinations of infection history and immune status—and hence compartments required—would likely result a model more complex than even a full individual-based framework. To ensure analytical and computational tractability, history-based models therefore capture the individual infection histories in a population, but not the immune statuses; status-based models (Gog and Swinton 2002) do the opposite, recording the current immune status of individuals in the population, but not the combination of past infections that generated that immunity.

In a history-based framework, partial cross-immunity must take form of every individual being equally partially immune to a particular strain. In other words, it is assumed that individuals with the same infection history will respond to subsequent infection in an identical way: if the set of strains \(Y\) have been previously seen, they will transmit strain \(i\) with probability \(\sigma (Y,i)\).

An alternative assumption is that upon infection some individuals become completely immune, while the rest remain susceptible. This is known as ‘polarized immunity’ (Gog and Swinton 2002). It is possible to include this assumption if the model is ‘status-based’, with compartments that represent which strains an individuals is totally immune to. The assumption of polarized immunity is not essential in a status-based model (see below), although it does serve to give perhaps the simplest verbal interpretation of the system and also illustrates a type of model that cannot be captured in a history-based framework.

In an equivalent interpretation of Eqs. 33–34, the variable \(\theta _i\) can be thought of as tracking the ‘effective susceptibility’ of the population to strain \(i\). Under the reduced transmission assumption, this means the contribution to force of infection if the full population were infected.

Such an interpretation includes the possibility that individuals are partially immune, and hence upon infection would transmit at a lower rate. These individuals are represented in the model by being partially in the \(\theta _i\) compartment (Gog 2008). In the reduced transmission status-based model, \(\theta _i\) can therefore be thought of as a sum of the population proportions weighted by their relative transmission potential for strain \(i\).

### 2.7 Comparison of models

#### 2.7.1 Model structure

In mathematical modelling of biological systems, there is often a need to balance complexity, particularly the number of variables in a model, with the ability to include biologically realistic assumptions. When evaluating strain models here, we also consider whether a model can incorporate the addition of new strains mid-way through a simulation without recording additional variables in advance.

Comparison of different models for \(n\) strains

Model | Type | Variables | New strains | Immunity reduces |
---|---|---|---|---|

Individual-based model | – | Many | Yes | Susceptibility/transmission |

Andreasen et al. (1997) | HB | \(\mathcal {O}(2^n)\) | Yes | Susceptibility/transmission |

Gupta et al. (1998)\(^*\) | HB | \(\mathcal {O}(n)\) | No | Transmission |

Kucharski and Gog (2012a) | HB | \(\mathcal {O}(n)\) | Yes | Transmission |

Gog and Swinton (2002) | SB | \(\mathcal {O}(2^n)\) | No | Susceptibility |

Gog and Grenfell (2002) | SB | \(\mathcal {O}(n)\) | No | Transmission |

Kryazhimskiy et al. (2007) | SB | \(\mathcal {O}(n^2)\) | No | Susceptibility/transmission |

From a biological point of view the assumption of reduced transmission can be awkward (Ballesteros et al. 2009; Kryazhimskiy et al. 2007). This is because upon infection we expect two events: the host becoming ill and transmitting the disease, and the production of antibodies by host’s immune system. If the host already has immunity to that strain, their current antibodies might block infection without transmission or production of new antibodies occurring. Under the reduced infectivity assumption, immunity prevents an infected host from transmitting the virus, but does not prevent additional gain of immunity. This could lead to an overestimate of population immunity (Ballesteros et al. 2009). Despite this potential caveat, however, the dynamics of the history-based model appear to relatively insensitive to whether immunity is assumed to reduce transmission or susceptibility (Ferguson and Andreasen 2002).

There is also the issue of whether cross-immunity is more plausible as it appears in a history-based model, or a status-based model with polarised immunity. Comparing to model output with the observed evolutionary dynamics of influenza can provide some insights (Ballesteros et al. 2009). Although it has been suggested that antigenic cluster replacement cannot occur in a simple SIR model (Gökaydin et al. 2007), the results of Ballesteros et al. (2009) imply that it is possible in a status-based model with reduced transmission, but not in reduced susceptibility models, or a reduced transmission history-based model: in these, punctuated antigenic evolution results in too high a depletion of susceptibles. In addition, there appears to be a fundamental difference in the dynamics of the status-based and history based-models, with oscillations absent in the status-based framework (Dawes and Gog 2002). The precise assumptions that lead to oscillations in different strain models are yet to be established, however, and the determining factors in a mathematical framework may not have comprehensible analogues in our interpretations of the ‘biology’ of the model (Dawes and Gog 2002).

These discrepancies illustrate the importance of understanding how different assumptions about cross-immunity affect model outputs. Biologically plausible assumptions do not necessarily generate biologically plausible dynamics, and vice versa. Moreover, choosing between two biologically distinct assumptions—such as reduced susceptibility and transmission—can sometimes have a negligible effect on model dynamics. Strain models inevitably have to balance realism with tractability; it is therefore important to know how different simplifications and assumptions influence model predictions.

In some cases, it is possible to incorporate additional realism without substantial additional complexity. History-based models require that all individuals with a particular infection history respond to a new strain in the same way. In contrast, status-based models using polarised immunity can include heterogeneity in immune response between individuals, as can individual-based models. For history-based models, one way to incorporate heterogeneity is to explicitly group hosts by a characteristic such as genotype (Gupta and Galvani 1999). Alternatively, if the characteristic of interest is age, differences could be explored in an age-structured reduced model: individuals are already grouped by age, and cross-immunity could therefore be defined as a function of age as well as infection history. Such a framework could be used to explore the phenomenon of ‘immunosenescence’, whereby the elderly exhibit a weaker immune response than younger groups (Caruso et al. 2009).

Further, in status-based models probability distributions have been used to vary the transition from one immune status to another (Cobey and Pascual 2011). This idea could also be used in history-based models, by treating the cross-immunity parameter as a random variable. There is also potential for this approach to be combined with within-host approximations (Pepin et al. 2010; Volkov et al. 2010). This would allow for more detailed exploration of how assumptions about the immune system affect population level dynamics.

#### 2.7.2 Biological applications

As well as depending on the assumptions that can be included, choice of modelling formulation will be influenced by the biological question being addressed. If the aim is to understand how cross-immunity affects the dynamics of the infection, models need to be sufficiently simple to simulate the number of hosts infectious with each strain at each point in time. Knowledge of the precise population history of infection and immunity is not required, as long as disease incidence is recorded. Some studies have used such models to examine the extent of antigenic variation over time, and the antigenic relationship (‘strain structure’) between co-circulating strains (Gog and Grenfell 2002; Gomes et al. 2002; Gupta et al. 1996, 1998). Other studies have looked factors that can generate oscillations in strain incidence (Castillo-Chavez et al. 1989; Dawes and Gog 2002), or the frequency at which epidemics occur (Andreasen 2003).

Previous comparisons of strain models have generally focused on the dynamics of a small number of strains (Dawes and Gog 2002; Ferguson and Andreasen 2002; Ballesteros et al. 2009). However, one of the strengths of reduced frameworks is that they allow a much larger strain space to be explored. The price of this simplicity is usually information: it is often not possible to introduce a novel strain and use existing variables to calculate immunity against it (Table 1).

There are several biological questions which require effective tracking of population immunity. To examine whether a new strain can replace endemic strains, it is necessary to record the changing immune structure of the population in a tractable model. This can either be achieved by focusing on a small number of strains (Ballesteros et al. 2009), or by making simplifying assumptions about the cross-immunity function (Andreasen and Sasaki 2006; Boni et al. 2004).

History-based based models, which record the possible infection histories in a population, can also be compared with observed serological data to understand how individual- and population-level factors shape antibody responses over time (Kucharski and Gog 2012c). Using models that can calculate cross-immunity against unseen strains, it should also be possible to examine the introduction of novel strains similar to those that have previously circulated, as happened with the 2009 influenza pandemic (Miller et al. 2010; Xu et al. 2010).

## 3 Incorporating pathogen evolution

### 3.1 Stochastic emergence

New influenza strains emerge frequently through mutations with antigenic effects (Both et al. 1983). When modelling influenza evolution it is therefore necessary to consider the random nature of the mutation and emergence process, and integrate this with a model of the epidemic dynamics. In an individual-based model (Bedford et al. 2012; Ferguson et al. 2003; Tria et al. 2005), both processes can be modelled explicitly: within each infected individual, a strain may mutate with a certain probability, with the new infection either taking off or failing to emerge as a result of stochastic transmission in the population. However, emergence of new strains can also be included in the reduced frameworks described in the preceding sections.

As with choice of strain model, selecting an appropriate evolution framework depends on the biological dynamics of interest, and on the corresponding processes that are likely to be influenced by stochasticity. One study used a stochastic status-based model with evolution represented by an explicit genotype-to-phenotype map to investigate observed patterns of influenza diversity and strain replacement (Koelle et al. 2006). Alternatively, the evolutionary process can be implemented in a stochastic two-tiered model, with one tier representing population dynamics, and the other molecular evolution (Koelle et al. 2010). Such an approach makes it possible to model entire genetic sequences in a computationally viable way, and hence generate phylogenies that can be compared quantitatively with observed data.

If the information of interest is the speed of antigenic drift rather than the specific evolutionary trajectory of the virus, a simpler approach is to use a stochastic proxy for the evolution process, based on a probability distribution, along with a deterministic status-based (Koelle et al. 2009) or history-based (Minayev and Ferguson 2009) model for the epidemic. Such models assume mutation is stochastic, but do not include the possibility of extinction at start of an epidemic as the result of a stochastic transmission process. Branching processes can be used to approximate the stochastic transmission—or extinction—that can occur as a result of the small number of people initially infected with a new strain (Gog 2008; Kucharski and Gog 2012b). Such models assume that virus mutation is deterministic, but that transmission is initially stochastic when a new strain emerges.

### 3.2 Separating epidemics from evolution

As there is evidence that temperate regions are annually ‘seeded’ with influenza after low levels of prevalence over the summer (Nelson et al. 2006), it is reasonable to consider the epidemic process during an influenza season separately from the evolutionary process between seasons.

Further, under the single season approach evolution need not be independent of the epidemic process: Boni et al. (2004) explored a framework in which larger epidemics generated more antigenic drift, showing that a positive feedback can occur between the number of cases and number of new variants. Such a model can also be extended to examine the relationship between viral fitness and drift (Boni et al. 2006).

### 3.3 Mutation-free approaches

It is also possible to model aspects of influenza dynamics without considering an explicit mutation process. Competition between strains resulting from cross-immunity has been seen to generate oscillations in disease incidence (Gupta et al. 1998; Lin et al. 1999) as well as sequential outbreaks of antigenically diverse pathogens (Recker et al. 2007). Further, the additional model tractability in the absence of a mutation process makes it possible to derive expressions for the conditions needed for invasion of new strains (Adams and Sasaki 2007, 2009) and transitions in epidemic dynamics (Blyuss and Gupta 2009; Blyuss 2012).

## 4 Interface with data

### 4.1 Infection and immunity

Tractable disease models have the advantage of being quick to simulate, which means they can be incorporated into inference frameworks. With the increasingly availability of detailed serological and social contact data (Conlan et al. 2010; Lessler et al. 2011; Mossong et al. 2008), multi-strain models are a promising tool with which to understand the processes behind infection and immunity for diseases such as influenza.

Exploring the age pattern of immunity to seasonal influenza with such models, it has been shown (Kucharski and Gog 2012c) that observed data are best explained with a model that uses physical contacts and incorporates the phenomenon of ‘original antigenic sin’, whereby the first infection of a lifetime inhibits subsequent acquisition of immunity (Francis 1960). To model original antigenic sin, it is necessary to know the order of the strains a host’s infection history; in particular, the antigenic properties of the first infection of a lifetime. Such information is available for a single season model, as the framework is designed so that one specific strain is introduced each year (Andreasen 2003). However, it would less straightforward to keep track of strain order if multiple strains were co-circulating.

Recent empirical studies have suggested a pattern of ‘antigenic seniority’ for influenza, with antibody titres higher to ‘senior’ strains seen earlier in life (Lessler et al. 2012; Miller et al. 2013). It has also been noted that elderly individuals have fewer naive B cells (Weinberger et al. 2008). There may well be a unifying mechanism behind these disparate observations; identifying it would greatly improve our understanding of how populations build immunity to diseases like influenza over the course of a lifetime.

The interaction of different strains through cross-immunity can also be examined using disease case data. Inference methods based on Sequential Monte Carlo algorithms have recently been developed for two-strain models (Shrestha et al. 2011); the natural next step would be to scale up this approach to explore a larger number of strains.

### 4.2 Other pathogens

Many multi-strain modelling studies have focused on influenza (Andreasen et al. 1997; Gog and Grenfell 2002; Ferguson et al. 2003), *Plasmodium falciparum* (Gupta and Day 1994; Gupta et al. 1994) or *Neisseria meningitidis* (Gupta et al. 1996; Buckee et al. 2008), but high-dimensional frameworks are likely to be relevant to other pathogens as well. Phylogenetic analysis has shown that dengue viruses (DENV) can be subdivided into four distinct groups (Grenfell et al. 2004), known as serotypes. When primary infection with one DENV serotype is followed by a secondary infection with a different DENV serotype, it can result in severe disease. It has been suggested this is caused by ‘antibody-dependent enhancement’, with prior immunity promoting rather than suppressing replication of the second virus (Dejnirattisai et al. 2010).

Models with up to four strains have previously been used to examine the effects of antibody-dependent enhancement on epidemic patterns (Nagao and Koelle 2008; Recker et al. 2009; Wikramaratna et al. 2010) and serotype diversity (Ferguson et al. 1999; Kawaguchi et al. 2003; Cummings et al. 2005). However, DENV shows evidence of genetic variation within serotype, and prior infection with different variants does not always result in the same response to a subsequent infection (Watts et al. 1999). The severity of disease is therefore likely determined by both genetic variation and serotype-specific immunity (Ohainle et al. 2011). High-dimensional strain models have been used to address a number of questions about influenza; interfaced with appropriate data, they would also be a natural tool with which to explore DENV.

### 4.3 Evolutionary dynamics

Multi-strain models can also be a useful tool for exploring disease ‘phylodynamics’: the interaction between pathogen evolution and population-level transmission (Grenfell et al. 2004). It can be challenging to perform statistical inference with such frameworks, however, as model complexity often means it is not possible to derive a likelihood function that incorporates both genetic and population-level data.

One solution is to focus on simple transmission models. For example, statistical inference for seasonally-forced SIR models can be performed using disease case data and sequence data (Rasmussen et al. 2011). Using techniques such as particle Markov chain Monte Carlo (Andrieu et al. 2010), such frameworks can be used to estimate key epidemiological parameters. Alternatively, approximate Bayesian computation (ABC) can be used to compare multi-strain models with case reports and sequence data when the likelihood function is intractable (Ratmann et al. 2012).

Combining such data in a modelling framework presents a number of theoretical challenges. First, to translate model variables into quantities that can be measured empirically, multi-strain models need to be combined with an observation process. For example, given the increasingly availability of serological data, combining a statistical model of antibody titres with a transmission model would make it possible to investigate population level dynamics using data on individual-level titres. Second, we need to examine the factors that influence evolution at different temporal and spatial scales. In particular, models could be used to explore how selection pressure acts on a virus both in terms of bottlenecks during transmission between hosts and the background of prior population immunity. Models of multi-strain pathogens could also be used to understand how within-host immune dynamics influence the acquisition of immunity over a lifetime, and hence the evolutionary trajectory of a disease. By developing such models in a way that makes them easily compatible with data, there is potential to substantially improve our knowledge of how population structure, evolution and immunity contribute to observed disease dynamics.

## Acknowledgments

The authors would like to thank two anonymous referees for their helpful comments. AJK is supported by the Medical Research Council (fellowship MR/K021524/1). AJK and JRG are supported by the RAPIDD program of the Science & Technology Directorate, Department of Homeland Security, and the Fogarty International Center, National Institutes of Health.

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.