Co-colonization interactions drive explicit frequency-dependent dynamics among N types

A fundamental question in ecology is the generation and maintenance of biodiversity. Classical approaches consider multi-species communities with Lotka-Volterra ODE models where inter- vs. intra-species interactions are key. Typically in high-dimensional systems, analysis is hard and model reduction is needed. Here, we describe and study a new system of multi-type interactions that arise in co-colonization, and develop a useful model reduction framework based on slow-fast dynamics under quasi-neutrality. We show that in a multi-type, Susceptible-Infected-Susceptible (SIS) system with co-colonization, neutral coexistence dynamics between N closely related strains occurs on a fast timescale, based on mean trait values, whereas the non-neutral selective forces act on a slow timescale, driven by the variance and asymmetries of the co-colonization interaction matrix. The explicit N equations for relative type frequencies on the slow manifold enable efficient computation of complex multi-strain dynamics, and provide analytical insight into high-dimensional community ecology, with potential applications to other multi-body systems.


Introduction
A fundamental question in ecology and evolutionary biology relates to how biodiversity is generated and maintained [1,2,3,4]. Many factors have been shown to be important, including inter-vs. intra-species interactions [5], population size [6], number and functional links with resources [7], and movement in space [8]. Most research in this field has considered multi-species interactions with classical Lotka-Volterra models or with evolutionary game theoretic approaches.
In this Letter, we describe and study a new system of multitype interactions that arise in the epidemiological dynamics of co-colonizing strains of the same microbial species. An example of such a system could be polymorphic Streptococcus pneumonia bacteria. We generalize a previously-introduced Susceptible-Infected-Susceptible (SIS) framework for 2 circulating strains with co-colonization [16,9], to the context of N strains. We show that an explicit reduced system emerges from a time scale separation, expressing the total dynamics as a fast plus a slow component, related to broken symmetries in cocolonization interactions.
In our system, microbial strains can infect a host simultaneously, and here we concentrate only on the case of up to 2 strains co-colonizing a host, whereby single colonization by a resident strain alters the susceptibility to incoming strains, (increasing or decreasing it) by a factor K i j , relative to uninfected hosts, without acquired immunity. The transmission and clearance rates of all strains are equal, except for the interactions in co-colonization given by an N × N matrix. In earlier theoretical studies such interaction has been studied in a slightly different context in models considering arbitrary infection mul-tiplicity [10]. Later evolutionary frameworks have also considered vulnerability to co-infection [11]. In the experimental literature, for example in microbiota studies, the phenomenon of colonization resistance has been reported [12]. In pneumococcal bacteria, more relevant to our model, carriage of one serotype has been shown to alter the acquisition rate of a second serotype [13]. Some studies have shown how this trait at the host-pathogen interface impacts disease persistence [14], coexistence and vaccination effects [15,16], and contributes to diversity in other traits, e.g. virulence [17] and antibiotic resistance [18].
Yet, robust mathematical frameworks to analyze the full spectrum of eco-epidemiological patterns emerging exclusively from co-colonization interactions among a high number of strains remain undeveloped. Here, we fill this gap. We model an arbitrary number of closely-related strains, and show that in the N-strain system with altered susceptibilities in co-colonization, neutral coexistence dynamics between types occurs on a fast timescale, whereas the non-neutral selective forces act on a slow timescale, driven explicitly by trait variation in the cocolonization interaction space. These dynamics can be very complex, and are nonlinearly modulated by global parameters such as overall transmission intensity R 0 , and mean interaction coefficient between strains. We derive a closed analytic solution for relative type frequencies over long time-scales in a changing fitness landscape.
(SIS) epidemiological dynamics, with the possibility of cocolonization. The number of potentially co-circulating strains is N. With a set of ordinary differential equations, we describe the proportion of hosts in several compartments: susceptibles, S , hosts colonized by one type I i , and co-colonized hosts I i j , with two types of each combination, independent of the order of their acquisition. We have: where gives the force of infection of strain i. Remark that here we assume that hosts colonized with a mixture of two subtypes i and j transmit either with equal probability. For more information and interpretation of parameters see Table 1 (Supplementary materials 1). Notice that S = 1 − (I i + I i j ), thus the dimension of the system is indeed N + N 2 . In practice, I i j and I ji cannot be distinguished so the dimension is effectively N + N(N − 1)/2. For the special case of N = 2, this model has been described and analyzed elsewhere [16,9]. We study pathogen diversity only in relation to how strains interact with each other upon co-colonization (K i j ), assuming equivalence in transmission β and clearance rate γ (although extension to further asymmetries in these traits is ongoing). The coefficients K i j can be above or below 1, indicating facilitation or competition between any two types in co-colonization. In the above notation m = γ + r, encapsulating both clearance rate γ of colonization episodes and recruitment rate of susceptible hosts r, equal to the mortality rate from all compartments (r = d). For a model summary diagram see  The force of infection for each strain i is F i , and governs the transition from susceptible to singly-colonized state, and from singly-colonized to co-colonized state. Clearance happens at equal rate for single and co-colonization, back to the susceptbile state. Co-colonization rate by strain j of singly-colonized hosts with i is altered by a factor K i j relative to uncolonized hosts. b) Considering K i j = k + εα i j , the global epidemiological dynamics can be decomposed in a fast and slow component. On the fast time-scale (o(1/ε)), strains follow neutral dynamics, driven by mean parameters, where S , I and D are conserved. On a slow time-scale, εt, complex non-neutral dynamics between strains takes place, depicted here by the constituent variations within the blue and green.

Re-writing the system in aggregated form
In order to derive a reduced model for such a general N strain system, displaying complicated dynamics (Fig. 1b), we use the following notation: for the prevalence of strain i in the population, and the total prevalence of all strains, respectively. Total prevalence satisfies T = i J i and the forces of infection are: F i = βJ i . With these notations, the system (1) reads : This system of 2 + N + N 2 equations admit a triangular structure: i) First we describe block of 2 equations (S , T ). ii) Next, we study the block of 2N equations (I i , J i ), which is the most complicated. iii) Lastly, we deal with the block of the N 2 equations of I i j , which is simple once the dynamics of I i and J i are known. In the following we describe briefly these three blocks.
Clearly, if the basic reproduction number R 0 = β m > 1, then Hence, without loss of generality, we consider that (S , we obtain the following sub-system of 2N differential equations: The dynamics of (4) are complex and we use a slow-fast approach to analyze it.
The co-colonization block I i j . This is the simple systeṁ Once it is known that Thus, once the dynamics (4)-(5) are explicit, so are the dynamics of (1).

Slow-fast representation of (4)
Since we model N closely-related strains, we can write each co-colonization coefficient as: Replacing these in (4), and re-arranging, we obtain: where 1 ≤ i ≤ N, and I = I j .

Neutral system
If ε = 0, then from above, we obtain the Neutral model The first equation gives Co-colonization prevalence is simply derived as: D = T − I. Fixing I = I, yields the (degenerate) linear system: Matrix A 0 has the two eigenvalues 0 and −ξ = tr(A 0 ) < 0, and We define H i and z i as: We have Thus remains constant in a neutrally stable manner. H i measures the difference between the part of strain i on the single infected versus the part of the strain i on the total number of infected. Thus, H i → 0 means that the proportion of strain i in single colonization, tends to be the same as the proportion of strain i in the total mass of all colonization. This implies it also tends to be equal to the proportion of strain i in co-colonization. Because on the slow manifold, we have H i = 0, we can infer that during fast dynamics z i tends to: whereby on the slow manifold z i describes exactly the frequency of strain i among all hosts that are colonized, with z i = 1.
The slow-fast dynamics In (6) I plays a role in the order 0 term of the last two equations. Thus, we use the ansatz: I(t) = I + εX(t) + O(ε), and rewrite (6) (for details see Supplementary Materials 2). After some algebraic manipulations, using newly defined variables we obtain the explicit slow-fast system: . However, on the slow time scale τ = εt in (11) , z i is not constant, and letting ε → 0, one obtains the slow dynamics on z i , given by the N-dimensional system: for 1 ≤ i ≤ N, where the constants Θ, µ > 0 are: , and the term q(z) is given by: The constant Θ is the rate that sets the tempo of multi-strain "motion" on the slow manifold towards an equilibrium. Notice that Θ depends specifically on the absolute transmission rate of the pathogen β, but also on mean traits R 0 and k, via the conserved aggregated quantities T, I, D. The quantity µ on the other hand represents the ratio between single colonization and co-colonization prevalence from the neutral system, a factor that amplifies the importance of type asymmetry. The function q(z) represents the time-changing impact of all the strains on their environment, which in turn modifies the fitness landscape for each in an equal manner. A more explicit way to see q(z) is in terms of relative change in 'effective' mean interaction coefficient between all extant types in the system, which if negative, indicates a global trend toward more competition in co-colonization, and if positive, indicates a global trend toward more facilitation. At this stage, we can apply quasistationarity methods (see [19,20,21,22] ) to show that the solution of (11) tends to the solution of (12) as ε → 0 locally uniformly in time on [t 0 , +∞[ for some t 0 > 0.
Link with pairwise invasion fitnesses λ j i Next, we provide another equivalent very useful representation of the model by using the notion of pairwise invasion fitness [23]. Let λ j i be the exponential growth rate of the strain i evaluated when introduced at the trivial endemic equilibrium of the strain j alone. If the fitness λ j i > 0, strain i will invade, and viceversa. By considering the rate of growth of strain i in an equilibrium set by j (dz i /dt (12) in the special case where all z k = 0 for k j), we find the exact formulation of pairwise invasion fitness is given by: We can thus recast the system (12) using only these fitness notations (for details see Supplementary Material 3) : The above can be made more compact by denoting Λ = (λ j i ) i, j the pairwise invasion fitness matrix, and using vector notation: It is this matrix Λ that defines all 'edges' of the rescaled interaction network between N strains. For N = 2, similar to the classical Lotka-Volterra model, in our model, as already shown [9], there are only four possible outcomes between 2 strains (edge linking 1 and 2): i) λ 2 1 > 0 , λ 1 2 > 0 : stable coexistence of 1 and 2; ii) λ 2 1 < 0 , λ 1 2 < 0: bistability of 1-only and 2-only; iii) λ 2 1 > 0 , λ 1 2 < 0 : 1-only competitive exclusion; iv) λ 2 1 < 0 , λ 1 2 > 0: 2-only competitive exclusion. Knowing all pairwise invasion fitnesses between each couple of strains, via (15) we can reconstitute the ultimate dynamics of the full system with N types and co-colonization.
Going back to the original N + N(N − 1)/2 variables Recall the original system with N strains is given by the SIS model with co-colonization interactions (1). While the decomposition K i j = k + εα i j is mathematically non-unique, and can be applied with respect to any reference k, provided that the resulting ε is small, a possible convenient choice is to define k as the average of the original interaction matrix entries K i j : and to define deviation from neutrality, ε, as the root mean square distance of the K i j from the mean interaction coefficient: thus representing the standard deviation of the K i j traits. The direction of deviation from neutrality (bias) for the interaction between strain i and j is then obtained as: Thus, the matrix A = α i j 1≤i, j≤N is the standardized interaction matrix, with A 2 = 1≤i, j≤N α 2 i j = N. This matrix A, and the ratio µ, determine the pairwise invasion fitness matrix (13), which contains nearly all the qualitative information about the non-neutral dynamics (15). So far, we have shown that provided the deviation from neutrality, ε, is small, the behavior of (12) describes very well the long term dynamics of (1). To recover the original variables, after solving for strain frequencies z i on the slow manifold, as the dynamics of (1) are well approached by those of (12), we can use the relations: to obtain the total prevalence of susceptibles S , total prevalence of colonized individuals T , and prevalence of single colonization I. The prevalence of co-colonization is simply D = T − I. Further, to recover strain-specific single colonization, and cocolonization prevalences, we apply: where the slow time scale is τ = εt, and the slow variables z(τ) = (z i (εt)) i verify i z i (εt) = 1 and follow dynamics (15). Figure 3: Accuracy of our approximation over time a) Error between the neutral model and the full system, starting from random initial conditions. We summarize simulations for different ε, where for each ε, 20 co-colonization matrices of different size (N ∈ [2,15]) were randomly generated, together with initial conditions. On a fast time-scale (o(1/ε)) the neutral model is a very good approximation to the real system. b) Error of the neutral and the slow dynamics approximation, starting from the slow manifold (10). The error (mean over all model variables) between the original system trajectories and the neutral vs. slow dynamics. We generated random co-colonization matrices (N ∈ [2,15]) for 3 values of ε (10 −1 ,10 −2 ,10 −3 ) while keeping mean k = 1. The K i j were drawn from the normal distribution N(k, ε 2 ). On the slow manifold, the slowdynamics approximate well the original system, unlike the neutral model.
In Figure 2 and in Supplementary Movie 1, we provide illustrations of the modeling framework and some dynamics that are possible for an arbitrary number of strains. Some further illustrations for other values of parameters are given in Supplementary Figure 1 and Supplementary Movie 2.

Accuracy of the approximation and simulation speed
The slow-fast model being a good approximation of the original system, more precisely, means that there exists t 0 > 0 such that for any T > t 0 and any ε > 0 small enough, there is M > 0 such that for any τ ∈ (t 0 , T ), the error between the two is small: We confirm this with numerical simulations in Figure 3a)-b), where the neutral model is shown to be a good approximation of the original system in a fast time-scale, and the slow-dynamics reduction a good approximation on the longer time-scale. Furthermore, we verify that the slow-fast approximation enables exploration of dynamics in a much more computationally efficient manner over a very large number of strains N (Fig.4), which is difficult with the original system.

Discussion
In this study, we investigated how co-colonization interaction networks drive coexistence in a system with N closely related types. We derived a slow-fast decomposition for the dynamics, which links explicitly variation in the interaction matrix with the slow time scale under which multi-type selective dynamics unfold. Model reduction for small deviations from neutrality allowed us to express strain frequencies on the slow manifold via only N equations (instead of N + N(N − 1)/2 equations in the full system). These can be easily re-mapped to the more complex epidemiological variables of the original system. The slow variables z i (1 ≤ i ≤ N), describing relative strain frequencies in the host population, necessarily equal in single and co-colonization, could be used as a practical test for quasineutrality, when strain prevalence data are available.
Our focus here was on presenting the timescale decomposition method for this system. The wider and more complete ecological picture, as well as the analysis of diversity-stability relationships in coexistence will be the focus of another paper [24]. We believe this work makes a crucial step towards the full characterization of conservative multi-type SIS dynamics.
The slow-fast framework allows to reduce the complexity of multi-strain systems, and understand transient and long-term coexistence on conserved manifolds near neutrality. We expect our results to stimulate further progress for the investigation of coexistence and evolution in multi-strain epidemiological systems such as the one of pneuomococcal bacteria [25], where explicit mathematical results linking neutral and niche mechanisms, for N large, are thusfar missing. Although motivated by such a pathogen with altered susceptibilities in co-colonization, the global transmission dynamics captured here have broader implications. As an example, information on the resilience of a group of strains may be derived from the sign of q(z) in our system (12): if q(z) > 0 then each strain is less competitive within the group but the overall community is more resistant to invasion by a new strain. Our model's generality and closed analytical form for relative type dynamics provide a new bridge between epidemiology and community ecology, invite comparison with the classical Lotka-Volterra model, and offer extension to other multi-body problems in physics and genetics.
We acknowledge support by the University of Tours and EU-RAXESS for an invited researcher stay of Erida Gjini at IDP, in Tours during 2019.  R 0 Basic reproduction number of the pathogen, R 0 = β/m, here assumed equal among strains , [16] 2. Derivation details for the slow-fast dynamics Using the ansatz: I(t) = I + εX(t) + O(ε), system (6) reads wherein we have denoted: j I j α ji J i − I i α i j J j The final slow fast form is given by using the new variables H i and z i defined in (9). Hence, define Finally, using these new unknows, one obtains the expression of the slow fast system (11): where δ = β 2(T) 2 −ID is a positive constant. If ε → 0 then z i remains constant and (X(t), H(t)) → (φ(z), 0) where: On the slow time scale τ = εt in (11) , and letting ε → 0, one obtains the slow dynamics Taking (X, H) = (φ(z), 0) yields the slow dynamics reducted to the slow manifolḋ φ(z) is given by (18) and verifies Denoting µ = I/D and Θ = βTID 2(T) 2 − ID , the explicit slow dynamics is then given by system (12) which may be rewritten as Remark that we have: which implies that the simplexe P = z ∈ R N + , z j = 1 is invariant under (12).

Characterization of the model via invasion fitnesses
We focus on the trivial solutions of (12), which are the extreme competitive exclusion equilibria E i = (δ ik ) 1≤k≤n where δ ik is the In particular, we note λ j i the invasion fitness of the strain i with respect to the trivial solution E j . More precisely, λ j i is the percapita growth rate of the strain i evaluated at E j . Since the i-th line of the system reads d dτ z i = z i h i (z 1 , · · · , z n ), we have the explicit formula: We have that the line i of (21) isż With the notations introduced above, this can be rewritten as: From N j=1 z j = 1, we infer Plugging this into (23), we obtain a new version of system (21), enterely parametrized by the fitnesses λ j i : λ k j z j z k         , i = 1, · · · , N z 1 + · · · + z N = 1. (24) or more shortly by noting λ i i = 0 and the pairwise invasion fitness matrix Λ = (λ j i ) 1≤i, j≤N :

Supplementary Figures and Movies
Supplementary Movie 1. Here we illustrate the dynamic trajectory of the system ( Figure 2 of the paper). A. Network of pairwise invasion fitnesses between strains, where edges can be of 4 types: coexistence (red), bistability (blue), exclusion of i, exclusion of j (directed gray arrows toward the winner). The system trajectory z i is shown as a movement in a 6-dimensional space. B. Trajectory of the strain frequencies over time. Figure S1: Illustration of more complex dynamics. Here we simulate the model in its reduced form for another set of parameters, leading to limit cycle coexistence between 5 strains in an N = 6 system. See associated movie S2.

Supplementary Movie 2.
Here we illustrate the dynamic trajectory of the system ( Figure 1 of the Supplements). A. Network of pairwise invasion fitnesses between strains, where edges can be of 4 types: coexistence (red), bistability (blue), exclusion of i, exclusion of j (directed gray arrows toward the winner). The system trajectory z i is shown as a movement in a 6-dimensional space. B. Trajectory of the strain frequencies over time.