Abstract
We investigate two stochastic models of a growing population with discrete and non-overlapping generations, subject to selection and mutation. In our models each individual carries a fitness which determines its mean offspring number. Many of these offspring inherit their parent’s fitness, but some are mutants and obtain a fitness randomly sampled, as in Kingman’s house-of-cards model, from a distribution in the domain of attraction of the Fréchet distribution. We give a rigorous proof for the precise rate of superexponential growth of these stochastic processes and support the argument by a heuristic and numerical study of the mechanism underlying this growth. This study yields in particular that the empirical fitness distribution of one model in the long time limit displays periodic behaviour.
Similar content being viewed by others
1 Introduction
While the theory of branching processes is undoubtedly one of the best developed areas of probability theory, stochastic branching models that incorporate effects of selection and mutation have only recently become the subject of rigorous mathematical analysis. This is despite the unquestionable relevance of these effects to the evolution of populations in nature and in the laboratory [1,2,3].
By contrast, deterministic high density models of a population undergoing selection and mutation have been studied for quite some time [4]. The model most closely associated with our stochastic process is Kingman’s model [5]. This is a dynamical system on the space of probability measures describing the fitness distribution of a population. The fitness distribution \(p_t\) at generation t of the population is replaced in the next generation by
Here a proportion \(1-\beta \) of the new generation has been selected from the current generation proportionally to their fitness and a proportion \(\beta \) are mutants that get a new fitness, sampled independently of their past using a fixed mutant fitness distribution (MFD) \(\mu \). Note that this model is only well-defined if the mean fitness of the population remains finite and therefore requires moment bounds for the MFD. For certain MFDs with bounded support Kingman’s model undergoes a condensation phase transition, which implies that a nonzero fraction of the total population attains the maximally possible fitness value when the mutation rate is low enough [6]. A rigorous analysis of the condensation transition can be found in [7], and variants of the model have been considered in [8,9,10].
Kingman’s model is based on two main assumptions about the evolutionary process. First, the fitness of a mutant is assumed to be random and independent of the parental fitness, a setting often referred to as the “house-of-cards” model [4, 5, 11]. Second, each mutation gives rise to a new genetic type or allele, an assumption known as the infinite alleles model [12]. Park and Krug [13] studied a stochastic version of Kingman’s model which incorporates these two features. The population is updated in discrete generations following asexual Wright–Fisher dynamics, which can be viewed as a branching process conditioned on a constant population size N, see [1]. For unbounded MFDs the mean population fitness grows without bound. Since the MFD is time-independent, the fraction of beneficial mutations declines indefinitely over time. As a consequence, for long times beneficial mutants emerge and evolve independently of each other, and the dynamics can be analyzed rather straightforwardly in terms of a record-like point process [13]. In particular, for a MFD with an exponential tail the mean population fitness increases logarithmically, which is consistent with the behaviour observed in Lenski’s long-term evolution experiment with bacteria [2, 14]. A generalization of this model that includes the response of the immune system to a population of pathogens was considered in [15].
The decline of the supply of beneficial mutations distinguishes the fixed population version of Kingman’s model from a related class of stochastic models where the selection coefficients of novel mutants, rather than their fitness, is drawn from a given, time-invariant distribution [1]. In these models the fitness distribution of the population converges to a fitness wave traveling at a constant speed, which is determined by the interference between competing mutant clones [16,17,18,19,20,21].
The branching process version of Kingman’s model considered in this paper is, in a sense, intermediate between the deterministic model (1) and the stochastic finite population model of [13]. The dynamics of the branching process is stochastic, but because of the unbounded growth of the population, competing clones can coexist at arbitrary long times and the population retains a nontrivial type structure. While our motivation here is primarily conceptual and mathematical, we note that the clonal composition of growing populations is a problem of considerable interest for the modeling of proliferating tumours [22,23,24]. In this context, Durrett et al. [22] studied a branching process with selection and mutation where, similar to the fitness wave models described in the previous paragraph, the selection coefficients of beneficial mutations are drawn from a fixed, continuous probability distribution with bounded or unbounded support.
The first papers studying branching process models of Kingman type that express the selective advantage of a fit individual in terms of its offspring distribution are [25], which deals with Weibull type MFDs and puts the focus on the condensation phenomenon in that model, and [26] which looks at the growth of the fittest family in the case of Gumbel type MFDs. Both papers are limited to bounded MFDs and implicitly rely on the analogy to Kingman’s original model, though of course the methods of study are entirely different in a stochastic setting. The present paper initiates the study of Kingman type branching processes with selection and mutation for unbounded MFDs. We focus on the case of Fréchet type MFDs, where the mathematical challenge is linked to the fact that the analogous Kingman model is ill-defined [13].
The structure of the paper is as follows. In Sect. 2, we introduce the models and state the main theorem. Section 3 explains the heuristics behind the formal results and in Sect. 4 we present a rigorous proof of the theorem. Section 5 contains refined results for the empirical fitness distribution for one of our models. These results are not yet accessible by a complete rigorous mathematical analysis, so that we resort to a numerical and heuristic study and a rigorous analysis of an approximating deterministic system. In Sect. 6 we provide a short discussion that places our results into the context of previous work and points to directions for future research. [3]
2 Models and Main Result
We study two models of a population evolving in discrete and non-overlapping generations. In both models all individuals are assigned a fitness value, which is a positive real number. As model parameters we fix a probability distribution \(\mu \) on \((0,\infty )\) from which the random fitness values F are sampled, referred to in the following as the mutant fitness distribution (MFD), and a mutation probability \(\beta \in (0,1)\). As was explained in the Introduction, we assume an infinite alleles model with a house-of-cards fitness landscape.
In both models we start from generation \(t=0\) with a single individualFootnote 1 with fitness f. Each individual in the population in generation \(t\ge 0\) produces a Poisson random number of offspring with mean given by its fitness. With probability \(1-\beta \) an offspring individual inherits its parent’s fitness and is added to the population at generation \(t+1\). Otherwise, with probability \(\beta \), it is a mutant. The two models differ in the fate of the mutants.
-
Fittest mutant model (FMM) Every mutant is assigned a fitness sampled independently from \(\mu \). Only the fittest mutant (if there is one) is added to the population at generation \(t+1\). All other mutants die instantly.
-
Multiple mutant model (MMM) Every mutant is assigned a fitness sampled independently from \(\mu \) and is added to the population at generation \(t+1\).
We write X(t) for the number of individuals in generation t and study the growth of the population conditioned on the event of survival, i.e. when \(X(t) \ne 0\) for all times t. It is easy to see that the population size of the MMM dominates the population size of the FMM at all times. Because the growth is determined by the fittest mutants we expect both models to grow at the same rate and to show this, it suffices to find an upper bound for the MMM and a matching lower bound for the FMM.
Naturally, the rate of growth depends on the MFD \(\mu \). If \(\mu \) is an unbounded distribution in both models individuals of ever increasing fitness occur and hence the population will grow superexponentially fast. By contrast, if \(\mu \) is bounded we can only have exponential growth. Indeed, if \(\mu \) is continuous with essential supremum one, then for a closely related continuous time model of immortal individuals, it is shown in [25, Remark 1] that
where \(\lambda ^*\in [1-\beta ,1)\) is the unique solution of the equation
if \(\beta \int \frac{1}{1-x} \, \mu (dx)\ge 1\), and otherwise \(\lambda ^*:=1-\beta \). Further details on the long term growth of the process in [25] depend on the classification of \(\mu \) according to its membership in the max domain of attraction of an extremal distribution. This also applies to other model variants and unbounded MFDs. By the celebrated Fisher-Tippett theorem there are three such universality classes, see for example [27, Proposition 0.3]. These are
-
the Weibull class, which roughly occurs if \(\mu \) is bounded with mass decaying slowly near the essential supremum,
-
the Gumbel class, which roughly occurs if the mass of \(\mu \) is decaying quickly near the essential supremum, which may be finite or infinite,
-
the Fréchet class, which roughly occurs if \(\mu \) is unbounded with mass decaying slowly near infinity.
Extreme value theory plays an important role in the interpretation of experimentally determined effect size distributions of beneficial mutations, and representatives of all three universality classes have been identified empirically [28,29,30,31].
In the present paper, we are mainly interested in the asymptotic behaviour of the population size X(t) in the case that \(\mu \) belongs to the Fréchet class (or, in short, is of Fréchet type). Precisely, this means that the tail function
is regularly varying with index \(-\alpha \) for some \(\alpha >0\). In other words, there exists a function \(\ell :(0,\infty ) \rightarrow {\mathbb {R}}\) which is slowly varying at infinity such that \(G(x)=x^{-\alpha } \ell (x)\). MFDs of Fréchet type have been found in several experimental studies [32,33,34,35], and appear to be typical for populations subjected to strong selection pressures, such as bacteria or viruses exposed to drugs.
As in this case \(\mu \) is an unbounded distribution, the process \((X(t) :t\ge 0)\) will grow superexponentially fast on survival and therefore our discussion will focus on the limiting quantity
Our main result is stated in the following theorem.
Theorem 1
Given \(\alpha >0\), let \(T\in {\mathbb {N}}\) be the unique number such that
and define (see Lemma 4 for another equivalent definition)
Let \((X(t))_{t\ge 0}\) be the size of the population in either the FMM or the MMM. Then, almost surely on survival,
We would like to emphasize that although the survival probability depends not only on the initial condition but also on the model (FMM or MMM), the almost sure convergence on survival in Theorem 1 holds irrespective of the actual value of the survival probability as long as it is nonzero. Before presenting the proof of the theorem in Sect. 4, in the next section we motivate the expression (2).
3 Motivation of the Main Result
Here we explain the statement of Theorem 1 by a heuristic analysis of the FMM. For convenience we take the MFD \(\mu \) to be of Pareto form, \(G(x) = x^{-\alpha }\) for \(x \ge 1\) and \(G(x) = 1\) for \(x< 1\). Moreover, throughout this section we assume that the initial fitness f is so large that the fluctuations induced by Poisson sampling are negligible at all times, which implies that both the total population size and the sizes of subpopulations of mutants are well approximated by their expectations. Denoting the fitness of the mutant that is added to the population at generation t by \(W_t\), we can then write
where the factors \(1-\beta \) account for the fact that (apart from the added mutant) only the unmutated fraction of the population survives to the next generation. For the same reason the total number \(N_t\) of mutants produced in generation t (including the ones that die immediately) is approximatelyFootnote 2
Since the probability that the largest fitness \(W_t\) among \(N_t\) independent and identically distributed random variables with common distribution G is smaller than x is \((1-x^{-\alpha })^{N_t}\), the random variable \(W_t\) can be sampled as
where \(Z_t\) is uniformly distributed in the interval (0, 1) and we have approximated \(Z_t^{1/N_t} \approx 1 + (1/N_t)\log Z_t\). Note that \(Y_t\) does not depend on X(t).
To proceed, we define \(\omega _t\) as
which implies that \(X(t) = f^{\omega _t}\) and \(W_t \approx Y_t f^{\omega _t/\alpha }\). Inserting these relations into (3) we obtain
In the limit \(f \rightarrow \infty \) with t fixed, the sum on the right hand side is dominated by the term with the largest exponent of f. Correspondingly, the \(\omega _t\) for large but finite f can be well approximated by the solution \(\chi _t\) of the recursion relation
with \(\chi _1=1\). We now argue that the \(\chi _t\) grow at least exponentially. Since for any \(t_0 \ge 1\) and any positive integer m
we have, for any \(n \ge 1\)
Correspondingly
where we have assumed that the limit is well-defined. Since (6) is valid for any integer \(m \ge 1\), an optimal lower-bound can be found by maximizing the right hand side. As shown by Lemma 4 in Sect. 4, the maximizer over the positive integers is precisely the function \(\nu (\alpha )\) in Theorem 1. As the population size depends exponentially on \(\omega _t\) or \(\chi _t\), the heuristic argument makes it plausible that \(\nu (\alpha )\) is a lower bound on the double-exponential growth rate of X(t). Remarkably, Theorem 1 states that the bound is tight, and moreover applies also to the MMM. Informally this implies that the population at time t is dominated by the fittest mutant that was generated at time \(t-T\). As a consequence the empirical fitness distribution changes periodically with period T (see Sect. 5 for further discussion).
In Fig. 1, we depict \(\nu (\alpha )\) together with the numerical solutionFootnote 3 of the recursion relation (5). The fact that \(\nu (\alpha )\) is the exact exponential growth rate of \(\chi _t\) is proven rigorously in Lemma 6 in Sect. 4. In the inset of Fig. 1, we compare (2) to an approximation obtained by treating m in (6) as a continuous variable. This yields
Although (7) is not exact, the relative error is less than 7% in all cases.
For \(\alpha e < 1\) the expressions (2) and (7) actually coincide. In this regime of extremely heavy-tailed MFDs (more precisely, in the case of \(\alpha \le 0.5\) with \(T=1\); see Theorem 1) selection becomes irrelevant, in the sense that the double-exponential growth rate \(\nu (\alpha ) = \log (1/\alpha )\) persists in the extreme case \(\beta \rightarrow 1\) of the MMM, where all individuals are replaced by mutants in each generation and the process becomes a classical Galton–Watson process albeit with infinite mean, cf. [36]. In the case of the FMM, the extreme case would stop the population from growing but the fitness \(W_t\) of the single individual present approximately satisfies \(W_{t+1} \approx W_t^{1/\alpha }\), which gives
4 Proof of Theorem 1
4.1 Preparation for the Proof
In this subsection we collect some tools that will be used in the proofs of the lower and upper bounds in the estimate leading to Theorem 1. The lower bound will be verified in Sect. 4.2 and the upper bound in Sect. 4.3.
For \(t\in \mathbb {N}_0\) let \(W_t\) be the fitness of the fittest of the mutants in generation t and \(W_t=0\) if there are no mutants in generation t. Our first observation is that under the weak assumption \(G(x) >0\) for all large x (which always holds if \(\mu \) is of Fréchet type) either the sequence \((W_t)\) is unbounded or the branching process dies out in finite time. Heuristically speaking, on survival the accumulated number of mutants is unbounded almost surely, which naturally entails unbounded largest fitness.
Lemma 2
Almost surely on survival the sequence \((W_t)\) is unbounded.
Proof
We first show that the branching process can be coupled to a sequence \((\xi _1, \ldots , \xi _t)\) of independent Bernoulli variables with success parameter \(\beta \) and an independent sequence \((F_1,\ldots , F_t)\) of independent fitnesses with distribution \(\mu \) such that on survival up to generation t we have, for all \(1\le i \le t\),
-
\(\xi _i=1\) if there is at least one mutant in generation i, and
-
\(W_i \ge F_i \xi _i\).
Indeed, once the random variables \((F_1,\ldots , F_t)\) and \((\xi _1, \ldots , \xi _t)\) are generated with the given law we generate the branching process as follows: Produce the offspring in the nth generation as a Poisson distribution with the right parameter given by the previous generation (possibly zero). If there is at least one offspring use \(\xi _n\) to decide whether it is a mutant and if so give it fitness \(F_n\). Then use other newly sampled Bernoulli variables with parameter \(\beta \) and fitnesses to decide whether other variables are mutants and if they are decide their fitness. Then surival implies \(W_n\ge \xi _n F_n\) as required.
Now \(N:=\sum _{i=1}^t \xi _i\) is binomially distributed with parameters \(\beta >0\) and \(t\in \mathbb {N}\). We infer that, for any fixed \(x>1\),
Since \(\mathbb {P}(F \le x) < 1\) and \(\beta >0\), we get
hence \(\mathbb {P}((W_t)\) is unbounded\() = \mathbb {P}( \text {survival})\) as claimed. \(\square \)
We next describe the distribution of \(W_t\) given the process at time \(t-1\).
Lemma 3
Suppose that at generation \(t-1\) there are n individuals with fitness \(F_1\), \(F_2\), ..., \(F_n\) and set \(\mathcal {X}:= \sum _{i=1}^n F_i\). Then, for all \(x \ge 0\),
Proof
First fix a positive integer n and suppose \(W_t^{_{(n)}}\) is the largest of n independently sampled fitnesses and \(W^{_{(0)}}_t=0\). Let \({\bar{G}}(x)=1-G(x)\) and note that
Now let N be the number of mutants in generation t, which is Poisson distributed with mean \(\beta \mathcal {X}\). Hence, for \(x \ge 0\),
As \(1-{\bar{G}}(x)=G(x)\) the proof is complete. \(\square \)
The next two results concern the potential limit \(\nu (\alpha )\). We first characterise \(\nu (\alpha )\) as a maximum and then as the growth rate in a recursion relation. Note that the first result easily implies that \(\nu (\alpha )\) is decreasing, as well as continuous and positive.
Lemma 4
We have
In particular, for all \(m \in \mathbb {N}\), we have
Proof
First recall \((T-1)^T/T^{T-1} < \alpha \le T^{T+1}/(T+1)^T\) and observe that \(\alpha > m^{m+1}/(m+1)^m\) for \(m<T\) and \(\alpha \le m^{m+1}/(m+1)^m\) for \(m \ge T\), where \(m \in \mathbb {N}\). Since
we have the desired result. \(\square \)
Remark 5
Let \(\alpha _T:= T^{T+1}/(T+1)^T\). The equality \(m = \alpha e^{\nu (\alpha )m}\) holds iff (\(m=T\)) or (\(m=T+1\) and \(\alpha = \alpha _T\)).
For the remainder of this subsection, we abbreviate \(\nu :=\nu (\alpha )\).
Lemma 6
For some positive sequence \((a_n)\) we define inductively
Then, if \(\displaystyle \lim _{n\rightarrow \infty } a_n e^{-\nu n}=0\), there are positive constants c and \(c'\) such that
and therefore we have
Proof
Abbreviate \(\hat{\chi }_t:= \chi _t /c'\) with \(c'=e^{-\nu T}\min \{\chi _1,\chi _2,\ldots ,\chi _T\}\). Obviously, \(\hat{\chi }_t \ge e^{\nu t}\) for \(t \le T\). Now assume \( n \ge T\) and \(\hat{\chi }_t \ge e^{\nu t}\) for all \(t \le n\). By the assumption and (8), we have
Induction gives \(\hat{\chi }_t \ge e^{\nu t}\) and hence \(\chi _t \ge c' e^{\nu t}\) for all \(t \ge 1\).
Now, choose a positive integer \(n_0\) such that \(a_n \le e^{\nu n}\) for all \(n \ge n_0\). Let \(\bar{\chi }_t = \chi _t/c\) with \(c = \max \{1,\chi _1,\chi _2,\ldots ,\chi _{n_0}\}\). Obviously, \(\bar{\chi }_t \le 1 \le e^{\nu t}\) for all \(t \le n_0\). Now let \(n \ge n_0\) and assume that \(\bar{\chi }_t \le e^{\nu t}\) for all \(t \le n\). Then,
where we have used (8). By induction, we have \(\chi _t \le c e^{\nu t}\) for all \(t\ge 1\). \(\square \)
For later reference we define
Lemma 7
Define
Then \(t-I_t\) is bounded.
Proof
By Lemma 6 we have \(c'e^{\nu t} \le \chi _t \le c e^{\nu t}\) for all t. Since there is \(t_0\) such that \(c'e^{\nu t} > a_t\) for all \(t \ge t_0\), we can write, for \(t > t_0\),
Now it is enough to show that \(t - I_t\) is bounded for \(t> t_0\).
Note that, for \(1 \le m \le t-1\),
where \(A(m) = me^{-\nu m} /\alpha \) with \(A(T)=1\). Since \(\lim _{m\rightarrow \infty } A(m)=0\), there is \(m_0\) such that \(c'> c A(m)\) and hence
As the right hand side is a lower bound of \(\frac{T}{\alpha }\chi _{t-T}\) we get that \(t - I_t\) cannot be larger than \(\max \{m_0,t_0\}\), as desired. \(\square \)
Remark 8
If we choose \(T' > \sup \{t-I_t :t\in \mathbb {N}\}\), then, for all t,
In words, \(\chi _t\) is completely determined by \(\tilde{\chi }_i(t)\) for i within the window \(t-T' \le i \le t-1\). This fact will play an important role in the proof of Theorem 1.
We conclude the subsection with two estimates for classical Galton–Watson processes.
Lemma 9
Consider a supercritical Galton–Watson process \((\mathcal {X}_t)_{t\ge 0}\) with Poisson offspring distribution with mean \(\theta >1\), starting in generation 0 with a single individual. Fix \(0<x<1\) and an integer \(n \ge 1\). Then,
Proof
First note that the mean and the variance of \(\mathcal {X}_t\) are (see, e.g., [37])
respectively and that
where we have used the sub-additivity of the probability measure. Using Chebyshev’s inequality, we get
which, along with (13), gives the claimed inequality. \(\square \)
Lemma 10
For a Galton–Watson process \((\mathcal {X}_t)\) with \(\mathcal {X}_0 = K_0\) and generation dependent offspring distribution \(N_t\) with \(\mathbb {E}[N_t] \le N\) for all t,
for all \(B>1\) and \(K>0\).
Proof
By Markov’s inequality, we have
Since \(\mathbb {E}[\mathcal {X}_{t+1}\vert \mathcal {X}_t] = \mathcal {X}_t \mathbb {E}[N_t]\), we have \(\mathbb {E}[\mathcal {X}_t] = K_0 \prod _{i=0}^{t-1} \mathbb {E}[N_i] \le K_0 N^t\), which gives
Since
a geometric sum gives the claimed inequality. \(\square \)
4.2 Proof of the Lower Bound
In this subsection we show that, for given \(\alpha >0\) and all \(\alpha '>\alpha \), we have
In both models at each generation s, we can regard a lineage originating from the mutant with fitness \(W_s\) as a version \((\hat{X}_t(f))_{t\ge s}\) of the same model starting in generation s with a single individual of fitness \(f=W_s\). Since \(X(t) \ge \hat{X}_t(f)\) for \(t \ge s\), (14) is proved, if there is at least one s such that
As \((W_t)\) is unbounded almost surely on survival it therefore suffices to show that
For convenience, we use the convention \((\log \log \hat{X}_t(f))/t = -\infty \) if \(\hat{X}_t(f) = 0\). As \((\hat{X}_t(x))\) can be coupled to an FMM \((S_t(x))\) with the same initial condition such that \(\hat{X}_t(x)\ge S_t(x)\) for all \(t\ge 0\), the result follows by combining Lemma 6 with the following statement.
Lemma 11
Fix \(0<\epsilon <1/2\) and let
where \(\chi _t':= \chi _t(\alpha ', (\frac{n}{2}))\) with \(\alpha ':=\alpha /(1-2 \epsilon )\). Then
Proof
We define \(m_0:=f\), \(n_0:=f^{1/2}\), and (\(i \ge 1\))
For later reference, we also define \(U_i:= n_i/m_i\) for all \(i\ge 0\).
Set \(\epsilon _0=\epsilon /(2-2\epsilon )\). By our assumption on \(\mu \), there is \(f_0\) such that
Since we are only interested in the limit as \(f\rightarrow \infty \), we may assume that f is so large that \((1-\beta )m_0 \ge 2\), \((1-\beta ) m_1 \ge 2\), \(U_0 < 1/2\), \(U_1 < 1/2\), and \(m_1 > f_0\). Notice that by assumption,
For \(\chi '_t\), we choose \(T'\) as in Remark 8. By \(N_{i,t}\) we denote the number of individuals with fitness \(W_i\) at generation t. Define events
Let \(D_{-1}\) be the certain event and, for \(i\in \mathbb {N}_0\),
Now observe that
By Lemma 9 we have
where we have used (12).
To proceed, we find the \(\mathcal {X}\) in Lemma 3 on the event \(D_{n-1}\) as
where we have used \(W_i \ge m_i \ge n_i\) and \(\tilde{\chi }_i'(t)\) as in (10) for parameters \(\alpha '\) and \(a_s=s/2\). Using Lemma 3 with \(G(m_i) \ge f^{-(1-\epsilon /2)\chi _i'}\), we have
Now we define
where \(\delta _{ij}\) is the Kronecker delta symbol. Trivially, we have \(\lim _{f\rightarrow \infty } b_i = 0\) for all \(i\ge 0\). Since, for sufficiently large f, \(b_s\) for fixed s is a bounded and decreasing function of f and since Lemma 6 gives
there is \(s_0\) such that \(\vert b_s\vert < 2^{-s}\) for all \(s >s_0\) and for all assumed value of f. Therefore, the series defining \(\phi (f)\) converges uniformly for sufficiently large f and \(\lim _{f \rightarrow \infty } \phi (f)=0\). Therefore, for sufficiently large f, we get
where we have used \((1-x)(1-y) \ge 1- x - y\) for \(x,y\ge 0\). As, on the event A,
where we have assumed \(N_{i,t} =0\) for \(i<0\), we see that \(A\subset E(f)\) and the proof is completed. \(\square \)
In fact, Lemma 11 and its proof are applicable to the MMM verbatim, except that \(S_t\) is replaced by \(\hat{X}_t\). If we are interested in the proof only for the MMM, we actually do not need to introduce \(S_t\).
4.3 Proof of the Upper Bound
In this subsection we show that, for given \(\alpha >0\) and all \(0<\alpha '<\alpha \), we have for the MMM denoted by \((M_t)\), or \((M_t(x))\) if in the initial generation there is a single individual with fixed fitness x, that
In case of extinction the upper bound holds by convention. One can construct two processes with initial fitness \(x\le y\) on the same probability space such that \(M_t(x) \le M_t(y)\) for all t. Indeed, this can be done as follows. First construct \((M_t(y))\) and look at its genealogical tree truncated after the first mutant in every line of descent from the root. Removing any individual in that tree together with all its offspring from \((M_t(y))\) independently with probability x/y we obtain \((M_t(x))\).
We now construct an MMM with special initial conditions. Fix \(\epsilon >0\). For given \(\alpha \), let \(\delta = (1+2\epsilon )/(1+3\epsilon )\), \(\alpha '=\alpha /(1+3\epsilon )\), \(\nu '=\nu (\alpha ')\), \(T = T(\alpha ')\), and
which is equivalent to \(\nu ' \Delta _0 +\delta = 1\). We choose \(\Delta \) such that \(0 < \Delta \le \Delta _0\) and
is an integer. We define, for a given \(f>0\),
Note that \(\chi _t'\) above is different from that in Lemma 11.
We briefly explain the motivation of introducing \(h_{n,t}\) and other quantities to find an upper bound. Unlike the proof for the lower bound in Lemma 11, where we have only to investigate a single lineage \(\hat{X}_t\), we have to consider all mutants to find an appropriate upper bound of the MMM. Since we only need an inequality for the proof, we divide the fitness space of mutants by \(h_{n,t}\) and regard mutants appearing at generation t with fitness in the region \((h_{n,t},h_{n+1,t}]\) as a mutant class with growth rate bounded by \(h_{n+1,t}\).
We consider the MMM \((\tilde{M}_t(f))_{t\ge T-1}\) starting in generation \(T-1\) with an initial condition such that there are T different mutant classes with fitness \(g_{\hat{n},m}\) for \(0\le m\le T-1\) and the number of individuals with fitness \(g_{\hat{n},m}\) is \(\lfloor (g_{\hat{n},m})^{T-m-1} \rfloor \). We only consider f sufficiently large so that \((1-\beta ) f>2\) and \((1-\beta ) f^{\epsilon /\alpha }>2\).
Now assume that we have proved, for all \(\alpha '< \alpha \),
Given an arbitrary \(f>0\) and \(\varepsilon >0\) pick \(f_\varepsilon \) such that the probability above exceeds \(1-\varepsilon \) and the smallest fitness in the initial condition of \(\tilde{M}_t(f_\varepsilon )\) is larger than f. Then (17) guarantees that
which proves (16). So it is enough to prove (17). Once (17) is proved, we use the natural coupling such that \(S_t\le M_t\) for all t. Then, almost surely on survival,
which completes the proof of Theorem 1.
Lemma 12
In an MMM, let \(Z_t\) be the number of non-mutated descendants at generation t of \(\mathcal {X}\) individuals at generation m whose fitness values are within a bounded interval I with right endpoint b. Assume \(\mathcal {X}\le K\). Then, for all \(B>1\),
Proof
As the mean number of non-mutated offspring of an individual is bounded by \((1-\beta )b\) we get the result by applying Lemma 10. \(\square \)
Lemma 13
Suppose at generation \(t-1\) of an MMM the population consists of n individuals with fitness \(F_1,\ldots , F_n\). Let
and let Z be the number of mutants in generation t with fitness in the interval (a, b]. Then, with \(p:= \mu ((a,b])\), we have
Proof
Observe that Z is Poisson distributed with mean \(Y_tp\). \(\square \)
Remark 14
Using Markov’s inequality, we get
which is useful when \(K \gg Y_tp\). By Chebyshev’s inequality, for \(K > Y_t\),
which is useful when \((K-Y_t)^2 \gg Y_t\). For \(K=0\), we will use
We denote the number of non-mutated descendants at generation \(t\ge T-1\) of initial individuals with fitness \(g_{\hat{n},m}\) by \(N_{m,T-1,t}\) and define
The number of mutants that appear at generation \(t\ge T\) with fitness in the interval \((h_{n-1,t}, h_{n,t}]\) is denoted by \(N_{n,t,t}\) for \(0 \le n \le \hat{n}+1\), where we have assumed \(h_{-1,t}:=0\) and \(h_{\hat{n}+1,t}:=\infty \). Typically, \(N_{\hat{n}+1,t,t}\) will be zero. The number of non-mutated descendants of \(N_{n,m,m}\) at generation \(t>m\) is denoted by \(N_{n,m,t}\). For \(t \ge m \ge T\) define
which gives
Let \((\theta _t)_{t\ge T-1}\) be a sequence satisfying \(\theta _{T-1} = T\) and, for \(t\ge T\),
Since \(\theta _{t+1} - \theta _t = \theta _t + \hat{n}\) for \(t \ge T\), we have
Lemma 15
For \(T\le x\le m<t\) (t, m are integers and x is real), we have
Proof
Using (8), we have
which proves (23). If \(\delta (t-m) - \alpha ' e^{-\nu ' \Delta }\) is negative, then (24) is trivially valid. If \(\delta (t-m) - \alpha ' e^{-\nu ' \Delta }\) is positive, then the left hand side of (24) has maximum at \(x=m\). Therefore, it is enough to prove (24) only for \(x =m\). Plugging \(x=m\), we have
where we have used \(1-e^{-x} \le x\), \(e^{\nu 'm} \le e^{\nu 't}\), and \(t-m \le \alpha ' e^{\nu '(t-m)}\). \(\square \)
Lemma 16
Let \(E(f):=\{\tilde{M}_t(f) \le \theta _{t+1} f^{\chi '_t}\) for all \(t \ge T \}\). Then
which implies (17).
Proof
Set \(\epsilon _0=\epsilon /(2+2\epsilon )\). By our assumption on \(\mu \) there is \(f_0\) such that \(G(f) \le f^{-\alpha (1-\epsilon _0)}\) for all \(f > f_0\). Now we assume \(h_{0,T}=f^{(1+\epsilon )/\alpha } > f_0\), which gives
Let
and define \(A_{T-1}:= \bigcap _{n=0}^{T-1} A_{n,T-1}\). By Lemma 12 with \(K= \lfloor (g_{\hat{n},n})^{T-n-1} \rfloor \), \(b = g_{\hat{n},n}=f^{\chi _n'/\alpha }\), and \(b B = f^{\chi _n'/\alpha '} = f^{(1+3\epsilon )\chi _n'/\alpha }\), we have
Since \(m e^{-\nu ' m} \le \alpha '\) and \(g_{\hat{n},n} \le f^{\chi '_n/\alpha '}\), we have
Therefore,
on the event \(A_{T-1}\). Let
Note that \(A_{n,t,t}\) has information on the empirical distribution of mutants’ fitness that appear at generation t. We define
By Lemma 15, we have on the event \(A_m\), that for all \(t\ge m \ge T\),
and, in turn,
on the event A(f). Therefore, \(A(f) \subset E(f)\) and the proof is complete if we show
Now we investigate \(\mathbb {P}(A_m \vert D_{m-1})\). First note that
and, on the event \(D_{m-1}\),
where \(Y_m\) is defined in (18).
We begin with \(\mathbb {P}(A_{\hat{n}+1,m}\vert D_{m-1})\), which clearly equals \(\mathbb {P}(A_{\hat{n}+1,m,m}\vert D_{m-1})\). Using (21) with \(a = h_{\hat{n},m}\) and \(Y_m G(a) \le \theta _m f^{-\epsilon \chi '_m/2}\), we obtain
Now we consider \(\mathbb {P}(A_{0,m}\vert D_{m-1})\). Using (20) with \(K = \theta _m f^{\chi _m'}\) and (27), we have
Using Lemma 12 with \(B = f^{\kappa _{0,m}/\alpha '}/h_{0,m} = f^{2\epsilon \kappa _{0,m}/\alpha }\), we have
Therefore, defining
we have \(\mathbb {P}(A_{0,m}\vert D_{m-1}) \ge 1-b_{0,m}\).
Finally, we move on to \(\mathbb {P}(A_{n,m}\vert D_{m-1})\) for \(1 \le n \le \hat{n}\). Using (19) with \(K = f^{\chi _m' - \kappa _{n-1,m}}\), \(a = h_{n-1,m}\), and \(Y_m G(a) \le \theta _m f^{\chi _m' - (1+\epsilon /2) \kappa _{n-1,m}}\), we have
Using Lemma 12 with \(B = f^{\delta \kappa _{n,m}/\alpha '}/h_{n,m} = f^{\epsilon \kappa _{n,m}/\alpha }\), we have
Therefore, defining
we have \( \mathbb {P}(A_{n,m}\vert D_{m-1}) \ge 1-b_{n,m}. \) We define
Recall that we have assumed \((1-\beta ) f>2\) and \((1-\beta ) f^{\epsilon /\alpha }>2\). Since \(b_m\) for given m is a bounded function of f which is decreasing to zero and
there is \(m_0\) such that \(\vert b_m\vert < 2^{-m}\) for all \(m >m_0\). Hence the series defining \(\phi (f)\) converges uniformly for sufficiently large f and, accordingly, \(\lim _{f \rightarrow \infty } \phi (f) = 0\). Therefore, for sufficiently large f,
and \(\displaystyle \lim _{f\rightarrow \infty } \mathbb {P}(A(f)) = 1\), which completes the proof. \(\square \)
5 Empirical Fitness Distributions of the FMM
Apart from the fact that the population is dominated by a single mutant class at all times, the proof of the double-exponential growth rate \(\nu \) presented in Sect. 4 does not give any insight into the structure of the population. However, since the solution \(\chi _t\) of the recursion relation (9) correctly describes the asymptotic growth of X(t), it provides a natural starting point for addressing this question at least on a heuristic level. In this section, we analyze the recursion relation in more depth to understand the demographic structure of the FMM in the long time limit, which turns out to display a rather rich behaviour.
5.1 Numerical Solution of the Recursion Relation
To characterize the empirical fitness distribution we introduce the following quantities:
where the second approximate relations in the definitions of \(J_i(t)\) and P(t) become equalities in the formal deterministic limit \(f \rightarrow \infty \) (see Sect. 3). The ratio \(J_i(t) \in [0,1]\) compares the log-fitness of the mutant class born at time i to the log-fitness of the current fittest mutant. Since \(X(t+1) \approx (1-\beta ) X(t) {\bar{F}}_t\) with \({\bar{F}}_t\) denoting the mean fitness of the population at generation t, P(t) quantifies the mean fitness at generation t on the same scale. The decomposition in Eq. (3) shows that the fraction of the population in mutant class i at time t is proportional to \(W_i^{t-i}\), and therefore \(R_i(t) \in [-1,0]\) serves as proxy of the (logarithmic) empirical fitness distribution at generation t over mutant classes i.
In Fig. 2a, we plot \(R_i(t)\) against \(J_i(t)\) for nine consecutive generations, obtained by numerically solving the recursion relation (5) for \(\alpha = 1\) with \(a_n=n\). The salient feature is the periodic behavior of the fitness distribution with period 3; note that \(T=3\) for \(\alpha =1\). To illustrate the accuracy of the periodic behaviour, we present data-collapse plots in Fig. 2b. In most regions of \(J_i\), the collapse looks perfect (since the number of mutant classes increases with the number of generations, the empirical fitness distributions at different times cannot be identical). For another illustration of the periodicity, we depict P(t) vs. t for various values of \(\alpha \) (with \(a_n = n\)) in Fig. 3. After an early transient behavior, P(t) clearly shows periodic behavior. A rigorous proof of the periodicity will be given in Sect. 5.2.
The periodicity was taken into account in the numerical estimates of \(\nu \) reported in Fig. 1. Rather than monitoring \(\log \chi _t/t\), which converges very slowly, we computed the quantity
which approaches a constant in a relatively short time.
5.2 Periodicity of \(\chi _t e^{-\nu t}\)
By Lemma 6, we know that
is bounded away from zero and infinity. Now we show that \(c_t\) is not only bounded, but eventually becomes periodic.
Proposition 17
For any sequence \((a_n)\) in the recursion relation (9), there is a \(t_1\) such that \(c_t = c_{t+T}\) for all \(t \ge t_1\).
Proof
In this proof, k and \(k'\) are exclusively used as integers in the range \(1 \le k,k' \le T\). Since \(\chi _{t+T} \ge e^{\nu T} \chi _t\) (see Sect. 3), the sequence \((c_{k+nT})_n\) is nondecreasing and bounded. Consequently,
is well defined. Note that \(\max \{C_k: 1 \le k \le T\}\) becomes the optimal upper bound in Lemma 6. If n satisfies \( n T> T'\) with \(T' > \max \{t-I_t\}\) (see Remark 8), then we have
Taking n to infinity, we get
and, by definition, \(C_{k+mT} = C_k\) for any integer m. Since \(T e^{-\nu T}/\alpha = 1\) and \(C_{k-T} = C_k\), we can rewrite (30) as
Comparing terms with \(s=T-1, T, T+1\) for any k, we have
which gives
for all k. Let \(\varphi _k = C_{k} e^\nu / C_{k-1}\) with \(\varphi _1 =\varphi _{T+1}= C_1 e^\nu / C_T\). Then,
for \(k > 1\). Setting \(k=T+1\) and considering \(C_{T+1} = C_1\), we have
To sum up, \(C_k\) takes the form
where \(C_0\) is a positive constant (note that \(\chi _t(\alpha ,(C_0 a_n)) = C_0 \chi _t(\alpha ,(a_n))\)) and \(\varphi _k\) satisfies
If \(\alpha =\alpha _T\) (see Remark 5), then \(e^\nu = (T+1)/T\) and the only possible value of \(\varphi _k\) is \(\varphi _k = e^\nu \) for all k because of (31).
To simplify (29) for large n, we use the following observation. For \(p \in \mathbb {N}\) with \(X:= 1/T\) and \(C_{k - (T \pm p)} = C_{k \mp p}\), we have
and
Since \(\sup _{p\ge 2} (1+ pX)/(1+X)^p<1\) for all nonzero X not smaller than \(-1\), relating s and p by \(p=\vert s-T \vert \) we can choose \( \epsilon > 0\) such that \(C_k - \epsilon > s e^{-\nu s} C_{k-s}/\alpha \) for all s with \(\vert s-T\vert > 1\) and for all k. By (28), for this \(\epsilon \), there is an integer \(m_0\) such that \(C_k - \epsilon < c_{k+nT} \le C_k\) for all \(n \ge m_0\) and for all k. If \(n> m_0\), then
for all s with \(\vert s - T \vert > 1\), which reduces (29) to
In the following, n is assumed so large that (32) is valid for all k. Defining \(\delta _{k,n}:=\)\(1-c_{k+nT}/C_k\) with the convention \(\delta _{k + m T, n}:= \delta _{k,n+m}\) for integer m and using the definition of \(\varphi _k\), we can write
As \(c_{k+nT} \rightarrow C_k\), we have \(\delta _{k,n} \rightarrow 0\) as \(n \rightarrow \infty \). If \((T+1)/(T\varphi _k)<1\), then \(1 - (T+1)/(T\varphi _k) (1-\delta _{k-1,n} )> 1 - (T+1)/(T\varphi _k)> 0\) for sufficiently large n and, therefore, the term with \((T+1)/(T\varphi _k)\) cannot be a minimum for large n. The same argument is applicable to the term with \((T-1)\varphi _{k+1}/T\).
If \(\alpha = \alpha _T\), then \(\varphi _k = (T+1)/T\) for all k and, accordingly, we have \(\delta _{k,n+1} = \min \{\delta _{k,n},\delta _{k-1,n}\}\) for all k and for all sufficiently large n. If there is m and \(k'\) such that \(\delta _{k',m}=0\), then \(\delta _{k',n}=0\) for all \(n \ge m\) and \(\delta _{k'+1,m+1} = 0\), which again gives \(\delta _{k'+2,m+2}=0\) and so on. Therefore, we have \(\delta _{k,n} =0\) for all \(n > m+T\) and all k. Hence, to complete the proof for this case, we need to elicit a contradiction if \(\delta _{k,n}\) is strictly positive for all n and for all k. Since \(\delta _{k,n}\) is a nonincreasing sequence of n, we have
for all \(s \in \mathbb {N}\). Since \(\delta _{k-1,n+s-1}\) should approach zero monotonically as \(s \rightarrow \infty \), there should be \(s_0\) such that \(\delta _{k-1,n+s-1} < \delta _{k,n}\) for all k and for all \(s > s_0\). Therefore, we get \(\delta _{k,n + s+T} = \delta _{k-1,n+s+T-1} = \delta _{k-2,n+s+T-2} = \delta _{k,n+s}\) for all \(s>s_0\). Since \(\delta \) cannot increase, we conclude that \(\delta _{k,n}\) is a constant for all sufficiently large n. If \(\delta _{k,n}\) is strictly positive for all n as assumed, \(C_k\) cannot be a limit and we arrive at a contradiction. Therefore, there is \(t_1\) such that \(c_{t+T} = c_t\) for all \(t \ge t_1\) in this case.
If \(\alpha \ne \alpha _T\), then there is at least one \(\varphi _k\) such that \((T+1)/T < \varphi _k\). If \(\epsilon \) also satisfies \(\epsilon /C_k < 1 - (T+1)/ (T \varphi _k)\), then we can write
for all \(n > m_0\). If \(\varphi _{k+1} < T/(T-1)\), then \(\delta _{k,n}\) will eventually be smaller than \(1-(T-1)\varphi _{k+1}/T\) and we have \(\delta _{k,n+1} = \delta _{k,n}=0\) for all large n. On the other hand, if \(\varphi _{k+1} = T/(T-1) > (T+1)/T\), we have
Since it is impossible for all \(\varphi _k\) to be \(T/(T-1)\), there exists \(k'\) such that \(\varphi _{k+i} = T/(T-1)\) for \(1 \le i \le k'\) and \(\varphi _{k+k'+1} < T/(T-1)\). Therefore,
for all sufficiently large n. Once \(\delta _{k+k',m} = 0\), then \(\delta _{k+i,n} = 0\) for all \(0 \le i \le k'\) and for all \(n > m+T\) by (33). If \(\varphi _{k+k'+1} > (T+1)/T\), we can repeat the above procedure. If \(\varphi _{k+k'+1} = (T+1)/T\), we have
for all sufficiently large n. Hence, the proof is complete. \(\square \)
5.3 Non-uniqueness of Periodic Solutions
Proposition 17 and its proof have shown the general periodic solutions of recursion relation (9) for \(t > t_1\) to be of the form
where the \(\varphi _k\) satisfy \(\varphi _{T+k} = \varphi _k\) and (31). Since \((T+1)/T \le e^\nu <T/(T-1)\), setting \(\varphi _i = e^\nu \) for all i satisfies (31), which gives the constant sequence \(c_{t} = c_{t_1}\). We refer to this solution as the homogeneous state. Recall that the homogeneous state is the unique possibility for \(\alpha = \alpha _T\), as shown right after (31). By constructing an appropriate sequence \((a_n)\), we now show that any set \(\{\varphi _k\}\) that satisfies the conditions (31) can give rise to a periodic solution \(c_t\). Therefore, the periodic solution \(c_t\) is not unique and can vary substantially with \((a_n)\) unless \(\alpha = \alpha _T\) or \(T=1\).
Proposition 18
Let
where the \(\varphi _j\) are as in (31) with periodicity \(\varphi _{T+j} = \varphi _j\) and we have used the convention \(\prod _{j=1}^0 = 1\). Then
Proof
To find \(a_1\), we observe that for \(0 \le i < T-1\)
where we have used \(\varphi _i \le T/(T-1)\). Therefore, we get
which is (35) for \(t=1\). Note that this \(\chi _1\) is trivially valid for \(T=1\).
Now assume (35) is valid up to \(t = n\). Then,
For \(i \le n\), we have
and for \( n+T-1>i \ge n\) and \(T>1\), we have
Therefore, we have
Induction completes the proof. \(\square \)
Now we illustrate that any allowed set of \(\varphi _j\)’s can appear in the actual branching process by choosing an appropriate initial condition. For a realization of (34) in the branching process, consider an initial condition such that there are T different mutant classes with fitness \(f_i:= f^{\psi _i/\alpha }\) (\(0\le i < T\)) and the number \(N_i\) of individuals with fitness \(f_i\) is
Notice that this initial condition with \(\varphi _j = e^{\nu '}\) together with a shift in time was used in the proof of Lemma 16. In the limit \(f\rightarrow \infty \) as in Sect. 3, X(t) is well approximated by \(f^{\chi _t}\) with \(a_t\) in (34).
In the above discussion, we have illustrated that any permissible set of \(\varphi _j\)’s can be realized by choosing an appropriate initial condition. Now we argue that a surviving outcome with an arbitrary initial condition should approach such a permissible set, but which values of the \(\varphi _j\)’s are realized may depend on the stochasticity in the early time regime. In the original branching process, the sequence \((a_n)\) depends both on the initial condition and the stochastic evolution in the early time regime before the deterministic approximation through the recursion relation (9) becomes valid. To see this, we recall from Sect. 3 how the recursion relation arises from the stochastic process. Since on survival the total population size as well as the largest fitness increases indefinitely, there should be a generation \(t_0\) such that \(X(t_0) > K\) for any preassigned K. Let \(W_0\) be the largest fitness at generation \(t_0\), define \(Y = X(t_0)\) and introduce a shifted time variable \(t' = t - t_0\) with \(\tilde{X}(t'):= X(t'+t_0)\). If K is extremely large, \(\tilde{X}(t')\) can be well approximated as \(\tilde{X}(0) = Y\),
where \(Y^{a_n}\) is the population size of all mutant classes that appeared prior to generation \(t_0\). Since \(Y^{a_n} \le Y W_0^n\), we naturally have \(\lim _{n\rightarrow \infty } a_n e^{-\nu n} = 0\), and \((a_n)\) is a permissible sequence that can be entered into the recursion relation (9).
5.4 Empirical Fitness Distribution for Large \(\alpha \)
Whereas the preceding subsection has shown that the empirical fitness distribution at long times is generally non-universal, we will now argue that it nevertheless has a well-defined limit for \(\alpha \rightarrow \infty \). Let us begin with the homogeneous state. In this case,
Since \(\nu \alpha \rightarrow 1/e\) as \(\alpha \rightarrow \infty \), the homogeneous state for all sufficiently large \(\alpha \) is well described by
and the mean log fitness converges to \(P = \frac{1}{e}\). Moreover, since
and \(T/\alpha \rightarrow e\) as \(\alpha \rightarrow \infty \), in this limit all periodic solutions that satisfy the constraints (31) become close to homogeneous, \(\varphi _i = e^{\nu } +O(\alpha ^{-2})\). Therefore, we conjecture that the empirical fitness distribution on survival has (36) as a limit distribution for \(\alpha \rightarrow \infty \). As an illustration, in Fig. 4 we compare (36) to numerical solutions of the recursion relation for \(\alpha =3\), 4, 5, 6. The numerical data are hardly distinguishable from (36) already for \(\alpha = 5\).
6 Summary and Discussion
In this article we have provided a detailed characterization of the superexponential population growth in two closely related stochastic models of evolution. To the best of our knowledge, this is the first rigorous analysis of a branching process with selection and mutations where the random fitness values (rather than the fitness differences [22]) are drawn from an unbounded probability distribution. A remarkable feature of the models considered here is the emergence of an integer-valued time scale T which depends (discontinuously) on the index \(\alpha \) of the underlying Fréchet distribution. As a consequence, the empirical fitness distribution displays oscillations with period T, a phenomenon that has been observed previously in certain models that include sexual reproduction [38, 39]. A partial understanding of the periodic behaviour of the population structure was achieved in a deterministic approximation. Further work on this problem is needed, addressing in particular how the stochastic initial phase of the process determines the non-universal aspects of the asymptotic population distribution.
It is instructive to compare our findings for the branching process to the earlier analysis of a stochastic fixed finite population version of Kingman’s model in [13]. In both cases the long-time behaviour is dominated by, and can quantitatively understood in terms of extremal mutation events in the past. However, in the fixed finite population model the likelihood of generating mutants that exceed the current population fitness declines with time, and the dynamics reduces to a modified record process, where the takeover of the population by a fit mutant is instantaneous compared to the waiting time for the next fitter mutant. As a consequence, the population at time t is dominated by a mutant that arose at a time of order t in the past. By contrast, in the branching process with Fréchet-type distributions, the declining probability of exceeding the current fitness is compensated by the rapid growth of the population in such a way that the time lag since the birth of the currently dominant mutant takes on a fixed value T. Moreover, the branching process never enters the regime of rare sequential fixation events associated with the decreasing supply of beneficial mutations in the finite population setting. Instead, the population attains a nontrivial stationary clonal structure which is approximately described in Sect. 5.
It is reasonable to expect that the growth of the population fitness in the branching process is intermediate between that of the fixed finite population model [13] and the deterministic infinite population model [5]. For Fréchet type fitness distributions the deterministic model is ill-defined, but the analysis of the fixed finite population model predicts a polynomial increase of the fitness with exponent \(1/\alpha \) [13], which is indeed much slower than the superexponential growth in the branching process. For unbounded Gumbel type distributions the growth law of the fitness is known for infinite as well as for finite populations [5, 13]. The corresponding behaviour of the branching process will be addressed in future work.
Data Availability
All material needed to reproduce the work is fully contained in the paper.
Notes
The generalization to multiple individuals is straightforward.
In the FMM, given total population size \(X(t-1)\) and population’s mean fitness \({\bar{F}}_{t-1}\) at generation \(t-1\), we can approximate \(N_t \approx \beta X(t-1) {\bar{F}}_{t-1}\) and \(X(t) \approx (1-\beta ) X(t-1) {\bar{F}}_{t-1}\). Therefore, to associate \(N_t\) with X(t) correctly, we need \(1-\beta \) in the denominator.
References
Park, S.-C., Simon, D., Krug, J.: The speed of evolution in large asexual populations. J. Stat. Phys. 138, 381–410 (2010)
Wiser, M.J., Ribeck, N., Lenski, R.E.: Long-term dynamics of adaptation in asexual populations. Science 342, 1364–1367 (2013)
Good, B.H., McDonald, M.J., Barrick, J.E., Lenski, R.E., Desai, M.M.: The dynamics of molecular evolution over 60,000 generations. Nature 551, 45–50 (2017)
Bürger, R.: The Mathematical Theory of Selection, Recombination, and Mutation. Wiley, Chichester (2000)
Kingman, J.F.C.: A simple model for the balance between selection and mutation. J. Appl. Probab. 15, 1–12 (1978)
Waxman, D., Peck, J.R.: Pleiotropy and the preservation of perfection. Science 279, 1210–1213 (1998)
Dereich, S., Mörters, P.: Emergence of condensation in Kingman’s model of selection and mutation. Acta Appl. Math. 127, 17–26 (2013)
Yuan, L.: A generalization of Kingman’s model of selection and mutation and the Lenski experiment. Math. Biosci. 285, 61–67 (2017)
Yuan, L.: Kingman’s model with random mutation probabilities: convergence and condensation II. J. Stat. Phys. 181, 870–896 (2020)
Yuan, L.: Kingman’s model with random mutation probabilities: convergence and condensation I. Adv. Appl. Probab. 54, 311–335 (2022)
Hwang, S., Schmiegelt, B., Ferretti, L., Krug, J.: Universality classes of interaction structures for NK fitness landscapes. J. Stat. Phys. 172, 226–278 (2018)
Kimura, M., Crow, J.F.: The number of alleles that can be maintained in a finite population. Genetics 49, 725–738 (1964)
Park, S.-C., Krug, J.: Evolution in random fitness landscapes: the infinite sites model. J. Stat. Mech. Exp. 2008(4), 04014 (2008)
Sibani, P., Brandt, M., Alstrøm, P.: Evolution and extinction dynamics in rugged fitness landscapes. Int. J. Mod. Phys. B 12, 361–391 (1998)
Bianconi, G., Fichera, D., Franz, S., Peliti, L.: Modeling microevolution in a changing environment: the evolving quasispecies and the diluted champion process. J. Stat. Mech. 2011, 08022 (2011)
Desai, M.M., Fisher, D.S.: Beneficial mutation-selection balance and the effect of linkage on positive selection. Genetics 176, 1759–1798 (2007)
Park, S.-C., Krug, J.: Clonal interference in large populations. Proc. Natl. Acad. Sci. USA 104, 18135–18140 (2007)
Yu, F., Etheridge, A., Cuthbertson, C.: Asymptotic behavior of the rate of adaptation. Ann. Appl. Probab. 20, 978–1004 (2010)
Fisher, D.S.: Asexual evolution waves: fluctuations and universality. J. Stat. Mech. 2013, 01011 (2013)
Kelly, M.: Upper bound on the rate of adaptation in an asexual population. Ann. Appl. Probab. 23, 1377–1408 (2013)
Schweinsberg, J.: Rigorous results for a population model with selection I: evolution of the fitness distribution. Electron. J. Probab. 22, 1–94 (2017)
Durrett, R., Foo, J., Leder, K., Mayberry, J., Michor, F.: Evolutionary dynamics of tumor progression with random fitness values. Theor. Popul. Biol. 78, 54–66 (2010)
Durrett, R.: Population genetics of neutral mutations in exponentially growing cancer cell populations. Ann. Appl. Probab. 23, 230–250 (2013)
Angaji, A., Velling, C., Berg, J.: Stochastic clonal dynamics and genetic turnover in exponentially growing populations. J. Stat. Mech. 2021, 103502 (2021)
Dereich, S., Mailler, C., Mörters, P.: Nonextensive condensation in reinforced branching processes. Ann. Appl. Probab. 27, 2539–2568 (2017)
Mailler, C., Mörters, P., Senkevich, A.: Competing growth processes with random growth rates and random birth times. Stoch. Proc. Appl. 135, 183–226 (2021)
Resnick, S.: Extreme Values, Regular Variation, and Point Processes. Springer, New York (1987)
Joyce, P., Rokyta, D.R., Beisel, C.J., Orr, H.A.: A general extreme value theory model for the adaptation of DNA sequences under strong selection and weak mutation. Genetics 180, 1627–1643 (2008)
Orr, H.A.: The population genetics of beneficial mutations. Philos. Trans. R. Soc. B 365, 1195–1201 (2010)
Bataillon, T., Bailey, S.F.: Effects of new mutations on fitness: insights from models and data. Ann. N. Y. Acad. Sci. 1320, 76–92 (2014)
Das, S.G., Krug, J.: Unpredictable repeatability in molecular evolution. Proc. Natl. Acad. Sci. USA 119, 2209373119 (2022)
Schenk, M.F., Szendro, I.G., Krug, J., de Visser, J.A.G.M.: Quantifying the adaptive potential of an antibiotic resistance enzyme. PLoS Genet. 8, 1002783 (2012)
Bank, C., Hietpas, R.T., Wong, A., Bolon, D.N., Jensen, J.D.: A Bayesian MCMC approach to assess the complete distribution of fitness effects of new mutations: uncovering the potential for adaptive walks in challenging environments. Genetics 175, 841–852 (2014)
Foll, M., Poh, Y.-P., Renzette, N., Ferrer-Admetlla, A., Bank, C., Shim, H., Malaspinas, A.-S., Ewing, G., Liu, P., Wegmann, D., Caffrey, D.R., Zeldovich, K.B., Bolon, D.N., Wang, J.P., Kowalik, T.F., Schiffer, C.A., Finberg, R.W., Jensen, J.D.: Influenza virus drug resistance: a time-sampled population genetics perspective. PLoS Genet. 10, 1004185 (2014)
Tokutomi, N., Nakai, K., Sugano, S.: Extreme value theory as a framework for understanding mutation frequency distribution in cancer genomes. PLoS ONE 16, 0243595 (2021)
Davies, P.L.: The simple branching process: a note on convergence when the mean is infinite. J. Appl. Probab. 15, 466–480 (1978)
Harris, T.E.: The Theory of Branching Processes. Springer, Berlin (1963)
Park, S.-C., Krug, J.: Rate of adaptation in sexuals and asexuals: a solvable model of the Fisher–Muller effect. Genetics 195, 941–955 (2013)
Pearce, M.T., Fisher, D.S.: Rapid adaptation in large populations with very rare sex: scalings and spontaneous oscillations. Theor. Popul. Biol. 129, 18–40 (2019)
Acknowledgements
We would like to thank Dan Balick and two anonymous referees for their very useful comments, and Jasmine Foo for pointing out relevant references.
Funding
Open Access funding enabled and organized by Projekt DEAL. S-CP acknowledges the support by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (MSIT) (Grant No. 2020R1F1A1077065) and by The Catholic University of Korea, research fund 2020. JK and PM were supported by the German Excellence Initiative through the UoC Forum “Classical and Quantum Dynamics of Interacting Particle Systems”.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Hal Tasaki.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Park, SC., Krug, J., Touzo, L. et al. Branching with Selection and Mutation I: Mutant Fitness of Fréchet Type. J Stat Phys 190, 115 (2023). https://doi.org/10.1007/s10955-023-03125-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10955-023-03125-3