Branching with Selection and Mutation I: Mutant Fitness of Fréchet Type

We investigate two stochastic models of a growing population with discrete and non-overlapping generations, subject to selection and mutation. In our models each individual carries a fitness which determines its mean offspring number. Many of these offspring inherit their parent’s fitness, but some are mutants and obtain a fitness randomly sampled, as in Kingman’s house-of-cards model, from a distribution in the domain of attraction of the Fréchet distribution. We give a rigorous proof for the precise rate of superexponential growth of these stochastic processes and support the argument by a heuristic and numerical study of the mechanism underlying this growth. This study yields in particular that the empirical fitness distribution of one model in the long time limit displays periodic behaviour.


Introduction
While the theory of branching processes is undoubtedly one of the best developed areas of probability theory, stochastic branching models that incorporate effects of selection and mutation have only recently caught the attention of mathematicians and physicists.This is despite the unquestionable relevance of these effects to the evolution of populations in nature and in the laboratory [1,2].
By contrast, deterministic high density models of a population undergoing selection and mutation have been studied for quite some time.The model most closely associated with our stochastic process is Kingman's model [3].This is a dynamical system on the space of probability measures describing the fitness distribution of a population.After t generations the fitness distribution p t of the population is replaced by x p t (dx) x p t (dx) + βµ(dx).
Here a proportion 1 − β of the new generation has been selected from the current generation proportionally to their fitness and a proportion β are mutants that get a new fitness, sampled independently from their past using the mutant fitness distribution µ.Note that this model is only well-defined if the mean fitness remains finite and therefore requires moment bounds for the mutant fitness distribution.Kingman's model undergoes a condensation phase transition, which is further studied in [4].Variants of the model have been considered in [5][6][7].
The study of individual based, stochastic models is much more recent.Park and Krug [8] studied Kingman's model for an unbounded fitness distribution alongside a random model of fixed, finite population size.Despite its highly simplified nature, the finite population model with an exponential distribution of fitness effects is qualitatively consistent with the fitness increase observed in Lenski's long-term evolution experiment with bacteria [2,9].A generalization of this model that includes the response of the immune system to a population of pathogens was considered in [10].
The first papers studying branching process models that express the selective advantage of a fit individual in terms of its offspring distribution are [11], which deals with Weibull type fitness distributions and puts the focus on the condensation phenomenon in that model, and [12] which looks at the growth of the fittest family in the case of Gumbel type distributions.Both papers are limited to bounded fitness distributions and implicitly rely on the analogy to Kingman's original model, though of course the methods of study are entirely different in a stochastic setting.The present paper initiates the study of branching processes with selection and mutation for unbounded fitness distributions.We focus on the case of Fréchet type fitness distributions where the mathematical challenge is linked to the fact that the analogous Kingman model is ill-defined [8].
The structure of the paper is as follows.In Sec. 2, we introduce the models and state the main theorem.Sec. 3 explains the heuristics behind the formal results and in Sec. 4 we present a rigorous proof of the theorem.Sec. 5 contains refined results for the empirical frequency distribution for one of our models.These results are not yet accessible by a complete rigorous mathematical analysis, so that we resort to a numerical and heuristic study and a rigorous analysis of an approximating deterministic system.In Sec. 6 we provide a short discussion that places our results into the context of previous work and points to directions for future research.

Models and main result
We study two models of a population evolving in discrete generations.In both models all individuals are assigned a fitness value, which is a positive real number.As model parameters we fix a probability distribution µ on (0, ∞) from which the random fitness values F are sampled, and a mutation probability β ∈ (0, 1).
In both models we start from generation t = 0 with a single individual 1 with fitness f .Each individual in the population in generation t ≥ 0 produces a Poisson random number of offspring with mean given by its fitness.With probability 1 − β an offspring individual inherits its parent's fitness and is added to the population at generation t + 1.Otherwise, with probability β, it is a mutant.The two models differ in the fate of the mutants.
• Fittest mutant model (FMM): Every mutant is assigned a fitness sampled independently from µ.Only the fittest mutant (if there is one) is added to the population at generation t + 1.All other mutants die instantly.• Multiple mutant model (MMM): Every mutant is assigned a fitness sampled independently from µ and is added to the population at generation t + 1.
We write X(t) for the number of individuals in generation t and study the growth of the populations conditioned on the event of survival, i.e. when X(t) = 0 for all times t.It is easy to see that the population size of the MMM dominates the population size of the FMM at all times.Because the growth is determined by the fittest mutants we expect both models to grow at the same rate and to show this, it suffices to find an upper bound for the MMM and a matching lower bound for the FMM.
Naturally, the rate of growth depends on the mutant fitness distribution µ.If µ is an unbounded distribution in both models individuals of ever increasing fitness occur and hence the population will grow superexponentially fast.By contrast, if µ is bounded we can only have exponential growth.Indeed, if µ is continuous with essential supremum one, then for a closely related continuous time model of immortal individuals, it is shown in [11,Remark 1] is the unique solution of the equation , and otherwise λ * := 1 − β.Further details on the long term growth of the process depend on the classification of µ according to its membership in the max domain of attraction of an extremal distribution.By the celebrated Fisher-Tippett theorem there are three such universality classes, see for example [13,Proposition 0.3].These are • the Weibull class, which roughly occurs if µ is bounded with mass decaying slowly near the essential supremum, • the Gumbel class, which roughly occurs if the mass of µ is decaying quickly near the essential supremum, which may be finite or infinite, • the Fréchet class, which roughly occurs if µ is unbounded with mass decaying slowly near infinity.
The assignment of mutant fitness distributions to extreme value universality classes plays an important role in the interpretation of evolution experiments [14], and representatives of all three classes have been identified empirically [15].
In the present paper, we are mainly interested in the asymptotic behaviour of the population size X(t) in the so far unexplored case that µ belongs to the Fréchet class (or, in short, is of Fréchet type).Precisely, this means that the tail function G(x) := µ((x, ∞)) = P(F > x) is regularly varying with index −α for some α > 0. In other words, there exists a function : (0, ∞) → R which is slowly varying at infinity such that G(x) = x −α (x).As in this case µ is an unbounded distribution, the process (X(t) : t ≥ 0) will grow superexponentially fast on survival and therefore our discussion will focus on the limiting quantity Our main result is stated in the following theorem.
Theorem 1 Given α > 0, let T ∈ N be the unique number such that Let (X(t)) t≥0 be the size of the population in either the FMM or the MMM.Then, almost surely on survival, Before presenting the proof of the theorem in Sec. 4, in the next section we motivate the expression (2).

Motivation of the main result
Here we explain the statement of Theorem 1 by a heuristic analysis of the FMM.For convenience we take the fitness distribution to be of Pareto form, G(x) = x −α for x ≥ 1 and G(x) = 1 for x < 1.Moreover, throughout this section we assume that the initial fitness f is so large that the fluctuations induced by Poisson sampling are negligible at all times, which implies that both the total population size and the sizes of subpopulations of mutants are well approximated by their expectations.Denoting the fitness of the mutant that is added to the population at generation t by W t , we can then write where the factors 1−β account for the fact that (apart from the added mutant) only the unmutated fraction of the population survives to the next generation.
For the same reason the total number N t of mutants produced in generation t (including the ones that die immediately) is approximately Since the probability that the largest fitness W t among N t independent and identically distributed random variables with common distribution G is smaller than x is (1 − x −α ) Nt , the random variable W t can be sampled as , where Z t is uniformly distributed in the interval (0, 1) and we have approximated Z Note that Y t does not depend on X(t).
To proceed, we define ω t as ω t := log X(t) log f , which implies that X(t) = f ωt and W t ≈ Y t f ωt/α .Inserting these relations into (3) we obtain In the limit f → ∞ the sum on the right hand side is dominated by the term with the largest exponent.Correspondingly, the ω t can be well approximated by the solution χ t of the recursion relation with χ 1 = 1.We now argue that the χ t grow at least exponentially.Since for any t 0 ≥ 1 and any positive integer m we have, for any n ≥ 1 where we have assumed that the limit is well-defined.Since ( 6) is valid for any integer m ≥ 1, an optimal lower-bound can be found by maximizing the right hand side.As shown by Lemma 4 in Sec. 4, the maximizer over the positive integers is precisely the function ν(α) in Theorem 1.As the population size depends exponentially on ω t or χ t , the heuristic argument makes it plausible that ν(α) is a lower bound on the double-exponential growth rate of X(t).
Remarkably, Theorem 1 states that the bound is tight, and moreover applies also to the MMM.Informally this implies that the population at time t is dominated by the fittest mutant that was generated at time t − T .As a consequence the mutant frequency distribution changes periodically with period T (see Sec. 5 for further discussion).
In Fig. 1, we depict ν(α) together with the numerical solution 2 of the recursion relation (5).The fact that ν(α) is the exact exponential growth rate of χ t is proven rigorously in Lemma 6 in Sec. 4. In the inset of Fig. 1, we 2 The direct numerical estimate of ν by extrapolating log χt/t is hampered by the fact that limt→∞ χte −νt does not exist in general (see Sec. 5).The method used to obtain the data in Fig. 1 is explained at the end of Sec.5.1.Fig. 1 Plots of ν vs α.Solid line depicts (2) and symbols are from numerical solution of the recursion relation (5).Inset: Plot of (νeα) −1 − 1 vs. α with ν in (2).The error of the approximation ( 7) is small and vanishes when α ≤ 1/e.compare (2) to an approximation obtained by treating m in (6) as a continuous variable.This yields Although (7) is not exact, the relative error is less than 7% in all cases.
For αe < 1 the expressions ( 2) and ( 7) actually coincide.In this regime of extremely heavy-tailed fitness distributions (more precisely, in the case of α ≤ 0.5) selection becomes irrelevant, in the sense that the double-exponential growth rate ν(α) = log(1/α) persists in the extreme case β → 1 of the MMM, where all individuals are replaced by mutants in each generation and the process becomes a classical Galton-Watson process albeit with infinite mean, cf.[16].In the case of the FMM, the extreme case would stop the population from growing but the fitness W t of the single individual present approximately satisfies 4 Proof of Theorem 1

Preparation for the proof
In this subsection we collect some tools that will be used in the proofs of the lower and upper bounds in the estimate leading to Theorem 1.The lower bound will be verified in Section 4.2 and the upper bound in Section 4.3.
For t ∈ N 0 let W t be the fitness of the fittest of the mutants in generation t and W t = 0 if there are no mutants in generation t.Our first observation is that under the weak assumption G(x) > 0 for all large x (which always holds if µ is of Fréchet type) either the sequence (W t ) is unbounded or the branching process dies out in finite time.Heuristically speaking, on survival the accumulated number of mutants is unbounded almost surely, which naturally entails unbounded largest fitness.
Lemma 2 Almost surely on survival the sequence (W t ) is unbounded.
Proof We first show that the branching process can be coupled to a sequence (ξ 1 , . . ., ξ t ) of independent Bernoulli variables with success parameter β and an independent sequence (F 1 , . . ., F t ) of independent fitnesses with distribution µ such that on survival up to generation t we have, for all 1 ≤ i ≤ t, • ξ i = 1 if there is at least one mutant in generation i, and Indeed, once the random variables (F 1 , . . ., F t ) and (ξ 1 , . . ., ξ t ) are generated with the given law we generate the branching process as follows: Produce the offspring in the nth generation as a Poisson distribution with the right parameter given by the previous generation (possibly zero).If there is at least one offspring use ξn to decide whether it is a mutant and if so give it fitness Fn.Then use other newly sampled Bernoulli variables with parameter β and fitnesses to decide whether other variables are mutants and if they are decide their fitness.Then surival implies Wn ≥ ξnFn as required.
Now N := t i=1 ξ i is binomially distributed with parameters β > 0 and t ∈ N. We infer that, for any fixed x > 1, Since P(F ≤ x) < 1 and β > 0, we get We next describe the distribution of W t given the process at time t − 1.
Lemma 3 Suppose that at generation t − 1 there are n individuals with fitness F 1 , F 2 , . . ., Fn and set X := n i=1 F i .Then, for all x ≥ 0, Proof First fix a positive integer n and suppose W (n) t is the largest of n independently sampled fitnesses and W (0) t = 0. Let Ḡ(x) = 1 − G(x) and note that Now let N be the number of mutants in generation t, which is Poisson distributed with mean βX .Hence, for x ≥ 0, The next two results concern the potential limit ν(α).We first characterise ν(α) as a maximum and then as the growth rate in a recursion relation.Note that the first result easily implies that ν(α) is decreasing, as well as continuous and positive.
For later reference we define Lemma 7 Define Then t − I t is bounded.
Proof By Lemma 6 we have c e νt ≤ χ t ≤ ce νt for all t.Since there is t 0 such that c e νt > a t for all t ≥ t 0 , we can write, for t > t 0 , where A(m) = me −νm /α with A(T ) = 1.Since limm→∞ A(m) = 0, there is m 0 such that c > cA(m) and hence m α χ t−m < c e νt , for all m ≥ m 0 .As the right hand side is a lower bound of T α χ t−T we get that t − I t cannot be larger than max{m 0 , t 0 }, as desired.
In words, χ t is completely determined by χi (t) for i within the window t − T ≤ i ≤ t − 1.This fact will play an important role in the proof of Theorem 1.
We conclude the subsection with two estimates for classical Galton-Watson processes.
Lemma 9 Consider a supercritical Galton-Watson process (X t ) t≥0 with Poisson offspring distribution with mean θ > 1, starting in generation 0 with a single individual.Fix 0 < x < 1 and an integer n ≥ 1.Then, Proof First note that (see, e.g., [17]) and that where we have used the sub-additivity of probability measure.Using Chebyshev's inequality, we get which, along with (13), gives the claimed inequality.
Lemma 10 For a Galton-Watson process (X t ) with X 0 = K 0 and generation dependent offspring distribution N t with E[N t ] ≤ N for all t, , for all B > 1 and K > 0.
Proof By Markov's inequality, we have a geometric sum gives the claimed inequality.

Proof of the lower bound
In this subsection we show that, for given α > 0 and all α > α, we have In both models at each generation s the lineage originating from the mutant with fitness W s dominates a version ( Xt (f )) t≥s of the same model starting in generation s with a single individual of fitness f = W s .If there is at least one s such that lim inf then ( 14) is proved.As (W t ) is unbounded almost surely on survival it therefore suffices to show that lim As ( Xt (x)) can be coupled to an FMM (S t (x)) with the same initial condition such that Xt (x) ≥ S t (x) for all t ≥ 0, the result follows by combining Lemma 6 with the following statement.
Since we are only interested in the limit as f → ∞, we may assume that f is so large that For χ t , we choose T as in Remark 8.By N i,t we denote the number of individuals with fitness W i at generation t.Define events Let D −1 be the certain event and, for i ∈ N 0 ,

Now observe that
.
By Lemma 9 we have where we have used (12).
To proceed, we find the X in Lemma 3 on the event D n−1 as where we have used W i ≥ m i ≥ n i and χ i (t) as in (10) for parameters α and as = s/2.Using Lemma 3 with G(m i ) ≥ f −(1−ε/2)χ i , we have

Now we define
where δ ij is the Kronecker delta symbol.Trivially, we have lim f →∞ b i = 0 for all i ≥ 0. Since, for sufficiently large f , bs for fixed s is a bounded and decreasing function of f and since Lemma 6 gives lim s→∞ bs2 s = 0, there is s 0 such that |bs| < 2 −s for all s > s 0 and for all assumed value of f .Therefore, the series defining φ(f ) converges uniformly for sufficiently large f and lim f →∞ φ(f ) = 0. Therefore, for sufficiently large f , we get where we have used (1 − x)(1 − y) ≥ 1 − x − y for x, y ≥ 0. As, on the event A, where we have assumed N i,t = 0 for i < 0, we see that A ⊂ E(f ) and the proof is completed.
In fact, Lemma 11 and its proof are applicable to the MMM verbatim, except that S t is replaced by Xt .If we are interested in the proof only for the MMM, we actually do not need to introduce S t .

Proof of the upper bound
In this subsection we show that, for given α > 0 and all α < α, we have for the MMM denoted by (M t ), or (M t (x)) if in the initial generation there is a single individual with fixed fitness x, that In case of extinction the upper bound holds by convention.One can construct two processes with initial fitness x ≤ y on the same probability space such that M t (x) ≤ M t (y) for all t.Indeed, this can be done as follows.First construct (M t (y)) and look at its genealogical tree truncated after the first mutant in every line of descent from the root.Removing any individual in that tree together with all its offspring from (M t (y)) independently with probability x/y we obtain (M t (x)).
We consider the MMM ( Mt (f )) t≥T −1 starting in generation T − 1 with an initial condition such that there are T different mutant classes with fitness g n,m for 0 ≤ m ≤ T − 1 and the number of individuals with fitness g n,m is (g n,m ) T −m−1 .We only consider f sufficiently large so that (1 Now assume that we have proved, for all α < α, Given an arbitrary f > 0 and ε > 0 pick f ε such that the probability above exceeds 1 − ε and the smallest fitness in the initial condition of Mt (f ε ) is larger than f .Then (17) guarantees that which proves (16).So it is enough to prove (17).Once ( 17) is proved, we use the natural coupling such that S t ≤ M t for all t.Then, almost surely on survival, lim sup which completes the proof of Theorem 1.
Lemma 12 Let Z t be the number of non-mutated descendants at generation t of X individuals with fitness in a bounded interval I with right endpoint b at generation m of an MMM.Assume X ≤ K.Then, for all B > 1, Proof As the mean number of non-mutated offspring of an individual is bounded by (1 − β)b we get the result by applying Lemma 10.
Lemma 13 Suppose at generation t − 1 of an MMM the population consists of n individuals with fitness F 1 , . . ., Fn.Let and let Z be the number of mutants in generation t with fitness in the interval (a, b].Then, with p := µ((a, b]), we have Proof Observe that Z is Poisson distributed with mean Y t p.
Remark 14 Using Markov's inequality, we get which is useful when K Y t p.By Chebyshev's inequality, for K > Y t , which is useful when (K − Y t ) 2 Y t .For K = 0, we will use We denote the number of non-mutated descendants of initial individuals with fitness g n,m at generation t ≥ T − 1 by N m,T −1,t and define The number of mutants that appear at generation t ≥ T with fitness in the interval (h n−1,t , h n,t ] is denoted by N n,t,t for 0 ≤ n ≤ n + 1, where we have assumed h −1,t := 0 and h n+1,t := ∞.Typically, N n+1,t,t will be zero.The number of non-mutated descendants of N n,m,m at generation t > m is denoted by Let (θ t ) t≥T −1 be a sequence satisfying θ T −1 = θ T = T and, for t > T , Lemma 15 For T ≤ x ≤ m < t (t, m are integers and x is real), we have Proof Using (8), we have which proves (23).If δ(t − m) − α e −ν ∆ is negative, then (24) is trivially valid.If δ(t−m)−α e −ν ∆ is positive, then the left hand side of (24) has maximum at x = m.Therefore, it is enough to prove (24) only for x = m.Plugging x = m, we have where we have used 1 − e −x ≤ x, e ν m ≤ e ν t , and t − m ≤ α e ν (t−m) .
, and bB = f χ n /α = f (1+3ε)χ n /α , we have Note that A n,t,t has information on the empirical distribution of mutants' fitness that appear at generation t.We define Ãn,m = By Lemma 15, we have on the event Am, that for all t ≥ m, and, in turn, where Ym is defined in (18).

Empirical frequency distributions
Apart from the fact that the population is dominated by a single mutant class at all times, the proof of the double-exponential growth rate ν presented in Sec. 4 does not give any insight into the structure of the population.However, since the solution χ t of the recursion relation ( 9) correctly describes the asymptotic growth of X(t), it provides a natural starting point for addressing this question at least on a heuristic level.In this section, we analyze the recursion relation in more depth to understand the demographic structure in the long time limit, which turns out to display a rather rich behaviour.

Numerical solution of the recursion relation
To characterize the empirical frequency distribution we introduce the following quantities: looks perfect (since the number of mutant classes increases with the number of generations, the frequency distributions at different times cannot be identical).
A rigorous proof of the periodicity will be given in Sec.5.2.
The periodicity was taken into account in the numerical estimates of ν reported in Fig. 1.Rather than monitoring log χ t /t, which converges very slowly, we computed the quantity which approaches a constant in a relatively short time.

Periodicity of χ t e −νt
By Lemma 6, we know that is bounded away from zero and infinity.Now we show that c t is not only bounded, but eventually becomes periodic.
Proposition 17 For any sequence (an) in the recursion relation (9), there is a t 1 such that c t = c t+T for all t ≥ t 1 .
Proof In this proof, k and k are exclusively used as integers in the range 1 ≤ k, k ≤ T .Since χ t+T ≥ e νT χ t (see Sec. 3), the sequence (c k+nT )n is nondecreasing and bounded.Consequently, is well defined.Note that max{C k : 1 ≤ k ≤ T } becomes the optimal upper bound in Lemma 6.If n satisfies nT > T with T > max{t − I t } (see Remark 8), then we have Taking n to infinity, we get and, by definition, C k+mT = C k for any integer m.Since T e −νT /α = 1 and C k−T = C k , we can rewrite (30) as Comparing terms with s = T − 1, T, T + 1 for any k, we have To sum up, C k takes the form where C 0 is a positive constant (note that χ t (α, (C 0 an)) = C 0 χ t (α, (an))) and ϕ k satisfies If α = α T (see Remark 5), then e ν = (T + 1)/T and the only possible value of ϕ k is ϕ k = e ν for all k because of (31).
To simplify (29) for large n, we use the following observation.For p ∈ N with X := 1/T and C k−(T ±p) = C k∓p , we have Since sup p≥2 (1 + pX)/(1 + X) p < 1 for all nonzero X not smaller than −1, In the following, n is assumed so large that (32) is valid for all k.Defining δ k,n :=1 − c k+nT /C k with the convention δ k+mT,n := δ k,n+m for integer m and using the definition of ϕ k , we can write As c k+nT → C k , we have δ k,n → 0 as n → ∞.Accordingly, if (T + 1)/(T ϕ k ) < 1, then the term with (T + 1)/(T ϕ k ) cannot be a minimum for large n.The same argument is applicable to the term with (T − 1)ϕ k+1 /T .
If α = α T , then ϕ k = (T + 1)/T for all k and, accordingly, we have δ k,n+1 = min{δ k,n , δ k−1,n } for all k and for all sufficiently large n.If there is m and k such that δ k ,m = 0, then δ k ,n = 0 for all n ≥ m and δ k +1,m+1 = 0, which again gives δ k +2,m+2 = 0 and so on.Therefore, we have δ k,n = 0 for all n > m + T and all k.Hence, to complete the proof for this case, we need to elicit a contradiction if δ k,n is strictly positive for all n and for all k.Since δ k,n is a nonincreasing sequence of n, we have for all s ∈ N. Since δ k−1,n+s−1 should approach zero monotonically as s → ∞, there should be s 0 such that δ k−1,n+s−1 < δ k,n for all k and for all s > s 0 .Therefore, we get δ k,n+s+T = δ k−1,n+s+T −1 = δ k−2,n+s+T −2 = δ k,n+s for all s > s 0 .Since δ cannot increase, we conclude that δ k,n is a constant for all sufficiently large n.If δ k,n is strictly positive for all n as assumed, C k cannot be a limit and we arrive at a contradiction.Therefore, there is t 1 such that c t+T = c t for all t ≥ t 1 in this case.

Non-uniqueness of periodic solutions
Proposition 17 and its proof have shown the general periodic solutions for t > t 1 to be of the form where the ϕ k satisfy ϕ T +k = ϕ k and (31).Since (T + 1)/T ≤ e ν < T /(T − 1), setting ϕ i = e ν for all i satisfies (31), which gives the constant sequence c t = c t1 .We refer to this solution as the homogeneous state.Recall that the homogeneous state is the unique possibility for α = α T , as shown right after (31).By constructing an appropriate sequence (a n ), we now show that any set {ϕ k } that satisfies the conditions (31) can give rise to a periodic solution c t .Therefore, the periodic solution c t is not unique and can vary substantially with (a n ) unless α = α T or T = 1.
Proposition 18 Let with the convention 0 j=1 = 1.Then where the ϕ j are as in (31) with periodicity ϕ T +j = ϕ j .
Proof To find a 1 , we observe that for 0 where we have used ϕ i ≤ T /(T − 1).Therefore, we get which is (35) for t = 1.Note that this χ 1 is trivially valid for T = 1.Now assume (35) is valid up to t = n.Then, For i ≤ n, we have and for n + T − 1 > i ≥ n and T > 1, we have Therefore, we have Induction completes the proof.
For a realization of (34) in the branching process, consider an initial condition such that there are T different mutant classes with fitness f i := f ψi/α (0 ≤ i < T ) and the number N i of individuals with fitness f i is Notice that this initial condition with ϕ j = e ν together with a shift in time was used in the proof of Lemma 16.In the limit f → ∞ as in Sec. 3, X(t) is well approximated by f χt with a t in (34).
In the original branching process, the sequence (a n ) depends both on the initial condition and the stochastic evolution in the early time regime before the deterministic approximation through the recursion relation ( 9) becomes valid.To see this, we recall from Sec. 3 how the recursion relation arises from the stochastic process.Since on survival the total population size as well as the largest fitness increases indefinitely, there should be a generation t 0 such that X(t 0 + 1) > K for any preassigned K. Let W 0 be the largest fitness at generation t 0 , define Y = X(t 0 + 1) and introduce a shifted time variable t = t − t 0 .If K is extremely large, X(t ) can be well approximated as where Y an is the population size of all mutant classes that appeared prior to generation t 0 .Since Y an ≤ Y W n 0 , we naturally have lim n→∞ a n e −νn = 0, and (a n ) is a permissible sequence that can be entered into the recursion relation (9).

Empirical frequency distribution for large α
Whereas the preceding subsection has shown that the empirical frequency distribution at long times is generally non-universal, we will now argue that it nevertheless has a well-defined limit for α → ∞.Let us begin with the homogeneous state.In this case, J i (t) = e −ν(t−i) , P (t) ≡ α(e ν − 1), Since να → 1/e as α → ∞, the homogeneous state for all sufficiently large α is well described by and the mean log fitness converges to P = 1 e .Moreover, since and T /α → e as α → ∞, in this limit all periodic solutions that satisfy the constraints (31) become close to homogeneous, ϕ i = e ν + O(α −2 ).Therefore, we conjecture that the empirical distribution on survival has (36) as a limit distribution for α → ∞.As an illustration, in Fig. 3 we compare (36) to numerical solutions of the recursion relation for α = 3, 4, 5, 6.The numerical data are hardly distinguishable from (36) already for α = 5.

Summary and Discussion
In this article we have provided a detailed characterization of the superexponential population growth in two closely related stochastic models of evolution.
To the best of our knowledge, this is the first rigorous analysis of a branching process with selection and random mutations drawn from an unbounded fitness distribution.A remarkable feature of the models considered here is the emergence of an integer-valued time scale T which depends (discontinuously) on the index α of the underlying Fréchet distribution.A partial understanding of the resulting periodic behaviour of the population structure was achieved in a deterministic approximation.Further work on this problem is needed, addressing in particular how the stochastic initial phase of the process determines the non-universal aspects of the asymptotic population distribution.
It is instructive to compare our findings for the branching process to the earlier analysis of a stochastic fixed finite population version of Kingman's model in [8].In both cases the long-time behavior is dominated by, and can quantitatively understood in terms of extremal mutation events in the past.However, in the fixed finite population model the likelihood of generating mutants that exceed the current population fitness declines with time, and the dynamics reduces to a modified record process, where the takeover of the population by a fit mutant is instantaneous compared to the waiting time for the next fitter mutant.As a consequence, the population at time t is dominated by a mutant that arose at a time of order t in the past.By contrast, in the branching process with Fréchet-type distributions, the declining probability of exceeding the current fitness is compensated by the rapid growth of the population in such a way that the time lag since the birth of the currently dominant mutant takes on a fixed value T .
It is reasonable to expect that the growth of the population fitness in the branching process is intermediate between that of the fixed finite population model and the deterministic infinite population model.For Fréchet type fitness distributions the deterministic model is ill-defined, but the analysis of the fixed finite population model predicts a polynomial increase of the fitness with exponent 1/α [8], which is indeed much slower than the superexponential growth in the branching process.For unbounded Gumbel type distributions
relating s and p by p = |s − T | we can choose ε > 0 such that C k − ε > se −νs C k−s /α for all s with |s − T | > 1 and for all k.By (28), for this ε, there is an integer m 0 such that C k − ε < c k+nT ≤ C k for all n ≥ m 0 and for all k.If n > m 0 , then c k+nT > C k − ε > se −νs C k−s /α ≥ se −νs c k−s+nT /α, for all s with |s − T | > 1, which reduces (29) to c k+(n+1)T = max c k+nT , T +1 T e −ν c k−1+nT , T −1 T e ν c k+1+nT .
and m 1 > f 0 .Notice that by assumption,