Abstract
In this paper we study the waiting time until a number of coordinated mutations occur in a population that reproduces according to a continuous time Markov process of Moran type. It is assumed that any individual can have one of \(m+1\) different types, numbered as \(0,1,\ldots ,m\), where initially all individuals have the same type 0. The waiting time is the time until all individuals in the population have acquired type m, under different scenarios for the rates at which forward mutations \(i\rightarrow i+1\) and backward mutations \(i\rightarrow i-1\) occur, and the selective fitness of the mutations. Although this waiting time is the time until the Markov process reaches its absorbing state, the state space of this process is huge for all but very small population sizes. The problem can be simplified though if all mutation rates are smaller than the inverse population size. The population then switches abruptly between different fixed states, where one type at a time dominates. Based on this, we show that phase-type distributions can be used to find closed form approximations for the waiting time law. Our results generalize work by Schweinsberg [60] and Durrett et al. [20], and they have numerous applications. This includes onset and growth of cancer for a cell population within a tissue, with type representing the severity of the cancer. Another application is temporal changes of gene expression among the individuals in a species, with type representing different binding sites that appear in regulatory sequences of DNA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Asmussen, S., Nerman, O., Olsson, M.: Fitting phase-type distributions via the EM algorithm. Scand. J. Stat. 23, 419–441 (1996)
Axe, D.D.: The limits of complex adaptation: an analysis based on a simple model of structured bacterial populations. BIO-Complex. 2010(4) (2010)
Barton, N.H.: The probability of fixation of a favoured allele in a subdivided population. Genet. Res. 62, 149–158 (1993)
Beerenwinkel, N., Antal, T., Dingli, D., Traulsen, A., Kinzler, K.W., Velculescu, V.W., Vogelstein, B., Nowak, M.A.: Genetic progression and the waiting time to cancer. PLoS Comput. Biol. 3(11), e225 (2007)
Behe, M., Snoke, D.W.: Simulating evolution by gene duplication of protein features that require multiple amino acid residues. Protein Sci. 13, 2651–2664 (2004)
Behe, M., Snoke, D.W.: A response to Michael Lynch. Protein Sci. 14, 2226–2227 (2005)
Behrens, S., Vingron, M.: Studying evolution of promoter sequences: a waiting time problem. J. Comput. Biol. 17(12), 1591–1606 (2010)
Behrens, S., Nicaud, C., Nicodéme, P.: An automaton approach for waiting times in DNA evolution. J. Comput. Biol. 19(5), 550–562 (2012)
Bobbio, A., Horvath, Á., Scarpa, M., Telek, M.: A cyclic discrete phase type distributions: properties and a parameter estimation algorithm. Perform. Eval. 54, 1–32 (2003)
Bodmer, W.F.: The evolutionary significance of recombination in prokaryotes. Symp. Soc. General Microbiol. 20, 279–294 (1970)
Carter, A.J.R., Wagner, G.P.: Evolution of functionally conserved enhancers can be accelerated in large populations: a population-genetic model. Proc. R. Soc. Lond. 269, 953–960 (2002)
Cao, Y., et al.: Efficient step size selection for the tau-leaping simulation method. J. Chem. Phys. 124, 44109–44119 (2006)
Chatterjee, K., Pavlogiannis, A., Adlam, B., Nowak, M.A.: The time scale of evolutionary innovation. PLOS Comput. Biol. 10(9), d1003818 (2014)
Christiansen, F.B., Otto, S.P., Bergman, A., Feldman, M.W.: Waiting time with and without recombination: the time to production of a double mutant. Theor. Popul. Biol. 53, 199–215 (1998)
Crow, J.F., Kimura, M.: An Introduction to Population Genetics Theory. The Blackburn Press, Caldwell (1970)
Desai, M.M., Fisher, D.S.: Beneficial mutation-selection balance and the effect of linkage on positive selection. Genetics 176, 1759–1798 (2007)
Durrett, R.: Probability Models for DNA Sequence Evolution. Springer, New York (2008)
Durrett, R., Schmidt, D.: Waiting for regulatory sequences to appear. Ann. Appl. Probab. 17(1), 1–32 (2007)
Durrett, R., Schmidt, D.: Waiting for two mutations: with applications to regulatory sequence evolution and the limits of Darwinian evolution. Genetics 180, 1501–1509 (2008)
Durrett, R., Schmidt, D., Schweinsberg, J.: A waiting time problem arising from the study of multi-stage carinogenesis. Ann. Appl. Probab. 19(2), 676–718 (2009)
Ewens, W.J.: Mathematical Population Genetics. I. Theoretical Introduction. Springer, New York (2004)
Fisher, R.A.: On the dominance ratio. Proc. R. Soc. Edinb. 42, 321–341 (1922)
Fisher, R.A.: The Genetical Theory of Natural Selection. Oxford University Press, Oxford (1930)
Gerstung, M., Beerenwinkel, N.: Waiting time models of cancer progression. Math. Popul. Stud. 20(3), 115–135 (2010)
Gillespie, D.T.: Approximate accelerated simulation of chemically reacting systems. J. Chem. Phys. 115, 1716–1733 (2001)
Gillespie, J.H.: Molecular evolution over the mutational landscape. Evolution 38(5), 1116–1129 (1984)
Gillespie, J.H.: The role of population size in molecular evolution. Theor. Popul. Biol. 55, 145–156 (1999)
Greven, A., Pfaffelhuber, C., Pokalyuk, A., Wakolbinger, A.: The fixation time of a strongly beneficial allele in a structured population. Electron. J. Probab. 21(61), 1–42 (2016)
Gut, A.: An Intermediate Course in Probability. Springer, New York (1995)
Haldane, J.B.S.: A mathematical theory of natural and artificial selection. Part V: selection and mutation. Math. Proc. Camb. Philos. Soc. 23, 838–844 (1927)
Hössjer, O., Tyvand, P.A., Miloh, T.: Exact Markov chain and approximate diffusion solution for haploid genetic drift with one-way mutation. Math. Biosci. 272, 100–112 (2016)
Iwasa, Y., Michor, F., Nowak, M.: Stochastic tunnels in evolutionary dynamics. Genetics 166, 1571–1579 (2004)
Iwasa, Y., Michor, F., Komarova, N.L., Nowak, M.: Population genetics of tumor suppressor genes. J. Theor. Biol. 233, 15–23 (2005)
Kimura, M.: Some problems of stochastic processes in genetics. Ann. Math. Stat. 28, 882–901 (1957)
Kimura, M.: On the probability of fixation of mutant genes in a population. Genetics 47, 713–719 (1962)
Kimura, M.: Average time until fixation of a mutant allele in a finite population under continued mutation pressure: studies by analytical, numerical and pseudo-sampling methods. Proc. Natl. Acad. Sci. USA 77, 522–526 (1980)
Kimura, M.: The role of compensatory neutral mutations in molecular evolution. J. Genet. 64(1), 7–19 (1985)
Kimura, M., Ohta, T.: The average number of generations until fixation of a mutant gene in a finite population. Genetics 61, 763–771 (1969)
Knudson, A.G.: Two genetic hits (more or less) to cancer. Nat. Rev. Cancer 1, 157–162 (2001)
Komarova, N.L., Sengupta, A., Nowak, M.: Mutation-selection networks of cancer initiation: tumor suppressor genes and chromosomal instability. J. Theor. Biol. 223, 433–450 (2003)
Lambert, A.: Probability of fixation under weak selection: a branching process unifying approach. Theor. Popul. Biol. 69(4), 419–441 (2006)
Li, T.: Analysis of explicit tau-leaping schemes for simulating chemically reacting systems. Multiscale Model. Simul. 6, 417–436 (2007)
Lynch, M.: Simple evolutionary pathways to complex proteins. Protein Sci. 14, 2217–2225 (2005)
Lynch, M., Abegg, A.: The rate of establishment of complex adaptations. Mol. Biol. Evol. 27(6), 1404–1414 (2010)
MacArthur, S., Brockfield, J.F.Y.: Expected rates and modes of evolution of enhancer sequences. Mol. Biol. Evol. 21(6), 1064–1073 (2004)
Maruyama, T.: On the fixation probability of mutant genes in a subdivided population. Genet. Res. 15, 221–225 (1970)
Maruyama, T., Kimura, M.: Some methods for treating continuous stochastic processes in population genetics. Jpn. J. Genet. 46(6), 407–410 (1971)
Maruyama, T., Kimura, M.: A note on the speed of gene frequency changes in reverse direction in a finite population. Evolution 28, 161–163 (1974)
Moran, P.A.P.: Random processes in genetics. Proc. Camb. Philos. Soc. 54, 60–71 (1958)
Neuts, M.F.: Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach. John Hopkins University Press, Baltimore (1981)
Nicodéme, P.: Revisiting waiting times in DNA evolution (2012). arXiv:1205.6420v1
Nowak, M.A.: Evolutionary Dynamics: Exploring the Equations of Life. Belknap Press, Cambridge (2006)
Phillips, P.C.: Waiting for a compensatory mutation: phase zero of the shifting balance process. Genet. Res. 67, 271–283 (1996)
Radmacher, M.D., Kelsoe, G., Kepler, T.B.: Predicted and inferred waiting times for key mutations in the germinal centre reaction: evidence for stochasticity in selection. Immunol. Cell Biol. 76, 373–381 (1998)
Rupe, C.L., Sanford, J.C.: Using simulation to better understand fixation rates, and establishment of a new principle: Haldane’s Ratchet. In: Horstmeyer, M. (ed.) Proceedings of the Seventh International Conference of Creationism. Creation Science Fellowship, Pittsburgh, PA (2013)
Sanford, J., Baumgardner, J., Brewer, W., Gibson, P., Remine, W.: Mendel’s accountant: a biologically realistic forward-time population genetics program. Scalable Comput.: Pract. Exp. 8(2), 147–165 (2007)
Sanford, J., Brewer, W., Smith, F., Baumgardner, J.: The waiting time problem in a model hominin population. Theor. Biol. Med. Model. 12, 18 (2015)
Schinazi, R.B.: A stochastic model of cancer risk. Genetics 174, 545–547 (2006)
Schinazi, R.B.: The waiting time for a second mutation: an alternative to the Moran model. Phys. A. Stat. Mech. Appl. 401, 224–227 (2014)
Schweinsberg, J.: The waiting time for \(m\) mutations. Electron. J. Probab. 13(52), 1442–1478 (2008)
Slatkin, M.: Fixation probabilities and fixation times in a subdivided population. Evolution 35, 477–488 (1981)
Stephan, W.: The rate of compensatory evolution. Genetics 144, 419–426 (1996)
Stone, J.R., Wray, G.A.: Rapid evolution of cis-regulatory sequences via local point mutations. Mol. Biol. Evol. 18, 1764–1770 (2001)
Tuğrul, M., Paixão, T., Barton, N.H., Tkačik, G.: Dynamics of transcription factor analysis. PLOS Genet. 11(11), e1005639 (2015)
Whitlock, M.C.: Fixation probability and time in subdivided populations. Genetics 164, 767–779 (2003)
Wodarz, D., Komarova, N.L.: Computational Biology of Cancer. Lecture Notes and Mathematical Modeling. World Scientific, New Jersey (2005)
Wright, S.: Evolution in Mendelian populations. Genetics 16, 97–159 (1931)
Wright, S.: The roles of mutation, inbreeding, crossbreeding and selection in evolution. In: Proceedings of the 6th International Congress on Genetics, vol. 1, pp. 356–366 (1932)
Wright, S.: Statistical genetics and evolution. Bull. Am. Math. Soc. 48, 223–246 (1942)
Yona, A.H., Alm, E.J., Gore, J.: Random sequences rapidly evolve into de novo promoters (2017). bioRxiv.org, https://doi.org/10.1101/111880
Zhu, T., Hu, Y., Ma, Z.-M., Zhang, D.-X., Li, T.: Efficient simulation under a population genetics model of carcinogenesis. Bioinformatics 6(27), 837–843 (2011)
Acknowledgements
The authors wish to thank an anonymous reviewer for several helpful suggestions that improved the clarity and presentation of the paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix A. A Simulation Algorithm
Recall from Sect. 12.2 that the allele frequency process \({\varvec{Z}}_t\) of the Moran model is a continuous time and piecewise constant Markov process with exponentially distributed holding times at each state \({\varvec{z}}=(z_0,\ldots ,z_m)\in \mathcal{Z}\). For all but very small population sizes, it is infeasible to simulate this process directly, since the distances between subsequent jumps are very small, of size \(O_p(N^{-1})\). The \(\tau \)-leaping algorithm was introduced (Gillespie [25], Li [42]) in order to speed up computations for a certain class of continuous time Markov processes. It is an approximate simulation algorithm with time increments of size \(\tau \). According to the leaping condition of Cao et al. [12], one chooses \(\tau =\tau (\varepsilon )\) in such a way that
for \(i=0,\ldots ,m\) and some fixed, small number \(\varepsilon >0\), typically in a range between 0.01 and 0.1.
Zhu et al. [71] pointed out that it is not appropriate to use \(\tau \)-leaping for the Moran model when very small allele frequencies are updated. For this reason they defined a hybrid algorithm that combines features of exact simulation and \(\tau \)-leaping. Although most time increments are of length \(\tau \), some critical ones are shorter. Then they showed that (12.118) will be satisfied by the hybrid algorithm for a neutral model with small mutation rates, when
We will extend the method of Zhu et al. [71] to our setting, where forward and backward mutations are possible. In order to describe the simulation algorithm, we first need to define the transition rates of the Moran model. From any state \({\varvec{z}}\in \mathcal{Z}\), there are at most \((m+1)m\) jumps \({\varvec{z}}\rightarrow {\varvec{z}}+ {\varvec{\delta }}_{ij}/N\) possible, where \({\varvec{\delta }}_{ij}={\varvec{e}}_j-{\varvec{e}}_i\), \(0\le i,j \le m\) and \(i\ne j\). Each such change corresponds to an event where a type i individual dies and gets replaced by a another one of type j. Since the process remains unchanged when \(i=j\), we need not include these events in the simulation algorithm. It follows from Sect. 12.2 that the transition rate from \({\varvec{z}}\) to \({\varvec{z}}+ {\varvec{\delta }}_{ij}/N\) is
with \(u_{m+1}=v_{-1}=z_{-1}=z_{m+1}=0\). Let \(N_c\) be a threshold. For any given state \({\varvec{z}}\), define the non-critical set \(\varOmega \) of events as those pairs (i, j) with \(i\ne j\) such that both of \(z_i\) and \(z_j\) exceed \(N_c/N\). The remaining events (i, j) are referred to as critical, since at least one of \(z_i\) and \(z_j\) is \(N_c/N\) or smaller. The idea of the hybrid simulation method is to simulate updates of critical events exactly, whereas non-critical events are updated approximately. In more detail, the algorithm is defined as follows:
-
1.
Set \(t=0\) and \({\varvec{Z}}_t={\varvec{e}}_0={\varvec{z}}\).
-
2.
Compute the \(m(m+1)\) transition rates \(a_{ij}=a_{ij}({\varvec{z}})\) for \(0\le i,j \le m\) and \(i\ne j\).
-
3.
Compute the set \(\varOmega =\varOmega ({\varvec{z}})\) of critical events for the current state \({\varvec{z}}\).
-
4.
Determine the exponentially distributed waiting time \(e\, {\mathop {\in }\limits ^\mathcal{L}}\text{ Exp }(a)\) until the next critical event occurs, where \(a=\sum _{(i,j)\notin \varOmega } a_{ij}\) is the rate of the exponential distribution.
-
5.
If \(e<\tau \), simulate a critical event \((I,J)\notin \varOmega \) from the probability distribution \(\{a_{ij}/a;\, (i,j)\notin \varOmega \}\), and update the allele frequency vector as \({\varvec{z}}\leftarrow {\varvec{z}}+ {\varvec{\delta }}_{IJ}/N\). Otherwise, if \(e\ge \tau \), simulate no critical event and leave \({\varvec{z}}\) intact.
-
6.
Let \(h=\min (e,\tau )\). Then simulate non-critical events over a time interval of length h, and increment the allele frequency vector as
$$ {\varvec{z}}\leftarrow {\varvec{z}}+ \frac{1}{N}\sum _{(i,j)\in \varOmega } n_{ij}{\varvec{\delta }}_{ij}, $$where \(n_{ij}\sim \text{ Po }(a_{ij}h)\) are independent and Poisson distributed random variables.
-
7.
Update date time (\(t\leftarrow t+h\)) and the allele frequency process (\({\varvec{Z}}_t\leftarrow {\varvec{z}}\)).
-
8.
If \({\varvec{z}}={\varvec{e}}_m\), set \(T_m=t\) and stop. Otherwise go back to step 2.
We have implemented the hybrid algorithm, with \(N_c\) and \(\varepsilon \) as input parameters and \(\tau =\varepsilon /2\). When the selection coefficients \(s_i\) are highly variable, a smaller value of \(\tau \) is needed though in order to guarantee that (12.118) holds.
Appendix B. The Expected Waiting Time for One Mutation
In this appendix we will motivate formula (12.34). It approximates the expected number of generations \(\alpha (s)\) until a single mutant with fitness s spreads and get fixed in a population where the remaining \(N-1\) individuals have fitness 1, given that such a fixation will happen and that no further mutations occur. This corresponds to a Moran model of Sect. 12.2 with \(m=1\) mutant, zero mutation rates (\(u_1=v_0=0\)), and initial allele frequency distribution \({\varvec{Z}}_0 = (1-p,p)\), where \(p=1/N\). For simplicity of notation we write \(Z_t=Z_{t1}\) for the frequency of the mutant allele 1.
Kimura and Ohta [38] derived a diffusion approximation of \(\alpha (s)\), for a general class of models. It involves the infinitesimal mean and variance functions M(z) and V(z) of the allele frequency process, defined through
as \(h\rightarrow 0\). In order to apply their formula to a mutation-free Moran model, we first need to find M(z) and V(z). To this end, suppose \(Z_t=z\). Then use formula (12.120) with \(m=1\) to deduce that
whereas
From this it follows that
and
We will also need the function
with \(s^\prime = (s-1)/(s+1)\). The formula of Kimura and Ohta [38] takes the form
where
approximates the fixation probability of a mutant allele that starts at frequency \(Z_0=z\). In particular, \(\hat{\beta }(1/N)\) approximates the exact probability (12.32) that one single copy of an allele with fitness s takes over a population where all other individuals have fitness 1. This diffusion approximation is increasingly accurate in the limit of weak selection (\(s\rightarrow 1\)).
The other function of the two integrands in (12.125), is
In order to verify (12.34) we will approximate (12.125) separately for neutral (\(s=1\)), advantageous (\(s>1\)), and deleterious (\(s<1\)) alleles. In the neutral case \(s=1\) we let \(s^\prime \rightarrow 0\) and find that \(\hat{\beta }(z)=z\) and \(\psi (z)=N/[z(1-z)]\). Inserting these functions into (12.125), we obtain an expression
for the expected fixation time. This is essentially the middle part of (12.34) when \(p=1/N\).
When \(s>1\), we similarly insert (12.126)–(12.127) into (12.125). After some quite long calculations, it can be shown that
as \(N\rightarrow \infty \). The first term of this expression dominates for large N, and it agrees with the lower part of (12.34).
When \(s<1\), a similar calculation yields
as \(N\rightarrow \infty \), with \(s^{\prime \prime }=(1-s)/(s+1)\). The first, leading term of this formula is consistent with the upper part of (12.34). The various approximations of \(\alpha (s)\) are shown in Table 12.5.
Appendix C. Sketch of Proofs of Main Results
Lemma 12.1
Let \(\{\tau _k\}_{k=0}^M\) be the fixation times of the process \({\varvec{Z}}_t\), defined in (12.13), and \(\tau _{k+1}^\prime \) the time points when a successful mutation first occurs between two successive fixation events (\(\tau _k<\tau _{k+1}^\prime < \tau _{k+1}\)). Let also \(\mu _i\) be the rate in (12.15) at which successful mutations appear in a homogeneous type i population. Then
as \(N\rightarrow \infty \) for all \(\zeta >0\) and \(i=0,1,\ldots ,m-1\).
Sketch of proof. Let \(f_i({\varvec{z}})=f_{i,N}({\varvec{z}})\) and \(b_i({\varvec{z}})=b_{i,N}({\varvec{z}})\) be the probabilities that the offspring of a type \(i\in \{0,\ldots ,m-1\}\) individual who mutates to \(i+1\) or \(i-1\) is a successful forward or backward mutation, given that the allele frequency configuration is \({\varvec{z}}\) just before replacement occurs with the individual that dies (when \(i=0\) we put \(b_0({\varvec{z}})=0\)). Notice in particular that \(f_i=f_i({\varvec{e}}_i)\) and \(b_i=b_i({\varvec{e}}_i)\), since these two quantities are defined as the probabilities of a successful forward or backward mutation in an environment where all individuals have type i just before the mutation, that is, when \({\varvec{z}}={\varvec{e}}_i\).
When an individual is born in a population with allele configuration \({\varvec{z}}\), with probability \(1-u_{i+1}f_i({\varvec{z}})-v_{i-1}b_i({\varvec{z}})\) it is not the first successful mutation between two fixation events \(\tau _k\) and \(\tau _{k+1}\), given that no other successful has occurred between these two time points. Let \(0\le t_1< t_2 < \cdots \) be the time points when a type i individual gets an offspring, and if we choose \(\{{\varvec{Z}}_t\}\) to be left-continuous, the probability of no successful mutation \(i\rightarrow i\pm 1\) at time \(t_l\), where \(\tau _k< t_l < \tau _{k+1}\), is \(1-u_{i+1}f_i({\varvec{Z}}_{t_l})-v_{i-1}b_i({\varvec{Z}}_{t_l})\), given that no other successful mutation has occurred so far (\(\tau _{k+1}^\prime \ge t_l\)). Since the left hand side of (12.130) is the probability of no mutation \(i\rightarrow i\pm 1\) being successful among those that arrive at some time point in \({\mathbb {T}}_i(\zeta )=\{t_l;\, \tau _k < t_l \le \tau _k+\zeta /\mu _i\}\), we find that
where expectation is with respect to variations in the allele frequency process \({\varvec{Z}}_t\) for \(t\in {\mathbb {T}}_i(\zeta )\).
Because of (12.4)–(12.5), with a probability tending to 1 as \(N\rightarrow \infty \), \({\varvec{Z}}_{t}\) will stay close to \({\varvec{e}}_i\) most of the time in \((\tau _k,\tau _{k+1}^\prime )\), that is, all alleles \(l\ne i\) will most of the time be kept at low frequencies. In order to motivate this, we notice that by definition, all mutations that arrive in \((\tau _k,\tau _{k+1}^\prime )\) are unsuccessful. It is known that the expected lifetime of an unsuccessful mutations is bounded by \(C\log (N)\) for a fairly large class of Moran models with selection, where C is a constant that depends on the model parameters, but not on N (Crow and Kimura [15], Section 8.9). Since mutations arrive at rate \(N(v_{i-1}+u_{i+1})\), this suggest that all alleles \(l\ne i\) are expected to have low frequency before the first successful mutation arrives, if
as \(N\rightarrow \infty \), i.e. if the convergence rate towards zero in (12.4)–(12.5) is faster than logarithmic. This implies that it is possible to approximate the sums on the right hand sides of (12.131) by
where \(|{\mathbb {T}}_i(\zeta )|\) refers to the number of elements in \({\mathbb {T}}_i(\zeta )\). In the first step of (12.132), we used that \(f_i({\varvec{z}})\rightarrow f_i\) and \(b_i({\varvec{z}})\rightarrow b_i\) as \({\varvec{z}}\rightarrow {\varvec{e}}_i\) respectively, and therefore \(f_i({\varvec{Z}}_{t_l})\approx f_i\) and \(b_i({\varvec{Z}}_{t_l})\approx b_i\) for most of the terms in (12.132). In the second step of (12.132) we used that \(|{\mathbb {T}}_i(\zeta )|\) counts the number of births of type i individuals within a time interval of length \(\zeta /\mu _i\), and that each \(t_{l+1}-t_l\) is approximately exponentially distributed. By the definition of the Moran model in Sect. 12.2, the intensity of this exponential distribution is approximately
for the majority of time points \(t_l\) such that \({\varvec{Z}}_{t_l}\) stays close to \({\varvec{e}}_i\). Consequently, \(|{\mathbb {T}}_i(\zeta )|\) is approximately Poisson distributed with expected value \(N\zeta /\mu _i\). We know from (12.4)–(12.5) and (12.15) that \(\mu _i=o(1)\). Because this implies that \(N\zeta /\mu _i \gg 1\) is large, and since the coefficient of variation of a Poisson distribution tends to zero when its expected value increases, \(|{\mathbb {T}}_i(\zeta )|/(N\zeta /\mu _i)\) converges to 1 in probability as \(N\rightarrow \infty \), and therefore we approximate \(|{\mathbb {T}}_i(\zeta )|\) by \(N\zeta /\mu _i\). To conclude; (12.130) follows from (12.15), (12.131), and (12.132). \(\square \)
Proof of Theorem 12.1. Let \({\varvec{X}}_{\zeta } = {\varvec{Z}}_{\zeta /\mu _{\text{ min }}}\) denote the allele frequency process after changing time scale by a factor \(\mu _{\text{ min }}\). Let \(S_k=\mu _{\text{ min }}\tau _k\) refer to time points of fixation when \(\{{\varvec{X}}_\zeta \}\) visits new fixed states in \(\mathcal{Z}_{\text{ hom }}\), defined in (12.6), \(S_{k+1}^\prime = \mu _{\text{ min }}\tau _{k+1}^\prime \) the time point when a successful mutation first appears after \(S_k\), and \(S=\mu _{\text{ min }}T_m = S_M\) the time when allele m gets fixed. We need to show that
To this end, write
where \(S_{\text{ appear }}\) is the total waiting time for new successful mutations to appear, and \(S_{\text{ tunfix }}\) is the total waiting time for tunneling and fixation, after successful mutations have appeared. We will first show that
It follows from (12.14) to (12.17) that \(\{{\varvec{X}}_{S_k}\}\) is a Markov chain that starts at \({\varvec{X}}_{S_0}={\varvec{e}}_0\), with transition probabilities
Because of (12.25) and Lemma 12.1, the waiting times for successful mutations \(i\rightarrow i\pm 1\) have exponential or degenerate limit distributions as \(N\rightarrow \infty \), since
where \(I_{\text{ long }}\) and \(I_{\text{ short }}\) refer to those asymptotic states in (12.22) and (12.23) that are visited for a long and short time, respectively. Since by definition, the non-asymptotic states \(i\in I_{\text{ nas }}\) in (12.20) will have no contribution to the limit distribution of \(S_{\text{ appear }}\) as \(N\rightarrow \infty \), it follows from (12.136) to (12.137) that asymptotically, \(S_{\text{ appear }}\) is the total waiting time for a continuous time Markov chain with intensity matrix \({\varvec{\varSigma }}\), that starts at \({\varvec{e}}_0\), before it reaches its absorbing state \({\varvec{e}}_m\). This proves (12.135).
It remains to prove that \(S_{\text{ tunfix }}\) is asymptotically negligible. It follows from (12.26) that
as \(N\rightarrow \infty \) for any \(\varepsilon > 0\). Write \(M=\sum _{i=0}^{m-1} M_i\), where \(M_i\) is the number of visits to \({\varvec{e}}_i\) by the Markov chain \(\{{\varvec{X}}_{S_k};\, k=0,\ldots ,M\}\), before it is stopped at time M. Let K be a large positive integer. We find that
for all sufficiently large N. In the second step of (12.139) we used that
where \({\varvec{P}}_0\) is a square matrix of order m that contains the first m rows and m columns of the transition matrix \({\varvec{P}}\) of the Markov chain \({\varvec{X}}_{S_k}\), so that its elements are the transition probabilities among and from the non-absorbing states. We used in (12.140) that M is the number of jumps until this Markov chain reaches its absorbing state, and therefore it has a discrete phase-type distribution (Bobbio et al. [9]). And because of (12.17)–(12.18), the expected value of M must be finite. In the last step of (12.139) we used (12.138) and the definition of non-asymptotic states, which implies \(P(M_i>0)=o(1)\) for all \(i\in I_{\text{ nas }}\).
Since (12.139) holds for all \(K>0\) and \(\varepsilon >0\), we deduce \(S_{\text{ tunfix }}=o(1)\) by first letting \(K\rightarrow \infty \) and then \(\varepsilon \rightarrow 0\). Together with (12.134)–(12.135) and Slutsky’s Theorem (see for instance Gut [29]), this completes the proof of (12.133). \(\square \)
In order to motivate Theorem 12.2, we first give four lemmas. It is assumed for all of them that the regularity conditions of Theorem 12.2 hold.
Lemma 12.2
Let \(r_{ilj}\) be the probabilities defined in (12.37)–(12.40). Then
and
as \(N\rightarrow \infty \). The corresponding formulas for \(r_{ij}=r_{iij}\) in (12.36) are obtained by putting \(l=i\) in (12.141)–(12.142).
Proof. In order to prove (12.141), assume \(i\le l \le j-2\). Since \(r_{i,j-1,j}=1\), repeated application of the recursive formula \(r_{i,k-1,j}=R(\rho _{ikj})\sqrt{r_{ikj}u_{k+1}}\) in (12.38), for \(k=j-1,\ldots ,l+1\), leads to
We know from (12.48) that all \(\rho _{ilj}=O(1)\) as \(N\rightarrow \infty \). From this and the definition of the function \(R(\rho )\) in (12.41), it follows that \(R(\rho _{ilj})=\varTheta (1)\) as \(N\rightarrow \infty \), so that
Then both parts of (12.141) follow by inserting the first equation of (12.46) into (12.144). The proof of (12.142) when \(j+2\le l\le i\) is analogous. Since \(r_{i,j+1,j}=1\), we use a recursion for \(k=j+1,\ldots ,l-1\) in order to arrive at the explicit formula
Then use (12.48) and the third equation of (12.46) to verify that \(r_{ilj}\) satisfies (12.142). \(\square \)
Lemma 12.3
Let \(q_{ij}\), \(q_{ilj}\), \(r_{ij}\), and \(r_{ilj}\) be the probabilities defined in connection with (12.35)–(12.40). Consider a fixed \(i\in \{0,1,\ldots ,m-1\}\), and let F(i) and B(i) be the indices defined in (12.44). Then,
as \(N\rightarrow \infty \). In particular,
Sketch of proof. Notice that (12.146) is a direct consequence of (12.145), since \(q_{iij}=q_{ij}\) and \(r_{iij}=r_{ij}\). We will only motivate the upper part of (12.145), since the lower part is treated similarly. Consider a fixed \(i\in \{0,\ldots ,m-1\}\), and for simplicity of notation we write \(j=F(i)\). We will argue that
for \(l=j-1,\ldots ,i\) by means of induction. Formula (12.147) clearly holds when \(l=j-1\), since, by definition, \(q_{i,j-1,j}=r_{i,j-1,j}=1\). As for the induction step, let \(i+1\le l \le j-1\), and suppose (12.147) has been proved for l. Then recall the recursive formula
from (12.38), with R defined in (12.41). If
holds as well, then (12.147) has been shown for \(l-1\), and the induction proof is completed. Without loss of generality we may assume that \(j\ge i+2\), since otherwise the induction proof of (12.147) stops after the first trivial step \(l=j-1\).
In order to motivate (12.149), we will look at what happens when the population is in fixed state i. Suppose \({\varvec{Z}}_{\tau _k}={\varvec{e}}_i\), and recall that \(\tau _{k+1}^\prime \) is the time point when the first successful mutation \(i\rightarrow i+1\) in \((\tau _k,\tau _{k+1})\) arrives. Therefore, if \({\varvec{Z}}_{\tau _{k+1}}={\varvec{e}}_j\), there is a non-empty set \(J=\{i+1,\ldots ,j-1\}\) of types that must be present among some of the descendants of the successful mutation, before a mutation \(j-1\rightarrow j\) arrives at some time point \(\tau _{k+1}^{\prime \prime }\in (\tau _{k+1}^\prime ,\tau _{k+1})\). Put \(Z_{tJ} = \max _{l\in J} Z_{tl}\). The regularity condition
for all \(\varepsilon >0\) as \(N\rightarrow \infty \), assures that with high probability, none of the alleles in J reaches a high frequency after the successful \(i\rightarrow i+1\) mutation occurred, and before allele j first appears. We will need this condition below, for verifying the induction step (12.149).
The rationale for (12.150) is that fixation events \(i\rightarrow j\) will happen much more frequently than other types of fixation events \(i\rightarrow l\) with \(l\in J\), because of (12.44). We will motivate that
for any \(a>0\) and \(\varepsilon >0\) as \(N\rightarrow \infty \), with \(\mu _i\) the rate of leaving fixation state i. In Lemma 12.1 we motivated that \(\tau _{k+1}^\prime -\tau _k = O_p(\mu _i^{-1})\), and in Lemma 12.5 we will argue that \(\tau _{k+1}^{\prime \prime }-\tau _{k+1}^\prime = o_p(\mu _i^{-1})\). Since this implies \(\tau _{k+1}^{\prime \prime }-\tau _k = O_p(\mu _i^{-1})\), formula (12.150) will follow from (12.151).
In order to motivate (12.151), assume for simplicity there are no backward mutations (the proof is analogous but more complicated if we include back mutations as well). If allele \(l\in J\) exceeds frequency \(\varepsilon \), we refer to this as a semi-fixation event. Let \(\lambda _{il}(\varepsilon )\) be the rate at which this happens after time \(\tau _k\), and before the next fixed state is reached. Then, the rate at which semi-fixation events happen among some \(l\in J\), is
In the second step of (12.152) we introduced \(\beta _{N\varepsilon }(s)\), the probability that a single mutant with fitness s reaches frequency \(\varepsilon \), if all other individuals have fitness 1 and there are no mutations. We made use of
This is motivated as in the proof of Lemma 12.4, in particular Eqs. (12.163), (12.164) and variant of (12.167) for semi-fixation rather than fixation. In the third step of (12.152) we utilized that \(\beta _{N\varepsilon }(s)\) is larger than the corresponding fixation probability \(\beta (s)=\beta _N(s)\) for a population of size N. In order to quantify how much larger the fixation probability of the smaller population of size \(N\varepsilon \) is, we introduced \(C(\varepsilon )\), an upper bound of \(\beta _{N\varepsilon }(s_l/s_i)/\beta (s_l/s_i)\) that holds for all \(l\in J\). An expression for \(C(\varepsilon )\) can be derived from (12.32) if \(s_l/s_i\) is sufficiently close to 1. Indeed, we know from (12.48) that \(s_l/s_i\rightarrow 1\) as \(N\rightarrow \infty \). However, we need to sharpen this condition somewhat, to
for all \(l\in J\) and some fixed \(x<0\). Then it follows from (12.32) that
is a constant not depending on N. Finally, in the last step of (12.152) we assumed
This is motivated in the same way as Eq. (12.153), making use of (12.163)–(12.164) and (12.167).
Assuming that semi-fixation events arrive according to a Poisson process with intensity \(\lambda _{iJ}(\varepsilon )\), formula (12.151) follows from (12.44) to (12.152), since
as \(N\rightarrow \infty \). In the third step of (12.156) we used (12.16) to conclude that \(p_{il}=\lambda _{il}/\mu _i\), and in the fourth step we utilized (12.17). In the fifth step of (12.156) we claimed that \(\pi _{il}=\hat{\pi }_{il}\) for \(l\in J\), Although we have not given a strict proof of this, it seems reasonable in view of the definitions of \(\pi _{il}\) and \(\hat{\pi }_{il}\) in (12.17) and (12.43), together with (12.35), (12.155), and the fact that \(q_{il}\sim r_{il}\) for \(i<l<F(i)\) (which can be proved by induction with respect to l). Finally, in the last step of (12.156) we invoked (12.44), which implies \(\hat{\pi }_{il}=0\) for all \(l\in J = \{i+1,\ldots ,F(i)-1\}\).
Equation (12.150) enables us to approximate the allele frequency \(Z_{tl}\) by a branching process with mutations, in order to motivate (12.149). (A strict proof of this for a neutral model \(s_0=\cdots s_{m-1}=1\) can be found in Theorem 2 of Durrett et al. [20].) We will look at the fate of the first \(l-1\rightarrow l\) mutation at time \(\tau \in (\tau _{k+1}^\prime ,\tau _{k+1}^{\prime \prime })\), that is a descendant of the first successful \(i\rightarrow i+1\) mutation at time \(\tau _{k+1}^\prime \), and arrives before the first \(j-1\rightarrow j\) mutation at time \(\tau _{k+1}^{\prime \prime }\). Recall that \(q=q_{i,l-1,j}\) is the probability that this l mutation gets an offspring that mutates into type j, and \(q^\prime =q_{ilj}\) is the corresponding probability that one of its descendants, an \(l\rightarrow l+1\) mutation, gets a type j offspring. Let also \(r^\prime =r_{ilj}\) be the approximation \(q^\prime \), and write \(s=s_l/s_i\) for the ratio between the selection coefficients of alleles l and i. With this simplified notation, according to (12.149), we need to show that
as \(N\rightarrow \infty \), where \(u=u_{l+1}\), and \(\rho = \rho _{ilj}\) is defined in (12.37), i.e.
We make the simplifying assumption that at time \(\tau \), the population has one single type l individual, the one that mutated from type \(l-1\) at this time point, whereas all other \(N-1\) individuals have type i. (Recall that we argued in Lemma 12.1 that such an assumption is asymptotically accurate.) In order to compute the probability q for the event A that this individual gets a descendant of type j, we condition on the next time point when one individual dies and is replaced by the offspring of an individual that reproduces. Let D and R be independent indicator variables for the events that the type l individual dies and reproduces respectively. Using the definition of the Moran process in Sect. 12.2, this gives an approximate recursive relation
for q, where \(v=v_{l-1}\) is the probability of a back mutation \(l\rightarrow l-1\). In the last step of (12.159) we retained the exact transition probabilities of the Moran process, but we used a branching process approximation for the probability q that the type l mutation at time \(\tau \) gets a type j descendant. This approximation relies on (12.150), and it means that descendants of the type l mutation that are alive at the same time point, have independent lines of descent after this time point. For instance, in the second term on the right hand side of (12.159), a type i individual dies and the type l individual reproduces (\(D=0\), \(R=1\)). Then there are three possibilities: First, the offspring of the type l individual mutates to \(l+1\) with probability u. Since the type l individual and its type \(l+1\) offspring have independent lines of descent, the probability is \(1-(1-q^\prime )(1-q)=q^\prime +q-q^\prime q\) that at least one of them gets a type j descendant. Second, if the offspring mutates back to \(l-1\) (with probability v), its type l parent has a probability q of getting a type j descendant. Third, if the offspring does not mutate (with probability \(1-u-v\)), there are two type l individuals, with a probability \(1-(1-q)^2 = 2q-q^2\) that at least one of them gets a type j offspring.
Equation (12.159) is quadratic in q. Dividing both sides of it by \(s/(N-1+s)\), it can be seen, after some computations, that this equation simplifies to \(aq^2 + bq + c = 0\), with
as \(N\rightarrow \infty \). When simplifying the formula for b, we used (12.158) in the second step, the induction hypothesis (12.147) in the last step (since it implies \(q^\prime \sim r^\prime \)), and additionally we assumed in the last step that \((1+q^\prime )u+v = o(\sqrt{ur^\prime })\). In order to justify this, from the second equation of (12.46) we know that \(v=O(u)\), and since \(q^\prime \le 1\), it suffices to verify that \(u = o(\sqrt{ur^\prime })\), or equivalently that \(r^\prime =\varOmega (u)\). But this follows from (12.46), (12.141), and the fact that \(u=u_{l+1}\), since
where in the last step we used that \(l\le j-1\). This verifies the asymptotic approximation of b in (12.160).
To conclude, in order to prove of (12.157), we notice that the only positive solution to the quadratic equation in q, with coefficients as in (12.160), is
where in the last step we invoked the definition of \(R(\rho )\) in (12.41). This finishes the proof of the induction step (12.149) or (12.157), and thereby the proof of (12.147).
We end this proof by a remark: Recall that \(r_{ij}\) in (12.36) is an approximation \(q_{ij}\), obtained from recursion (12.38) or (12.148) when \(j>i\), and from (12.40) when \(j<i\). A more accurate (but less explicit) approximation of \(q_{ij}\) is obtained, when \(i<j\), by recursively solving the quadratic equation \(ax^2 + bx + c=0\), with respect to \(x=r_{i,l-1,j}\) for \(l=j-1,\ldots ,i+1\), and finally putting \(r_{ij}=r_{iij}\). The coefficients of this equation are defined as in (12.160), with \(r^\prime =r_{ilj}\) instead of \(q^\prime \). When \(j<i\), the improved approximation of \(q_{ij}\) is defined analogously. \(\square \)
Lemma 12.4
Let \(\mu _i\) be the rate (12.15) at which a successful forward or backward mutation occurs in a homogeneous type i population, and let \(\hat{\mu }_i\) in (12.42) be its approximation. Define the asymptotic transition probabilities \(\pi _{ij}\) between fixed population states as in (12.17), and their approximations \(\hat{\pi }_{ij}\) as in (12.43). Then
as \(N\rightarrow \infty \), and
Sketch of proof. Consider a time point \(\tau _k\) when the population becomes fixed with type i, so that \({\varvec{Z}}_{\tau _k}={\varvec{e}}_i\). Denote by \(f_{ij}\) the probability a forward mutation \(i\rightarrow i+1\), which appears at a time point later than \(\tau _k\), is the first successful mutation after \(\tau _k\), that its descendants have taken over the population by time \(\tau _{k+1}\), and that all of them by that time have type j (so that \({\varvec{Z}}_{\tau _{k+1}}={\varvec{e}}_j\)). Likewise, when \(j<i\) and \(i\ge 1\), we let \(b_{ij}\) refer to the probability that if a backward mutation \(i\rightarrow i-1\) arrives, it is successful, its descendants have taken over the population by time \(\tau _{k+1}\), and all of them have type j. For definiteness we also put \(b_{0j}=0\). We argue that
since the event that the population at time \(\tau _{k+1}\) have descended from more than one \(i\rightarrow i\pm 1\) mutation that occurred in the time interval \((\tau _k,\tau _{k+1})\), is asymptotically negligible.
Let \(\beta _j({\varvec{z}})\) be the probability that the descendants of a type j individual, who lives in a population with a type configuration \({\varvec{z}}\), takes over the population so that it becomes homogeneous of type j. Although \(\beta _j({\varvec{z}})\) depends on the mutation rates \(u_1,\ldots ,u_m,v_0,\ldots ,v_{m-1}\) as well as the selection coefficients \(s_1,\ldots ,s_m\), this is not made explicit in the notation. The probabilities \(f_{ij}\) and \(b_{ij}\) in (12.163) can be written as a product
of two terms. Recall that the first term, \(q_{ij}\), is the probability that the first successful mutation \(i\rightarrow i\pm 1\) at time \(\tau _{k+1}^\prime > \tau _k\) has a descendant that mutates into type j at some time \(\tau _{k+1}^{\prime \prime }\in (\tau _{k+1}^{\prime },\tau _{k+1})\). The second term is the probability that this mutation has spread to the rest of the population by time \(\tau _{k+1}\). The conditional expectation of this second term is with respect to variations in \({\varvec{Z}}_{\tau _{k+1}^{\prime \prime }}\), and the conditioning is with respect to \(A_{j}\), the event that the mutation at time \(\tau _{k+1}^{\prime \prime }\) is into type j.
In order to compare the transition rates in (12.163) with the approximate ones in (12.35), we notice that the latter can be written as
where
\(r_{ij}\) is the approximation of \(q_{ij}\) defined in (12.36), whereas \(\beta (s_{j}/s_i)\) is the probability that a single type j individual gets fixed in a population without mutations, where all other individuals have type i.
We will argue that the probabilities in (12.166) are asymptotically accurate approximations of those in (12.164), for all pairs i, j of states that dominate asymptotically, that is, those pairs for which \(j\in \{B(i),F(i)\}\). In Lemma 12.3 we motivated that \(r_{ij}\) is an asymptotically accurate approximation of \(q_{ij}\) for all such pairs of states. Likewise, we argue that \(\beta (s_{j}/s_i)\) is a good approximation of the conditional expectation in (12.164). Indeed, following the reasoning of Lemma 12.3, since none of the intermediate alleles, between i and j, will reach a high frequency before the type j mutant appears at time \(\tau _{k+1}^{\prime \prime }\), it follows that most of the other \(N-1\) individuals will have type i at this time point. Consequently,
as \(N\rightarrow \infty \). In the last step of (12.167) we used that new mutations between time points \(\tau _{k+1}^{\prime \prime }\) and \(\tau _{k+1}\) can be ignored, because of the smallness (12.4)–(12.5) of the mutation rates. Since \(\beta _j\left( (N-1){\varvec{e}}_i/N + {\varvec{e}}_j/N\right) \) is the fixation probability of a single type j mutant that has selection coefficient \(s_j/s_i\) relative to the other \(N-1\) type i individuals, it is approximately equal to the corresponding fixation probability \(\beta (s_j/s_i)\) of a mutation free Moran model. It therefore follows from (12.164) and (12.166) that
as \(N\rightarrow \infty \).
Next we consider pairs of types i, j such that \(j\notin \{B(i),F(i)\}\). We know from (12.44), (12.165) and (12.166) that \(\hat{f}_{il}=o(\hat{f}_{iF(i)})\) for all \(l>i\) such that \(l\ne F(i)\). It is therefore reasonable to assume that \(f_{il}=o(f_{iF(i)})\) as well for all \(l>i\) with \(l\ne F(i)\), although \(\hat{f}_{il}\) need not necessarily be a good approximation of \(f_{il}\) for all these l. The same argument also applies to backward mutations when \(B(i)\ne \emptyset \) and \(\hat{\pi }_{iB(i)}>0\), that is, we should have \(f_{il}=o(f_{iB(i)})\) for all \(l<i\) such that \(l\ne B(i)\).
Putting things together, it follows from (12.44), (12.163), (12.165), (12.168), and the last paragraph that the approximate rate (12.42) at which a homogeneous type i population is transferred into a new fixed state, satisfies
as \(N\rightarrow \infty \), in agreement with (12.161). Formulas (12.16)–(12.17), (12.43)–(12.44), (12.163), (12.165), and (12.168)–(12.169) also motivate why \(\pi _{ij}\) should equal \(\hat{\pi }_{ij}\), in accordance with (12.162). \(\square \)
Lemma 12.5
The regularity condition (12.47) of Theorem 12.2 implies that (12.26) holds.
Sketch of proof. Suppose \({\varvec{Z}}_{\tau _k}={\varvec{e}}_i\) and \({\varvec{Z}}_{\tau _{k+1}}={\varvec{e}}_j\) for some \(i\in I_{\text{ as }}\) and \(j\ne i\). Write
If \(j>i\), then the successful mutation at time \(\tau _{k+1}^\prime \) is from i to \(i+1\). This type \(i+1\) mutation has a line of descent with individuals that mutate to types \(i+2,\ldots ,j\), before the descendants of the type j mutation take over the population. The first term \(\sigma _{\text{ tunnel }}=\tau _{k+1}^{\prime \prime }-\tau _{k+1}^\prime \) on the right hand side of (12.170) is the time it takes for the type \(i+1\) mutation to tunnel into type j. It is the sum of \(\sigma _l\), the time it takes for the type \(l+1\) mutation to appear after the type l mutation, for all \(l=i+1,\ldots ,j-1\). The second term \(\sigma _{\text{ fix }}=\tau _{k+1}-\tau _{k+1}^{\prime \prime }\) on the right hand side of (12.170) is the time it takes for j to get fixed after the j mutation first appears. When \(j<i\), we interpret the terms of (12.170) analogously. It follows from (12.170) that in order to prove (12.26), it suffices to show that
as \(N\rightarrow \infty \) for all asymptotic states \(i\in I_{\text{ as }}\). When \(j>i\), we know from (12.44) to (12.162) that with probability tending to 1, \(j=F(i)\). Following the argument from the proof of Theorem 2 of Durrett et al. [20], we have that
In the special case when \(l=i+1\) and \(j=i+2\), formula (12.172) can also be deduced from the proof of Theorem 12.3, by looking at \(G(x)/G(\infty )\) in (12.191). Using (12.172), we obtain the upper part of (12.171), since
In the second step of (12.173) we used that \(q_{iij}\le q_{ilj}\) for \(i<l\), which follows from the definition of these quantities, in the third step we invoked \(q_{ij}=q_{iij}\), and in the fourth step we applied the relation
The first step of (12.174) is motivated as in Lemma 12.4, since \(j=F(i)\) and hence \(\pi _{ij}>0\), whereas the second step follows from (12.4) and the fact that \(\beta (s_i/s_j)\) is bounded by 1. Finally, the fourth step of (12.173) follows from the definition of \(\mu _{\text{ min }}\) in (12.24), since (12.174) applies to any \(i\in I_{\text{ as }}\). When \(j<i\), the first part of (12.171) is shown analogously.
In order to verify the second part of (12.171), we know from the motivation of Lemma 12.4 that with high probability, \(\sigma _{\text{ fix }}\) is the time it takes for descendants of the type j mutation to take over the population, ignoring the probability that descendants of other individuals first mutated into j and then some of them survived up to time \(\tau _{k+1}\) as well. We further recall from Lemma 12.4 that because of the smallness (12.4)–(12.5) of the mutation rates, right after the j mutation has arrived at time \(\tau _{k+1}^{\prime \prime }\), we may assume that the remaining \(N-1\) individuals have type i, and after that no other mutation occurs until the j allele gets fixed at time \(\tau _{k+1}\). With these assumptions, \(\sigma _{\text{ fix }}\) is the time for one single individual with selection coefficient \(s_j/s_i\) to get fixed in a two-type Moran model without mutations, where all other individuals have selection coefficient 1. From Sect. 12.5 it follows that \(E(\sigma _{\text{ fix }})\sim \alpha (s_j/s_i)\), and therefore the second part of (12.171) will be proved if we can verify that
holds for all \(i\in I_{\text{ as }}\) and \(j\in \{B(i),F(i)\}\) as \(N\rightarrow \infty \). This is equivalent to showing that
as \(N\rightarrow \infty \), where the \(\alpha ^{-1}(s_{B(i)}/s_i)\)-term is included only when \(B(i)\ne \emptyset \) (or equivalently, when \(\pi _{iB(i)}>0\)). Using (12.44), (12.46), (12.141), (12.161), (12.168), and (12.169), we find that
Inserting (12.176) into the definition of \(\mu _{\text{ min }}\) in (12.24), we obtain
and formula (12.175) follows, because of (12.47). \(\square \)
Proof of Theorem 12.2. We need to establish that the limit result (12.49) of Theorem 12.2 follows from Theorem 12.1. To this end, we first need to show that all \(\hat{\lambda }_{ij}\) are good approximations of \(\lambda _{ij}\), in the sense specified by Theorem 12.2, i.e. \(\pi _{ij}=\hat{\pi }_{ij}\) and \(\hat{\mu }_i/\hat{\mu }_{\text{ min }}\rightarrow \kappa _i\) as \(N\rightarrow \infty \). But this follows from Lemma 12.4, and the definitions of \(\mu _{\text{ min }}\) and \(\hat{\mu }_{\text{ min }}\) in (12.24) and Theorem 12.2. Then it remains to check those two regularity conditions (12.18) and (12.26) of Theorem 12.1 that are not present in Theorem 12.2. But (12.18) follows from (12.44) to (12.162), since these two equations imply \(\pi _{iF(i)}>0\) for all \(i=0,\ldots ,m-1\), and (12.26) follows from Lemma 12.5. \(\square \)
Proof of (12.109). Let
be the standardized expected waiting time until all m mutations have appeared and spread in the population, given that it starts in fixed state i. Our goal is to find an explicit formula for \(\theta _0\), and then show that (12.109) is an asymptotically accurate approximation of this explicit formula as \(m\rightarrow \infty \).
Recall that \(\varSigma _{ij}\) in (12.107) are the elements of the intensity matrix, for the Markov process that switches between fixed population states, when time has been multiplied by \(\hat{\mu }_{\text{ min }}=u\). When the population is in fixed state i, the standardized expected waiting time until the next transition is \(1/(-\varSigma _{ii})\). By conditioning on what happens at this transition, it can be seen that the standardized expected waiting times in (12.177), satisfy a recursive relation
for \(i=0,1,\ldots ,m-1\), assuming \(\theta _{-1}=0\) on the right hand side of (12.178) when \(i=0\), and similarly \(\theta _m=0\) when \(i=m-1\). Inserting the values of \(\varSigma _{ij}\) from (12.107) into (12.178), we can rewrite the latter equation as
and
for \(i=1,\ldots ,m-1\), respectively. We obtain an explicit formula for \(\theta _0\) by first solving the linear recursion for \(\theta _i-\theta _{i+1}\) in (12.179)–(12.180), and then summing over i. This yields
where
Formulas (12.181)–(12.182) provide the desired explicit formula for \(\theta _0\). When \(C=0\), it is clear that
where \(\gamma \approx 0.5772\) is the Euler–Mascheroni constant. This proves the upper half of (12.109). For \(C>0\), we will show that when m gets large, the (standardized) expected waiting time until the last mutant gets fixed, \(\theta _{m-1}-\theta _m=\theta _{m-1}\), dominates the first sum in (12.181). To this end, we first look at \(\theta _{m-1}\), and rewrite this quantity as
where
are two binomially distributed random variables. For large m, we apply the Law of Large Numbers to \(Y_{m-1}\) and find that
in agreement with the lower half of (12.109). In view of (12.181), in order to finalize the proof of (12.109), we need to show that the sum of \(\theta _{m-j}-\theta _{m-j+1}\) for \(j=2,3,\ldots ,m\), is of a smaller order than (12.184). A similar argument as in (12.183) leads to
where
For large m we have, by the Law of Large Numbers, that
By summing (12.186) over j, it is easy to see that
as \(m\rightarrow \infty \). Together with (12.184), this completes the derivation of the lower part of (12.109). \(\square \)
Sketch of proof of Theorem 12.3. Our proof will parallel that of Theorem 1 in Durrett el al. [20], see also Wodarz and Komarova [66]. We first use formula (12.66) in order to deduce that the ratio between the two rates of fixation from a type 0 population, satisfies \(\hat{\lambda }_{02}/\hat{\lambda }_{01} \rightarrow \infty \) as \(N\rightarrow \infty \). When \(\rho =0\) in (12.51), this is a consequence of \(\hat{\lambda }_{02}/\hat{\lambda }_{01}\sim N\sqrt{u_2}\) and the assumption \(N\sqrt{u_2}\rightarrow \infty \) on the second mutation rate \(u_2\). When \(\rho < 0\), \(\hat{\lambda }_{02}/\hat{\lambda }_{01}\) tends to infinity at an even faster rate, due to the \(\psi (\rho u_2^{1/2})\)-term of \(\hat{\lambda }_{01}\) in (12.66). In any case, it follows that condition (12.44) is satisfied, with \(F(0)=2\) and \(\hat{\pi }_{02}=1\). That is, tunneling from 0 to 2 will occur with probability tending to 1 as \(N\rightarrow \infty \) whether \(\rho =0\) or \(\rho <0\). As in the proof of Lemma 12.3 we conclude from this that the fraction \(Z_t=Z_{t1}\) of allele 1 will stay close to 0, and we may use a branching process approximation for \(Z_t\). A consequence of this approximation is that type 1 mutations arrive according to a Poisson process with intensity \(Nu_1\), and the descendants of different type 1 mutations evolve independently. Let \(0<\sigma \le \infty \) be the time it takes for the first type 2 descendant of a type 1 mutation to appear. In particular, if \(\sigma =\infty \), this type 1 mutation has no type 2 descendants. Letting \(G(x)=P(\sigma \le x)\) be the distribution function of \(\sigma \), it follows by a Poisson process thinning argument that
We use Kolmogorov’s backward equation in order to determine G. To this end, we will first compute \(G(x+h)\) for a small number \(h>0\), by conditioning on what happens during the time interval (0, h). As in formulas (12.121)–(12.122) of Appendix B, we let \(a_{ij}(z)\) refer to the rate at which a type i individual dies and gets replaced by the offspring of a type j individual, when the number of type 1 individuals before the replacement is Nz. Since we look at the descendants of one type 1 individual, we have that \(z=Z_0=1/N\). Using a similar argument as in Eq. (12.159), it follows from this that
for small \(h>0\). Notice that the two \(a_{00}(1/N)\) terms cancel out in (12.188), whereas \(a_{11}(1/N)(1-G(x))u_2\times h=O(N^{-2}u_2\times h)\) is too small to have an asymptotic impact. Using formulas (12.121)–(12.122) for \(a_{01}(1/N)\) and \(a_{10}(1/N)\), it follows that (12.188) simplifies to
when all asymptotically negligible terms are put into the remainder term. Letting \(h\rightarrow 0\), we find that G(x) satisfies the differential equation
where
are the two roots of the quadratic equation \(-sy^2 + (s-1)y+su_2=0\). Recall from (12.51) that \(s=1+\rho \sqrt{u_2}\). We may therefore express these two roots as
where in the second step we used that \(u_2\rightarrow 0\) and \(s\rightarrow 1\) as \(N\rightarrow \infty \), and in the last step we invoked (12.41), the definition of \(R(\rho )\). Since \(r_2< 0 < r_1\), and \(G^\prime (x)\rightarrow 0\) as \(x\rightarrow \infty \), it follows from (12.189) that we must have \(G(\infty )=r_1\). Together with the other boundary condition \(G(0)=0\), this gives as solution
to the differential equation (12.189), with
and
Putting things together, we find that
where formula (12.190) was used in the first step, (12.187) in the second step, in the third step we changed variables \(y=Nu_1r_1 s\times x\) and introduced the hazard function \(h(x) = G\left( x/(Nu_1r_1s)\right) /(sr_1)\). If \(Nu_1\rightarrow a >0\) as \(N\rightarrow \infty \), it follows from (12.191) and the fact that \(s\rightarrow 1\) that we can rewrite the hazard function as
We finally obtain the limit result (12.110)–(12.111) when \(a>0\) from (12.193) to (12.194), using (12.192) and the fact that
When \(Nu_1\rightarrow 0\), one similarly shows that (12.193) holds, with \(h(x)=1\). Finally, formula (12.112) follows by integrating (12.193) with respect to t. \(\square \)
Motivation of formula (12.114). We will motivate formula (12.114) in terms of the transition rates \(\hat{\lambda }_{ij}\) in (12.35), rather than those in (12.113) that are adjusted for tunneling and fixation of alleles.
Since we assume \(s_1=\cdots =s_{m-1}=1<s_m\) in (12.114), it follows from (12.35) that it is increasingly difficult to have backward and forward transitions over larger distances, except that it is possible for some models to have a direct forward transition to the target allele m. By this we mean that the backward and forward transition rates from any state i satisfy \(\hat{\lambda }_{i,i-1}\gg \cdots \gg \hat{\lambda }_{i0}\), and \(\hat{\lambda }_{i,i+1}\gg \cdots \gg \hat{\lambda }_{i,m-1}\) respectively, as \(N\rightarrow \infty \). For this reason, from any fixed state i, it is only possible to have competition between the two forward transitions \(i\rightarrow i+1\) and \(i\rightarrow m\) when \(0\le i\le m-2\). Since \(\gamma _i=(\hat{\lambda }_{im}/\hat{\lambda }_{i,i+1})^2\), and since the transition rates to the intermediate alleles \(i+1,\ldots ,m-1\) are of a smaller order than the transition rate to \(i+1\), it follows that (12.35) predicts a total forward rate of fixation from fixed state i of the order
where in the last step we used that \(s_i=s_{i+1}\) and \(\beta (1)=1/N\). We will extend the argument in the proof of Theorem 3 in Durrett et al. [20], and indicate that the total forward rate of fixation from i should rather be
where \(\chi (\cdot )\) is the function defined in (12.63). This will also motivate (12.114), since this formula serves the purpose of modifying the incorrect forward rate of fixation (12.195), so that it equals the adjusted one in (12.196), keeping the relative sizes of the different forward rates \(i\rightarrow j\) of fixation intact for \(j=i+1,\ldots ,m\).
The rationale for (12.196) is that type \(i+1\) mutations arrive according to a Poisson process at rate \(Nu_{i+1}\), and \(\chi /N\) is the probability that any such type \(i+1\) mutation has descendants of type \(i+1\) or m that spread to the whole population. We need to show that
To this end, let \(X_t\) be the fraction of descendants of a \(i\rightarrow i+1\) mutation, Nt time units after this mutation appeared. We stop this process at a time point \(\tau \) when \(X_t\) reaches any of the two boundary points 0 or 1 (\(X_\tau =0\) or 1), or when a successful mutation \(i+1\rightarrow i+2\) appears before that, which is a descendant of the type \(i+1\) mutation that itself will have type m descendants who spread to the whole population, before any other type gets fixed (\(0<X_\tau <1\)). We have that \(x=X_0=1/N\), but define
for any value of x. This is a non-fixation probability, i.e. the probability that the descendants of Nx individuals of type \(i+1\) at time \(t=0\) neither have a successful type \(i+2\) descendant, nor take over the population before that. Since the descendants of a single type \(i+1\) mutation take over the population with probability \(1-\bar{\beta }(1/N)\), it is clear that
Durrett et al. [20] prove that it is possible to neglect the impact of further \(i\rightarrow i+1\) mutations after time \(t=0\). It follows that \(X_t\) will be a version of the Moran process of Appendix B with \(s=s_{i+1}/s_i=1\), during the time interval \((0,\tau )\), when time speeded up by a factor of N. Using (12.123)–(12.124), we find that the infinitesimal mean and variance functions of \(X_t\) are
respectively. At time t, a successful type \(i+2\) mutation arrives at rate
where in the second step we used \(r_{im}^2=u_{i+2}r_{i+1,m}\), which follows from (12.36), since all \(R(\rho _{ilj})=1\) when \(s_1=\cdots =s_{m-1}=1\). Then in the third step we used \(\hat{\lambda }_{im}/\hat{\lambda }_{i,i+1}=Nr_{im}\beta (s_m)\), which follows from (12.35), and in the last step we introduced the short notation \(\gamma ^\prime = \gamma _i\beta (s_m)^{-1}\). (One instance of \(\gamma ^\prime \) is presented for the boundary scenarios of Sect. 12.7.2.1, below formula (12.105).)
We will use (12.199)–(12.200) and Kolmogorov’s backward equation in order to derive a differential equation for \(\bar{\beta }(x)\). Consider a fixed \(0<x<1\), and let \(h>0\) be a small number. Then condition on what happens during time interval (0, h). When h is small, it is unlikely that the process \(X_t\) will stop because it hits any of the boundaries 0 or 1, i.e.
as \(h\rightarrow 0\). The non-fixation probability can therefore be expressed as
Letting \(h\rightarrow 0\), we find from (12.199) that \(\bar{\beta }(x)\) satisfies the differential equation
Durrett et al. [20] use a power series argument to prove that the solution of (12.201), with boundary conditions \(\bar{\beta }(0)=1\) and \(\bar{\beta }(1)=0\), is
Recalling (12.63) and that \(\gamma ^\prime =\gamma _i/\beta (s_m)\), we deduce formula (12.197) from (12.198) and differentiation of (12.202) with respect to x. \(\square \)
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Hössjer, O., Bechly, G., Gauger, A. (2018). Phase-Type Distribution Approximations of the Waiting Time Until Coordinated Mutations Get Fixed in a Population. In: Silvestrov, S., Malyarenko, A., Rančić, M. (eds) Stochastic Processes and Applications. SPAS 2017. Springer Proceedings in Mathematics & Statistics, vol 271. Springer, Cham. https://doi.org/10.1007/978-3-030-02825-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-02825-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02824-4
Online ISBN: 978-3-030-02825-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)