Skip to main content
Log in

On selection in finite populations

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

Two major forces shaping evolution are drift and selection. The standard models of neutral drift—the Wright–Fisher (WF) and Moran processes—can be extended to include selection. However, these standard models are not always applicable in practice, and—even without selection—many other drift models make very different predictions. For example, “generalised Wright–Fisher” models (so-called because their first two conditional moments agree with those of the WF process) can yield wildly different absorption times from WF. Additionally, evolutionary stability in finite populations depends only on fixation probabilities, which can be evaluated under less restrictive assumptions than those required to estimate fixation times or more complex population-genetic quantities. We therefore distill the notion of a selection process into a broad class of finite-population, mutationless models of drift and selection (including the WF and Moran processes). We characterize when selection favours fixation of one strategy over another, for any selection process, which allows us to derive finite-population conditions for evolutionary stability independent of the selection process. In applications, the precise details of the selection process are seldom known, yet by exploiting these new theoretical results it is now possible to make rigorously justifiable inferences about fixation of traits.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. Fitnesses need not be defined for \(i=0\) or N, as in these extremes the population is homogeneous and there is no variability in fitness.

  2. An invading strategy may appear in the population by immigration, mutation, or (in the case of cultural traits) innovation.

  3. In fact, not all GWF models with selection satisfy our definition of selection processes (Definition 2.1); see Appendix B.4.

  4. We assume here that the fitnesses \(\overline{W}_{A}\left( {j}\right) \) and \(\overline{W}_{B}\left( {j}\right) \) are positive for \(1\le j\le N-1\).

  5. In the original version of the EW process (Eldon and Wakeley 2006), the fitnesses of both types were equal, and the parent agent was guaranteed to survive to the next generation. Der et al. (2012) generalized the original model to types with different fitnesses, but in their version of the EW process, it is possible for the parent to be chosen for replacement. Here, we reformulate Der et al’s extended model while retaining the original condition that the reproducing agent cannot be chosen for replacement. In contrast to our version of the EW process, setting \(U=2\) in Der et al’s version yields the Moran process exactly.

  6. GWF processes also allow for mutation which is not discussed here; see Der (2010); Der et al. (2011).

  7. Taking \(\{\nu _{k}\}_{k=1}^{N}\) to be exchangeable multinomial variables yields the neutral WF process (Cannings 1974).

  8. This occurs when the two types effectively make up two sub-populations that reproduce and evolve independently.

  9. Importantly, note that assuming equal mean numbers of offspring in Lessard and Ladret’s model, \(\mu _{A}\left( i\right) \,\)=\(\,\mu _{B}\left( i\right) =1\), does not recover Cannings’ original (neutral) model.

References

  • Acerbi A, Bentley RA (2014) Biases in cultural transmission shape the turnover of popular traits. Evol Hum Behav 35(3):228–236. doi:10.1016/j.evolhumbehav.2014.02.003

    Article  Google Scholar 

  • Alexander H, Wahl L (2008) Fixation probabilities depend on life history: fecundity, generation time and survival in a burst-death model. Evolution 62(7):1600–1609. doi:10.1111/j.1558-5646.2008.00396.x

    Article  Google Scholar 

  • Allen B, Nowak MA, Dieckmann U (2013) Adaptive dynamics with interaction structure. Am Nat 181(6):E139–E163

    Article  Google Scholar 

  • Aoki K, Lehmann L, Feldman MW (2011) Rates of cultural change and patterns of cultural accumulation in stochastic models of social transmission. Theor Popul Biol 79(4):192–202. doi:10.1016/j.tpb.2011.02.001

    Article  MATH  Google Scholar 

  • Bentley RA, Hahn MW, Shennan SJ (2004) Random drift and culture change. Proc R Soc Lond B: Biol Sci 271(1547):1443–1450

    Article  Google Scholar 

  • Cannings C (1974) The latent roots of certain markov chains arising in genetics: a new approach. I. Haploid models. Adv Appl Probab 6:260–290

    Article  MathSciNet  MATH  Google Scholar 

  • Charlesworth B (2009) Effective population size and patterns of molecular evolution and variation. Nat Rev Genet 10(3):195–205

    Article  Google Scholar 

  • Chia A, Watterson G (1969) Demographic effects on the rate of genetic evolution: I. Constant size populations with two genotypes. J Appl Probab 6:231–248

    MathSciNet  MATH  Google Scholar 

  • Chung KL (1967) Markov chains with stationary transition probabilities. Springer, Berlin

    MATH  Google Scholar 

  • Der R (2010) A theory of generalised population processes. ProQuest, Philadelphia

    Google Scholar 

  • Der R, Epstein CL, Plotkin JB (2012) Dynamics of neutral and selected alleles when the offspring distribution is skewed. Genetics 191(4):1331–1344

    Article  Google Scholar 

  • Der R, Epstein CL, Plotkin JB (2011) Generalized population models and the nature of genetic drift. Theor Popul Biol 80(2):80–99

    Article  MATH  Google Scholar 

  • Durrett R (2008) Probability models for DNA sequence evolution. Springer, Berlin

    Book  MATH  Google Scholar 

  • Eldon B, Wakeley J (2006) Coalescent processes when the distribution of offspring number among individuals is highly skewed. Genetics 172(4):2621–2633

    Article  Google Scholar 

  • Eldon B, Wakeley J (2008) Linkage disequilibrium under skewed offspring distribution among individuals in a population. Genetics 178(3):1517–1532

    Article  Google Scholar 

  • Eldon B, Wakeley J (2009) Coalescence times and fst under a skewed offspring distribution among individuals in a population. Genetics 181(2):615–629

    Article  Google Scholar 

  • Ewens WJ (2012) Mathematical population genetics 1: theoretical introduction, volume 27 of interdisciplinary applied mathematics. Springer, Berlin

  • Feller W (1968) An introduction to probability theory and its applications: vol 3. Wiley, New York

    MATH  Google Scholar 

  • Fisher RA (1930) The genetical theory of natural selection: a complete, variorum edn. Oxford University Press, Oxford

    MATH  Google Scholar 

  • Haldane JBS (1932) A mathematical theory of natural and artificial selection. Part ix. Rapid selection. In: Mathematical Proceedings of the Cambridge Philosophical Society, vol 28, pp 244–248. Cambridge University Press, Cambridge

  • Hartl DL, Clark AG (2007) Principles of population genetics, 4th edn. Sinauer associates, Sunderland. ISBN 9780878933082

    Google Scholar 

  • Hedgecock D (1994) Does variance in reproductive success limit effective population sizes of marine organisms. In: Beaumont AR (ed) Genetics and evolution of aquatic organisms. Chapman and Hall, London, pp 122–134

    Google Scholar 

  • Hofbauer J, Sigmund K (1998) Evolutionary games and population dynamics. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Huillet T, Möhle M (2011) Population genetics models with skewed fertilities: a forward and backward analysis. Stoch Model 27(3):521–554

    Article  MathSciNet  MATH  Google Scholar 

  • Imhof LA, Nowak MA (2006) Evolutionary game dynamics in a Wright-Fisher process. J Math Biol 52(5):667–681

    Article  MathSciNet  MATH  Google Scholar 

  • Karlin S, McGregor J (1964) Direct product branching processes and related markov chains. PNAS 51(4):598

    Article  MathSciNet  MATH  Google Scholar 

  • Karlin S, Taylor HM (1975) A first course in stochastic processes, 2nd edn. Academic Press, New York

    MATH  Google Scholar 

  • Kurokawa S, Ihara Y (2009) Emergence of cooperation in public goods games. Proc R Soc Lond B: Biol Sci 276(1660):1379–1384

    Article  Google Scholar 

  • Lessard S (2005) Long-term stability from fixation probabilities in finite populations: new perspectives for ESS theory. Theor Popul Biol 68(1):19–27

    Article  MathSciNet  MATH  Google Scholar 

  • Lessard S, Ladret V (2007) The probability of fixation of a single mutant in an exchangeable selection model. J Math Biol 54(5):721–744

    Article  MathSciNet  MATH  Google Scholar 

  • Maynard Smith J (1982) Evolution of theory games. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Metz JA, Geritz SA, Meszéna G, Jacobs FJ, Van Heerwaarden JS et al (1996) Adaptive dynamics, a geometrical study of the consequences of nearly faithful reproduction. Stoch Spat Struct Dyn Syst 45:183–231

    MathSciNet  MATH  Google Scholar 

  • Molina C, Earn DJD (in prep.) Evolutionarily stability in symmetric games in finite populations

  • Moran PAP (1962) The statistical processes of evolutionary theory. Clarendon Press, Oxford

    MATH  Google Scholar 

  • Nowak MA (2006) Evolutionary dynamics: exploring the equations of life. Harvard University Press, Cambridge

    MATH  Google Scholar 

  • Nowak MA, Sasaki A, Taylor C, Fudenberg D (2004) Emergence of cooperation and evolutionary stability in finite populations. Nature 428(6983):646–650

    Article  Google Scholar 

  • Ohtsuki H (2010) Stochastic evolutionary dynamics of bimatrix games. J Theor Biol 264(1):136–142

    Article  MathSciNet  Google Scholar 

  • Ohtsuki H, Hauert C, Lieberman E, Nowak MA (2006) A simple rule for the evolution of cooperation on graphs and social networks. Nature 441(7092):502–505

    Article  Google Scholar 

  • Patwa Z, Wahl LM (2008) The fixation probability of beneficial mutations. J R Soc Interface 5(28):1279–1289

    Article  Google Scholar 

  • Pitman J (1999) Coalescents with multiple collisions. Ann Probab 27(4):1870–1902

    Google Scholar 

  • Proulx S, Day T (2002) What can invasion analyses tell us about evolution under stochasticity in finite populations? Selection 2(1–2):2–15

    Article  Google Scholar 

  • Proulx SR (2000) The ESS under spatial variation with applications to sex allocation. Theor Popul Biol 58(1):33–47

    Article  MATH  Google Scholar 

  • Ridley M (2003) Evolution. Wiley, New York. ISBN 9781405103459

    Google Scholar 

  • Sagitov S et al (1999) The general coalescent with asynchronous mergers of ancestral lines. J Appl Probab 36(4):1116–1125

    Article  MathSciNet  MATH  Google Scholar 

  • Sargsyan O, Wakeley J (2008) A coalescent process with simultaneous multiple mergers for approximating the gene genealogies of many marine organisms. Theor Popul Biol 74(1):104–114

    Article  MATH  Google Scholar 

  • Schweinsberg J (2003) Coalescent processes obtained from supercritical Galton-Watson processes. Stoch Process Appl 106(1):107–139

    Article  MathSciNet  MATH  Google Scholar 

  • Stewart AJ, Plotkin JB (2013) From extortion to generosity, evolution in the iterated prisoners dilemma. PNAS 110(38):15348–15353

    Article  MathSciNet  MATH  Google Scholar 

  • Tarnita CE, Ohtsuki H, Antal T, Fu F, Nowak MA (2009) Strategy selection in structured populations. J Theor Biol 259(3):570–581

    Article  MathSciNet  Google Scholar 

  • Tarnita CE, Wage N, Nowak MA (2011) Multiple strategies in structured populations. PNAS 108(6):2334–2337

    Article  Google Scholar 

  • Wild G, Taylor PD (2004) Fitness and evolutionary stability in game theoretic models of finite populations. Proc R Soc Lond B: Biol Sci 271(1555):2345–2349

    Article  Google Scholar 

  • Wright S (1931) Evolution in mendelian populations. Genetics 16(2):97

    Google Scholar 

  • Wu B, Gokhale CS, Veelen M, Wang L, Traulsen A (2013) Interpretations arising from Wrightian and Malthusian fitness under strong frequency dependent selection. Ecol Evol 3(5):1276–1280

    Article  Google Scholar 

Download references

Acknowledgements

DE was supported by NSERC. CM was supported by an Ontario Trillium Scholarship and the McMaster University Department of Mathematics and Statistics. We are grateful to Erol Akçay, Ben Bolker, Sarah Otto and Joshua Plotkin for valuable discussions and comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chai Molina.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 176 KB)

Appendices

Appendix

A Proofs

1.1 A.1 Proof of Proposition 2.7

If \(X(0) = 0\) or \(X(0) = N\), nothing remains to be shown.

Let \(C=\{1,2,\ldots ,N-1\}\) and consider \(i\in C\). Suppose, in order to derive a contradiction, that the absorption probability starting from state i is

$$\begin{aligned} \left. {{\mathrm{\mathrm Pr}}}\big ( \exists t\in {\mathbb {N}} \text { such that } X(t)\in \{0,N\}\,|\, X(0)=i \big )\right. <1. \end{aligned}$$
(24)

Then,

$$\begin{aligned} \left. {{\mathrm{\mathrm Pr}}}\big ( X(t)\in C \text { for all } t\in {\mathbb {N}} \,|\, X(0)=i \big )\right. >0. \end{aligned}$$
(25)

If X(t) takes values in C for all times \(t\ge 0\), then since C is finite, at least one index j, \(1\le j\le N-1\) is visited infinitely often, that is, for some j, \(1\le j\le N-1\),

$$\begin{aligned} \left. {{\mathrm{\mathrm Pr}}}\big ( \text {for any } T\ge 0, \text { there exists } t>T \text { such that } X(t)=j \,|\, X(0)=i \big )\right. >0. \end{aligned}$$
(26)

Now note that H1 implies that C is a set of inessential, and therefore nonrecurrent states (see Appendix D.2 in the ESM and Theorem I.4.4 of Chung 1967), which cannot be visited infinitely often (Theorem I.4.3 of Chung 1967), contradicting inequality (26). \(\square \)

1.2 A.2 Proof of Proposition 4.1

Define the random variable \(T_A=\left. \min \{t\,|\, X(t) = N\}\right. \), that is, \(T_A\) is the fixation time of A (\(T_A=\infty \) if A never fixes). Similarly, let \(T_B=\left. \min \{t\,|\, X(t) = 0\}\right. \) be the fixation time of B. Both \(T_A\) and \(T_B\) are stopping times (see Definition S3), and hence the absorption time \(T_\mathrm{abs}= \min \{T_A,T_B\}\) is also a stopping time (Karlin and Taylor 1975, p. 256).

Since either A or B must fix (Proposition 2.7),

$$\begin{aligned} {{\mathrm{\mathrm Pr}}}(T_\mathrm{abs}<\infty )=1, \end{aligned}$$
(27)

so

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right) =\left. {{\mathrm{\mathrm Pr}}}\Big (\lim _{t\rightarrow \infty }X(t)=N\,|\, X(0)=i\Big )\right. =\left. {{\mathrm{\mathrm Pr}}}\big (X(T_\mathrm{abs})= N\,|\, X(0)=i\big ),\right. \end{aligned}$$
(28)

and

$$\begin{aligned} \left. {{\mathrm{\mathbb {E}}}}\big (X(T_\mathrm{abs})\,|\, X(0)=i\big )\right. =&\left. {{\mathrm{\mathrm Pr}}}\big (X(T_\mathrm{abs})= 0\,|\, X(0)=i\big )\right. \cdot 0\nonumber \\&+ \left. {{\mathrm{\mathrm Pr}}}\big (X(T_\mathrm{abs})= N\,|\, X(0)=i\big )\right. \cdot N =p_\mathrm{fix}\left( {i}\right) \cdot N. \end{aligned}$$
(29)

For any t, we have \(0\le X(t)\le N\), so it follows that for any stopping time T,

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}\Big (\sup _{t\ge 0}X(\min \{T,t\}\big )\Big )< \infty . \end{aligned}$$
(30)

Thus, since X(t) is a martingale and \(T_\mathrm{abs}\) is a stopping time satisfying Eqs. (27) and (30) the optional stopping theorem (theorem S5) implies that

$$\begin{aligned} i=X(0) = \left. {{\mathrm{\mathbb {E}}}}\big (X(0)\,|\, X(0)=i\big )\right. = \left. {{\mathrm{\mathbb {E}}}}\big (X(T_\mathrm{abs})\,|\, X(0)=i\big ),\right. \end{aligned}$$
(31)

and hence

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right)&=\left. {{\mathrm{\mathrm Pr}}}\Big (\lim _{t\rightarrow \infty }X(t)= N\,|\, X(0)=i\Big )\right. \nonumber \\&= \left. {{\mathrm{\mathrm Pr}}}\big (X(T_\mathrm{abs})= N\,|\, X(0)=i\big )\right. = i/N. \end{aligned}$$
(32)

\(\square \)

Remark A.1

Feller (1968, p.399) gives an alternative proof of Proposition 4.1 that does not rely on the optional stopping theorem.

1.3 A.3 Proof of Lemma 4.4

Observe that X(t) is a non-negative supermartingale (Definition S2). Thus, for any stopping time S, with \({{\mathrm{\mathrm Pr}}}(S<\infty ) = 1\), a version of the optional stopping theorem for supermartingales (theorem S6) states that

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}\big (X(S)\big )\le {{\mathrm{\mathbb {E}}}}\big (X(0)\big ). \end{aligned}$$
(33)

Using a constant stopping time \(S=\tau \ge 0\), inequality (33) gives

$$\begin{aligned} \left. {{\mathrm{\mathbb {E}}}}\big (X(\tau )\,|\, X(0) = i\big )\right. \le X(0)=i. \end{aligned}$$
(34)

Letting \(T_\mathrm{abs}\) be the absorption time for the system, by Proposition 2.7 we can apply inequality (33) and Eq. (29) to show that for any initial state \(X(0) = i\) for (\(0\le i\le N\)) the fixation probability of A satisfies

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right) N= \left. {{\mathrm{\mathbb {E}}}}\big (X(T_\mathrm{abs})\,|\, X(0) = i\big )\right. \le X(0)=i, \end{aligned}$$
(35)

so \(p_\mathrm{fix}\left( {i}\right) \le i/N\), and the fixation probability of B is \(1-p_\mathrm{fix}\left( {i}\right) \ge (N-i)/N\).

Similarly, if we use H1 as well then

$$\begin{aligned} p_\mathrm{fix}\left( {\hat{\imath }}\right) N=\left. {{\mathrm{\mathbb {E}}}}\big (X(T_\mathrm{abs})\,|\, X(0) = \hat{\imath }\big )\right. \le \left. {{\mathrm{\mathbb {E}}}}\big (X(1)\,|\, X(0) = \hat{\imath }\big )\right. <\hat{\imath }, \end{aligned}$$
(36)

so \(p_\mathrm{fix}\left( {\hat{\imath }}\right) < \hat{\imath }/N\).

Denoting the probability of reaching state j at time \(\tau \ge 0\) starting from state \(X(0) = i\) by

$$\begin{aligned} P_{{i},{j}}^{(\tau )}=\left. {{\mathrm{\mathrm Pr}}}\big (X(\tau ) = j \,|\, X(0) = i\big ),\right. \end{aligned}$$
(37)

we have \(P_{{i},{j}}^{(\tau )}= (P^\tau )_{i,j}\).

If i leads to \(\hat{\imath }\), then for some time \(\tau \ge 0\), the probability of reaching state \(\hat{\imath }\) from state i is nonzero, \(P_{{i},{\hat{\imath }}}^{(\tau )} >0\). Conditioning on the state arrived at in the \(\tau \)-th time-step, we have

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right)&= \sum _{j=0}^{N}P_{{i},{j}}^{(\tau )} p_\mathrm{fix}\left( {j}\right) = \sum _{\begin{array}{c} j=0\\ j\ne \hat{\imath } \end{array}}^{N}P_{{i},{j}}^{(\tau )}p_\mathrm{fix}\left( {j}\right) + P_{{i},{\hat{\imath }}}^{(\tau )} p_\mathrm{fix}\left( {\hat{\imath }}\right) <\frac{1}{N}\sum _{j=0}^{N}P_{{i},{j}}^{(\tau )}j\nonumber \\&=\frac{1}{N}\left. {{\mathrm{\mathbb {E}}}}\big (X(\tau )\,|\, X(0) = i\big ).\right. \end{aligned}$$
(38)

Using Eq. (34), we obtain

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right)&< \frac{1}{N}X(0)=\frac{i}{N}, \end{aligned}$$
(39)

and the probability of B fixing is \(1-p_\mathrm{fix}\left( {i}\right) >\frac{N-i}{N}\). \(\square \)

1.4 A.4 Proof of Lemma 4.6

Suppose, in order to derive a contradiction, that inequality (16) does not hold: for all states \(1\le j\le N-1\)

$$\begin{aligned} \overline{W}_{A}\left( {j}\right) \ge \overline{W}_{B}\left( {j}\right) . \end{aligned}$$
(40)

Then for any selection process, from Lemma 4.4 (with the roles of A and B reversed), \(p_\mathrm{fix}\left( {i}\right) \ge i/N\), contradicting inequality (14). Thus inequality (16) holds.

Now suppose, in order to derive a contradiction, that inequality (15) does not hold: there exists a state \(\hat{\jmath }\) for which

$$\begin{aligned} \overline{W}_{A}\left( {\hat{\jmath }}\right) >\overline{W}_{B}\left( {\hat{\jmath }}\right) . \end{aligned}$$
(41)

We will construct a transition matrix P for a selection process \({\mathscr {P}}\) (consistent with the fitnesses \(\overline{W}_{A}\left( {j}\right) \) and \(\overline{W}_{B}\left( {j}\right) \), \(1\le j\le N-1\)) such that \(p_\mathrm{fix}\left( {i}\right) \ge i/N\), which contradicts inequality (14) holding for all selection processes.

To find such a selection process \({\mathscr {P}}\), we can restrict attention to processes with the property that at any time and any mixed-type state, the number of individuals of type A must change by exactly 1. Thus, for any mixed-type state k, \(P_{{j},{k}}\ne 0\) if and only if \(j=k\pm 1\). The matrix P then defines a “birth-death” process, for which the fixation probabilities starting from state \(X(0) =i\) satisfy (see Appendix C):

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right) = \frac{1 + \sum _{k=1}^{i-1}\prod _{j=1}^{k}\frac{P_{{j},{j-1}}}{P_{{j},{j+1}}}}{1+ \sum _{k=1}^{N-1}\prod _{j=1}^{k}\frac{P_{{j},{j-1}}}{P_{{j},{j+1}}}}. \end{aligned}$$
(42)

Let \({\mathcal A_{>}}\), \({\mathcal A_{<}}\) and \({\mathcal A_{=}}\) be the sets of states in which the expected fitness of individuals of type A is higher than, lower than or equal to that of B individuals (respectively). Note that \(\hat{\jmath }\in {\mathcal A_{>}}\) and \(\hat{\imath }\in {\mathcal A_{<}}\). We then specify the ratios of the non-vanishing transition probabilities by

$$\begin{aligned} \frac{P_{{j},{j-1}}}{P_{{j},{j+1}}}= {\left\{ \begin{array}{ll} r_+ &{} j\in {\mathcal A_{>}},\\ r_- &{} j\in {\mathcal A_{<}},\\ 1 &{} j\in {\mathcal A_{=}}, \end{array}\right. } \end{aligned}$$
(43)

where \(r_+\) and \(r_-\) are constants—independent of j—that satisfy \(0<r_+<1<r_-\).

Observe that P defines a mixed-irreducible selection process \({\mathscr {P}}\):

  • If \(X(t)=j\), then

    $$\begin{aligned} \left. {{\mathrm{\mathbb {E}}}}\big (X(t+1)\,|\,X(t) = j\big )\right. = (j+1)P_{{j},{j+1}} + (j-1)P_{{j},{j-1}} = j +P_{{j},{j+1}} - P_{{j},{j-1}},\nonumber \\ \end{aligned}$$
    (44)

    so H1 is satisfied.

  • As for the Moran process [see Eq. (54a)], for any j and k such that \(1\le j \le N-1\), \(0\le k \le N\) and \(j\ne k\), there is a positive probability of transitioning from state j to state k in \(d=|j-k|\) steps: setting \(\sigma = {{\mathrm{\mathrm sign}}}(k-j)\), we have

    $$\begin{aligned} \left. {{\mathrm{\mathrm Pr}}}\big (X(t+d) = k\,|\, X(t) = j\big )\right.&= \prod _{m=1}^{d}P_{{(j+\sigma (m-1))},{(j+\sigma m)}}>0,\end{aligned}$$
    (45a)
    $$\begin{aligned} \left. {{\mathrm{\mathrm Pr}}}\big (X(t+2) = j\,|\, X(t)=j \big )\right.&= P_{{j},{j+1}}P_{{j+1},{j}}+ P_{{j},{j-1}}P_{{j-1},{j}}>0, \end{aligned}$$
    (45b)

    so all states can be reached from state \(X(t)=i\) in finite time. Thus, P is mixed-irreducible. Moreover, the probability of B fixing at a future time \(t+\tau \) (\(\tau \ge 0\)) is positive, so H2 is satisfied.

  • The states 0 and N are absorbing, so H3 is trivially satisfied.

For \(1\le j\le N-1\), we define the number of states k (\(1\le k\le j\)) in which the expected fitness of A individuals is higher than that of B individuals,

$$\begin{aligned} {\alpha _+}(j) = \Big |\left. \big \{k\,|\, 1\le k\le j \text { and } k\in {\mathcal A_{>}}\big \}\right. \Big |, \end{aligned}$$
(46)

and similarly,

$$\begin{aligned} {\alpha _-}(j) = \Big |\left. \big \{k\,|\, 1\le k\le j \text { and } k\in {\mathcal A_{<}}\big \}\right. \Big |. \end{aligned}$$
(47)

Lastly, let \(a_+\) be the smallest number of individuals of type A in the population for which type A’s expected fitness is higher than type B’s, that is,

$$\begin{aligned} a_+ =\min {\mathcal A_{>}}\ge 1. \end{aligned}$$
(48)

Note that \(a_+\le \hat{\jmath }<N\), and that \({\alpha _+}(j) =0\) for all \(j<a_+\).

From Eq. (42), the fixation probability \(p_\mathrm{fix}\left( {i}\right) \) is a rational function of \(r_+\) and \(r_-\),

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right)&= \frac{1 + \sum _{k=1}^{i-1}r_+^{{\alpha _+}(k)}r_-^{{\alpha _-}(k)}}{1 + \sum _{k=1}^{N-1}r_+^{{\alpha _+}(k)}r_-^{{\alpha _-}(k)}}, \end{aligned}$$
(49)

and is continuous because the denominator is positive for any \(r_-,r_+>0\).

If \(i\ge a_+\), then \(p_\mathrm{fix}\left( {i}\right) \rightarrow 1\) as \(r_+\rightarrow 0\). If \(i<a_+\), then

$$\begin{aligned} \lim _{r_+\rightarrow 0}p_\mathrm{fix}\left( {i}\right)&= \frac{1 + \sum _{k=1}^{i-1}r_-^{{\alpha _-}(k)}}{1 + \sum _{k=1}^{a_+-1}r_-^{{\alpha _-}(k)}}\quad \xrightarrow {r_-\rightarrow 1}{} \quad \frac{i}{a_+}> \frac{i}{N}. \end{aligned}$$
(50)

It is thus possible to choose \(r_-\) sufficiently close to 1 and \(r_+\) sufficiently close to 0 to ensure that \(p_\mathrm{fix}\left( {i}\right) >i/N\), which completes the proof. \(\square \)

B Examples

1.1 B.1 The Moran process

If the population evolves according to the Moran process (Moran 1962; Hartl and Clark 2007; Ewens 2012), then exactly one agent is replaced at each time step. In detail, at each time step:

  • An agent is chosen for death, with equal probability for all agents;

  • An agent is chosen for reproduction, with probability proportional to its fitnessFootnote 4;

  • The agent chosen for death is replaced with a clone of the agent chosen for reproduction.

Note that sampling of agents is done with replacement, so that an agent can be chosen for both death and reproduction (in which case the population remains unchanged).

When the population consists of i mutants (individuals of type A) and \(N-i\) residents (individuals of type B), the probabilities of choosing a mutant or a resident for death are i / N and \((N-i)/N\), respectively. The probabilities of choosing a mutant or a resident for reproduction are

$$\begin{aligned} \frac{i \overline{W}_{A}\left( {i}\right) }{i\overline{W}_{A}\left( {i}\right) +(N-i)\overline{W}_{B}\left( {i}\right) }, \end{aligned}$$
(51a)

and

$$\begin{aligned} \frac{(N-i)\overline{W}_{B}\left( {i}\right) }{i\overline{W}_{A}\left( {i}\right) + (N-i)\overline{W}_{B}\left( {i}\right) }. \end{aligned}$$
(51b)

Because the death and reproduction events are independent, the transition probabilities are simply

$$\begin{aligned} P_{{i},{i+1}}&=\frac{i \overline{W}_{A}\left( {i}\right) }{i\overline{W}_{A}\left( {i}\right) +(N-i) \overline{W}_{B}\left( {i}\right) }\times \frac{N-i}{N}>0,\end{aligned}$$
(52a)
$$\begin{aligned} P_{{i},{i-1}}&= \frac{(N-i)\overline{W}_{B}\left( {i}\right) }{i\overline{W}_{A}\left( {i}\right) + (N-i) \overline{W}_{B}\left( {i}\right) }\times \frac{i}{N}>0, \end{aligned}$$
(52b)

and (since at each time step at most one individual is replaced)

$$\begin{aligned} P_{{i},{i}}&=1-P_{{i},{i+1}} -P_{{i},{i-1}} = \frac{i^2\overline{W}_{A}\left( {i}\right) + (N-i)^2\overline{W}_{B}\left( {i}\right) }{N\big (i\overline{W}_{A}\left( {i}\right) + (N-i)\overline{W}_{B}\left( {i}\right) \big )} >0. \end{aligned}$$
(52c)

Lastly, \(P_{{0},{0}} = P_{{N},{N}} = 1\) and \(P_{{0},{i}} = P_{{N},{N-i}} =0\) for all \(1\le i\le N\) (the states where the resident or mutant have fixed are absorbing, so H3 is trivially satisfied).

For any \(1\le i \le N-1\), if \(X(t) = i\), we have

$$\begin{aligned} \left. {{\mathrm{\mathbb {E}}}}\big (X(t+1)-X(t) \,|\, X(t)=i \big )\right.&= -i + \sum _{j=0}^N jP_{{i},{j}} \nonumber \\&= -i + \big [(i-1)P_{{i},{i-1}}+ iP_{{i},{i}}+ (i+1) P_{{i},{i+1}}\big ]\nonumber \\&=-i + \big [ i + P_{{i},{i+1}} -P_{{i},{i-1}}\big ] \nonumber \\&= \frac{i (N-i)\big (\overline{W}_{A}\left( {i}\right) - \overline{W}_{B}\left( {i}\right) \big )}{N\big (i\overline{W}_{A}\left( {i}\right) + (N-i)\overline{W}_{B}\left( {i}\right) \big )} . \end{aligned}$$
(53)

The expected number of individuals of type A (respectively B) in the next time-step is larger than in the current time-step, if and only if \(\overline{W}_{A}\left( {i}\right) >\overline{W}_{B}\left( {i}\right) \) (respectively \(\overline{W}_{B}\left( {i}\right) >\overline{W}_{A}\left( {i}\right) \)), so H1 is satisfied.

To see that H2 is satisfied, and moreover, that P defines a mixed-irreducible selection process, consider i and j such that \(1\le i \le N-1\), \(0\le j \le N\) and \(j\ne i\), and observe that there is a positive probability of changing from state i to state j in \(d=|j-i|\) steps: setting \(\sigma = {{\mathrm{\mathrm sign}}}(j-i)\), we have

$$\begin{aligned} \left. {{\mathrm{\mathrm Pr}}}\big (X(t+d) = j\,|\, X(t) = i\big )\right.&= \prod _{k=1}^{d}P_{{(i+\sigma (k-1))},{(i+\sigma k)}}>0 \end{aligned}$$
(54a)
$$\begin{aligned} \left. {{\mathrm{\mathrm Pr}}}\big (X(t+1) = i\,|\, X(t)=i \big )\right.&= P_{{i},{i}}>0, \end{aligned}$$
(54b)

so all states can be reached from state \(X(t)=i\) in finite time, and in particular, the probability of B fixing at a future time \(t+\tau \) (\(\tau \ge 0\)) is positive.

If neither type has a selective advantage over the other, regardless of their frequencies in the population, then for all \(1\le i \le N-1\), \(\overline{W}_{A}\left( {i}\right) = \overline{W}_{B}\left( {i}\right) \), so from Eq. (53), \({{\mathrm{\mathbb {E}}}}\big (X(t+1)\big ) = X(t)\), and P defines a neutral drift process.

1.2 B.2 The Wright–Fisher process

If the population evolves according to the Wright–Fisher process (Hartl and Clark 2007; Ewens 2012) then all individuals are replaced at each time step (generations do not overlap). At each time step, the entire population of N individuals is replaced by a new generation constructed using binomial sampling: in each of the N Bernoulli trials, the probability of drawing any type represented in the current generation is proportional to its present mean fitness and to the present number of individuals of that type. Thus, the probability that an individual in the next generation will be of type A is

$$\begin{aligned} \frac{i\overline{W}_{A}\left( {i}\right) }{i\overline{W}_{A}\left( {i}\right) +(N-i)\overline{W}_{B}\left( {i}\right) }, \end{aligned}$$
(55)

and

$$\begin{aligned} P_{{i},{j}}&= \left. {{\mathrm{\mathrm Pr}}}\big (X(t+1) = j\,|\, X(t) = i\big )\right. \nonumber \\&= {N \atopwithdelims ()j}\Bigg (\frac{i\overline{W}_{A}\left( {i}\right) }{i\overline{W}_{A}\left( {i}\right) +(N-i)\overline{W}_{B}\left( {i}\right) }\Bigg )^j \Bigg (\frac{(N-i) \overline{W}_{B}\left( {i}\right) }{i\overline{W}_{A}\left( {i}\right) +(N-i)\overline{W}_{B}\left( {i}\right) }\Bigg )^{N-j}, \end{aligned}$$
(56)

where \(P_{{0},{0}}=P_{{N},{N}}=1\) (so the states \(X=0\) and \(X=N\) are absorbing and H3 is satisfied). Note that if A is not present at some time \(\tau \), B has fixed and the population remains in state \(X(t)=0\) for all \(t\ge \tau \), and similarly if B is not present at some time \(\tau \), then \(X(t)=N\) for all \(t\ge \tau \).

The mean of a binomial random variable defined by n trials with success probability p is np, so for any \(0\le i \le N\), we have

$$\begin{aligned} \left. {{\mathrm{\mathbb {E}}}}\big (X(t+1)\,|\,X(t)=i\big )\right. -i&=N\frac{i\overline{W}_{A}\left( {i}\right) }{i\overline{W}_{A}\left( {i}\right) +(N-i)\overline{W}_{B}\left( {i}\right) }-i\nonumber \\&=i(N-i)\frac{\overline{W}_{A}\left( {i}\right) -\overline{W}_{B}\left( {i}\right) }{i \overline{W}_{A}\left( {i}\right) +(N-i)\overline{W}_{B}\left( {i}\right) }, \end{aligned}$$
(57)

so H1 is satisfied. H2 is trivially satisfied because for any \(1\le i \le N-1\), \(P_{{i},{0}}>0\). Thus, P defines a selection process, which is, moreover, mixed-irreducible, because for any \(1\le i \le N-1\), \(P_{{i},{j}}>0\) also for any \(1\le j\le N-1\).

If neither type has a selective advantage over the other, \(\overline{W}_{A}\left( {i}\right) = \overline{W}_{B}\left( {i}\right) \) for all \(1\le i \le N-1\), and Eq. (57) becomes \(\left. {{\mathrm{\mathbb {E}}}}\big (X(t+1)\,|\, X(t)=i\big )\right. = i=X(t)\), so X(t) is a neutral drift process.

1.3 B.3 The Eldon–Wakeley process with viability selection

The Eldon–Wakeley (EW) process (Eldon and Wakeley 2006; Der et al. 2012) is a variation on the neutral Moran process that allows for a skewed (rather than uniform) offspring distribution. It has been used to interpret genetic data from Pacific Oysters (Eldon and Wakeley 2006; Der et al. 2012).

The EW process describes neutral drift in a population of constant size N, consisting of two types, A and B. At each time step, a single agent is randomly drawn from the population with uniform probability, and produces a random number of offspring \(U-1\). The parent agent survives to the next generation and its \(U-1\) offspring replace \(U-1\) randomly chosen members of the remainder of the population. In the special case that exactly one offspring is always produced, i.e. \({{\mathrm{\mathrm Pr}}}(U=2)=1\), the EW process is similar (but not identical) to the classical Moran process (Ewens 2012; Moran 1962): in both processes, the parent always produces one offspring, which increases the number of individuals of the parent’s type in the next generation iff the agent chosen to be replaced is not of the parent’s type. In the EW process, the parent is guaranteed to survive, and one additional offspring replaces another randomly chosen member of the population, so if there are i agents of type A, the probability that the population state remains the same is

$$\begin{aligned} \frac{i}{N}\frac{i-1}{N-1}+ \frac{N-i}{N}\frac{N-i-1}{N-1}= \frac{i^2+ (N-i)^2 - N}{N(N-1)}. \end{aligned}$$
(58)

By contrast, in the Moran process, this probability is given by \(\frac{i^2 +(N-i)^2}{N^2}\) [see Eq. (52c)]. Thus, whenever the population is in a mixed-type state (i.e. \(1\le i\le N-1\)), the probability that the population state remains unchanged is larger for the Moran model than for the EW model. However, for both models, the probability of increase in type A is the same as the probability of increase in type B (this probability does depend on the population composition). Thus, in effect, the neutral (i.e. selectionless) EW process with \(U=2\) is a slightly “sped up” version of the neutral Moran process, with fewer time-steps in which the population state is unchanged.Footnote 5

Letting \(X(t)=i\) be the number of individuals of type A at some time \(t\ge 0\), then the probabilities that an agent of type A and B are chosen for reproduction are i / N and \((N-i)/N\), respectively. If an agent of type A is chosen for reproduction and produces \(U-1=u-1\) offspring, then the number of B agents chosen for replacement is hypergeometrically distributed with sample size \(N-1\), initial configuration \(N-i\) and \(u-1\) draws (Der et al. 2012), so the probability of k agents of type B (\(0\le k\le u-1\)) being replaced by agents of type A is

$$\begin{aligned} \frac{{N-i \atopwithdelims ()k}{i-1 \atopwithdelims ()u-1-k}}{{N-1 \atopwithdelims ()u-1}}, \end{aligned}$$
(59)

which has mean \((u-1)\frac{N-i}{N-1}\). Similarly, the mean number of agents of type A to be replaced, given that a B agent is chosen for reproduction and produces \(u-1\) offspring is \((u-1)\frac{i}{N-1}\). Thus, by the law of total expectation (theorem S1, conditioning on the type of agent chosen for reproduction), the expected number of individuals of type A in the next generation, given their present number, is:

$$\begin{aligned} \left. {{\mathrm{\mathbb {E}}}}\big (X(t+1)\,|\, X(t) = i\big )\right. =&\frac{i}{N}{{\mathrm{\mathbb {E}}}}\left( i +(U-1)\frac{N-i}{N-1} \right) \nonumber \\&+ \frac{N-i}{N}{{\mathrm{\mathbb {E}}}}\left( i -(U-1)\frac{i}{N-1} \right) \nonumber \\ =&\, i + \frac{i}{N}\frac{N-i}{N-1} {{\mathrm{\mathbb {E}}}}\left( U-1\right) - \frac{N-i}{N}\frac{i}{N-1}{{\mathrm{\mathbb {E}}}}\left( U-1\right) \nonumber \\ =&\,i. \end{aligned}$$
(60)

Der et al. (2012) have generalized the neutral EW process (Eldon and Wakeley 2006) by adding a deterministic “viability selection” step: for \(s\in \mathbb {R}\), given the population state X(t) at time t, an intermediate, pre-selection offspring population state at time \(t+1\) is generated according to the EW model without selection (described above). The population state \(X(t+1)\) at time \(t+1\) is then obtained by transforming the pre-selection offspring state according to standard (deterministic) logistic growth:

$$\begin{aligned} i\mapsto v\left( {i}\right) = \left\lfloor {\frac{(1+s/N)i}{(1+s/N)i+ (N-i)}N}\right\rfloor =\left\lfloor {\frac{N+s}{N+s (i/N)}i}\right\rfloor , \end{aligned}$$
(61)

where \(\lfloor {x}\rfloor \) is the largest integer smaller than x. This corresponds to selection acting on the offspring before reaching reproductive age (X(t) represents the state of the reproductively-mature population).

Now observe that for any \(1\le i\le N-1\), if \(s>0\) then

$$\begin{aligned} v\left( {i}\right) \ge i , \end{aligned}$$
(62)

if \(s<0\)

$$\begin{aligned} v\left( {i}\right) \le i , \end{aligned}$$
(63)

and if \(s=0\), \(v\left( {i}\right) = i\) (so the original EW process is recovered). Note also that because \(\frac{(1+s/N)i}{(1+s/N)i+ (N-i)}N<N\), fixation cannot occur in the selection step.

For any s, the selection step and neutral EW process above define a Markov process. Equations (60) and (62) imply that H1 is satisfied for this Markov process.

To verify H2 for any \(s\ge 0\), choose any i (\(1\le i\le N-1\)) and \(u\ge 2\) such that \({{\mathrm{\mathrm Pr}}}(U=u)=p_u>0\) (such u must exist because otherwise no offspring are ever created). The probability of an individual of type A reproducing is \(\frac{i}{N}\). Using Eq. (59), the probability of increasing the number of As in the population given that an individual of type A reproduces and that \(U=u\) is

$$\begin{aligned} p_+ (i) = 1- \frac{{N-i \atopwithdelims ()0}{i-1 \atopwithdelims ()u}}{{N-1 \atopwithdelims ()u}}, \end{aligned}$$
(64)

and \(p_+(i)>0\) because \(i<N\). Hence, the probability of increasing the number of agents of type A in the population in the next generation is no less than

$$\begin{aligned} \left. {{\mathrm{\mathrm Pr}}}\big (X(t+1)>i\,|\, X(t)=i\big )\right. \ge \frac{i}{N}p_u p_+ (i)>0, \end{aligned}$$
(65)

[recall that the selection step cannot decrease the number of As in the population; see inequality (62)]. Now, starting from state i, if the number of agents of type A in the population is increased at each step, fixation of A is attained in at most \(N-i\) steps. Since the probability of increasing the number of A’s in the population is positive for \(1\le i<N\), the probability of A fixing in i steps is positive,

$$\begin{aligned} \left. {{\mathrm{\mathrm Pr}}}\big (X(t+i)=N\,|\, X(t)=i\big )\right. >0. \end{aligned}$$
(66)

Verifying H2 for \(s<0\) is similar.

As in Appendices B.1 and B.2, H3 is satisfied because there is no mutation, and consequently the EW process with viability selection defines a selection process.

Note that Eq. (60) implies that in the absence of selection, the EW process is a neutral drift process. Moreover, a similar method to that used in Appendix B.1 shows that the EW process without selection is mixed-irreducible.

1.4 B.4 Generalized Wright–Fisher models

Generalized Wright–Fisher (GWF) models are “a broad class of forward-time population models that share the same mean and variance of the Wright–Fisher model, but may otherwise differ”(Der et al. 2011). GWF models can allow for selection and mutation, but the general construction builds on pure-drift GWF models.

Mathematically, a pure-drift GWF model is Markov processes X(t) such that

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}\left( X(t+1)\,|\,X(t) = i\right)&= X(t), \end{aligned}$$
(67a)
$$\begin{aligned} {{\mathrm{\mathrm Var}}}\left( X(t+1)\,|\,X(t) = i\right)&= \frac{N\sigma ^2}{N-1} X(t)\left( 1- \frac{X(t)}{N}\right) . \end{aligned}$$
(67b)

If \(\sigma ^2= 0\) then \( {{\mathrm{\mathrm Var}}}\left( X(t+1)\,|\,X(t) = i\right) = 0\) and the transition matrix for the corresponding Markov process is the identity matrix; this case, in which the population state never changes, is biologically absurd, so \(\sigma ^2> 0\) is assumed hereafter.

Pure-drift GWF models are neutral drift processes (Definition 2.9):

  • Neutrality: Equation (8) is satisfied by assumption [Eq. (67a)], i.e. neither type is expected to increase in frequency from one time-step to the next.

H2: :

This hypothesis stipulates that starting from a mixed-type state i (\(0<i<N\)), the fixation of at least one the types (A or B) must be possible.

Equation (67b) implies that for any mixed-type state \(X(t)=i\notin \{0,N\}\),

$$\begin{aligned} {{\mathrm{\mathrm Var}}}\left( X(t+1)\,|\,X(t) = i\right) >0. \end{aligned}$$
(68)

From Eq. (67a), it follows that

$$\begin{aligned} {{\mathrm{\mathrm Pr}}}\left( X(t+1)<i\,|\,X(t) = i)\right) >0, \end{aligned}$$
(69)

Thus, starting at any mixed-type state i (\(0<i<N\)), it is possible to reach the state 0 in i or fewer steps in which A decreases in frequency, each of which occurs with positive probability, so A can fix with positive probability, and H2 holds. A similar argument shows that B can also fix with positive probability.

H3: :

This hypothesis stipulates that the states at which the population is composed only of one type (A or B) are absorbing. Pure-drift GWF processes satisfy H3, because if \(X(t) = 0\) then from Eq. (67b), \({{\mathrm{\mathrm Var}}}\left( X(t+1)\,|\,X(t) =0\right) = 0\) so from Eq. (67a), if \(X(t)=0\), then

$$\begin{aligned} X(t+1)= {{\mathrm{\mathbb {E}}}}\left( X(t+1)\,|\,X(t) = 0\right) =X(t)= 0, \end{aligned}$$
(70)

and similarly, if \(X(t)=N\), then \(X(t+1) = N\).

For a population of size N, pure selection (i.e. mutationless) GWF models are constructed by modifying a pure-drift GWF modelFootnote 6: starting with a pure-drift process with transition matrix \(Q^{(N)}\) selection is represented by a second \((N+1)\times (N+1)\) row-stochastic matrix \(S^{(N)}\), and the transition matrix for the pure selection process is defined by \(S^{(N)} Q^{(N)}\). When choosing how to construct selection matrices \(S^{(N)}\), the only requirement is that in the limit \(N\rightarrow \infty \), if \(Q^{(N)}\rightarrow I\) (which means that the offspring variance approaches 0 as \(N\rightarrow \infty \)), the dynamics converge to Haldane’s classical theory of deterministic evolution (Haldane 1932). This amounts to requiring that

$$\begin{aligned} \lim _{N\rightarrow \infty }N\big (S^{(N)} - I\big )u_N = \gamma x (1-x) \frac{\mathrm{d}u}{\mathrm{d} x}, \end{aligned}$$
(71)

where u is any smooth function and

$$\begin{aligned} u_N = \Bigg (u(0),u\left( \frac{1}{N}\right) , \dots , u\left( \frac{N-1}{N}\right) \Bigg ), \end{aligned}$$
(72)

and \(\gamma \) is type B’s selective advantage (Der 2010; Der et al. 2011).

Because of the generality of the method in which Der et al. allow for selection, GWF models with selection are not necessarily selection processes acorrding to Definition 2.1. To see that this is possible, observe that there is no restriction on the selection matrix, \(S^{(N)}\) for any specific population size N (other than it being row-stochastic); only the infinite-population limit of a sequence of such selection matrices is restricted. Thus, let Q be the transition matrix for a pure-drift GWF model. Let the selection matrix S be any row-stochastic matrix with first and last columns composed of zeros other than the top and bottom (respectively) entries, which are taken to be 1. The transition matrix \(P=SQ\) defines a Markov process for which fixation from any mixed-type state is impossible (violating H2).

1.5 B.5 Neutral and non-neutral Cannings (exchangeable) models

1.5.1 B.5.1 Neutral Cannings models

An important class of models arising in population genetics are due to Cannings (1974). In the most basic formulation, a population of N individuals is considered, each of which can be of either type A or B. The reproduction of each of these individuals (regardless of its type) is assumed to be equivalent in the sense that the numbers of offspring left by each individual are exchangeable random variables.

Mathematically, for any time \(t\ge 0\) and population state, \(X(t) =i\) (that is, i is the number of individuals of type A), let \(\nu _{k}\) \((1\le k\le N)\) be a random variable describing the number of offspring of the \(k\mathrm{th}\) individual in the population (that is, its contribution to the next generation, at time \(t+1\)). Without loss of generality, we label the individuals of type A as \(1,\dots ,i\) and the individuals of type B as \(i+1,\dots ,N\) (where one of these sets of indices is empty if \(i=0\) or \(i=N\)). The population state at time \(t+1\) given that \(X(t) = i\) is then

$$\begin{aligned} X(t+1) = \sum _{k=1}^{i}\nu _{k}. \end{aligned}$$
(73)

The assumption of exchangeability of the offspring variables is then that \(\{\nu _{k}\}_{k=1}^{N}\) is a set of exchangeable random variables and independent of the population state, i, that is, the joint probability distribution of \(\{\nu _{k}\}_{k=1}^{N}\) is invariant to the order of these random variables: for any permutation \(\sigma \) of the indices \(1,\dots ,N\), and numbers of offspring \(\left( \xi _{1},\dots ,\xi _{N}\right) \in \{0,\dots ,N\}^N\),

$$\begin{aligned} {{\mathrm{\mathrm Pr}}}\left( \nu _{k}=\xi _{k};\, 1\le k\le N \right) = {{\mathrm{\mathrm Pr}}}\left( \nu _{\sigma (k)}=\xi _{k};\, 1\le k\le N \right) , \end{aligned}$$
(74)

Because the population size is constant, the offspring variables \(\{\nu _{k}\}_{k=1}^{N}\) must also satisfy

$$\begin{aligned} N = \sum _{k=1}^N\nu _{k}, \end{aligned}$$
(75)

so the variables \(\{\nu _{k}\left( i\right) \}_{k=1}^{N}\) are in general not independently distributed.Footnote 7

Any Cannings process is a pure-drift GWF process (Appendix B.4), and thus a neutral drift process (Definition 2.9). To prove this, we must show that the first and second conditional moments of a Cannings process conform to Eq. (67). The exchangeability of the offspring variables implies that the expected number of offspring of all individuals are equal (regardless of their type),

$$\begin{aligned} {{\mathrm{\mathbb {E}}}}(\nu _{k})={{\mathrm{\mathbb {E}}}}(\nu _{j}) \quad \text { for all } k,j \text { such that } 1\le k\le N,1\le j\le N. \end{aligned}$$
(76)

Using Eq. (75),

$$\begin{aligned} N = \sum _{k=1}^N{{\mathrm{\mathbb {E}}}}\left( \nu _{k}\right) = N{{\mathrm{\mathbb {E}}}}(\nu _{1}), \end{aligned}$$
(77)

so \({{\mathrm{\mathbb {E}}}}(\nu _{k})={{\mathrm{\mathbb {E}}}}(\nu _{1})=1\) for any k such that \(1\le k\le N\). It follows that,

$$\begin{aligned} \left. {{\mathrm{\mathbb {E}}}}(X(t+1))\,|\,X(t) = i)\right.&= {{\mathrm{\mathbb {E}}}}\left( \sum _{k=1}^{i}\nu _{k}\right) = i{{\mathrm{\mathbb {E}}}}(\nu _{k})= i = X(t) \end{aligned}$$
(78)

so X(t) is a martingale and Eq. (67a) is satisfied. The following Lemma shows that Eq. (67b) also holds.

Lemma B.1

The conditional variance of a Cannings process with offspring variance \(\sigma ^2\) is

$$\begin{aligned} {{\mathrm{\mathrm Var}}}\left( X(t+1)\,|\,X(t) = i\right) = \sigma ^2X(t)\frac{N- X(t)}{N-1}. \end{aligned}$$
(79)

Proof

Our derivation follows that of Ewens (2012). Using Eq. (75),

$$\begin{aligned} 0 = {{\mathrm{\mathrm Var}}}\left( N\right) = {{\mathrm{\mathrm Var}}}\left( \sum _{k=1}^N\nu _{k}\right) = \sum _{k=1}^N{{\mathrm{\mathrm Var}}}\left( \nu _{k}\right) + \sum _{\begin{array}{c} j,k=1 \\ j\ne k \end{array}}^N{{\mathrm{\mathrm Cov}}}\left( \nu _{j},\nu _{k}\right) , \end{aligned}$$
(80)

By symmetry, for any \(k\ne j\) such that \(1\le j\le N\) and \(1\le k\le N\), we have

$$\begin{aligned} 0 = N{{\mathrm{\mathrm Var}}}\left( \nu _{k}\right) + N(N-1){{\mathrm{\mathrm Cov}}}\left( \nu _{j},\nu _{k}\right) , \end{aligned}$$
(81)

and hence,

$$\begin{aligned} {{\mathrm{\mathrm Cov}}}\left( \nu _{j},\nu _{k}\right) = -\frac{\sigma ^2}{N-1}. \end{aligned}$$
(82)

It follows that

$$\begin{aligned} {{\mathrm{\mathrm Var}}}\left( X(t+1)\,|\,X(t) = i\right)&= {{\mathrm{\mathrm Var}}}\left( \sum _{k=1}^i\nu _{k}\right) \nonumber \\&=\sum _{k=1}^i{{\mathrm{\mathrm Var}}}\left( \nu _{k}\right) + \sum _{\begin{array}{c} j,k=1 \\ j\ne k \end{array}}^i{{\mathrm{\mathrm Cov}}}\left( \nu _{j},\nu _{k}\right) ,\nonumber \\&=i\sigma ^2- i(i-1)\frac{\sigma ^2}{N-1}\nonumber \\&= \sigma ^2i\frac{N- i}{N-1}= \sigma ^2X(t)\frac{N- X(t)}{N-1}. \end{aligned}$$
(83)

\(\square \)

1.5.2 B.5.2 Cannings models with selection

Lessard and Ladret (2007) introduced an extension of Cannings models that includes selection. Here, we show that although not all of the models in the class defined by Lessard and Ladret (2007) are selection processes, this is due to some biologically absurd models belonging to this class; under minimal biologically reasonable assumptions, such models are selection processes.

Following Lessard and Ladret (2007, with slightly modified notation), we consider a population of N individuals, each of which can be of either type A or B. In contrast to neutral Cannings models, in which all individuals are exchangeable, we now suppose individuals can be exchanged only with others of their own type (so an A can be exchanged with any other A but not a B).

Mathematically, for any time \(t\ge 0\) and population state, \(X(t) =i\) (that is, i is the number of individuals of type A), let \(\nu _{k}\left( i\right) \) \((1\le k\le N)\) be a random variable describing the number of offspring of the \(k\mathrm{th}\) individual in the population. Without loss of generality, we label the individuals of type A as \(1,\dots ,i\) and the individuals of type B as \(i+1,\dots ,N\). The population state at time \(t+1\) given that \(X(t) = i\) is then

$$\begin{aligned} X(t+1) = \sum _{k=1}^{i}\nu _{k}\left( i\right) . \end{aligned}$$
(84)

We assume that the offspring variables for each type, \(\{\nu _{k}\left( i\right) \}_{k=1}^{i}\) and \(\{\nu _{k}\left( i\right) \}_{k=i+1}^{N}\) are both sets of exchangeable random variables, that is the joint probability distributions of \(\{\nu _{k}\left( i\right) \}_{k=1}^{i}\) and \(\{\nu _{k}\left( i\right) \}_{k=i+1}^{N}\) are invariant to the order of these random variables: for any permutations \(\sigma _A\) and \(\sigma _B\) of the indices \(1,\dots ,i\) and \(i+1,\dots ,N\), respectively, and numbers of offspring \(\left( \xi _{1},\dots ,\xi _{N}\right) \in \{0,\dots ,N\}^N\),

$$\begin{aligned} {{\mathrm{\mathrm Pr}}}\left( \nu _{k}\left( i\right) =\xi _{k};\, 1\le k\le i \right) = {{\mathrm{\mathrm Pr}}}\left( \nu _{\sigma _A(k)}\left( i\right) =\xi _{k};\, 1\le k\le i \right) , \end{aligned}$$
(85a)

and

$$\begin{aligned} {{\mathrm{\mathrm Pr}}}\left( \nu _{k}\left( i\right) =\xi _{k};\, i+1\le k\le N \right) = {{\mathrm{\mathrm Pr}}}\left( \nu _{\sigma _B(k)}\left( i\right) =\xi _{k};\, i+1\le k\le N \right) . \end{aligned}$$
(85b)

Because the population size is constant, the offspring variables \(\{\nu _{k}\left( i\right) \}_{k=1}^{N}\) must also satisfy

$$\begin{aligned} N = \sum _{k=1}^N\nu _{k}\left( i\right) . \end{aligned}$$
(86)

Let the expected number of offspring of individuals of type A be

$$\begin{aligned} \mu _{A}\left( i\right) = {{\mathrm{\mathbb {E}}}}(\nu _{k}\left( i\right) ) \quad \text { for } 1\le k\le i, \end{aligned}$$
(87)

and the expected number of offspring of individuals of type B be

$$\begin{aligned} \mu _{B}\left( i\right) = {{\mathrm{\mathbb {E}}}}(\nu _{k}\left( i\right) ) \quad \text { for } i+1\le k\le N. \end{aligned}$$
(88)

Equation (86) then implies

$$\begin{aligned} N = \sum _{k=1}^N{{\mathrm{\mathbb {E}}}}\left( \nu _{k}\left( i\right) \right) = i\mu _{A}\left( i\right) + (N-i)\mu _{B}\left( i\right) . \end{aligned}$$
(89)

Differential fitnesses for the two types can then be introduced by allowing \(\mu _{A}\left( i\right) \) and \(\mu _{B}\left( i\right) \) to differ, and defining the fitness of each type in a manner consistent with hypothesis H1 of Definition 2.1 (for example, Wrightian fitness can be used, i.e. define the expected fitness of each type \(t=A\) or B as \(\overline{W}_{t}\left( {i}\right) \triangleq \mu _{t}\left( i\right) \); see for example Wu et al. 2013). In particular, frequency dependent selection is obtained by allowing the expected numbers of offspring to depend on the population state i.

Lessard and Ladret’s extension of Cannings’ model (described above) defines a discrete time Markov chain with transition matrix

$$\begin{aligned} P_{{i},{j}} = {{\mathrm{\mathrm Pr}}}\left( {\sum _{k=1}^{i}\nu _{k}\left( i\right) }=j\right) . \end{aligned}$$
(90)

Not all models in the class defined by Lessard and Ladret are selection processes. For example, suppose that \(\{\nu _{k}\left( i\right) \}_{k=1}^{i}\) and \(\{\nu _{k}\left( i\right) \}_{k=i+1}^{N}\) are sets of exchangeable random variables with means \(\mu _{A}\left( i\right) \)=\(\mu _{B}\left( i\right) =1\). Then,

$$\begin{aligned} \sum _{k=1}^i\nu _{k}\left( i\right) =i, \end{aligned}$$
(91a)

and

$$\begin{aligned} \sum _{k=i+1}^N\nu _{k}\left( i\right) =N-i. \end{aligned}$$
(91b)

that, is, each type evolves independently according to a (neutral) Cannings model.Footnote 8 The joint probability distribution of \(\{\nu _{k}\left( i\right) \}_{k=1}^{i}\) is then

$$\begin{aligned} {{\mathrm{\mathrm Pr}}}\left( \nu _{k}\left( i\right) =\xi _{k};\, 1\le k\le i \right) = {\left\{ \begin{array}{ll} 1/i &{}\text { if } o_k = i\delta _{k\hat{k}} \text { for some } \hat{k}\in \{1,\ldots ,i\},\\ 0 &{}\text { otherwise,} \end{array}\right. } \end{aligned}$$
(92)

with an analogous expression for \(\{\nu _{k}\left( i\right) \}_{k=i+1}^{N}\) (where we have used Kronecker’s delta notation: \(\delta _{mn}=1\) if \(m=n\) and \(\delta _{mn}=0\) otherwise). The resulting transition matrix \(P=I\) defines a neutral process in the sense that \(\left. {{\mathrm{\mathbb {E}}}}\left( X(t+1)\,|\, X(t)\right) \right. = X(t)\). However, any state is an absorbing state of this Markov process, and in particular, hypothesis H2 of Definition 2.1 is violated.Footnote 9

It is thus natural to ask which of Lessard and Ladret’s models are selection processes? To answer this question, we consider the three hypotheses of Definition 2.1:

H1: :

This hypothesis asserts that the type that has higher fitness at time t is expected to increase in frequency in the next time step. While fitness as such is not part of the definition of Lessard and Ladret’s models, one may define the expected fitness of each type \(t=A\) or B as the expected number of offspring of individuals of that type (as suggested above),

$$\begin{aligned} \overline{W}_{t}\left( {i}\right) \triangleq \mu _{t}\left( i\right) . \end{aligned}$$
(93)

Under this definition of fitness,

$$\begin{aligned} \left. {{\mathrm{\mathbb {E}}}}(X(t+1)\,|\,X(t) = i)\right.&= {{\mathrm{\mathbb {E}}}}\left( \sum _{k=1}^{i}\nu _{k}\left( i\right) \right) = i\mu _{A}\left( i\right) = i + \frac{i}{N}(N\mu _{A}\left( i\right) -N)\nonumber \\&= i + \frac{i}{N}\big [N\mu _{A}\left( i\right) -(i\mu _{A}\left( i\right) + (N-i)\mu _{B}\left( i\right) \big ] \nonumber \\&=i + i\frac{N-i}{N}(\mu _{A}\left( i\right) -\mu _{B}\left( i\right) ), \end{aligned}$$
(94)

so H1 is satisfied.

H2: :

This hypothesis stipulates that starting from a mixed-type state i (where both types are present in the population), the fixation of at least one the types (A or B) must be possible. We are not aware of a simple sufficient condition on the exchangeable sets \(\{\nu _{k}\left( i\right) \}_{k=1}^{i}\) and \(\{\nu _{k}\left( i\right) \}_{k=i+1}^{N}\) ensuring that H2 holds. However, models violating this hypothesis seem to us biologically unreasonable.

H3::

This hypothesis stipulates that the states at which the population is composed only of one type (A or B) are absorbing. Any process in the class defined by Lessard and Ladret satisfies H3, because if \(X(t) = 0\), then from Eq. (84)

$$\begin{aligned} X(t+1) = \sum _{k=1}^{0}\nu _{k}\left( i\right) =0, \end{aligned}$$
(95)

and similarly, if \(X(t)=N\), then \(X(t+1) = N\).

C Fixation probabilities for birth–death processes

Suppose that individuals in a population of constant size N can possess one of two traits, A and B. Let the state of the population (i.e. the number of individuals of type A) evolve according to a discrete-time birth–death process in which a trait that has disappeared cannot re-emerge. That is, the population state may change by at most one at any given time-step (individuals change their type one at a time), and the states 0 and N are absorbing. In this Appendix, we find \(p_\mathrm{fix}\left( {i}\right) \), the fixation probability of the trait A, when there are initially i individuals of type A in the population. We do this following the method presented by Nowak (2006).

Mathematically, the time evolution of the population composition follows a Markov process with transition matrix P satisfying

$$\begin{aligned} P_{{k},{k}}=1-P_{{k},{k+1}}-P_{{k},{k-1}}, \end{aligned}$$
(96)

and \(P_{{k},{j}}=0\) for all \(0\le j<k-1\) and \(k+1<j\le N\), where \(P_{{k},{k+1}}\) and \(P_{{k},{k-1}}\) are the transition probabilities from the state in which there are k individuals of type A, to the ones in which the population contains \(k+1\) or \(k-1\) individuals of type A, respectively. Note also that \(P_{{0},{0}} = P_{{1},{1}} = 1\) and \(P_{{0},{k}} = P_{{N},{N-k}} =0\) for all \(1\le k\le N\) (the states corresponding to homogeneous populations are absorbing).

Let \(p_\mathrm{fix}\left( {i}\right) \) be the probability of reaching state N (fixation of A) when starting from state i. It follows that \(p_\mathrm{fix}\left( {0}\right) = 0\), \(p_\mathrm{fix}\left( {N}\right) =1\) and for \(1\le i \le N-1\),

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right) = P_{{i},{i-1}}p_\mathrm{fix}\left( {i-1}\right) + P_{{i},{i+1}}p_\mathrm{fix}\left( {i+1}\right) + P_{{i},{i}}p_\mathrm{fix}\left( {i}\right) \!. \end{aligned}$$
(97)

Consequently,

$$\begin{aligned} (P_{{i},{i+1}} + P_{{i},{i-1}})p_\mathrm{fix}\left( {i}\right) =(1-P_{{i},{i}})p_\mathrm{fix}\left( {i}\right) = P_{{i},{i-1}}p_\mathrm{fix}\left( {i-1}\right) + P_{{i},{i+1}}p_\mathrm{fix}\left( {i+1}\right) \!, \end{aligned}$$

so

$$\begin{aligned} P_{{i},{i-1}}(p_\mathrm{fix}\left( {i}\right) -p_\mathrm{fix}\left( {i-1}\right) ) = P_{{i},{i+1}}(p_\mathrm{fix}\left( {i+1}\right) -p_\mathrm{fix}\left( {i}\right) ), \end{aligned}$$

or, defining \(y_i = p_\mathrm{fix}\left( {i}\right) - p_\mathrm{fix}\left( {i-1}\right) \) for \(1\le i \le N\),

$$\begin{aligned} y_{i+1} =\frac{P_{{i},{i-1}}}{P_{{i},{i+1}}}y_i. \end{aligned}$$

Thus,

$$\begin{aligned} y_1&= p_\mathrm{fix}\left( {1}\right) - p_\mathrm{fix}\left( {0}\right) = p_\mathrm{fix}\left( {1}\right) ,\nonumber \\ y_2&=\frac{P_{{1},{0}}}{P_{{1},{2}}}y_1 =\frac{P_{{1},{0}}}{P_{{1},{2}}}p_\mathrm{fix}\left( {1}\right) ,\nonumber \\ y_3&= \frac{P_{{2},{1}}}{P_{{2},{3}}}y_2 = \frac{P_{{2},{1}}}{P_{{2},{3}}}\frac{P_{{1},{0}}}{P_{{1},{2}}}p_\mathrm{fix}\left( {1}\right) ,\nonumber \\&\,\;\vdots \nonumber \\ y_{i+1}&= \prod _{j=1}^{i}\frac{P_{{j},{j-1}}}{P_{{j},{j+1}}} p_\mathrm{fix}\left( {1}\right) \end{aligned}$$
(98)

for \(2\le i\le N-1\).

Summing \(y_k\) for \(1\le k\le i\le N\) gives

$$\begin{aligned} \sum _{k=1}^{i} y_k = \sum _{k=1}^{i} {\big (p_\mathrm{fix}\left( {k}\right) - p_\mathrm{fix}\left( {k-1}\right) \big )} = p_\mathrm{fix}\left( {i}\right) - p_\mathrm{fix}\left( {0}\right) = p_\mathrm{fix}\left( {i}\right) . \end{aligned}$$
(99)

From Eqs. (98) and (99),

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right) = y_1 +\sum _{k=1}^{i-1} y_{k+1} = p_\mathrm{fix}\left( {1}\right) \left( 1+\sum _{k=1}^{i-1} \prod _{j=1}^{k}\frac{P_{{j},{j-1}}}{P_{{j},{j+1}}}\right) . \end{aligned}$$
(100)

Since \(p_\mathrm{fix}\left( {N}\right) =1\), substituting \(i=N\) in Eq. (100) gives

$$\begin{aligned} p_\mathrm{fix}\left( {1}\right) = \frac{1}{1 + \sum _{k=1}^{N-1}\prod _{j=1}^{k}\frac{P_{{j},{j-1}}}{P_{{j},{j+1}}}}. \end{aligned}$$
(101)

Thus, from Eqs. (101) and (100), the fixation probability of A when there are initially i individuals of type A in the population is

$$\begin{aligned} p_\mathrm{fix}\left( {i}\right) = \frac{1 +\sum _{k=1}^{i-1}\prod _{j=1}^{k} \frac{P_{{j},{j-1}}}{P_{{j},{j+1}}}}{1+ \sum _{k=1}^{N-1}\prod _{j=1}^{k}\frac{P_{{j},{j-1}}}{P_{{j},{j+1}}}}. \end{aligned}$$
(102)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Molina, C., Earn, D.J.D. On selection in finite populations. J. Math. Biol. 76, 645–678 (2018). https://doi.org/10.1007/s00285-017-1151-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-017-1151-4

Keywords

Mathematics Subject Classification

Navigation