Abstract
In this paper, we develop a method for computing the variance effective size \(N_{eV}\), the fixation index \(F_{ST}\) and the coefficient of gene differentiation \(G_{ST}\) of a structured population under equilibrium conditions. The subpopulation sizes are constant in time, with migration and reproduction schemes that can be chosen with great flexibility. Our quasi equilibrium approach is conditional on non-fixation of alleles. This is of relevance when migration rates are of a larger order of magnitude than the mutation rates, so that new mutations can be ignored before equilibrium balance between genetic drift and migration is obtained. The vector valued time series of subpopulation allele frequencies is divided into two parts; one corresponding to genetic drift of the whole population and one corresponding to differences in allele frequencies among subpopulations. We give conditions under which the first two moments of the latter, after a simple standardization, are well approximated by quantities that can be explicitly calculated. This enables us to compute approximations of the quasi equilibrium values of \(N_{eV}\), \(F_{ST}\) and \(G_{ST}\). Our findings are illustrated for several reproduction and migration scenarios, including the island model, stepping stone models and a model where one subpopulation acts as a demographic reservoir. We also make detailed comparisons with a backward approach based on coalescence probabilities.
Similar content being viewed by others
References
Allendorf F, Ryman N (2002) The role of genetics in population viability analysis. In: Bessinger SR, McCullogh DR (eds) Population viability analysis. The University of Chicago Press, Chicago
Allendorf FW, Luikart G (2007) Conservation and the genetics of populations. Blackwell, Malden
Barton NH, Slatkin M (1986) A quasi-equilibrium theory of the distribution of rare alleles in a subdivided population. Heredity 56:409–415
Brockwell PJ, Davis RA (1987) Time series: theory and methods. Springer, New York
Caballero A (1994) Developments in the prediction of effective population size. Heredity 73:657–679
Cannings C (1974) The latent roots of certain Markov chains arising in genetics: a new approach. I. Haploid models. Adv Appl Prob 6:260–290
Caswell H (2001) Matrix population models, 2nd edn. Sinauer, Sunderland
Cattiaux P, Collet P, Lambert A, Martínez SM, Martín JS (2009) Quasi-stationary distributions and diffusion models in population dynamics. Ann Probab 37(5):1926–1969
Chakraborty R, Leimar O (1987) Genetic variation within a subdivided population. In: Ryman N, Utter R (eds) Population genetics and fishery management. Washington Sea Grant Program, Seattle, WA. Reprinted 2009 by The Blackburn Press, Caldwell
Collet P, Martinez S (2013) Quasi stationary distributions, Markov chains, diffusions and dynamical systems. Springer, Berlin
Cox DR, Miller HD (1965) The theory of stochastic processes. Methuen & Co Ltd, London
Crow JF (2004) Assessing population subdivision. In: Wasser SP (ed) Evolutionary theory and processes: modern horizons. Papers in Honour of Eviator Nevo. Springer Science+Business Media Dordrecht, Berlin, pp 35–42
Crow JF, Aoki K (1982) Group selection for a polygenic behavioral trait: a differential proliferation model. Proc Natl Acad Sci 79:2628–2631
Crow JF, Aoki K (1984) Group selection for a polygenic behavioral trait: estimating the degree of population subdivision. Proc Natl Acad Sci 81:6073–6077
Crow JF, Kimura M (1970) An introduction to population genetics theory. The Blackburn Press, Caldwell
Durrett R (2008) Probability models for DNA sequence evolution, 2nd edn. Springer, New York
Engen S, Lande R, Saether B-E (2005a) Effective size of a fluctuating age-structured population. Genetics 170:941–954
Engen S, Lande R, Saether B-E, Weimerskirch H (2005b) Extinction in relation to demographic and environmental stochasticity in age-structured models. Math Biosci 195:210–227
Engle RF, Granger CWJ (1987) Co-integration and error correction: Representation, estimation and testing. Econometrica 55:251–276
Ethier SN, Nagylaki T (1980) Diffusion approximation of Markov chains with two time scales and applications to genetics. Adv Appl Prob 12:14–49
Ewens WJ (1982) On the concept of effective population size. Theoret Popul Biol 21:373–378
Ewens WJ (2004) Mathematical Population Genetics. I. Theoretical introduction, 2nd edn. Springer, New York
Felsenstein J (1971) Inbreeding and variance effective numbers in populations with overlapping generations. Genetics 68:581–597
Fisher RA (1958) The genetical theory of natural selection, 2nd edn. Dover, New York
Granger CWJ (1981) Some properties of time series data and their use in econometric model specification. J Econom 16:121–130
Hardy OJ, Vekemans X (1999) Isolation by distance in a continuous population: reconciliation between spatial autocorrelation analysis and population genetics models. Heredity 83:145–154
Hardy OJ, Vekemans X (2002) SPAGeDI: a versatile computer program to analyse spatial genetic structure at the individual or population model. Mol Ecol Notes 2:618–620
Hare MP, Nunney L, Schwartz MK, Ruzzante DE, Burford M, Waples R, Ruegg K, Palstra F (2011) Understanding and estimating effective population size for practical applications in marine species management. Conserv Biol 25(3):438–449
Hössjer O (2011) Coalescence theory for a general class of structured populations with fast migration. Adv Appl Probab 43(4):1027–1047
Hössjer O (2013) Spatial autocorrelation for subdivided populations with invariant migration schemes. Methodol Comput Appl Probab. doi:10.1007/s11009-013-9321-3
Hössjer O, Jorde PE, Ryman N (2013) Quasi equilibrium approximations of the fixation index of the island model under neutrality. Theoret Popul Biol 84:9–24
Jamieson IG, Allendorf FW (2012) How does the 50/500 rule apply to MVPs? Trends Ecol Evol 27(10): 578–584
Jorde P-E, Ryman N (2007) Unbiased estimator of genetic drift and effective population size. Genetics 177:927–935
Karlin S (1966) A first course in stochastic processes. Academic Press, New York
Kimura M (1953) ‘Stepping stone’ model of population. Ann Rep Natl Inst Genet Japan 3:62–63
Kimura M (1955) Solution of a process of random genetic drift with a continuous model. Proc Natl Acad Sci USA 41:141–150
Kimura M (1964) Diffusion models in population genetics. J Appl Prob 1:177–232
Kimura M (1971) Theoretical foundations of population genetics at the molecular level. Theor Popul Biol 2:174–208
Kimura M, Weiss GH (1964) The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics 61:763–771
Kingman JFC (1982) The coalescent. Stoch Proc Appl 13:235–248
Latter BDH, Sved JA (1981) Migration and mutation in stochastic models of gene frequency change. II. Stochastic migration with a finite number of islands. J Math Biol 13:95–104
Leviyang S (2011a) The distribution of \(F_{ST}\) for the island model in the large population, weak mutation limit. Stoch Anal Appl 28:577–601
Leviyang S (2011b) The distribution of \(F_{ST}\) and other genetic statistics for a class of population structure models. J Math Biol 62:203–289
Leviyang S, Hamilton MB (2011) Properties of Weir and Cockerham’s \(F_{ST}\) estimator and associated bootstrap confidence intervals. Theoret Populat Biol 79:39–52
Malécot G (1946) La consanguinité dans une population limitée. C R Acad Sci (Paris) 222:841–843
Maruyama T (1970a) On the rate of decrease of heterozygosity in circular stepping stone models of populations. Theor Popul Biol 1:101–119
Maruyama T (1970b) Effective number of alleles in subdivided populations. Theor Popul Biol 1:273–306
Möhle M (2010) Looking forwards and backwards in the multi-allelic neutral Cannings population model. J Appl Prob 47:713–731
Nagylaki T (1980) The strong migration limit in geographically structured populations. J Math Biol 9: 101–114
Nagylaki T (1982) Geographical invariance in population genetics. J Theor Biol 99:159–172
Nagylaki T (1998) The expected number of heterozygous sites in a subdivided population. Genetics 149:1599–1604
Nagylaki T (2000) Geographical invariance and the strong-migration limit in subdivided populations. J Math Biol 41:123–142
Nei M (1973) Analysis of gene diversity in subdivided populations. Proc Natl Acad Sci USA 70:3321–3323
Nei M (1975) Molecular evolution and population genetics. North-Holland, Amsterdam
Nei M (1977) \(F\)-statistics and analysis of gene diversity in subdivided populations. Ann Hum Genet 41: 225–233
Nei M, Chakravarti A, Tateno Y (1977) Mean and variance of \(F_{ST}\) in a finite number of incompletely isolated populations. Theoret Popul Biol 11:291–306
Nei M, Kumar S (2000) Molecular evolution and phylogenetics. Oxford University Press, Oxford
Nei M, Tajima F (1981) Genetic drift and estimation of effective population size. Genetics 98:625–640
Nordborg M, Krone S (2002) Separation of time scales and convergence to the coalescent in structured populations. In: Slatkin M, Veuille M (eds) Modern development in theoretical population genetics. Oxford Univ Press, Oxford, pp 194–232
Nunney L (1999) The effective size of a hierarchically-structured population. Evolution 53:1–10
Olsson F, Hössjer O, Laikre L, Ryman N (2013) Variance effective population size of populations in which size and age composition fluctuate. Theoret, Popul Biol (to appear)
Orive ME (1993) Effective population size in organisms with complex life-histories. Theoret Popul Biol 44:316–340
Palstra FP, Ruzzante DE (2008) Genetic estimates of contemporary effective population size: what can they tell us about the importance of genetic stochasticity for wild populations persistence? Mol Ecol 17:3428–3447
Rottenstreich S, Miller JR, Hamilton MB (2007) Steady state of homozygosity and \(G_{ST}\) for the island model. Theoret Popul Biol 72:231–244
Ryman N, Allendorf FW, Jorde PE, Laikre L, Hössjer O (2013) Samples from structured populations yield biased estimates of effective size that overestimate the rate of loss of genetic variation. Mol Ecol Resour (to appear)
Ryman N, Leimar O (2008) Effect of mutation on genetic differentiation among nonequilibrium populations. Evolution 62(9):2250–2259
Sagitov S, Jagers P (2005) The coalescent effective size of age-structured populations. Ann Appl Probab 15(3):1778–1797
Sampson KY (2006) Structured coalescent with nonconservative migration. J Appl Prob 43:351–362
Sjödin P, Kaj I, Krone S, Lascoux M, Nordborg M (2005) On the meaning and existence of an effective population size. Genetics 169:1061–1070
Slatkin M (1981) Estimating levels of gene flow in natural populations. Genetics 99:323–335
Slatkin M (1985) Rare alleles as indicators of gene flow. Evolution 39:53–65
Slatkin M (1991) Inbreeding coefficients and coalescence times. Genet Res 58:167–175
Slatkin M, Arter HE (1991) Spatial autocorrelation methods in population genetics. Am Nat 138(2):499–517
Sved JA, Latter BDH (1977) Migration and mutation in stochastic models of gene frequency change. J Math Biol 5:61–73
Sokal RR, Oden NL, Thomson BA (1997) A simulation study of microevolutionary inferences by spatial autocorrelation analysis. Biol J Linnean Soc 60:73–93
Takahata N (1983) gene identity and genetic differentiation of populations in the finite island model. Genetics 104 (3): 497–512
Takahata N, Nei M (1984) \(F_{ST}\) and \(G_{ST}\) statistics in the Finite island model. Genetics 107 (3): 501–504
Van der AA NP, Ter Morsche HG, Mattheij RRM (2007) Computation of eigenvalue and eigenvector derivatives for a general complex-valued eigensystem. Electron J Linear Algebra 16:300–314
Wakeley J (1999) Nonequilibrium migration in human history. Genetics 153:1863–1871
Wakeley J, Takahashi T (2004) The many-demes limit for selection and drift in a subdivided population. Theoret Popul Biol 66:83–91
Wang J, Caballero A (1999) Developments in predicting the effective size of subdivided populations. Heredity 82:212–226
Waples RS (1989) A generalized approach for estimating effective population size from temporal changes of allele frequency. Genetics 121:379–391
Waples RS (2002) Definition and estimation of effective population size in the conservation of endangered species. In: Beissinger SR, McCullogh DR (eds) Populations viability analysis. The University of Chicago Press, Chicago
Waples RS, Gaggiotti O (2006) What is a population? An empirical evaluation of some genetic methods for identifying the number of gene pools and their degree of connectivity. Mol Ecol 15:1419–1439
Waples RS, Yokota M (2007) Temporal estimates of effective population size in species with overlapping generations. Genetics 175:219–233
Ward RD, Woodward M, Skibinski DOF (1994) A comparison of genetic diversity levels in marine, freshwater and anadromous fishes. J Fish Biol 44:213–232
Weir BS, Cockerham CC (1984) Estimating \(F\)-statistics for the analysis of population structure. Evolution 38(6):1358
Weiss GH, Kimura M (1965) A mathematical analysis of the stepping stone model of genetic correlation. J Appl Probab 2:129–149
Whitlock MC, Barton NH (1997) The effective size of a subdivided population. Genetics 145:427–441
Wilkinson-Herbots HM (1998) Genealogy and subpopulation differentiation under various models of population structure. J Math Biol 37:535–585
Wright S (1931) Evolution in Mendelian populations. Genetics 16:97–159
Wright S (1938) Size of population and breeding structure in relation to evolution. Science 87:430–431
Wright S (1943) Isolation by distance. Genetics 28:114–138
Wright S (1946) Isolation by distance under diverse systems of mating. Genetics 31:39–59
Wright S (1951) The general structure of populations. Ann Eugenics 15:323–354
Wright S (1978) Variability within and among genetic populations. Evolution and the genetics of populations, vol 4. University of Chicago Press, Chicago
Acknowledgments
Ola Hössjer’s research was financially supported by the Swedish Research Council, contract nr. 621-2008-4946, and the Gustafsson Foundation for Research in Natural Sciences and Medicine. Nils Ryman’s research was supported by grants from the Swedish Research Council, the BONUS Baltic Organisations’ Network for Funding Science EEIG (the BaltGene research project), and through a grant to his colleague Linda Laikre from the Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning (Formas). The authors want to thank an associate editor, two referees, Anders Martin-Löf, and Fredrik Olsson for valuable comments on the work.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Orthogonal decomposition of allele frequency process
Jordan canonical form of \({\varvec{B}}\) and motivation of (22). Let \({\varvec{B}}= {\varvec{Q}}\varvec{\Lambda }{\varvec{Q}}^{-1}\) be the Jordan canonical form of \({\varvec{B}}\), with
a block diagonal matrix containing the (possibly complex-valued) eigenvalues of \({\varvec{B}}\) along the diagonal. For each \(l=1,\ldots ,r\), the square matrix
occupies rows and columns \(j_{l-1}+1,\ldots ,j_l\) of \(\varvec{\Lambda }\), with diagonal entries equal to \(\lambda _l\), all entries along the superdiagonal equal to 1 and all other entries of \(\varvec{\Lambda }_l\) equal 0. Hence \(\lambda _l\) is an eigenvalue of \({\varvec{B}}\) which appears \(j_l-j_{l-1}\) times along the diagonal of \(\varvec{\Lambda }_l\), with \(0=j_0 < j_1 < \cdots < j_r=s\). In particular, \(\varvec{\Lambda }\) is diagonal when all eigenvalues of \({\varvec{B}}\) are distinct and \(r=s\). Then the rows of \({\varvec{Q}}^{-1}\) contain the left eigenvectors of \({\varvec{B}}\) and the columns \({\varvec{q}}_1,\ldots ,{\varvec{q}}_s\) of \({\varvec{Q}}\) the right eigenvectors. See for instance Cox and Miller (1965).
Regardless of whether \(\varvec{\Lambda }\) is diagonal or not, since \({\varvec{B}}\) is a transition matrix of a Markov chain, \({\varvec{q}}_1 = \varvec{1}\) is a right eigenvector with eigenvalue \(\lambda _1=1\). By the assumed irreducibility and aperiodicity of this Markov chain, it follows from the Perron Frobenius Theorem that \(|\lambda _l|<1\) for \(l=2,\ldots ,r\), and without loss of generality, we may assume \(|\lambda _2|\ge |\lambda _3|\ge \cdots \ge |\lambda _r|\ge 0\).
Introduce the inner product
for possibly complex-valued column vectors \({\varvec{x}}=(x_i)\) and \({\varvec{y}}=(y_i)\) of length \(s\), with \(\bar{x}_i\) the complex conjugate of \(x_i\). Then, we have the following result:
Proposition 5
The columns \({\varvec{q}}_2,\ldots ,{\varvec{q}}_s\) of \({\varvec{Q}}\) are all orthogonal to \({\varvec{q}}_1=\varvec{1}\) with respect to inner product (105), i.e.
Proof
We have that
where \(\langle {\varvec{x}},{\varvec{y}}\rangle =\sum _{j=1}^s x_jy_j\) is the standard inner product. The result follows since \(\varvec{\gamma }\) is the first row of \({\varvec{Q}}^{-1}\) and \({\varvec{q}}_j\) row number \(j\) (with \(j\ge 2\)) of \({\varvec{Q}}\). \(\square \)
Define \(\varvec{\Lambda }^0=\text{ diag }(0,\varvec{\Lambda }_2,\ldots ,\varvec{\Lambda }_s)\) as the block diagonal matrix obtained by replacing \(\Lambda _1=\lambda _1=1\) in \(\varvec{\Lambda }\) by \(0\) (or any other with modulus less or equal to \(|\lambda _2|\)), and put
It then follows that \({\varvec{B}}^0\) has largest eigenvalue \(|\lambda _2|<1\), and it enters into the time dynamics of the allele frequency process as follows:
Proposition 6
The recursive autoregressive equation (10) for \({\varvec{P}}_t\) can be decomposed into one genetic drift term for the overall allele frequency of the whole population, and one recursion part for the allele frequency fluctuations among subpopulations, as
with \(\varvec{\varepsilon }_{t+1}^0\) as defined in (22).
Proof
The upper part of (107) follows immediately from (10), since
Define, for any vector \({\varvec{x}}=(x_1,...,x_s)\), \({\varvec{x}}^0 = {\varvec{x}}- (\varvec{1},{\varvec{x}})\varvec{1}\). Then, since \(({\varvec{x}}+{\varvec{y}})^0 = {\varvec{x}}^0+{\varvec{y}}^0\), we have that
The third equality of (108) follows since
where in the second last step we used that since \({\varvec{P}}_t^0\) is a linear combination of \({\varvec{q}}_2,\ldots ,{\varvec{q}}_s\), so is \({\varvec{B}}{\varvec{P}}_t^0\), and hence orthogonal to \(\varvec{1}\) by Proposition 5, so that \(({\varvec{B}}{\varvec{P}}_t^0)^0={\varvec{B}}{\varvec{P}}_t^0\).
The fourth equality of (108) follows since \({\varvec{Q}}^{-1}{\varvec{P}}_t^0\) is a linear combination of \({\varvec{e}}_2,\ldots ,{\varvec{e}}_s\), where \({\varvec{e}}_i=(0,\ldots ,0,1,0,\ldots ,0)^T\) has 1 in position \(i\) and zeros elsewhere. Hence \(\varvec{\Lambda }{\varvec{Q}}^{-1}{\varvec{P}}_t^0 = \varvec{\Lambda }^0{\varvec{Q}}^{-1}{\varvec{P}}_t^0\) and \({\varvec{B}}{\varvec{P}}_t^0 = {\varvec{B}}^0 {\varvec{P}}_t^{0}\). \(\square \)
Appendix B: Proofs from Sect. 5
Proof of Proposition 1.
We notice that
from which it easily follows that the two recursions in (25) and (26) are equivalent, with \(A_{ij,kl}\) and \(U_{ij,kl}\) related as in (27).
Next we will show that (25) and (28) are equivalent. Clearly (25) implies (28), so it remains to establish the reverse implication. Hence we assume that (28) is satisfied and we want to show that (25) holds for a unique square matrix \({\varvec{U}}=(U_{ij,kl})\) of order \(s^2\) with \(U_{ij,kl}=U_{ij,lk}\). Indeed, since \(\varvec{\Omega }({\varvec{P}})\) is a quadratic function of \({\varvec{P}}\) with \(\varvec{\Omega }({\varvec{0}})={\varvec{0}}\), there is a unique such matrix \({\varvec{U}}\) and a unique set of coefficients \(c_{ij,k}\) satisfying
for all \(i,j\). On the other hand, according to lower part of (28),
should agree with (109). The quadratic terms of (109) and (110) are clearly identical, but in order for the linear and constant terms to agree as well,
must hold for all \(k\) (recall that \(U_{ij,kl}=U_{ij,lk}\)). On the other hand, we can add and subtract linear terms in (109) according to
where
for all \(k\). But \(d_{ij,k}=c_{ij,k}\) according to (111), so that the second sum in (112) vanishes, and the proposition is proved. \(\square \)
Proof of Proposition 2
First of all, since \(\sum _{\tau =0}^\infty ({\varvec{G}}^0-\varvec{\Pi }{\varvec{U}})^\tau \) is assumed to converge, it can be seen by insertion that (36) provides a solution to (33).
In order to prove (37), we get from the Cauchy–Schwarz inequality
for all pairs \(i,j\). This implies
We then use the definitions of \(|\cdot |_\infty \) and \(\Vert \cdot \Vert \) in Table 2, the triangle inequality and the matrix norm inequality \(\Vert ({\varvec{G}}-\varvec{\Pi }{\varvec{U}})^\tau \varvec{\Pi }\Vert \le \Vert ({\varvec{G}}-\varvec{\Pi }{\varvec{U}})^\tau \Vert \Vert \varvec{\Pi }\Vert \) in order to prove (38), since
We also have that
Finally, (39)–(40) are proved in the same way as (37)–(38). \(\square \)
Proof of Proposition 3
In order to prove (41), we introduce for each pair of integers \(\tau ,\alpha \) with \(0\le \alpha \le \tau \) the set \({\mathcal {N}}_{\tau \alpha } = \{{\varvec{n}}= (n_0,n_1,\ldots ,n_{\alpha +1})\}\) of \(\tau \atopwithdelims ()\alpha \) sequences \({\varvec{n}}\) such that \(0=n_0 < n_1 < \cdots < n_\alpha < n_{\alpha +1} = \tau +1\). Then
where the terms in \({\mathcal {N}}_{\tau \alpha }\) correspond to all possible ways of picking \(\alpha \) terms \(\varvec{\Pi }{\varvec{U}}\) and \(\tau -\alpha \) terms \({\varvec{G}}^0\). Taking the matrix norm of (113) and multiplying by \({\varvec{U}}\) from the left and \(\varvec{\Pi }\) from the right, it follows from matrix norm inequalities that
Summing (114) over \(\tau \), then changing the order of summation between \(\alpha \) and \(\tau \), and finally substituting \(m_i=n_i-n_{i-1}-1\), we find that
It can be seen that \(({\varvec{G}}^0)^\tau \text{ vec }({\varvec{V}}) = \text{ vec }(({\varvec{B}}^0)^\tau {\varvec{V}}(({\varvec{B}}^0)^T)^\tau )\), by induction with respect to \(\tau \). Writing \(({\varvec{G}}^0)^\tau =(G_{ij,kl}^{0(\tau )})\) and \(({\varvec{B}}^0)^\tau = (b_{ik}^{0(\tau )})\), this yields
and
Formula (41) then follows from (115) to (116). In order to verify (42), we use (106) and the Jordan decomposition (104) to deduce
where the middle matrix on the right hand side is block diagonal, with \(\Vert \varvec{\Lambda }_l^\tau \Vert = O(\tau ^{j_l-j_{l-1}-1}|\lambda _l|^\tau )\) as \(\tau \rightarrow \infty \), and \(j_l-j_{l-1}\) the order of the square matrix \(\varvec{\Lambda }_l\), see Cox and Miller (1965) for details. In particular, this implies that \(\Vert \varvec{\Lambda }_l^\tau \Vert \) converges to zero at a faster rate than \((|\lambda _2|+\epsilon )^\tau \) as \(\tau \rightarrow \infty \) for any \(0<\epsilon < 1-|\lambda _2|\). Then (42) follows, since
Finally, (43) is a simple consequence of (41) and (42), since
\(\square \)
Appendix C: Proof of Theorem 1
We start by showing that \(\text{ vec }({\varvec{V}}_t)\) and \(\text{ vec }(\varvec{\Sigma }_t)\) satisfy a similar system of equations as (33). To this end, since \(\varvec{\varepsilon }_t^0 = ({\varvec{I}}-\varvec{1}\varvec{\gamma })\varvec{\varepsilon }_t\), the lower part of (22) implies a recursion
where \(\varvec{\xi }_{t+1}\) is a remainder term that is nonzero since we conditioned on \(P_t\) rather than \(P_{t+1}\) and divided by \(P_t(1-P_t)\) rather than \(P_{t+1}(1-P_{t+1})\) on the right hand side of (117). Any departure of \(E_c(\varvec{\varepsilon }_{t+1}^0|P_t)\) from \(E(\varvec{\varepsilon }_{t+1}^0|P_t)={\varvec{0}}\) implies, in addition, that a cross covariance term is added to \(\varvec{\xi }_{t+1}\).
In vec format we may rewrite (117) as
with \(\varvec{\Pi }\) and \({\varvec{G}}^0\) matrices defined by \(\varvec{\Pi }\text{ vec }(\varvec{\Sigma }_t)=\text{ vec }(({\varvec{I}}-\varvec{1}\varvec{\gamma })\varvec{\Sigma }_t({\varvec{I}}-\varvec{1}\varvec{\gamma })^T))\) and \({\varvec{G}}^0\text{ vec }({\varvec{V}}_t)=\text{ vec }({\varvec{B}}^0{\varvec{V}}_t({\varvec{B}}^0)^T)\) respectively. Hence their entries are as in (34) and (35).
For the standardized genetic drift covariance matrix, we first expand (31) as
where the remainder term \(\varvec{\zeta }_t\) occurs when replacing the inner expectation \(E\) by \(E_c\). Then we expand \(\varvec{\Omega }(P_t{\varvec{1}} + {\varvec{P}}_t^0)\) as in (30) and take expectation conditionally on \(P_t\), and switch index from \(t\) to \(t+1\), to deduce that
where \({\varvec{U}}_t = (U_{tij,k})\) is an \(s^2\times s\) matrix, whose elements are defined as \(U_{tij,k}=(1-2P_t)\sum _l U_{ij,kl}\), so that the last term on the right hand side of (30) can be written as \({\varvec{U}}_t{\varvec{P}}_t^0\). The last term on the right hand side of (120) is defined by
with \(\varvec{\mu }_t\) as in (44).
Now (118) and (120) define a system of equations which only differs from (33) in that the remainder terms \(\text{ vec }(\varvec{\xi }_{t+1})\) and \(\varvec{\eta }_{t+1}\) have been added. For simplicity of notation, we write \(\tilde{\varvec{\xi }}_t = \text{ vec }(\varvec{\xi }_t)=\text{ vec }(\xi _{t,ij};1\le i,j\le s)\), a column vector of length \(s^2\). Combining and (118) and (120), we get
where
On the other hand, it follows from (33) that
Taking the difference of (121) and (122), we find that
satisfies
provided that the series converges. It can be shown by induction with respect to \(\tau \) that
for all \(\tau \ge 1\). Inserting this formula into (123), one obtains
Since \(\tilde{\varvec{\xi }}_t\) contains the same elements as \(\varvec{\xi }_t\), we have that \(|\tilde{\varvec{\xi }}_t|_\infty = |\varvec{\xi }_t|_\infty \), and moreover, \(|\varvec{\eta }_t|_\infty \le |\varvec{\zeta }_t|_\infty + \Vert {\varvec{U}}_t\Vert |\varvec{\mu }_t|_\infty \). Hence it follows, by taking the \(|\cdot |_\infty \)-norm of the upper and lower part of (124), that
and
Since
if follows that \(\Vert {\varvec{U}}_t\Vert \le \Vert {\varvec{U}}\Vert \). Hence we may replace \(\Vert {\varvec{U}}_t\Vert \) and \(\Vert {\varvec{U}}_{t-\tau -1}\Vert \) in (125)–(126) by their upper bounds \(\Vert {\varvec{U}}\Vert \), take conditional expectation \(E_c\) on both sides of these two inequalities, and finally letting \(t\rightarrow \infty \), thereby obtaining (48) and (49). \(\square \)
Appendix D: Verifying formulas for \(\Omega ({\varvec{P}}_t)\) and \(N_{eV}^\mathrm{{ appr}}\) for various reproduction and migration models.
We will start by verifying (30) (and hence also (120)) separately for reproduction scenarios 1, 2 and 3.
Reproduction scenario 1. For this reproduction scenario, we write
It follows from (10) and (53) that
We further have that
and
Combining the last three displayed expressions, we arrive at (54). \(\square \)
Reproduction scenario 2. Write
introduce \(C_{kij} = \text{ Cov }(\nu _{ki}^l,\nu _{kj}^l)\) and \(\tilde{C}_{kij}=\text{ Cov }(\nu _{ki}^l,\nu _{kj}^{l^\prime })\) when \(l\ne l^\prime \). Because of the assumed exchangeability of \(\{\varvec{\nu }_k^l\}_{l=1}^{2Nu_k}\), \(C_{kij}\) and \(\tilde{C}_{kij}\) do not depend on \(l\) and \((l,l^\prime )\) respectively. Since (2) holds exactly, with remainder term \(o(1)\) equal to zero, the variance of the left hand side must be zero, and this implies \(\tilde{C}_{kij}=-C_{kij}/(2Nu_k-1)\). Therefore, it follows from (55) that
Combining this with (128), we arrive at
which is equivalent to (56). \(\square \)
Reproduction scenario 3. In order to verify (120), we first notice from (10) and (57) that
with \(\text{ rem } = \sum _{k=1}^s (B_{ik}-b_{ik})(\tilde{P}_{tk}-P_{tk})\) a remainder term that vanishes when \(N_{ek}=Nu_k\) for all \(k\) and which is otherwise asymptotically negligible when \(\alpha _i\rightarrow \infty \) as \(N\rightarrow \infty \). It follows from (57) and (59) that
and
In conjunction with (127) and (129), this proves (60). \(\square \)
Verifying (65). The reproduction scenario 3 expression for \(\varvec{\Sigma }\) is obtained by combining the upper equation of (33) with the relevant entries for \(U_{ij,kl}\) in Table 3. When \(N_{ek}=Nu_k\) for \(k=1,\ldots ,s\), all non-diagonal (\(i\ne j\)) terms vanish and then the denominator of (62) can be written as
which yields (65). \(\square \)
Deriving explicit expressions of \(N_{eV}^\mathrm{{ appr}}\) and \(F_{ST}^\mathrm{{ appr}}\) for the island model. Since \(\varvec{\gamma }={\varvec{u}}\) for the island model, we can apply (62) and (63), with \({\varvec{u}}=\varvec{1}^T/s\), to deduce
and
We will start by giving a more explicit expression for \({\varvec{V}}\). It follows from (66) that \({\varvec{B}}{\varvec{q}}= (1-m){\varvec{q}}\) for any vector \({\varvec{q}}\) with \(({\varvec{q}},\varvec{1})=0\). Hence \(\lambda _2=\cdots =\lambda _s=1-m\). In this case it is particularly convenient to put \(\lambda _1^0=1-m\) in the definition of \({\varvec{B}}^0\), since then, according to (106), \({\varvec{B}}^0=(1-m){\varvec{I}}\). The lower part of (33) can be written as \({\varvec{V}}={\varvec{B}}^0{\varvec{V}}({\varvec{B}}^0)^T + \tilde{\varvec{\Sigma }}\), where \(\tilde{\varvec{\Sigma }} = ({\varvec{I}}-\varvec{1}\varvec{\gamma })\varvec{\Sigma }({\varvec{I}}-\varvec{\gamma })^T\). We can repeatedly apply this equation to deduce that
and hence (131) can be rewritten as
Therefore, in view of (130) and (132), it remains to find \(\varvec{\Sigma }\).
For reproduction scenario 1, it can be deduced from (120) that (54) simplifies to
for the island model, so that
and
Combining (130) and (133) we arrive at (67), and inserting (134) into (132) and solving for \(F_{ST}^\mathrm{{ appr}}\) we arrive at (68).
For reproduction scenario 3, a similar simplification of (60) leads to
and
Inserting (135) into (130) we arrive at (69), and plugging (136) into (132) and solving for \(F_{ST}^\mathrm{{ appr}}\) we arrive at (70). \(\square \)
Appendix E: Proof of Theorem 2
In order to prove Theorem 2, we first need two lemmas, which we state for a single biallelic locus:
Lemma 1
In the one locus biallelic definitions (12) and (13) of \(N_{eV,t}^{{\varvec{w}}}=Y/X\) and \(F_{ST,t}^{{\varvec{w}}}=Z/Y\), the conditional expected values of the numerators and denominators equal
and
respectively, where \({\varvec{C}}_Y = ({\varvec{w}}-\varvec{\gamma })^T({\varvec{w}}-\varvec{\gamma })\), \({\varvec{c}}_Y = (1-2P_t)({\varvec{w}}-\varvec{\gamma })\), \({\varvec{C}}_Z = \text{ diag }({\varvec{w}}) - {\varvec{w}}^T{\varvec{w}}\), \({\varvec{C}}_X = 2({\varvec{B}}-{\varvec{I}})^T ({\varvec{w}}-\varvec{\gamma })^T({\varvec{w}}-\varvec{\gamma })({\varvec{B}}-{\varvec{I}})\), \({\varvec{C}}_X^\prime = 2{\varvec{w}}^T{\varvec{w}}\), \(\varvec{\mu }_t\) and \(\varvec{\zeta }_t\) are the remainder terms defined in (44) and (119), and \(\varvec{\varsigma }_t\) another remainder term defined below, in (140).
Proof
We only prove the first parts of (137)–(139), and leave the second part to the reader. Starting with (137), we find that
where in the second equality we used \(P_t^{{\varvec{w}}} - P_t = ({\varvec{w}}-\varvec{\gamma }){\varvec{P}}_t = ({\varvec{w}}-\varvec{\gamma }){\varvec{P}}_t^0\). For (139) we use (21) and \(({\varvec{B}}-{\varvec{I}})\varvec{1}= {\varvec{0}}\) to deduce
We introduce the ascertainment bias term
which quantifies the effect of replacing the inner expectation \(E\) of \({\varvec{P}}_t^0({\varvec{P}}_t^0)^T\) by \(E_c\). Then we can write
In order to verify (138), we first write
which leads to
and then (138) follows since
\(\square \)
Lemma 2
Let \({\varvec{c}}\) be a \(1\times s\) vector, \({\varvec{C}}\) an \(s\times s\) matrix, and define
Then
with
Proof
Put \({\varvec{c}}=(c_1,\ldots ,c_s)\) and \({\varvec{C}}=(C_{ij})_{i,j=1}^s\). For simplicity, we omit conditioning on \(P_t\) in the notation, writing \(E_c(\cdot ) = E_c(\cdot |P_t)\). Then
using the Cauchy Schwarz Inequality in the fourth step. The last term is identical to the right hand side of (141). \(\square \)
Proof of Theorem 2
When all loci are biallelic (\(n(x)\equiv 2\)), formulas (75) and (78) simplify to
respectively, where
are multilocus extensions of the corresponding numerators and denominators \(X\), \(Y\), \(Z\) of Lemma 1, where also \({\varvec{C}}_Z\) is defined. We assume that \(P_{ti}(x)\) is the value of the overall allele frequency \(P_{ti}(x,a)\) of some (arbitrary) of the two alleles \(a=1,2\) at locus \(x\) and subpopulation \(i=1,\ldots ,s\), \(P_t(x)=\sum _{i=1}^s \gamma _i P_{ti}(x)\), \(P_{ti}^0(x) = P_{ti}(x)-P_t(x)\) and \({\varvec{P}}_t^0(x) = (P_{ti}^0(x);i=1,\ldots ,s)^T\).
It will be convenient to condition on the allele frequency spectrum \({\mathcal {P}}_t = \{P_t(x); \, x=1,\ldots ,n\}\), writing
where
can be deduced from Lemma 1, using the same definitions of \({\varvec{C}}_X\), \({\varvec{C}}_X^\prime \) and \({\varvec{C}}_Y\) as there. Moreover, \({\varvec{V}}_t(x)\), \(\varvec{\Sigma }_t(x)\), \({\varvec{c}}_Y(x)=(1-2P_t(x))({\varvec{w}}-\varvec{\gamma })\), \(\varvec{\mu }_t(x)\), \(\varvec{\zeta }_t(x)\) and \(\varvec{\varsigma }_t(x)\) are the values of \({\varvec{V}}_t\), \(\varvec{\Sigma }_t\), \({\varvec{c}}_Y\), \(\varvec{\mu }_t\), \(\varvec{\zeta }_t\) and \(\varvec{\varsigma }_t\) at locus \(x\). The remaining three quantities of (142) are the residual terms
It follows from the definitions of \(G_{ST}^\mathrm{{ appr},{\varvec{w}}}\) and \(N_{eV}^\mathrm{{ appr},{\varvec{w}}}\) in (51) and (50) that we can write
with
Taking the difference of (142) and (144), we find that
where in the last step we made a second order Taylor expansion. The first term on the right hand side of (145) can be further approximated as
where in the second step we replaced \(\bar{Y}\) and \(\bar{Z}\) by \(Y^\mathrm{{ appr}}\) and \(Z^\mathrm{{ appr}}\), in the third step we approximated the gene diversity
in (76) by its quasi equilibrium limit (20), which is accurate, by a Law of Large Numbers argument, for large \(n\). In the last step of (147) we introduced the constants \(C_1\) and \(C_2\) in order to simplify notation.
By the definition of \(\epsilon _Y\) and \(\epsilon _Z\) we have \(E_c(\epsilon _Y|{\mathcal {P}}_t) = E_c(\epsilon _Z|{\mathcal {P}}_t)=0\), and, since \(\bar{Y}\) and \(\bar{Z}\) are both functions of \({\mathcal {P}}_t\), it follows that the first two terms of the last line of (145) have zero mean. Since all loci are in linkage equilibrium, the terms on the right hand sides of all three equations in (143) are independent for different \(x\). By Lemma 2 it then follows, after some computations, that
for some constant \(C_3^\prime \), independently of \(n\). Combining (145), (147) and (147), using \(P_t(x)(1-P_t(x))\le 1/4\), \(|\text{ tr }({\varvec{C}}_Y({\varvec{V}}_t(x)-{\varvec{V}})|\le |{\varvec{C}}_Y|_1 |{\varvec{V}}_t(x)-{\varvec{V}}|_\infty \) and analogous estimates for all \(x=1,\ldots ,n\), we find that
Then we use \(|{\varvec{C}}_Z|_1\le 2\) and \(|{\varvec{C}}_Y|_1\le |{\varvec{w}}-\varvec{\gamma }|_1^2\) and let \(t\rightarrow \infty \), in order to deduce that
since, for instance, the limit \(\lim _{t\rightarrow \infty } E_c(|{\varvec{V}}_t(x)-{\varvec{V}}|_\infty ) = |\Delta {\varvec{V}}|^\mathrm{{ eq}}\) in (48) exists for all \(x\). As similar analysis shows that
where
is an asymptotic upper bound for the remainder terms \(\varvec{\varsigma }_t(x)\), defined in the same way as (45)–(47), and
\(\square \)
Appendix F: Details from Sect. 11
Proof of Proposition 4
Let \(Q_{ij,kl}\) denote the probability that two different genes from subpopulations \(i\) and \(j\) have their parents in subpopulations \(k\) and \(l\) respectively, and let \(p_{ijk}\) be the coalescence probability defined in (84).
It is possible to compute \(q_{t+1,ij}\) by conditioning on the parental subpopulation \(k\) and \(l\) one generation back in time, and then look at the ancestry of the parents \(t\) generations back in time. Since coalescence can only appear when \(k=l\), we find that
This equals the recursion in (82), with
On the other hand, we can rewrite the gene diversity recursion (26) as
since \((1-1/(2Nu_i))^{\{i=j\}}\) is the probability that two genes, drawn with replacement from subpopulations \(i\) and \(j\) in generation \(t+1\) are different, and \(H_{tkl}/(1-1/(2Nu_k))^{\{k=l\}}\) is the probability that two different genes from subpopulations \(k\) and \(l\) in generation \(t\) have different alleles. Hence we see from (26) that
from which (83) follows. \(\square \)
We will derive explicit expressions of the matrix elements \(D_{ij,kl}\) in Proposition 4. To this end, one could either calculate the coefficients \(U_{ij,kl}\) of the covariance matrix expansion (25), and then use Propositions 1 and 4 in order to find \(D_{ij,kl}\). Alternatively, one may employ coalescence probabilities and obtain the elements of \({\varvec{D}}\) directly from (82). We use this latter approach in order to prove the following:
Proposition 7
Asymptotically, for large populations and reproduction scenario 2, the elements of \({\varvec{D}}\) have the form
where \(p_{ijk}\) is the coalescence probability (84) that two genes from subpopulations \(i\) and \(j\), that have their parents in \(k\), have the same parent, and
For reproduction scenario 3 with \(\alpha _i\equiv \infty \), it holds that
with coalescence probability \(p_{ijk}=1/(2N_{ek})\), so that \(\sigma _{ijk}(N)\) in (84) equals
Nagylaki (2000) has derived a recursion that generalizes (152) when \(N_{ek}=Nu_k\) for probabilities that concern not only the time when but also the subpopulation where coalescence of two genes from subpopulations \(i\) and \(j\) occurs. The constant \(\sigma _{ijk}(N)\) was defined in Hössjer (2011). As mentioned in Sect 11.1, it can be interpreted as the coalescence rate of a pair of lines from subpopulations \(i\) and \(j\), when both of these migrate backwards to \(k\).
Proof of Proposition 7
In order to establish (150) and (152), we will use (149), and hence we need to find expressions for \(Q_{ij,kl}\) and \(p_{ijk}\). Starting with reproduction scenario 2, we have
since the two genes are drawn without replacement, and an exact fraction \(b_{ik}\) of the parents of the offspring genes of subpopulation \(i\) originate from subpopulation \(k\), and similarly, an exact fraction \(b_{jl}\) of the genes in \(j\) to have their parent in \(l\). We can rewrite (154) more compactly as
It follows for instance from Hössjer (2011) that the coalescence probability \(p_{ijk}\) has the form (84), and this completes the proof of (150).
For reproduction scenario 3 with \(\alpha _i\equiv \infty \), we simply have
since the parental subpopulations are drawn independently for two genes of subpopulations \(i\) and \(j\), from the probability distributions corresponding to rows \(i\) and \(j\) of \({\varvec{B}}\). Moreover, the coalescence probability is \(1/(2N_{ek})\), since this is the probability that the two parents in \(k\) originate from the same gene of a breeder, and this completes the proof of (152). \(\square \)
Proof of Theorem 3
We will use (87) in order to prove (88). By Perron–Frobenius’ Theorem, there exists a unique largest eigenvalue \(\lambda \) of \({\varvec{D}}\), with corresponding left and right eigenvectors \({\varvec{l}}=(l_{ij})\) and \({\varvec{r}}=(r_{ij})\), which can be normalized so that
By a Jordan decomposition of \({\varvec{D}}\), it follows that
Our asymptotic analysis \(N\rightarrow \infty \) is equivalent to letting the perturbation parameter
tend to zero. In order to highlight the dependence of \({\varvec{D}}={\varvec{D}}(\varepsilon )\) on \(\varepsilon \), we Taylor expand its elements around \(\varepsilon =0\), as
It follows from (150) that \(\dot{{\varvec{D}}}=(\dot{D}_{ij,kl})\) has elements
for reproduction scenario 2 and
for reproduction scenario 3 with \(\alpha _i\equiv \infty \). Clearly, \({\varvec{D}}(0)={\varvec{B}}\otimes {\varvec{B}}\) is the Kronecker product of \({\varvec{B}}\) with itself for either reproduction scenario. It has largest eigenvalue \(\lambda (0)=1\), since \({\varvec{B}}\) is the transition matrix of an irreducible Markov chain, with a unique largest eigenvalue 1. Moreover, the form of the left and right eigenvectors \({\varvec{l}}={\varvec{l}}(\varepsilon )\) and \({\varvec{r}}={\varvec{r}}(\varepsilon )\) can be deduced from the left and right eigenvectors of \({\varvec{B}}\) when \(\varepsilon =0\), as
It follows from perturbation theory of matrices (see for instance Nagylaki (1980) and Van der AA et al. 2007), that
where
for reproduction scenario 2, with \(C\) as defined in (89). A similar (but simpler) analysis shows that \(\dot{\lambda }= -C\) for reproduction scenario 3 with \(\alpha _i\equiv \infty \). In view of (87), this implies
as \(\varepsilon \rightarrow 0\), or equivalently, as \(N\rightarrow \infty \), thereby proving (88). In the fifth equality of (156) we used that \({\varvec{r}}={\varvec{r}}(\varepsilon )={\underline{{\mathbf{1}}}} + o(1)\) as \(\varepsilon \rightarrow 0\), and in the sixth equality \({\varvec{W}}_T{\underline{{\mathbf{1}}}} = \sum _{i,j} w_iw_j = 1\), regardless of the choice of weight vector \({\varvec{w}}\).
We now turn to the proof of (90). It follows from Table 3 that \(\Vert {\varvec{U}}\Vert =O(N^{-1})\) for both reproduction scenarios 2 and 3 (with \(\alpha _i\equiv \infty \)). Invoking the upper part of (33) and (38), we deduce that
where the last step follows from Proposition 3 and the fact that the migration rates are kept fixed. Inserting the last expression into (50), we find that
where
It thus remains to verify, for both reproduction scenarios, that \(C^\prime = C\). Starting with reproduction scenario 2, we find from Table 3 that
with \(C_{kij}=\text{ Cov }(\nu _{ki}^l,\nu _{kj}^l)\). By the assumptions of the theorem, the quantities \(\sigma _{ijk}(N)\) in (151) will converge as \(N\rightarrow \infty \). Since the migration rates in \({\varvec{M}}\) are fixed, it follows that the covariances \(C_{kij}=C_{kij}(N)\) will converge as well. With a slight abuse of notation, we write \(C_{kij}\) also for the asymptotic \(N\rightarrow \infty \) limits. Inserting (158) into (157), we find that
On the other hand, it follows from the definition of \(\sigma _{ijk}\) in (151), that each covariance term \(C_{kij}\) can be rewritten as
Inserting (159) into (157), it follows, after some computations, that
and in view of (156), this proves (90).
For reproduction scenario 3 with \(\alpha _i\equiv \infty \), it follows from Table 3 that
Insertion of this expression into (157) leads to
where \(\sigma _k=\sigma _{ijk}\) is defined in (153). The last step of (160) follows easily by adding a term \(\sigma _k\) on both sides of Eq. (91). \(\square \)
Given two random variables \(X\) and \(Y\), we put \(E_0(Y/X)^*=E_0(Y)/E_0(X)\), where \(E_0(X)=E(X|{\varvec{P}}_0=P_0\varvec{1})\), a prediction of \(Y/X\) given that the allele frequencies of the founder generation are the same in all subpopulations. The following proposition shows that \(\bar{f}_{ST}^{{\varvec{w}}}\) and \(f_{ST}^{{\varvec{w}}}\) are weighted averages over \(t\) of \(E_0(\bar{F}_{ST,t}^{{\varvec{w}}})^{*}\) and \(E_0(F_{ST,t}^{{\varvec{w}}})^{*}\) respectively:
Proposition 8
The matrix \(\bar{{\varvec{H}}}_t = (\bar{H}_{tij})_{i,j=1}^s\) of gene diversities, defined for a pair of distinct genes, satisfies
and the fixation index in (98) is a weighted average
of predictions \(E_0\left( \bar{F}^{{\varvec{w}}} _{ST,t}\right) ^{*}\) of the fixation index (95) over different time horizons \(t\), with weights
Analogously, the matrix \({\varvec{H}}_t = (H_{tij})_{i,j=1}^s\) of gene diversities, when the pair of genes is drawn with replacement, satisfies
and the fixation index (99) is a weighted average
with weights
It is implicit from the proof of Theorem 3 that the weights (165) correspond to a probability distribution with mean \(O(N)\), as discussed in Subsection 11.2.
Proof of Proposition 8
By means of an expansion \(({\varvec{I}}-{\varvec{D}})^{-1}=\sum _{t=0}^\infty {\varvec{D}}^t\), it is clear that (98) can be rewritten as
given an assumption that the \(\mu \rightarrow 0\) approximation in (98) is exact. On the other hand, as in the proof of (82), it follows that we get a gene diversity recursion
instead of (29) when two genes are drawn without replacement. We prove (161) by repeated use of (167). This yields
applying (94) with \(t=0\) in the last step. Invoking the definitions of \(\bar{H}_{Tt}^{{\varvec{w}}}\) and \(\bar{H}_{St}^{{\varvec{w}}}\) into (161), this yields
where the last step follows as in the proof of Theorem 3 (see in particular (156)), since
Hence it follows that
By inserting the last equation into (166) we arrive at (162).
Equations (163) and (164) are derived analogously, although the proof is simpler. The reason is that the \(O(N^{-1})\) remainder terms vanish, since \(\text{ vec }({\varvec{H}}_0)=2P_0(1-P_0){\underline{{\mathbf{1}}}}\) holds exactly when \({\varvec{P}}_0=P_0\varvec{1}\). \(\square \)
Proof of Theorem 4
It will be convenient to rewrite (27) as
where \({\varvec{G}}=(G_{ij,kl})\) has elements \(G_{ij,kl}=b_{ik}b_{jl}\). The Jordan decomposition of \({\varvec{B}}\) in Appendix A implies that \({\varvec{B}}^0{\varvec{B}}^{t-1} = {\varvec{B}}^{t-1}{\varvec{B}}^0 = ({\varvec{B}}^0)^{t}\) for any non-negative integer \(t\). Since \({\varvec{G}}^0 = {\varvec{B}}^0\otimes {\varvec{B}}^0\) and \({\varvec{G}}={\varvec{B}}\otimes {\varvec{B}}\), it is easy to see that this implies
A similar calculation as in the proof of Theorem 3 (see in particular (156)) yields
where \(\lambda \) is the unique largest eigenvalue of \({\varvec{G}}-{\varvec{U}}\). We will also make use of the fact that
which follows since \({\varvec{w}}=\varvec{\gamma }\) and
with \(\varvec{1}\) a column vector of length \(s\), and
Based on these preliminaries, we can rewrite the numerator of (99) as
using (169), (171) and the fact that \(({\varvec{W}}_S-{\varvec{W}}_T){\varvec{G}}^t{\underline{{\mathbf{1}}}} = ({\varvec{W}}_S-{\varvec{W}}_T){\underline{{\mathbf{1}}}} = 0\) in the third step, a change of variables \(\alpha = t-\tau -1\) in the fourth step and (170) in the fifth step. Formula (170) also implies that the denominator of (99) equals
In view of (168), we obtain formula (100) by taking the ratio of the last two displayed equations. In order to prove that \(F_{ST}^\mathrm{{ appr}}\) equals the right hand side of (100) as well, it follows, by the definition of \(\varvec{\Pi }\) in (34), that
which we can rewrite in vector format, as
A similar calculation shows that
which we rewrite as
This yields
where in the first step we used the definition (51) of \(F_{ST}^\mathrm{{ appr},{\varvec{\gamma }}}\), in the third step the expansion (36) of \(\text{ vec }({\varvec{V}})\) and in the fifth step the assumption \(\Vert {\varvec{U}}\Vert =O(N^{-1})\), (172), (173) and the second part of (171).
Finally, formula (101) is proved in the same way as (100), replacing \({\varvec{U}}\) by \(\bar{{\varvec{U}}}={\varvec{B}}\otimes {\varvec{B}}-{\varvec{D}}\) everywhere. \(\square \)
In order to compare the sizes of the fixation indeces when genes are drawn with and without replacement, we formulate the following result:
Proposition 9
The fixation index in (99) can be written as
In particular, for a strong migration limit where \(N\rightarrow \infty \) while the migration rates in \({\varvec{M}}\) are kept fixed, it holds that
In order to illustrate this result, consider the island model under panmixia (\(m=1\)), for which it is well known that \(\bar{f}_{ST}=0\) for the canonical and uniform weighting scheme \(w_i=1/s\), reflecting the fact that subpopulations on the average are identical. However, even under panmixia, there will still be small differences between subpopulations. It is shown in Hössjer (2013) (see also Latter and Sved 1981) that the replacement version \(f_{ST}\) of the fixation index captures this, in terms of a nonzero value \(f_{ST}=(s-1)/(2N) + o(N^{-1})\). It also follows from Hössjer et al. (2013) or (68) that the replacement version of the quasi equilibrium approximation of the fixation index satisfies \(F_{ST}^\mathrm{{ appr}}= (s-1)/(2N)\) under panmixia.
Proof of Proposition 9
We have that
since the probability is \(\left( 1-1/(2Nu_i)\right) ^{\{i=j\}}\) that two genes are not the same when drawn with replacement, and given this, they are different by state with probability \(\bar{h}_{ij}^{{\varvec{w}}}\), as defined in (96). It then follows from (97), and the analogous definitions of \(h_S^{{\varvec{w}}}\) and \(h_T^{{\varvec{w}}}\) in terms of \(h_{ij}^{{\varvec{w}}}\), that
By inserting these two equations into (99), we arrive at (174).
When migration rates are fixed and \(N\rightarrow \infty \), we have \(\bar{h}_{ij}=\bar{h}_T^{{\varvec{w}}}(1+O(N^{-1}))\) for all \(i,j\), and hence (174) implies
which can be simplified to (175). \(\square \)
Rights and permissions
About this article
Cite this article
Hössjer, O., Ryman, N. Quasi equilibrium, variance effective size and fixation index for populations with substructure. J. Math. Biol. 69, 1057–1128 (2014). https://doi.org/10.1007/s00285-013-0728-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-013-0728-9
Keywords
- Autoregressive time series
- Island model
- Quasi equilibrium
- Stepping stone models
- Spatial allele frequency fluctuations
- Structured populations
- Temporal allele frequency fluctuations