Assessment of the Global Variance Effective Size of Subdivided Populations, and Its Relation to Other Effective Sizes

The variance effective population size (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eV}$$\end{document}NeV) is frequently used to quantify the expected rate at which a population’s allele frequencies change over time. The purpose of this paper is to find expressions for the global \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eV}$$\end{document}NeV of a spatially structured population that are of interest for conservation of species. Since \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eV}$$\end{document}NeV depends on allele frequency change, we start by dividing the cause of allele frequency change into genetic drift within subpopulations (I) and a second component mainly due to migration between subpopulations (II). We investigate in detail how these two components depend on the way in which subpopulations are weighted as well as their dependence on parameters of the model such a migration rates, and local effective and census sizes. It is shown that under certain conditions the impact of II is eliminated, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eV}$$\end{document}NeV of the metapopulation is maximized, when subpopulations are weighted proportionally to their long term reproductive contributions. This maximal \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eV}$$\end{document}NeV is the sought for global effective size, since it approximates the gene diversity effective size \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eGD}$$\end{document}NeGD, a quantifier of the rate of loss of genetic diversity that is relevant for conservation of species and populations. We also propose two novel versions of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eV}$$\end{document}NeV, one of which (the backward version of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eV}$$\end{document}NeV) is most stable, exists for most populations, and is closer to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eGD}$$\end{document}NeGD than the classical notion of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eV}$$\end{document}NeV. Expressions for the optimal length of the time interval for measuring genetic change are developed, that make it possible to estimate any version of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_{eV}$$\end{document}NeV with maximal accuracy.


Background on Effective Population Sizes
The effective population size N e is a well known concept (Wright 1931(Wright , 1938) that quantifies the rate at which genetic variation of a population is lost over time.This is important in conservation biology, where retention of sufficient levels of genetic diversity to allow adaptation to changing environmental conditions is of major concern for the long term viability and conservation of species and populations (Frankham et al. 2010;Traill et al. 2010;Hoban et al. 2021;Allendorf et al. 2022).Since many populations exhibit some type of geographic substructure, it is crucial to assess in which way and how much this impacts N e .Typically, such a structure is modelled as a metapopulation that consists of a number of more or less connected subpopulations.For short term conservation of species it is mainly genetic drift within and migration between subpopulations that impact N e , whereas mutation and natural selection are usually ignored.
Many versions of effective size have been proposed, as recently discussed by Gilbert and Whitlock (2015), Wang (2016), Waples (2016), Ryman et al. (2019), andNadachowska-Brzyska et al. (2022).In this paper we focus on the variance effective size N eV (Crow 1954), where loss of genetic variation is quantified in terms of the variance of frequency change of genetic variants (alleles).If genetic data is available from at least two points in time, the temporal method (Kimbras and Tsakas 1971;Nei and Tajima 1981;Pollack 1983;Waples 1989;Jorde and Ryman 2007) can be employed to estimate N eV .For this reason N eV is one of the most frequently used notions of effective size that is recommended because it is multigenerational (Frankham et al. 2019;Frankham 2021).On the other hand, N eV is typically not the best effective size for assessing the rate at which genetic diversity is lost in substructured populations.Since this rate is an important criterion for conservation of species, this is potentially a drawback of N eV (Ryman et al. 2019).
In order to find out whether versions of N eV for substructured populations exist that are more appropriate for estimating N e for conservation purposes, a first step is to understand how various parameters of a population genetic model influence N eV .To this end, it is important to build a mathematical framework for how the genetic makeup of a population evolves over time, and then find expressions for the variance of allele frequency change.Using such an approach, Whitlock and Barton (1997) noted that N eV is a function of several parameters, such as the local effective sizes of subpopulations under isolation, the migration pattern between subpopulations and the way in which subpopulations are weighted in order for N eV to reflect, for instance, local or global aspects of the variance effective size.Hössjer et al. (2014) added local census sizes to the model and considered subpopulation weights of general form.Hössjer et al. (2016) noticed that previous analyses of N eV had been overly simplistic and neglected the impact of subpopulation differentiation at the first time point at which genetic data is collected.

Objectives
The purpose of this paper is to find versions of N eV for the metapopulation that are of interest for conservation, by appropriately quantifying the rate of loss of genetic diversity.To this end, we will generalize work of Whitlock and Barton (1997), Ryman et al. (2014) and Hössjer et al. (2014Hössjer et al. ( , 2016) ) and study the variance effective size of structured populations by means of matrix analytic methods, where standardized covariances of allele frequency change and gene diversities (Nei 1973) are updated recursively over time.More specifically we consider three aspects of N eV : (i) A careful analysis of allele frequency change and subpopulation weights, which will lead us to a version of N eV that is of interest for conservation, (ii) Introduction of two novel and more stable ways of defining N eV , both of which have versions that are of relevance for conservation, (iii) Finding expressions for the length of the interval between the two time points at which genetic data is collected, which is optimal in terms of estimating N eV with maximal accuracy.In the rest of this section, we describe these three steps in more detail.
For the first contribution (i), following Hössjer et al. (2016) we divide expected squared allele frequency change between the two time points at which genetic data is collected into two components I and II, and study conditions under which the impact of II is negligible.In order to motivate more closely this first aspect of our article, we will start by explaining the meaning of these two terms I and II.
The first component I was analyzed by Whitlock and Barton (1997) and Hössjer et al. (2014), and it quantifies how much standardized covariances of allele frequency change increase or how much the gene diversity decreases between the two time points at which genetic data is collected.We will refer to I as the drift term, since it is mainly genetic drift that causes loss of genetic variation, and for this reason I is usually the most important source of genetic change.Indeed, gene diversity decrease I is equivalent to gene identity increase, a haploid approximation of increased inbreeding that is of major concern for short term protection of species (Franklin 1980;Jamieson and Allendorf 2012).For this reason the gene diversity effective size N eGD is of interest for conservation since it only involves the genetic drift term I but not the other term II.N eGD is however of more relevance for long term than for short term conservation since it approximates the additive genetic variance effective size N eAV (Franklin 1980;Hössjer et al. 2016).This is important since the frequently used conservation guideline for long term survival, that stipulates that N e should be larger than 500 (Franklin 1980;Jamieson and Allendorf 2012) or larger than 1000 (Frankham et al. 2014;Pérez-Pereira et al. 2022), relates to N eAV (Ryman et al. 2019).However, since N eGD (and N eAV ) is difficult to estimate in practice, it is important to assess how well it is approximated by N eV .
The second term II was introduced in Hössjer et al. (2016) and it quantifies how much allele frequency change in the past, before the first time point when data is collected, is correlated with allele frequency change between the two time points of data collection, with a negative correlation corresponding to a positive value of II.We could therefore refer to −II as a correlation between allele frequency change of the past and the present.But since II vanishes when all subpopulations are isolated and additionally the same subpopulation weights are used at the two time points at 19 Page 4 of 49 which genetic data is collected, it follows that II is mainly caused by migration.For this reason we will speak of II as a migration or gene flow term.It is also the case that II is present only when there is subpopulation differentiation at the first time point of data collection.
In order to shed further light on the relation between N eGD and N eV , we continue the analysis of Hössjer et al. (2016) and express genetic drift I and gene flow II in terms of how subpopulations are weighted, and also in terms of parameters of the population genetic model such as the local census and effective sizes and the migration rates between subpopulations.In particular, we demonstrate that II is highly dependent on local census sizes, whereas I is virtually independent of them (although local census sizes were introduced in Hössjer et al. (2014), they had little impact on N eV since II was not included as a component of allele frequency change in that article).
It is of particular interest to find conditions under which it is possible to estimate N eV in such a way that II is eliminated.It was shown in Hössjer et al. (2016) that the gene flow contribution II to the variance effective size vanishes when subpopulations are weighted proportionally to their long term reproductive contribution (Hill 1972;Nagylaki 1980;Whitlock and Barton 1997).This means that each subpopulation receives a weight that corresponds to the fraction of ancestors, many generations ago, that originated from this particular subpopulation.When subpopulations are weighted in this way, the overall frequency of an allele in the metapopulation changes over time in such a way that only genetic drift (term I) contributes, whereas the effects of migration into different subpopulations cancel out ( II = 0 ).These so called reproductive subpopulation weights give rise to a version of the variance effective size that we refer to as N eVMeta .It turns out that N eVMeta is of particular interest for long term conservation, since under migration-drift equilibrium N eVMeta not only equals N eGD and N eAV , but also the eigenvalue effective size N eE (Ewens 1982(Ewens , 2004)), which is known to reflect the long term genetic behavior of a population.
In spite of the relevance of N eVMeta for conservation, it is a challenge to use this effective size in practice since its subpopulation weights involve migration rates between subpopulations, which are difficult to estimate.It is possible, though, to find simplified expressions for I and II under migration-drift equilibrium, using perturbation theory and eigenvalue decomposition of matrices (Horn and Johnson 1985;Friswell 1996;Van der Aa et al. 2007).Although perturbation results for eigenvalues have previously been applied to population genetics (Maruyama 1970a;Nagylaki 1980Nagylaki , 1995;;Hössjer 2015) it seems that our perturbation results for eigenvectors are new.Based on this analysis we demonstrate, for some particular models, that it is possible to eliminate the impact II of migration by maximizing the variance effective size with respect to subpopulation weights, so that the corresponding N eV approximates N eVMeta .
For the second contribution (ii) of this article, we demonstrate that when the impact II of gene flow is not eliminated, under certain conditions it elevates allele frequency change over long time intervals to such an extent that the traditional (forward) version of N eV is undefined.For this reason we define two novel notions of variance effective size, the intermediate and backward versions of N eV .The intermediate version corresponds to a frequently used estimator of variance effective size due to Jorde and Ryman (2007), and although it is more stable than the forward version of N eV , it shares the drawback of sometimes being undefined for long time intervals of genetic change.The backward version of N eV , on the other hand, exists for most populations, and it is also the version of variance effective size that most closely relates to N eGD and N eE , since it lessens the impact of the gene flow term II more than the other two versions of variance effective size.We demonstrate, numerically and analytically, that the forward, intermediate and backward versions of the local variance effective size are very close for large subdivided populations, unless the time interval is very long.On the other hand, the three effective sizes differ substantially for a small and subdivided population, and moderate or large time intervals.
For the third contribution (iii) of this article, we give explicit expressions for the length of the time interval that maximizes the accuracy of estimates of N eGD and all three versions of N eV , for any type of subpopulation weights.This optimal length is proportional to the eigenvalue effective size N eE , with a constant of proportionality that depends on characteristics of the population as well as the type of effective size being used, including subpopulation weights.This reinforces that the three variance effective sizes behave differently for small subdivided populations (when N eE is small).
Our paper is organized as follows: We start by defining the population genetic model in Sect.2, and the framework of genetic variation in terms of one single biallelic marker in Sect.3.This makes it possible in Sect. 4 to introduce the matrix analytic framework for how covariances of allele frequency change, gene diversities and fixation indeces evolve over time.The various notions of effective size are introduced in Sect.5, migration-drift equilibrium is the topic of Sect.6, the impact of the length of the time interval on effective size is analyzed in Sect.7, and the optimal time interval in terms of accurately estimating effective size is studied in Sect.8. Then analysis of a real data set in Sect.9 and a discussion in Sect. 10 concludes.A summary of the most important notation is provided in Table 1, whereas some of the numerical results and all proofs are collected in the appendices.

Population Genetic Model
We will study the genetic composition of a structured (or subdivided) population that evolves over time in terms of non-overlapping generations t = −T, −T + 1, … , where −T ≤ 0 is a founder generation.The population has s subpopulations x = 1, … , s , whose local census sizes N cx and local effective sizes N ex under isolation do not change over time.The subpopulations are not isolated, but rather connected though gene flow, as summarized by an irreducible backward migration matrix B = (B xy ) of order s, where B xy is the expected fraction of gene copies in x that in the previous generation migrated from y.
The most well known type of subdivided population is the island model (Wright 1943;Maruyama 1970b) Vector of local subpopulation weights ( = (0, … , 0, 1, 0, … , 0) ) that only assigns a positive weight to subpopulation x (a 1 in position x) Vector of reproductive subpopulation weights ( = ( 1 , … , s ) ). x is the fraction of ancestors originating from subpopulation x many generations ago of subpopulations are the same as well, so that in each generation m � = 1 − B xx is the fraction of offspring of subpopulation x whose parents migrated from any other subpopulation {y; y ≠ x} .On the other hand, m can be thought of as the fraction of offspring of x whose parents originate from a global gene pool, with equal contribution from all subpopulations (including x itself).The one-and two-dimensional stepping stone models (Kimura 1953;Kimura and Weiss 1964;Weiss and Kimura 1965;Durrett 2008) correspond to a subdivided population where migration from y to x ( B xy > 0 ) is possible only when these two subpopulations are neighbors.
It is assumed that the population reproduces in such a way that migration precedes fertilization.More specifically, reproduction between generations t and t + 1 involves the following three steps: 1. (Gamete formation) Within each subpopulation x of generation t an infinitely large pre-migration gene pool is constructed as follows: 2N ex gene copies (corresponding to N ex diploid breeders) are drawn without replacement from all Generic notation for effective size of type Q at migration-drift equilibrium

N eQwv
Effective population size when subpopulations are weighted as w and v at the two end points of the interval along which genetic change is assessed

N eQw
Effective population size when subpopulations are weighted as w at both end points of the interval along which genetic change is assessed ( = N eQww ) N eQMeta Effective size of type Q for the metapopulation ( = N eQ = N eQ )

Genetic Variation at a Biallelic Marker
Our main focus is to study how the genetic composition of the population of Sect. 2 changes between two time points t and t + , where is a positive integer.Typically genetic data from many biallelic markers are used to represent the genetic composition at time t and t + .For our theoretical investigations in Sects. 3-8,it will be sufficient though to study one single biallelic marker, as a representative of any of the markers that are part of the data set.For this reason we consider a marker with alleles A and a and let p t be the frequency of allele A in generation t.For a subdivided population we need to keep track of the frequency p tx of A in all subpopulations x at each time point t.In order to obtain one single allele frequency at time t and t + , we will weight subpopulations as w = (w 1 , … , w s ) and v = (v 1 , … , v s ) at these two time points, where w x and v x are non-negative numbers satisfying The accompanying subpopulation weighted frequencies of A, at time t and t + , are The subpopulation weights in (1) play a crucial role in this paper.They may for instance reflect the sampling scheme of time points t and t + , although this is not necessary.Local subpopulation weights at time t correspond to giving some subpopulation x full weight ( w x = 1 ), whereas none of the other subpopulation contribute to p t ( w y = 0 for any y ≠ x ).With vector notation this is phrased as w = e x , where e x = (0, … , 0, 1, 0, … , 0) has a one in position x and zeros elsewhere. (1) Similarly, v x = 1 if subpopulation weight x receives full weight at time t + , or equivalently v = e x .Global subpopulation weights at time t and t + assign positive values w x > 0 and v x > 0 respectively, to all subpopulations x.When the long term evolution of the population is of interest, it is appropriate to use reproductive subpopulation weights w = v = = ( 1 , … , s ) at both time points, since a fraction x of all gene copies originated from subpopulation x many generations ago (Nagylaki 1980(Nagylaki , 2000;;Hössjer and Ryman 2014).This weight vector is the equilibrium distribution of a Markov chain with state space {1, … , s} and transition matrix B , and it corresponds to a probability distribution for the subpopulation ancestry of a gene copy.Assuming that B is irreducible, is the unique probability vector satisfying = B , with = (1, … , 1)∕s for the island model.Consequently, x quantifies the long term contribution of x to the metapopulation, or as mentioned above, x is the fraction of ancestors that originated from x, many generations back in time.

Standardized Covariances, Gene Diversities, and Fixation Indeces
In this section we define a number of concepts needed in Sect. 5 when various types of effective size are introduced.Following Hössjer et al. (2016), assume that all subpopulations have the same frequency p −T,x = p of allele A at the founder generation at t = −T .This is no essential restriction, since we will mainly consider equilibrium conditions when T → ∞.

Standardized Covariances
The standardized covariance between a pair x, y of subpopulations at time point t ∈ {−T, −T + 1, …} , is defined as Equivalently, f txy is the correlation coefficient between the alleles of two gene copies drawn independently from subpopulations x and y at time t (with replacement if x = y ), see for instance Cockerham (1969).It was shown in Hössjer et al. (2014) that the column vector h t = (h txy ) of length s 2 satisfies a recursive relation where A = (A xy,zu ) is a square matrix of order s 2 with elements (2) (3) .
Similar types of recursions were originally developed by Malécot (1951), see also Whitlock and Barton (1997).Since all standardized covariances vanish at the founder generation, it follows that is a vector of s 2 ones.Notice that the initial condition (5) and the linear recursion (3) determine the value of h txy for all t, x, y.

Gene Diversities
The gene identity (gene diversity) F txy ( H txy ) between a pair of subpopulations at time t is the probability that two randomly chosen gene copies of subpopulations x and y, drawn with replacement if x = y , have the same (different) alleles.It turns out that the time recursive behavior of gene diversities is very similar to that of standardized covariances.In order to motivate this we notice that allele frequencies at time t > −T are unknown from the perspective of the base generation −T , and therefore This implies that the gene diversities at time t and t + are given by By this we mean that H t is the probability that two gene copies, drawn randomly with replacement from the population at time t, have different alleles, given that w x is the probability of drawing each gene from x.Likewise, H t+ is the probability that two randomly drawn gene copies at time t + have different alleles, if subpopulations are chosen with probabilities v x .It was shown in Hössjer et al. (2014) that the column vector H t = (H txy ) of length s 2 satisfies the same recursion as in (3), i.e.
for t = −T, −T + 1, … .Since p −T,x = p by assumption, we have that H −Txy = 2p(1 − p) for all x, y, and consequently Comparing (3) and ( 5) with ( 7) and ( 8), we find that the numbers h txy obtained from the standardized covariances are equivalent to the gene diversities H txy , up to a multiplicative constant, i.e. for all t, x, y.
A concept closely related to ( 6) is the collection of gene diversities without replacement.They are defined as ( 5) the probabilities that two gene copies, drawn randomly without replacement at the same time point t and t + respectively, have different alleles.Likewise, Htxy is the probability that two gene copies, drawn at randomly without replacement from x and y at time t, have different alleles.It was shown in Hössjer et al. (2014) that the column vector Ht = ( Htxy ) of gene diversities without replacement satisfies a recursion for t = −T, −T + 1, … , where D = (D xy,zu ) is a square matrix of order s 2 with elements Note that the definitions of Htxy and H txy are the same when x ≠ y .Moreover, since (2N cx − 1)∕(2N cx ) is the probability that two gene copies, drawn randomly with replacement from x, are different copies, it follows that for all x, y.In particular, a comparison between (8) and ( 13) reveals that H−T is virtually independent of the census sizes N cx when these are large.Since the elements (12) of the linear recursion matrix D do not involve any census sizes, it follows that Htxy are virtually independent of census sizes as well, for any t, x, y.Making use of ( 13) again, we conclude that the gene diversities H txy are virtually independent of the local census sizes as well.

Fixation Index
The most well known measure of genetic differences between subpopulations is the fixation index F ST (Malécot 1948, Wright 1949;Weir and Cockerham 1984;Bhatia et al. 2013).Here we will use a version of the fixation index referred to as the coefficient of gene differentiation by Nei (1973) and subsequently generalized to multiallelic loci in Nei (1977).The fixation index is conveniently defined in terms of allele frequency differences between subpopulations, and we will study F ST in each generation t from the perspective of the base generation −T , so that allele frequencies at t > −T are unknown.Following the argument in Hössjer et al. (2016), the fixation index at time t is then predicted by ( 10) 19 Page 12 of 49 where in the last step of ( 14) we first divided the numerator and denominator by p(1 − p) , and then invoked the definition of h txy = 1 − f txy in (2).In order for the fixation index to be nonzero, it is required that at least two subpopulation weights w x are nonzero.We will mainly use ( 14) in the context of reproductive population weights w = .

Effective Sizes
The idea of effective size is to find a simple population that serves as a yardstick and shares some properties with the structured population of interest.The Wright-Fisher population (WF) is usually used for this purpose.It is a special case of the model of Sect. 2 that corresponds to a homogeneous population ( s = 1 ) with equal census and effective size ( N c1 = N e1 = N ).An effective size of type Q (notated as N eQ ) is the size of a WF population that exhibits the same value of a certain quantity Q as the given structured population.Typically Q quantifies how fast the genetic composition of the population changes between time points t and t + , and we will assume that it takes the value (1 − 1∕(2N)) for a WF population of size N, so that Solving for the effective size in (15) we find that Equation ( 16) is very close to a formula for the effective size that appears at the bottom of Page 525 of Luikart et al. (1999).Since they have 2N eQ + 1 rather than 2N eQ in the denominator of (15), they end up with an additional term −0.5 in the expression for N eQ , and they also include extra terms that correct for estimation bias of Q( ) due to having finite samples of genetic data at time points t and t + .
When N eQ ≫  , the right hand side of ( 15) is well approximated by a first order Taylor expansion of g(x) = 1 − (1 − x) ≈ x around x = 0 .This gives rise to the simpler and approximate definition Sometimes ( 17) is referred to as the additive approach (Waples 1989; Luikart et al. 1999), as opposed to the exact multiplicative approach (15).Although the additive approximation often works well, it can sometimes be inaccurate when gets large, in particular for populations that experience bottlenecks (Richards and Leberg 1996; Luikart et al. 1999).Another important difference between the multiplicative and additive approaches is that N eQ,add always exists, as long as Q is positive, whereas in order for N eQ to have a finite positive value we must require 0 < Q < 1 .Although Q < 1 is guaranteed for a Wright-Fisher population, for a subdivided population in general Q may sometimes exceed in 1.
In this paper we will mainly focus on loss of gene diversity ( Q = GD ) and variance of allele frequency change ( Q = V ).But we will also consider the eigenvalue effective size, for which Q = E corresponds to the largest eigenvalue of a certain matrix.This effective size does not follow the general pattern ( 15) and ( 16) of genetic change between two time points t and t + , but rather it quantifies the long term loss of genetic diversity at migration-drift equilibrium.

Notation for Local and Global Effective Sizes
We will assume that subpopulations are weighted as w and v at time t and t + , and in order to highlight the impact of these subpopulation weights we sometimes write N eQ = N eQwv , and in particular N eQ = N eQw when the same weight vector w = v is used at both time points.For an effective size of type Q ∈ {GD, V} we also adopt the notation of Laikre et al. (2016) and write N eQMeta = N eQ for the effective population size of the metapopulation.This corresponds to using reproductive weights at both time points ( w = v = ).The quantity N eQRx = N eQe x refers to the realized local effective size of subpopulation x, and it corresponds to using the same local weight vector at both time points ( w = v = e x ).The term realized was introduced in Laikre et al. (2016) and Ryman et al. (2019) to emphasize the fact that due to migration N eQRx typically differs from N ex , although the two quantities are identical when x is isolated from the other subpopulations.When two different subpopulations x and y receive full weight at time points t and t + , i.e. w = e x , v = e y , and x ≠ y , we write N eQRxy = N eQe x e y for the corresponding realized effective size.
The eigenvalue effective size N eE , on the other hand, is a property of the metapopulation, and therefore it does not involve subpopulation weights.

Gene Diversity Effective Size
The gene diversity effective size N eGD between the two time points t and t + is defined as the size of an ideal Wright-Fisher population that exhibits the same relative gene diversity decline (locally or globally for the metapopulation), during this time interval, as for the studied structured population.In mathematical terms, this corresponds to the quantity where I = I T+t ( ) , the relative decline of gene diversity, quantifies how much genetic drift there has been between generations t and t + .
19 Page 14 of 49 The effective size in ( 18) was referred as a haploid inbreeding effective size with replacement in Hössjer et al. (2014), since gene diversity decrease is equivalent to gene identity increase, a haploid analogue of increased inbreeding.It was shown in Hössjer et al. (2016) that N eGD is a good approximation of the additive genetic variance effective size N eAV , which is of interest for long term conservation of species.Recall from the discussion at the end of Sect.4.2 that all H txy and H t+ ,xy are essentially independent of the local census sizes of all subpopulations.From this it follows that H t , H t+ , I, and N eGD are functions of the migration rates in B and the local effective sizes N ex , whereas they are essentially independent of the local census sizes N cx .Since I is nonzero even when all subpopulations are isolated, the contribution of all N ex is most fundamental to I, and for this reason we will refer to it as a genetic drift term.
Since the gene diversities H t and H t+ are non-negative, it follows that the term I does not exceed unity ( I ≤ 1 ).It may happen though that I is negative when local subpopulations weights of x are used at both time points ( w = v = e x ) and migration into x causes the gene diversity to increase ( H t+ > H t ).Then formally N eGD = ∞.

Forward Approach
The variance effective size N eV is the size of an ideal and spatially homogeneous population whose standardized variance of allele frequency change between time points t and t + is the same as in the studied structured population, see for instance Sect.7.6.3 of Crow and Kimura (1970).It is instructive to first introduce N eV for a population that is either spatially homogeneous ( s = 1 ) or has a substructure that is ignored.The traditional definition quantifies variance of allele frequency change conditionally on allele frequencies of generation t.If mutations and selection of a homogeneous population is ignored, then typically allele frequency change of the past (before time t) is uncorrelated with allele frequency change of the present (between time points t and t + ).This implies that E(p t+ |p t ) = p t , so that the variance in ( 19) equals E[(p t+ − p t ) 2 |p t ] .It turns out that the latter quantity is preferable to use in more general settings (such as a subdivided population) when possibly E(p t+ |p t ) ≠ p t , due to the fact that allele frequency change of the past might be correlated with allele frequency change between time points t and t + .We therefore define the variance effective size of a subdivided population (with subpopulation weights w and v ) as Equation (20) differs from ( 19) in that the numerator and denominator of the genetic drift term F are averaged with respect to p t .Indeed, it is well known (Ewens 1982;Hössjer and Ryman 2014;Hössjer et al. 2014Hössjer et al. , 2016) ) that typically is not a fixed number for a structured population, but rather a function of p t .This makes the more general definition of genetic drift in (20) preferable for a metapopulation with subpopulations, since the impact of p t is averaged out.We will refer to (20) as the forward definition of N eV , since allele fre- quency change is normalized, in the denominator, as a function of allele frequencies at time t, the left end point of the interval [t, t + ] , and from the perspective of this time point the allele frequency change in the numerator of (20) takes place forwards in time.Note that the traditional definition (19) of variance effective size is based on the forward approach as well, and it can be seen that ( 20) is a generalization of (19).In particular, when = 1 and the population is homogeneous, both of ( 19) and (20) reduce to the well know formula Crow and Kimura (1970, Eq. 7.6.3.25).
Following Hössjer et al. (2016), where the special case w = v was treated, we rewrite the right hand side of (20) as The first term I on the right hand side of ( 21) is identical to the genetic drift term I that appears in the definition (18) of the gene diversity effective size.Indeed, it follows from (2) and ( 21) that can be expressed in terms of h txy and h t+ ,xy for all pairs x, y of subpopulations, which in view of ( 9) are proportional to the corresponding gene diversities that appear in the genetic drift term I of (18).
The second term II of ( 21) is only present in a subdivided population, and therefore it follows from ( 18) and ( 21) that N GD = N eV for homogeneous populations.For a subdivided population, −II accounts for the correlation between allele frequency change up to time t, and the allele frequency change that takes place between time points t and t + .We could therefore refer to −II as a correlation between past and present allele frequency change.Extending the argument in Hössjer et al. (2016), where the case w = v was treated, one finds that (21) x,y ((vB ) x − w x )w y h txy ∑ x,y w x w y h txy .
19 Page 16 of 49 It follows from ( 23) that II = 0 when all subpopulations are isolated ( vB = v ) and the subpopulation weights are the same at time points t and t + ( w = v ).We will therefore often refer to II as a migration or gene flow term, since it is impacted by migration in an essential way.Equation ( 20) implies that the standardized amount of allele frequency change is non-negative, i.e.F = I + II ≥ 0 .It turns out that the gene flow term II is typically non-negative as well, since migration tends to induce a negative correlation between past and present allele frequency change when subpopulations with large allele frequencies receive inflow from other subpopulations with lower frequencies of the same allele.The consequence of such a negative correlation, or positive II, is to inflate the expected squared allele frequency change F. Since past and present allele frequency changes of a homogeneous population ( s = 1 ) are uncorrelated, such a population must have II = 0 and N eV = N eGD .The same is true when subpopulations are weighted according to their long term reproductive ability ( w = v = ), since positive and negative allele frequency changes in different subpopulations will then cancel out in such a way that allele frequency change before time t is uncorrelated to the one that takes place over the interval [t, t + ] .On the other hand, II is typically positive when one subpopulation x receives full weight ( w = v = e x ), and this will lower N eVRx below N eGD .The magnitude of II for local subpopulation weights depends on the amount of subpopulation differentiation at time t (as quantified by F ST,t ) and the amount of gene flow between the subpopulations.It follows from ( 14) that this amount of subpopulation differentiation is reflected in terms of how much larger h txy for pairs of different subpopulations x ≠ y are compared to all h txx .In the extreme case when all elements of h t are the same it follows that F ST,t = II = 0 and consequently N eV = N eGD .This happens for instance when the first time point t of [t, t + ] is the founder generation ( t = −T ).On the other hand, when II > 0 it may happen that F = 1 ⇔ II = 1 − I or F > 1 ⇔ II > 1 − I , which we formally write as N eV = 0 and N eV = −∞ respectively.

Intermediate Approach
The forward definition (20) of the variance effective size relies on a standardized measure F = I + II of expected squared allele frequency change, which sometimes exceeds 1.This is due to the fact that the denominator of F in ( 20) is inflated when the allele frequency at the first time point t of the interval [t, t + ] is close to 0 or 1.For this reason, when N eV is estimated from data by the temporal method, allele frequency change is usually standardized in such a way that allele frequencies at both time points t and t + are used.In particular, the approach of Pollack (1983) and Jorde and Ryman (2007) corresponds to a definition (24) of standardized expected squared allele frequency change, whose denominator involves allele frequencies p t and p t+ at both time points t and t + .We refer to (24) as the intermediate version of the standardized expected squared allele frequency change, since the allele frequency change of the numerator is forward or backward in time, from the perspective of time point t and t + respectively.Let N int eV refer to the corresponding intermediate version of variance effective size that makes use of F int rather than F. It is not possible to define N int eV by simply replacing F with F int in (20), since for a Wright-Fisher population, such a procedure would not retain the population size.Instead, following Jorde and Ryman (2007) we put It can be seen that the intermediate approach is somewhat more stable than the forward approach.Indeed, the right hand side of ( 25) is less than 1 whenever F int < 4∕3 , so that N int eV exists whenever 0 < F int < 4∕3.In order to analyze N int eV more closely, we need an expression for the genetic drift term F int in (24).To this end, we have to replace the denominator . The ratio of these two denominators is Inserting ( 21) and ( 26) into (24) we find that When the last equation is plugged into (25), an expression .
19 Page 18 of 49 is obtained for the intermediate definition of the variance effective size.From this it follows that the threshold for the intermediate version of the variance effective size not to exist ( N int eV = −∞ ) is twice as high ( II > 2(1 − I) ) as compared to the forward version of this effective size.

Backward Approach
In analogy with ( 24) and ( 25), we also introduce a novel backward definition of N eV .In the first step expected squared allele frequency change is normalized using allele frequencies from the right end point t + of the time interval along which genetic change is monitored.From the horizon of an observer at this time point ( 29) describes what happened in the past, since the expected squared allele frequency change of the numerator is applied to a time period of the past.Let N back eV denote the variance effective size that makes use of F back rather than F. In order for N back eV to retain the size of a Wright-Fisher population, we need to define it as It follows that N back eV exists for all scenarios such that the standardized expected squared allele frequency change between generations t and t + satisfies 0 < F back < ∞ , since this implies 0 < Q() < 1 .For this reason the backward approach (30) gives a more stable definition of variance effective size than the forward and intermediate definitions in (20) and (25).
In order to study N back eV more closely, we start by deriving an expression for F back in (29).To this end, we have to replace the denominator . By similar calculations as in (26), we find that the ratio of these two denominators is In a two-step procedure, we first insert ( 21) and ( 31) into (29) and find that When the last equation is plugged into (30), a formula (32) for the backward definition of the variance effective size is derived.It follows that N back eV exists under the very mild requirements I + II > 0 and I < 1 , since this implies 0 < Q() < 1.

Eigenvalue Effective Size
The eigenvalue effective size N eE corresponds to the long term rate at which genetic variability is lost.The formal definition of the eigenvalue effective size is where = 3 (P) is the largest non-unit eigenvalue and the third largest eigenvalue overall of the transition matrix P of {p t = (p t1 , … , p ts ); t = −T, −T + 1, …} , the vector-valued Markov chain of allele frequencies in all subpopulations.This Markov chain is defined on a huge state space of size ∏ s x=1 (2N cx + 1) .Tufto et al. ( 1996) and Tufto and Hindar ( 2003) used a slightly different definition of in (34), as the largest eigenvalue of the much smaller matrix A that appears in the linear recursion for one minus standardized covariances as well as for gene diversities (cf.( 3) and ( 7)).Indeed, by the Perron-Frobenius Theorem A has a unique, real-valued, and positive eigenvalue of multiplicity 1, which is strictly larger than the modulus of all other eigenvalues of A .It follows from work of Whitlock and Barton (1997) and Hössjer (2015) that 3 (P) = max (A).

Relations Between Effective Sizes
It is clear from the definition (18) of the gene diversity effective size and the three versions ( 21), (28), and (33) of the variance effective size that whenever the gene flow term II is non-negative ( II ≥ 0 ) the values of Q( ) for these four effective sizes satisfy making use of the fact that I ≤ 1 because of (18), and that I + II ≥ 0 must hold as a consequence of (21).But since N eQ is a strictly decreasing function of Q( ) in ( 16), it follows that (33) 19 Page 20 of 49 These inequalities involve the possibility that some effective sizes have values −∞ , 0, or ∞ , whenever Q() > 1 , Q( ) = 1 and Q( ) ≤ 0 , as discussed above.There is no general relation between N eE and the four effective sizes in (36).We will find however that under migration-drift equilibrium N eE equals N eGD as well as N eV with reproductive subpopulation weights.

Migration-Drift Equilibrium
Migration-drift equilibrium occurs when many generations have elapsed between the founder generation and the first generation t of the interval over which genetic change is assessed, so that a balance between genetic drift within and migration between subpopulation is obtained.Mathematically, this corresponds to keeping t fixed while T → ∞ .Recall from ( 35) that A has a unique, real-valued, and largest eigenvalue .Let r = (r xy ) be the corresponding right eigenvector of A with eigenvalue , whose elements, by the Perron-Frobenius Theorem, are realvalued and positive.In view of ( 5), it follows that h −T = 1 s 2 = Cr + r � for some constant C > 0 , where r ′ is a linear combination of the other right eigenvectors of A .Consequently, it follows from the linear recursion (3) that is an increasingly accurate approximation as T gets large.For this reason the migration-drift properties of the metapopulation will only involve and r.
Example 1 (Symmetric migration and equally large subpopulations) In order to find more explicit expressions for r , we will consider a class of structured popu- lations that includes the island and stepping stone models as special cases.These populations have subpopulations with equally large local census sizes ( N cx = N c ) and equally large local effective sizes under isolation ( N ex = N e ).The backward migration rates B xy may depend on the pair x, y of subpopulations, but it is assumed that they are the same in both directions between any such pair.Consequently, the backward migration matrix B is symmetric ( B xy = B yx for all x ≠ y ).Since we also assume that B is irreducible, this implies that an asymptotic distribution = 1 T s ∕s exists for the Markov chain with transition matrix B , where 1 s = (1, … , 1) T is a col- umn vector of s ones.Moreover, B has real-valued eigenvalues i , with , l s be the corresponding orthonormal system of left eigenvectors l i of B , expressed as l i = (l ix ; x = 1, … , s) .It is shown in Appendix B.1 that the column vectors l T ij = (l ix l jy ; 1 ≤ x, y ≤ s) T of length s 2 form a convenient orthonormal system of basis functions to use in order to analyze the right eigenvector r of A , for a system with symmetric migration.
The island model is an instance of symmetric migration, with and I s the identity matrix of order s.The non-unit eigenvalues of this migration matrix are The circular stepping stone model is a second example of symmetric migration, where any subpopulation x receives a fraction m/2 of genes from each of its two neighboring subpopulations x − 1 and x + 1 modulo s.This corresponds to a backward migration matrix The matrix in ( 40) is a circular matrix, and Fourier analysis of such matrices has frequently been used in population genetics (Malécot 1951;Maruyama 1970a;Rousset 2004;Hössjer 2014).For instance, it is shown in Hössjer (2014) that with [(i + 1)∕2] the integer part of (i + 1)∕2 .Expressions for i for the two- dimensional (torus) stepping stone model can be found in Hössjer (2014).◻

Subpopulation Differentiation
In order to find an expression for the fixation index F ST,t under migration-drift equilibrium, we insert (37) into ( 14) and let T → ∞ .This yields where superscript eq is an acronym for equilibrium.It is shown in Appendix B.2 that for reproductive weights w = and the symmetric model of Example 1, the approximation is accurate for large local population sizes when subpopulations are connected by strong migration.For the island model ( 43) we insert (39) into (43) and obtain where is a harmonic average of the local census and effective sizes.Formula ( 44) is accurate when m is not too small.For improved island model approximations of F eq ST , see Hössjer et al. (2013).

Genetic Drift and Migration
Next we will analyze how the genetic drift term I = I T+t ( ) and the gene flow term II = II T+t ( ) behave as T → ∞ .From (3), ( 22), and (37) we deduce that and when t and t + are kept fixed while T → ∞.
It is shown in Appendix B.3 that the equilibrium gene flow term (47) simplifies to for symmetric migration (cf.Example 1), with i = wl T i and i = vl T i the coefficients of l i for w and v , when these two weight vectors are expanded as a linear combination of the left eigenvectors l i of B .Formula ( 48) is accurate when the subpopulations are connected by strong migration.For the island model ( 38) and ( 39) we have that Then (48) simplifies to with Ñ as in (45).This formula is accurate as long as m is not too small.In particular, if 1 ≤ k ≤ s subpopulations receive equal weight 1/k at time points t and t + , and max(2k − s, 0) ≤ l ≤ k of these overlap, it follows that |w| 2 = 1∕k and wv T = l∕k 2 .Insertion into (49) gives Notice in particular that the right hand side of (50) vanishes when k = l = s .This corresponds to using equal weights w x = v x = 1∕s of all subpopulations at both time points t and t + , which are the reproductive weights for the island model.In Sects.6.3 and 6.4 we will use ( 46)-( 50) in order to derive explicit expressions for the gene diversity and variance effective sizes under migration-drift equilibrium.

Gene Diversity Effective Size
It follows from ( 18) and ( 46) that the gene diversity effective size equals the eigenvalue effective size under migration-drift equilibrium, since Notice in particular that since the equilibrium limit I ∞ ( ) of the drift term in (46)  does not involve the subpopulation weighting scheme w , (51) holds regardless which w we use to define N eGD .

Forward Approach
The two equations ( 46) and ( 47) have interesting implications for the asymptotic limit of the forward version of the variance effective size N eV at migration-drift equilibrium.It follows from ( 20), ( 21), (46), and (47) that for all such that 0 < I ∞ () + II ∞ () < 1 , or equivalently that holds.We may apply (52) to any kind of weighting scheme.Since N eVMeta is based on reproductive weights w = v = , and = B , it follows from (23) that II T+t ( ) = 0 for any T ≥ 0 , and hence II ∞ ( ) = 0 .Insertion into (52) gives For local subpopulation weights we insert v = w = e x into the definition of II ∞ ( ) in (47).In conjunction with (52) this gives the equilibrium value N eq eVRx of the realized variance effective size of subpopulation x, for all such that (53) holds. (50) It is proved in Appendix B.4 that the variance effective size at migration-drift equilibrium satisfies for the island model, and subpopulations weights w and v at time points t and t + such that wv T ≤ |w| 2 , with equality in (55) if and only if wv T = |w| 2 .The intuition behind ( 55) is that II ∞ ( ) is elevated when different subpopulation weights are used in generations t and t + , since the negative correlation II between allele frequency change of the past and present then increases, so that the variance effective size gets smaller.We also verify in Appendix B.4 that for the symmetric migration models of Example 1, with equality if and only if reproductive weights ( w = v = ) are used at time points t and t + .The intuition behind ( 56) is that the gene flow term II ∞ ( ) is positive as soon as non-reproductive weights w = v ≠ are used, so that N eVw gets smaller.We also conjecture that results similar to ( 55) and ( 56) hold more generally than for island and symmetric migration models respectively.
It is instructive to illustrate ( 55) and ( 56) for an island model where 1 ≤ k ≤ s subpopulations are assigned equal weight 1/k at both time points t and t + , and that l of these subpopulations overlap.Insertion of the equilibrium migration term II ∞ ( ) in ( 50) into (52) yields This formula shows very explicitly how much N eq eV differs from N eE , as a function of k and l.For fixed k, N eq eV is maximized in (57) when the same subpopulation weights are used at both time points ( l = k ), in agreement with (55).When k = l , we notice that N eq eV attains its maximum N eE when reproductive weights are used at both time points, which corresponds to k = s and w = = 1 T s ∕s , in agreement with (56).

Intermediate Approach
For the intermediate approach, we have, analogously to (52), that the variance effective size at equilibrium is for all such that (55) N eq eVwv ≤ N eq eVw ( 56) which is a less stringent condition than (53) for the variance effective size to exist.
Since II ∞ ( ) = 0 for reproductive weights, it follows that N int,eq eVMeta converges to N eE as migration-drift equilibrium is approached, as in (54).The local realized variance effective size N int,eq eVRx at equilibrium is obtained by inserting w = v = e x into the definition of II ∞ ( ) in (58).Formulas ( 55) and ( 56) hold for the intermediate version of the variance effective size as well, and explicit expressions of N int,eq eV for the island model are obtained by inserting (50) into (58).

Backward Approach
For the backward approach, we find that the variance effective size at equilibrium exists for time intervals of any length .This equilibrium value is derived in the same way as ( 52) and ( 58).For local subpopulation weights we insert w = v = e x into the definition of II ∞ ( ) in Eq. ( 60) in order to obtain N back,eq eV,Rx .Formulas ( 55) and ( 56) hold for the backward version of the variance effective size as well, and explicit expressions of N back,eq eV for the island model are obtained by inserting (50) into (60).

The Length of the Time Interval
In this section we analyze how the length of the time interval impacts the gene diversity and variance effective sizes.We will focus on the two extreme scenarios of consecutive generations ( = 1 ) and long time intervals ( → ∞).

Consecutive Generations
For ease of notation, we will sometimes write I T+t ( ) = I( ) and II T+t ( ) = II( ) for the genetic drift and gene flow terms that appear in the definitions of the gene diversity and variance effective sizes.When these effective sizes reflect changes between two consecutive generations ( = 1 ), formulas (18), ( 21), (28), and (33) simplify to 19 Page 26 of 49 and respectively, where the right hand sides of ( 61)-( 64) refer to migration-drift equilibrium, with I ∞ (1) and II ∞ (1) the drift and gene flow terms in ( 46) and ( 47) at equilibrium, for two consecutive generations.Typically the gene flow term is small ( II(1) ≪ 1 ) unless there is much migration between the subpopulations and a large amount of subpopulation differentiation at time t.Consequently, for most scenarios of practical interest the three versions of variance effective size are practically the same, ( 62) Table 2 Values of the realized local variance effective size N eVRx at migration-drift equilibrium for a time interval of length = 1 , so that the same subpopulation x receives full weight at the two end points of the interval An island model with s = 10 subpopulations is used, with local effective size N ex = 50 under isolation, local census size N cx and migration parameter m (where B xy = m∕s when x ≠ y and B xx = 1 − (s − 1)m∕s ).The three methods of computing N eVRx refer to the forward approach (= For, the right hand side of ( 62)), the intermediate approach (= Int, the right hand side of ( 63)), and the backward approach (= Back, the right hand side of ( 64)).A more explicit approximation of N eVRx , for the forward approach, appears in ( 66) when the expected squared allele frequency change between two consecutive generations is analyzed.This is illustrated in Tables 2 and 3 for the realized local variance effective size of an island model with s = 10 subpopulations at migration-drift equi- librium, corresponding to the right hand side of ( 62)-( 64).Whereas the same subpopulation weights are used at both time points in Table 2 ( w = v = e x or k = l = 1 ), this is not the case in Table 3 ( w = e x , v = e y , x ≠ y or k = 1 , l = 0 ).Note in particu- lar that all three versions of the realized local variance effective size depend strongly on the local census size.This phenomenon is discussed in Ryman et al. (2023), and in the present framework in can be explained as follows: The term II ∞ (1) is approxi- mated by ( 49) and ( 50) for the island model and it depends on the amount of migration into x or y from the other subpopulations as well as the amount of subpopulation differentiation F eq ST in (44).The larger the migration rate and the local census size are, the smaller is the amount of subpopulation differentiation, and the smaller is the gene flow term II ∞ (1) at equilibrium, so that the variance effective size approaches the eigenvalue effective size.
A more analytical interpretation of the results of Table 2 is obtained by inserting = 1 and w = v = e x into (49) and the right hand side of (62).This yields On the other hand, in order to approximate the results of Table 3, we insert = 1 , w = e x , and v = e y into the right hand side of ( 62).This yields (66) .
The three methods of computing N eVRxy refer to the forward approach (=For, the right hand side of ( 62)), the intermediate approach (=Int, the right hand side of ( 63)), and the backward approach (=Back, the right hand side of ( 64)).A more explicit approximation of N eVRxy , for the forward approach, appears in ( 67) Formulas ( 66) and ( 67) are also obtained from ( 57), with = 1 , k = 1 , and l = 1 or l = 0 respectively.They are accurate when the migration rate m is not too small.In particular, under panmixia it follows from (57) that regardless of the loal weight vectors w = e x and v = e y of the subpopulations at time points t and t + 1 .Note that ( 68) is the limit of ( 66) and ( 67) when m → 1 , and that (68) approaches N eE when N c → ∞.

Long Time Intervals
When the length of the time interval gets large, it may happen that the standardized allele frequency change of the forward and intermediate versions of the variance effective size satisfy Q( ) ≥ 1 , so that the corresponding effective size equals 0 or −∞ .In this subsection we will provide formulas for the maximal length max,Q of the time interval for which each type Q of effective size exists under migration-drift equilibrium.Since the gene diversity effective size equals the eigenvalue effective size at equilibrium, for time intervals of any length (cf.( 51)), it follows that For the forward version of the variance effective size, it follows from (53) that where the right hand side is interpreted as plus infinity whenever is zero or negative.The approximation in ( 69) is accurate for large N eE .It implies that N eq eV exists for time intervals up to a maximal length that is proportional to N eE .For the intermediate version of the variance effective size, we similarly deduce from formula (59).We recall from (60) that the backward version of the variance effective size exists at equilibrium for time intervals of any length, so that ( 67) In particular, by increasing the length of the time interval in (60) we find that for all types of subpopulation weights w.
Figure 1 illustrates the eigenvalue effective size N eE , and the forward, inter- mediate, and backward versions of the realized local variance effective size over time intervals [0, ] of increasing length when the population is at migration-drift equilibrium.The model is an island model with s = 10 subpopulations and the migration rate equals m = 0.1 .It can be seen that N eVRx and N int eVRx initially increase as grows, until they reach a maximum, start to decline and eventually do not exist.In contrast, N back eVRx always exists and increases monotonically to N eE as the length of the time interval grows, in agreement with (72).The corresponding variance effective sizes N eVMeta , N int eVMeta and N back eVMeta of the metapopulation, based on equal subpopulation weights w x = 1∕s , equal N eE for all values of .
The three realized local variance effective sizes of Fig. 1 are almost the same for time intervals of length up to 200-300 generations, which is at least tenfold the time span typically employed in the context of genetic conservation.However, it is shown in Appendix A that for a small and subdivided population, the three effective sizes may differ substantially for time intervals of length 5-10 generations.More generally, the value of for which the three effective sizes significantly start to differ is proportional to N eE .For populations that are either very small locally or experi- ence a severe bottleneck, it may therefore be of interest to use the most stable version N back eVRx of the realized local variance effective size.

Estimation of Effective Sizes
In this section we will investigate how the length of the time interval impacts the accuracy of an estimator of the gene diversity and variance effective sizes at equilibrium ( N eq eQ , Q ∈ {GD, V} ).Our theoretical analysis is complementary to the simulation results of Luikart et al. (1999), interval lengths with maximal accuracy, for a variance effective size estimator, were derived for a population going through a bottleneck.Here we stick to the model introduced in Sect.2, with timeinvariant population sizes of all subpopulations.We start by introducing the value, at migration-drift equilibrium, of the quantity Q used to define each effective size.More specifically, in this section Q( ) corresponds to the limit, when T → ∞ , of the quantities that appear in ( 18), ( 20), (25), and (30) respectively.A necessary requirement for N eq eQ to exist is that 0 < Q() < 1 .Recall from Sect. 7.2 that this is not always the case for the forward and intermediate versions of the variance effective size.
In order to estimate N eq eQ from data, let Q() = Q() +  be an estimate of Q( ) , based on samples of sizes n t and n t+ at time points t and t + .We will assume that the estimation error of Q() is a random variable with E( ) = 0 and Var ( ) = 2 .Typically 2 is inversely proportional to the number of biallelic markers used to estimate Q( ) , with a proportionality constant that is a monotone increasing function of 1∕(2n t ) and 1∕(2n t+ ) (Waples 1989).Our objective is to estimate the asymptotic amount of genetic drift per generation at equilibrium, where is the inverse of g.By a first order Taylor expansion of h, it can be seen that the error of the estimate Q = h( Q()) has an approximate variance Our objective is to express Var ( Q) as a function of for each quantity Q and weighting scheme w .The variance in (74) will initially decrease with , since for short time intervals Q() ≪ 1 and consequently When gets larger and Q( ) approaches 1, the variance in (74) will reach a minimum and then start to increase.For this reason it is of interest to find approximate expressions for the interval length that minimizes the estimation variance in (74).As we will find below, Var ( Q) is a function of N eE and the equilibrium gene flow term II ∞ ( ) , defined in (47).It turns out that the optimal time interval will have a length that is proportional to N eE .
For a system with strong migration between its subpopulations (Nagylaki 1980) II ∞ ( ) approaches the asymptotic limit (70) so quickly that the length of the transient period is small in comparison to N eE .For a population with strong migration we will therefore approximate opt,Q by minimizing a simplified version of dh(Q( ))∕dQ( ) with respect to , where II ∞ ( ) is replaced by the constant II ∞ in (70).As a complement to (76) we also define as the largest value of for which the standard deviation of the estimate of Q has not exceeded by a factor of at least C > 1 .In particular, ∞,Q is closely related to max,Q .

Gene Diversity Effective Size
For the gene diversity effective size we recall from ( 18) and ( 46) that Insertion of this equation into (74) yields Minimizing (78) with respect to , we find that the optimal length of the time interval, when estimating the gene diversity effective size, is .

Forward Approach
The quantity Q( ) of the forward version of the variance effective size is obtained from ( 20) and ( 46) and ( 47).The resulting formula leads to Equating the derivative with respect to to 0, of a simplified version of (81) (where II ∞ ( ) is replaced by II ∞ ), and assuming N eV is large, it can be shown that whenever II ∞ ≥ 0 , where d opt,V = d opt,V (II ∞ ) solves the equation We interpret II ∞ as a number that quantifies how much migration between subpopu- lations impacts the variance of allele frequency change.It can be seen from ( 83) that d opt,V is a decreasing function of II ∞ , with d opt,V = 1 for N eq eVMeta and II ∞ = 0 , whereas d opt,V → 0 as II ∞ → 1 .It follows from ( 83) that d opt,V < log(II −1 ∞ ) , and con- sequently, the optimal interval (76) is shorter than the length max,V of the maximal interval in (69) for which N eq eV exists.
Figure 2 is a log-log plot of the normalized standard deviation Var ( Q) 1∕2 ∕ of Q as a function of for an island model with s = 10 subpopulations, when esti- mating the variance effective size of the metapopulation and a local population respectively.The linear decay to the left of the figure, for smaller , corresponds to Var ( Q) 1∕2 being inversely proportional to for intervals of short length, in agreement with (75).Note in particular the vertical asymptote of the dash-dotted curve.This corresponds to the fact that Var ( Q) diverges when approaches the length of intervals for which N eq eVRx is no longer defined (cf.Fig. 1).

Intermediate Approach
For the intermediate definition of the variance effective size we proceed similarly as in Sect.8.2.1.It follows from ( 25), (28), and ( 46) and (47) that which leads to Replacing II ∞ ( ) in (85) by II ∞ and minimizing with respect to , it follows that with 19 Page 34 of 49 Notice that d int,V opt = 1 when II ∞ = 0 , whereas d int,V opt < 1 is larger than (83) whenever II ∞ > 0 .This verifies that it is possible to estimate the variance effective size with high accuracy over longer time intervals when the intermediate approach is used, compared to using the forward approach.
Figure 3 illustrates Var ( Q) 1∕2 ∕ for an island model with s = 10 subpopu- lations.The linear decay to the left of the figure, for smaller , corresponds to Var ( Q) 1∕2 being inversely proportional to for intervals of short length, in agree- ment with (75).The vertical asymptote of the dash-dotted curve corresponds to the fact that Var ( Q) diverges when approaches the length of intervals for which N int,eq eVRx is no longer defined (cf.Fig. 1).

Backward Approach
In order to find Q( ) for the backward definition of the variance effective size we combine (30), (33), and (46) and (47).This leads to (87)  89) is proportional to 2N eE , as in (76).For large N eE , the proportionality constant is d back opt,V ≈ 1 .Consequently, the length of the optimal interval for the backward version of the variance effective size is similarly as for the gene diversity effective size in (79).Comparing (90) with ( 83) and ( 87) we find that the optimal time interval of the backward approach is longer than the corresponding optimal intervals of the forward and intermediate definitions of the variance effective size.

Estimation of Variance Effective Size from a Real Data Set
In order to illustrate how the variance effective size depends on the chosen subpopulation weights, in this section we analyze a genetic data set of brown trout (Salmo trutta) from the Swedish lake Ånnjön.This data set is part of a large longitudinal study comprising 27 different lakes that are located in protected areas of Jämtland County in the central part of Sweden (cf.Andersson et al. (2022) for more details).Biallelic markers are sampled from 96 distinct loci, scattered along the whole genome (all 40 chromosomes) of brown trout, at two time points, corresponding to data that was collected in 1976 and 2017 respectively.Using estimates of the generation time of brown trout, it is assumed that these two time points are approximately = 6 generations apart.The Structure software (v.2.3.4;Pritchard et al. 2000;Falush et al. 2003) was used to identify s = 3 cryptic subpopulations within Ånnjön.
The variance effective size is estimated as in Jorde and Ryman (2007).This estimator is defined for a homogeneous population.Its properties for subdivided populations, where the allele frequency at each time point and locus is a weighted average of allele frequencies from all subpopulations, were studied in Ryman et al. (2014Ryman et al. ( , 2023)).As mentioned in Sect.5.3.2, the JR07-estimator targets the intermediate version N int eV of the variance effective size.We will write Nint eVwv to denote the version of the JR07-estimator that makes use of subpopulation weights w and v at the two time points at which data was collected.It is based on estimates of the subpopulation weighted allele frequencies at loci l = 1, … , L = 96 and time points t and t + = t + 6 respectively.Here ptxl refers to the estimated allele frequency at locus l in subpopulation x at time t, based on a sample of size n tx .In order to define Nint eVwv we also need to introduce sample sizes n t and n t+ at time t and t + .To this end, we use subpopulation weighted harmonic averages of the subpopulation specific sample sizes.The rationale for (92) is that Var (p txl ) = p txl (1 − p txl )∕(2n tx ) , and therefore the variances of the estimated subpopulation weighted allele frequencies, in (91), are approximately the same as for homogeneous, binomial samples of sizes n t and n t+ .   .For all other scenarios, the same subpopulation weights are used at both time points ( w = v ).The (locus averaged) sample sizes at the first time point are n t1 = 30 , n t2 = 9.9 , and n t3 = 9 , whereas at the sec-19 Page 38 of 49 Recall the discussion of Sect.6.4.2 that equations ( 55) and ( 56) are also valid for the intermediate version of the variance effective size, if the system is in migration-drift equilibrium.The findings of Table 4 could therefore indicate that the reproductive weights are close to e 2 , so that N eq eVMeta = N eE is close to Nint eVe 2 = 590 .According to Sect. 3, = ( 1 , 2 , 3 ) contains the long term genetic contributions from the three subpopulations.If our conclusion 2 ≈ 1 is correct, this indicates that x = 2 is a source population from which most or all genetic material originates (i.e.unidirectional migration from 2 to 1 and 3).However, for at least two reasons, this is so far only a conjecture: Firstly, more data analysis, with larger sample sizes and more loci, is needed in order to confirm the conclusion that 2 is a source population.Although the JR07-estimator corrects for the sampling effect, the low sample sizes (for x = 2 in particular) of this data set indicate that the results of Table 4 are a bit uncertain.A separate analysis, based on the (wrong) assumption that all sample sizes are very large, gives a maximal variance effective size Nint eVwv when the three subpopulations are weighted close to uniformly ( w x = v x ≈ 1∕3 for x = 1, 2, 3 ) at both time points, with a corresponding much lower value of N eE .Secondly, the theoretical results ( 55) and ( 56) have only been proved for populations in migration-drift equilibrium, with (55) derived for island models and (56) for models with symmetric migration between subpopulations.

Discussion
In this paper we study the variance effective size N eV of a substructured population, with particular focus on the size of the metapopulation ( N eVMeta ).Our main findings are: (i) That the version of N eV that is of interest for conservation, under certain conditions can be found by maximizing the variance effective size with respect to subpopulation weights in order to minimize the impact of migration and approximate N eGD , (ii) that two new and more stable versions of N eV are introduced and (iii) that the length of the optimal time window of N eV , in terms of estimation accuracy, is derived.
As a major tool for understanding the properties of N eV , we analyze in detail two components of expected squared allele frequency change, defined in equations ( 21)-( 23).The first term I is caused by genetic drift in subpopulations between the two time points at which genetic data is collected, whereas the second term II (or more precisely −II ) quantifies a correlation between allele frequency change of the past and present.We refer to II as a migration or gene flow term, since it is mainly caused by gene flow between subpopulations, when these are assigned the same weights at both time points at which allele frequencies are estimated from data.General expressions are obtained for how the genetic drift and gene flow terms I and II involve the local census sizes and local effective sizes of subpopulations, the migration pattern between subpopulations and the way in which subpopulations are weighed at the two time points between which genetic change is monitored.
The variance effective size is traditionally defined as in (20), so that expected squared allele frequency change is normalized by its expectation, a normalization that involves allele frequencies at the first time point at which genetic data is collected.We refer to this as the forward version of N eV , since it corresponds to a forward time perspective on how allele frequency change is normalized.As mentioned under (ii), in this article we also introduce, in ( 25) and ( 30), two other notions of variance effective size, the intermediate and backward versions of N eV , for which allele frequency change is normalized based on expected allele frequency change at both or only the last time point at which genetic data is collected.
The abovementioned three versions of N eV are very close when the interval between the two time points at which genetic data is collected is small, but they start to differ substantially for intervals with a length that is at least of the same order as the eigenvalue effective size N eE .Two numerical examples are given in this paper in order to illustrate this.The first example represents a large metapopulation with s = 10 subpopulations of size 50, for which 200-300 generations are required for the three versions of N eV to differ substantially.The second example represents a small metapopulation with s = 2 subpopulations of size 10, for which less than 10 generations is sufficient for the three versions of N eV to differ significantly.We also show that the backward version of N eV is most stable and exists under general conditions, for time intervals of any length.In addition, as mentioned under (iii), we derive in ( 76), (86), and (90) the length of the optimal time interval for which the forward, intermediate and backward versions of N eV are estimated with maximal accuracy.
As mentioned under (i), a major implication of our work is that the variance effective size of a substructured population, with appropriately chosen subpopulation weights, is relevant for conservation applications.In more detail, let NeVw be an estimate of the variance effective size, based on using the same subpopulation weight vector w = v at both time points at which genetic data is collected.We conjecture that is an estimate of the eigenvalue effective size N eE for some population systems close to migration-drift equilibrium.The rationale for (93) is equation ( 56), which implies that the variance effective size N eVw , based on using the same subpopulation weights w = v at both time points at which genetic data is collected, is maximized for repro- ductive subpopulation weights w = .This follows from the fact that N eVw is maxi- mized when the gene flow term II vanishes, which happens for reproductive subpopulation weights .The conservation relevance of (93) follows from the fact that (a) N eV is closely related to the gene diversity effective size N eGD , (b) N eGD equals N eE under migration drift equilibrium, and (c) N eGD also approximates the additive genetic variance effective size N eAV , which is of particular interest for long term con- servation (Hössjer et al. 2016).Because of the conservation relevance of (93), it is The reproductive weights depend on the migration pattern between the subpopulations, which typically is unknown.However, equation (93) suggests that it is possible to estimate indirectly (without first estimating migration rates between subpopulations) as the subpopulation weights that maximize NeVw .If all subpopulations contribute to the long term reproduction of the metapopulation, all components of are positive.Whenever this is the case, in order to compute the estimator of N eE in (93), it is required that genetic data is collected from all subpopulations at the two time points between which genetic change is monitored.On the other hand, analysis of the dataset in Sect.9 indicates that one of the subpopulations might be a source, since the maximum of (93) occurs when this subpopulation is assigned a maximal weight of 1.If this is a correct interpretation of the biological situation, only data from this subpopulation is needed in order to estimate N eE .However, in order to confirm this conclusion a larger dataset is needed, and the validity of ( 93) must be investigated beyond our present theoretical assumptions (migration-drift equilibrium and symmetric backward migration rates B xy = B yx between all pairs x, y of subpopulations, which implies = (1∕s, … , 1∕s) ) fail.Several extensions of our work are possible.Firstly, it is possible to investigate whether the present conditions (migration-drift equilibrium and symmetric migration) for equations ( 56) and ( 93) can be extended to structured populations of more general form.
Secondly, it is of interest to develop a multilocus estimator of the backward version N back eV of the variance effective size, which analogously to the JR07-estimator of N int eV in Jorde and Ryman (2007) adjusts for finite sampling.Thirdly, for conservation purposes it is important to study the relation between N eV , N eGD , and N eAV for more general models.We have emphasized that N eV , with reproductive subpopulation weights w = v = , is closely related to N eGD and N eAV (and also with N eE under migration-drift equilibrium).However, this is based on the assumption that N eAV refers to the change of additive genetic variance of a quantitative trait with no epistasis (Hössjer et al. 2016).It is therefore of interest to give more general expressions for N eAV when epistasis is taken into account.We conjecture that N eAV is still very similar to N eGD , and N eV with reproductive weights, for models with epistasis, since all these three effective sizes only involve the drift term I, whereas N eV with other subpopulation weights will be different, since it also involves the correlation term II between past and present allele frequency change.

A: Appendix with Numerical Examples for a Small and Subdivided Population
In this appendix we demonstrate that sometimes the forward, intermediate, and backward versions of the local realized variance effective size differ substantially for a small and subdivided population, even for time intervals [0, ] of moderate (and biologically realistic) lengths.This is illustrated in Fig. 5 for an island model with s = 2 subpopulations of sizes N ex = N cx = 10 , and a migration rate of m = 0.1 .It can be seen that the three effec- tive size differ a lot already for = 5 , with values N eVRx = 9.71 , N int eVRx = 10.39 , and N back eVRx = 11.07 , and even more for 10 , with values N eVRx = 9.98 , N int eVRx = 11.25 , and N back eVRx = 12.51 .The forward and intermediate versions of the realized variance effective size exist for intervals of length up to = 52 and 83 generations respec- tively, whereas the backward version of the realized variance effective exists for intervals of any length.
The corresponding plot of the accuracy of estimates of genetic drift per generation, for intervals of varying length, also reveals a substantial difference between the three versions of variance effective size (cf.Fig. 6).

B.1: Series Expansion of r
We will view the right eigenvector r = r( ) of A as a function of = 1∕(2N eE ) .Assuming that 0 < N eE ≤ ∞ is large, or equivalently that ≥ 0 is small, we apply perturbation theory of matrices (Maruyama 1970a;Nagylaki 1980Nagylaki , 1995;;Hössjer 2015) in order to find a linear approximation r() ≈ r(0) + ̇r.
Put e x = N eE ∕N ex , c x = N eE ∕N cx and rewrite the elements of A = A( ) in (4) as where and Note in particular that the two exponents 1(x = y) and 1(z = u) appear in A xy,zu ( ) as a consequence of (4).For this reason A xy,zu ( ) varies with , and Ȧxy,zu ≠ 0 , only when at least one of the two conditions x = y and z = u is satisfied.It follows from (94) that r(0) = 1 s 2 = 1 is a right eigenvector of A(0) with eigenvalue 1, and therefore is a valid first order Taylor approximation of r( ) .In order to find a more explicit expression of ̇r , we will rewrite it as a linear combination of a system of orthonormal basis functions.To this end, recall that l i = (l i1 , … , l is ) is a left eigenvector of B with a real-valued eigenvalue i , and that {l i } s i=1 is an orthonormal system of basis functions for ℝ s .Define, for each 1 ≤ i, j ≤ s a row vector l ij = (l ij,xy = l ix l jy ;1 ≤ x, y ≤ s) of length s 2 .Then {l ij ; 1 ≤ i, j ≤ s} forms an orthonormal system of basis functions for ℝ s 2 .We also introduce as the coefficient of l T ij in the basis function expansion of ̇r .Since l 11 = 1∕s is proportional to r(0) = 1 , we may assume that a change from = 0 to  > 0 results in a perturbation r( ) − r(0) orthogonal to r(0) .This corresponds to an assumption 0 =  11 = l 11 ̇r = 1 T ̇r∕s .In order to find a more explicit expression for all { ij ; (i, j) ≠ (1, 1)} , we will derive a linear system of equations for the components of ̇r = ( ̇rxy ) , based on an analysis of how r( ) is perturbed after small change of away from zero.From the definition of , and of the eigenvalue effective size in (34), it follows that ( ) = 1 − .Together with (96), this makes it possible to rewrite ( )r( ) = A( )r( ) as Equating the linear -terms of the left-and right-hand sides of (99) we find, after some rearrangements, that Because of ( 94) and ( 95), component xy of Eq. ( 100) takes the form where the term 1(x = y) of the second step is inherited from (95).In the last step of (101) we assumed N ex = N e and N cx = N c , so that e x = e and c x = c .Assume that (i, j) ≠ (1, 1) .Multiplying the left and right hand sides of (101) with l ij,xy , and summing jointly over x and y, we find, making use of  Inserting ( 96) into ( 42) we find that where in the second step of (103) we used w = = 1 s ∕s and the fact that ∑ xy ̇rxy =  11 = 0 .In the third step of (103) we inserted the series expansion (98) of ̇r , in the fourth step we utilized (102), and in the final step we invoked the definitions of c = N eE ∕N c , e = N eE ∕N e , and = 1∕(2N eE ) .This completes the proof since the right hand side of (103) agrees with (43).
Inserting ( 98) and ( 105) into (104) we find that Then we insert the expression (102) for ij , that was derived in Appendix B.1, into (106).This yields By substituting the definitions of = 1∕(2N eE ) , c = N eE ∕N c and e = N eE ∕N e into (107) we finally arrive at (48).

B.4: Proofs of Equations (55) and (56)
Recall that Eq. ( 55) stipulates that the variance effective size at equilibrium, for the island model, is maximized when the same subpopulation weights are used at time points t and t + ( w = v ).Equation ( 56), on the other hand, is a claim that whenever the same subpopulation weights are used at both time points ( w = v ), the variance effective size at equilibrium is maximized for reproductive subpopulation weights ( w = v = ).In order to prove these claims we will make use of (52), which states that the variance effective size N eq eV under equilibrium is a strictly decreasing function of the equilibrium gene flow term II ∞ ( ) of Eq. (47).That is, we need to translate (55) and (56) into analogous inequalities for the equilibrium gene flow term in (47).To this end, we will highlight that this term is a function of the weight vectors w and v at time points t and t + .More specifically, we will write II ∞ ( ) = II ∞wv ( ) and II ∞w ( ) = II ∞ww ( ) .Then, in order to prove (55) it suffices to establish that for the island model whenever wv T ≤ |w| 2 , with equality in (108) if and only if wv = |w| 2 .And in order to establish (56) it suffices to prove that for the symmetric migration models of Example 1, with equality if and only if w = .But (108) follows immediately from (49), whereas (109) is deduced from (48) with i = i (since w = v is assumed), and the fact that w = if and only if 2 = … = s = 0.
Acknowledgements The authors wish to thank the handling editor and two reviewers for valuable suggestions that considerably improved the quality of the manuscript.Financial support from the Swedish (108) II ∞wv ( ) ≥ II ∞w ( ) , for which N ex = N e and N cx = N c are the same for all sub- populations.The migration rates B xy = m � ∕(s − 1) = m∕s between all pairs x ≠ y 19 Page 6 of 49 Table 1 A summary of the most important notation used in this Index of a subpopulation ( ∈ {1, … , s}) t Time index, in units of generations ( ∈ {−T, −T + 1, …}) T Number of generations ago ( t = −T ) when the founder population lived Length of time interval along which genic change is assessed B xy Backward migration rate from x to y or the fraction of parents of individuals in subpopulation x that originate from subpopulation y one generation ago B Square matrix (B xy ) s x,y=1 of order s with all backward migration rates m ′ Migration rate of island model ( B xy = m � ∕(s − 1) when x ≠ y) m Fraction of parents in the island model that originate from the whole population one generation ago ( = sm � ∕(s − 1)) w x Weight of subpopulation x at the start ( = t ) of the time interval along which genetic change is assessed w Vector of subpopulation weights ( = (w 1 , … , w s ) ) at the start of the time interval along which genetic change is assessed v x Weight of subpopulation x at the end ( = t + ) of the time interval along which genetic change is assessed v Vector of subpopulation weights ( = (v 1 , … , v s ) ) at the end of the time interval along which genetic change is assessed e x

Fig. 1
Fig. 1 The figure plots effective sizes for an island model with s = 10 , N ex = N cx = 50 and m = 0.1 , when t = 0 corresponds to migration-drift equilibrium ( T → ∞ ).The horizontal solid line corresponds to N eE = 519.28 .The three curves correspond to N eVRx (dotted), N int eVRx (dash-dotted) and N back eVRx (dashed) for intervals [0, ] of increasing length.N eVRx increases with at first, then it starts to drop until = max = 2431 , and for longer intervals N eVRx does not exist.In comparison, formula (69) predicts max,V = log(II −1 ∞ ) ⋅ 2N eE = 2432.8 .In a similar fashion N int eVRx increases with at first, then it starts to drop until = max = 3151 , and after this generation N int eVRx does not exist.In comparison, formula (71) predicts int max,V = log(2II −1 ∞ ) ⋅ 2N eE = 3152.7 .On the other hand, N back eVRx increases monotonically to N eE as → ∞

Figure 4 illustrates
Figure4illustrates Var ( Q) 1∕2 ∕ for an island model with s = 10 subpopulations.The linear decay to the left of the figure, for smaller , corresponds to Var ( Q) 1∕2 being inversely proportional to for intervals of short length, in agreement with (75).
eVR3 = 343 correspond to choosing local weights w = v = e x for x = 1, 2, 3 .It can be seen that the intermediate version of the variance effective pop- ulation size is maximized for local weights of subpopulation 2, i.e.Nint eVwv = 590 , for w = v = e 2 = (0, 1, 0).
develop software that automatically perform the maximization of this equation in order to compute NeE .

Fig. 5
Fig. 5 The figure plots effective sizes for an island model with s = 2 , N ex = N cx = 10 and m = 0.1 , when t = 0 corresponds to migration-drift equilibrium ( T → ∞ ).The horizontal solid line depicts N eE = 22.42 .The three curves correspond to N eVRx (dotted), N int eVRx (dash-dotted) and N back eVRx (dashed) for intervals [0, ] of increasing length.N eVRx increases with at first, then it starts to drop until = max = 52 , and for longer intervals N eVRx does not exist.In comparison, formula (69) predicts max,V = log(II −1 ∞ ) ⋅ 2N eE = 52.96 .In a similar fashion N int eVRx increases with at first, then it starts to drop until = max = 83 , and after this generation N int eVRx does not exist.In comparison, formula (71) predicts int max,V = log(2II −1 ∞ ) ⋅ 2N eE = 84.04 .On the other hand, N back eVRx increases monotonically to N eE as → ∞

Fig. 6
Fig. 6 The figure shows a log-log plot of the normalized standard deviation Var ( Q) 1∕2 ∕ for estimating Q = 1∕(2N method,eq eVw ) , the average amount of genetic drift per generation at equilibrium, for time intervals [0, ] .The population is an island model with s = 2 , N ex = N cx = 10 , and m = 0.1 .The dotted, dashed and dashdotted lines correspond to variance effective sizes of a local population ( w = x ), using the forward, intermediate and backward methods respectively, whereas the solid line (the same for all three methods) corresponds to the metapopulation ( w = Meta ).The optimal time intervals have lengths opt = 20 , int opt = 29 , and int opt = 44 for the local curves of the forward, intermediate and backward methods, and opt,Meta = 44 for the metapopulation.The corresponding approximations are d opt 2N eE = 22.24 , d int opt 2N eE = 31.07, d back opt 2N eE = 44.83, and d opt,Meta 2N eE = 44.83respectively.The largest time intervals for which the forward and intermediate effective sizes can be estimated, are max,V = 52 and int max,V = 83 respectively

Table 1
eQRx Realized local effective size of type Q for subpopulation x ( = N eQe x = N eQe x e x ).It equals N ex when x is isolated from the other subpopulations N eQRxy Realized local effective size of type Q when subpopulations x and y ( x ≠ y ) receive full weight at the two end points of the time interval ( = N eQe x e y )

Table 3
Values of the realized local variance effective size N eVRxy at migration-drift equilibrium, for a time interval of length = 1 , so that different subpopulations x and y receive full weight at the two end points of the interval An island model with s = 10 subpopulations is used, with local effective size N ex = 50 under isolation, local census size N cx and migration parameter m (where B xy = m∕s when x ≠ y and 500

Table 4
Estimated variance effective sizes NinteVwv , based on subpopulation weights w and v at time points t and t + 6 , for the brown trout data set of lake Ånnjön, with s = 3 cryptic subpopulations