Trunk perimeter correlates with genetic bottleneck intensity and the level of genetic diversity in populations of Taxus baccata L

Taxus baccata remnants established recently tend to contribute less to the species’ overall genetic variation than historical populations because they are subjected to a greater impact of the founder effect and genetic isolation. As tree trunk perimeter is a rough indicator of genetic variation in a population, this measure should be considered in conservation programs. Genetic variation within Taxus baccata (L.) populations is not associated with the current census size but correlates well with the effective size, suggesting that genetic drift intensity reflects variation in demographic histories. We hypothesize that recently established populations are subjected to greater bottleneck than old remnants. Using the mean trunk perimeter as a surrogate of tree age, we test whether the demographic history and genetic variation are associated with the mean tree age. Using 18 microsatellite markers, we analyze the genetic diversity and demographic history of 11 yew populations in Poland to assess the relationship between the mean trunk perimeter and the inferred genetic parameters. Populations reveal significant differences in levels of genetic variation and in the intensity and time of genetic bottleneck. After excluding an apparent outlier, the genetic variation is significantly greater while the bottleneck intensity lower in populations with a greater perimeter. Due to continuous species decline and increasing fragmentation, the non-uniform contribution of yew remnants to the overall genetic variation tends to decrease together with the mean tree age. Germplasm collections for the species should take into account tree perimeter as a rough indicator of the genetic variation of a population.


Introduction
Preserving genetic variation is crucial for protecting endangered species and populations (Aravanopoulos 2016;Ralls et al. 2018), especially to mitigate effects of progressing anthropogenic changes, such as increased exploitation and fragmentation of habitats, as well as the on-going climate changes. Even if it is difficult to predict responses of wild populations to environmental threats, the standing genetic variation is believed to be a fundamental driver for adaptive changes in the future (Pelletier and Coltman 2018).
In addition to natural selection, gene flow among populations and genetic drift are the main factors shaping the genetic resources of a population (Johnson et al. 2018). Generally, gene exchange between populations helps to maintain the genetic variation within populations, neutralizing the negative effect of genetic erosion due to local genetic drift (Willi et al. 2006). As a result, gene flow can maintain or even increase a population's adaptive potential (Swindell and Bouzat 2006). Furthermore, gene flow can counteract inbreeding depression (Ingvarsson and Whitlock 2000) and stochastic demographic fluctuations (Alleaume-Benharira et al. 2006), which otherwise can push isolated populations towards extinction (Brook et al. 2002).
In a meta-population with extinction and recolonization dynamics (turnover dynamics; Haag et al. 2005), newly established (pioneer) populations are characterized by low genetic diversity due to the founder effect (McCauley et al. 1995). Under a sufficient gene flow rate relative to the extinction rate, a pioneer population gains genetic variation, reaching the migration-drift equilibrium eventually. Stochastic population turnover causes a meta-population to consist of populations of different age and different levels of accumulated genetic diversity. Consequently, at any given snapshot in time, a positive relationship between population age and genetic diversity is expected across a metapopulation. However, when gene flow rates are low and/or turnover rates are high, the age vs. genetic diversity relationship may be undetectable because populations go extinct before they start reaching the drift-migration equilibrium (Whitlock 1992;McCauley et al. 1995). Therefore, it is not surprising that empirical studies provide only partial support for the "population age vs. genetic diversity" relationship (e.g., Schneller and Holderegger 1996;Powolny et al. 2016). Natural populations do not conform to the "age vs. genetic diversity" pattern when numerous source populations are located a short distance from a colonized patch (Helsen et al. 2013), or/and when a species has the potential to long-distance dispersal (Raffl et al. 2008;Pluess 2011). In such cases, newly founded populations tend to rapidly reach high genetic diversity and low divergence at the early successional stage and quickly become indistinguishable from source populations (Lefèvre et al. 2004). In this respect, trees seem to be especially "resistant" to the founder effect during colonization (Austerlitz et al. 2000;Born et al. 2008;Elleouet and Aitken 2019;Lander et al. 2020), not only due to high dispersal capabilities but also to specific life-history traits which alleviate genetic drift associated with founder effects.
Taxus baccata L. can serve as a good example of a species characterized by a well-fragmented meta-population subjected to the extinction-recolonization dynamics. T. baccata is a dioecious conifer having a wide range of natural distribution across Europe, the Middle East, and North Africa (Thomas and Polwart 2003). Nevertheless, throughout its range, the natural yew's occurrence is limited to small, isolated populations (Thomas and Polwart 2003;Thomas and Garcia-Martínez 2015;Iszkuło et al. 2016). It is believed that such a high fragmentation is mainly the result of adverse climate conditions in the Holocene, intensive browsing, forest management, and the exploitation of this species in the past (e.g., Linares 2013). Studies on the ancient range of the yew showed that until 2500-5000 years ago, it was a fairly common species in Europe (Deforce and Bastiaens 2007), although the abundance was likely relatively variable across the range (Uzquiano et al. 2015;Iszkuło et al. 2016). The progressive specialization of land use, including the forest areas' transformation into agricultural fields and monoculture standards in the recent past, caused that today, T. baccata is a rare species in natural sites throughout its distribution. On the other hand, T. baccata is very common as an ornamental plant. For this reason, taking into account anthropogenic "populations", it is difficult to consider the yew an endangered species. Nevertheless, the future condition of natural populations is unclear, especially in the context of climate change and the current scale of isolation of individual sites (Thomas and Garcia-Martínez, 2015;Chybicki and Oleksa 2018).
T. baccata is characterized by high genetic variation (Myking et al. 2009;Dubreuil et al. 2010;González-Martínez et al. 2010;Litkowiec et al. 2015), but also a strong genetic differentiation among populations (Myking et al. 2009;Dubreuil et al. 2010;González-Martínez et al. 2010;Burgarella et al. 2012;Litkowiec et al. 2018). So far, the studies indicated that the current genetic structure is shaped by a genetic drift that is often driven by genetic bottlenecks (Chybicki et al. 2012;Litkowiec et al. 2018). It seems that this is a direct effect of isolation and limited gene flow both among (Hilfiker et al. 2004;Myking et al. 2009;Dubreuil et al. 2010) and within populations (Lavabre et al. 2016;Chybicki and Oleksa 2018). Limited seed and pollen dispersal create a spatial genetic structure within natural populations Chybicki et al. 2016), which also results in mating between relatives (Chybicki et al. 2019). Interestingly, contemporary levels of biparental inbreeding co-vary with distance to the nearest male and tree girth, but does not depend on the census population size (Chybicki et al. 2019). The effect of distance to the nearest male can be associated with spatially restricted mating patterns (Chybicki and Oleksa 2018) and the spatial genetic structure Chybicki et al. 2016). On the other hand, the decrease in biparental inbreeding level together with the increase in mean tree girth suggests there is a trend towards decreased opportunities and/or possibilities for sib-ship mating in older yew remnants. Such a pattern can result from the maintenance of sufficient frequency of unrelated individuals or sufficient frequency of lethal factors (or both) in old populations, suggesting that the level of genetic diversity may co-vary with the mean tree age in a population.
The variation in the ratio of effective to census population size observed previously (Chybicki et al. 2012;Litkowiec et al. 2018) suggested different rates of genetic drift across yew stands, likely due to differences in demographic histories. Here, we studied whether demographic histories and genetic diversity are associated with the mean adult tree age. Because naturally occurring T. baccata is under protection in Poland, the trunk perimeter was used as a non-invasive but still highly predictive surrogate of tree age (Chen et al. 2020). Consistently with theoretical predictions for a metapopulation with extinction-recolonization dynamics, we hypothesized that genetic diversity increases with the mean adult tree age. In addition, noting that the yew is a very longlived tree species, we assumed that remnants characterized by older adult trees are representatives of a more abundant historical distribution of the species, while populations with younger adult trees (i.e., presumably established recently) show effects of a strong bottleneck due to a limited number of recent seed sources. Therefore, we hypothesized that molecular signs of genetic bottleneck decrease together with the increase in mean tree age.

Plant sampling and DNA analysis
This study examined 11 populations of Taxus baccata in Poland. Populations were located in protected areas, except for Kórnik. However, yew trees in Kórnik have been left to regenerate freely, and can therefore be considered an undisturbed, semi-natural (naturalized) population (Iszkuło and Boratyński 2005). Except for two remnants (Cisowa Góra A and B) which are only 1 km apart, populations are well separated, and the mean distance between populations is 238 km (see Fig. 1D). All populations were characterized by the mean trunk perimeter. For this purpose, to avoid the impact of sex ratio differences between samples, only female trees (15 random adult female trees per population) were measured at breast height (1.3 m) ( Table 1). In every population, we selected only mature, well seed-bearing trees, representing possibly the oldest individuals within a given location. In this way, we attempted to assess the maximum tree perimeter, reflecting the oldest possible demographic events. Among the study populations, only three were previously characterized for the tree age based on dendrochronological measurements: Cisowa Góra A (173-218 years), Kórnik (105-148 years), and Rokita (49-51 years) (based on two trees per site; Cedro and Iszkuło 2011). Based on these limited data, tree age and tree perimeter show good agreement with each other, despite substantial differences in habitat characteristics among the three populations. Within each site, 17 to 58 trees were sampled (needles), yielding a total of 384 individuals. The total genomic DNA was extracted using GeneMATRIX Plant & Fungi DNA Purification Kit (EURx, Poland). Eighteen EST (Expressed Sequence Tag) microsatellite sequences (Ueno et al., 2015) were used for genotyping (Chybicki et al. 2020) using the protocol described in Chybicki et al. (2019). All the permissions required to perform the above actions in protected areas were received from dedicated Regional Directorates for Environmental Protection according to the Polish Environmental law.

Genetic diversity
The genetic diversity was characterized by the number of alleles (A), observed heterozygosity (H o ), and expected heterozygosity (H e ) using the INEST 2.2 software (Chybicki and Burczyk 2009) and the allelic richness (AR) using FSTAT 2.9.4 (Goudet 1995). For each population, the inbreeding coefficients were estimated jointly with null allele frequencies (p null ) using the Bayesian approach implemented in the INEST 2.2 software (Chybicki and Burczyk 2009). To determine whether inbreeding and/or null alleles were important factors influencing the observed genotypic structure, the Deviation Information Criterion (DIC) was used to select the best model for a given population. To test the homogeneity in genetic variation among populations, we used the Friedman test (the nonparametric equivalent of the repeated measures ANOVA) with AR and H e across markers used as independent blocking variables and populations used as independent treatments.

Genetic structure
To characterize the overall strength of the genetic structure, we ran the classic Analysis of Molecular Variance (AMOVA; Excoffier et al. 1992). The significance of F ST index was determined based on 10,000 permutations. We also ran the F ST -outlier test to verify whether microsatellite markers follow the neutral model of evolution. The null distribution of F ST under neutrality was determined based on 20,000 coalescent simulations. Both tests were performed using Arlequin 3.5 (Excoffier and Lischer, 2010).
Subsequently, using STRU CTU RE 2.3.4 (Pritchard et al. 2000), we performed clustering of individual genotypes into gene pools, ignoring the information about sampling localities. The number of clusters (K) was assumed to vary from 1 to 15. For each K, ten runs were performed, with the burn-in length of 250,000 and 250,000 iterations. The best K was determined using the approach described recently in Puechmaille (2016) because the MEDMEDK, MEDMEAK, MAXMEDK, and MAXMEAK estimators tend to be more accurate than the original approach (Pritchard et al. 2000) as well as the ∆K method (Evanno et al. 2005) for both evenly and unevenly sampled datasets. Puechmaille estimators were obtained using the StructureSelector (Li and Liu 2018). The best K was selected as the number of clusters for which Puechmaille's estimators show the maximum value consistently. Finally, the population admixture coefficient (Admix), an indicator of historical gene flow, was computed under the best K using the Gini-Simpson index.

Demographic parameters
The effective population size (N e ) was assessed using two estimators revealing the best statistical properties out of the available methods (Wang 2016). The first method, implemented in Colony 2.0.6.5 program (Wang 2004), is based on frequencies of full-and half-sib relationships reconstructed using genetic data (SF-N e ; Jones and Wang 2010). The second estimator, available in the NeEstimator 2.1 computer program (Do et al. 2014), is based on the level of linkage disequilibrium generated due to genetic drift (LD-N e ; Waples 2006). The observed MR values were tested against the demographic equilibrium value MR eq using the Wilcoxon signed-rank test. MR eq was computed based on the coalescent simulations (analogously to Cornuet and Luikart 1996), assuming the two-phase mutation model (with the default settings in INEST 2.2). The intensity of the bottleneck was assessed using ∆ MR = MR eq -MR. Because of MR < MR eq under bottleneck (Garza and Williamson 2001), the larger ∆ MR value, the higher signal of the bottleneck. The homogeneity in genetic bottleneck signal among populations was evaluated using the Friedman test with ∆ MR across markers used as an independent blocking variable and populations used as independent treatments.
Demographic histories of the studied populations were also characterized using an Approximate Bayesian Computation approach (ABC) implemented in DIYABC 2.1.0 (Cornuet et al. 2014). Due to limited genetic data (as for the robust inference based on the ABC approach), we considered only two simple, a priori equally likely demographic models: either a population evolved without any change in size or a single change in population size was assumed to occur at the time of t generations ago. Each population was analyzed independently, ignoring any gene flow between populations. Based on the estimates of the effective population size obtained independently, the current population size (N 0 ) was assumed to follow the uniform distribution within the range of 4-1000. For the alternative model, we assumed the historical (ancestral) effective size (N A ) to follow the uniform distribution within the range of 10-100,000. In addition, for the alternative model, the demographic change was assumed to occur t generations back in the past, with t sampled from the uniform distribution within the range 1-10,000. The default settings were used regarding the mutation model, except for the average mutation frequency minimum (set to 10 -5 ) and the individual locus mutation frequency minimum (set to 10 -6 ). The mean A, the mean H e , the mean variance in allele size, and the mean MR were used as summary statistics for the analysis. To approximate the posterior distribution, 2000 bestmatching samples out of 1,000,000 random coalescent samples were used. Model probability was estimated using the logistic regression approach. To characterize population demography, we focused on the compound parameters: N 0 μ, N A μ, and τ = t μ because their ratios inform about the ratios of the natural parameters, regardless the assumption about the mutation rate μ (given that the mutation rate μ is uniform across populations). Parameter estimates were obtained after the log transformation.

Trunk perimeter vs. genetic parameters
Finally, we assessed the relationship between the mean trunk perimeter and the genetic parameters inferred in the abovedescribed analyses. As an indicator of association, we used Spearman's rank correlation because it allows potential nonlinear relationships to be taken into account.

Genetic diversity within populations
The average number of alleles within the study populations ranged from 2.3 (Bogdaniec) to 4.8 (Cisowa Góra B and Czarne), with a mean of 4.1 ( Table 2). The allelic richness (based on a minimum of 17 samples) ranged from 2.3 (Bogdaniec) to 4.6 (Cisowa Góra A), with a mean of 3.8. On average, the observed heterozygosity (mean = 0.564, range: 0.508-0.622) was close to the mean expected heterozygosity (mean = 0.564, range: 0.434-0.627) across populations. However, at the population level, we observed homozygosity excess. According to the Bayesian analysis results, the excess in homozygosity was mostly due to null alleles (Table 2). Only three populations showed signs of inbreeding, but the estimated inbreeding level corrected for the presence of null alleles was low (between 0.027 and 0.028). Despite the significant contribution of null alleles to the observed genotypic structure, their population-specific mean frequency across markers was rather low, ranging from 0 to 7.2% with a grand mean of 3.0%. With respect to the genetic variation, the study populations deviated significantly from homogeneity in AR (p value < 0.001) but not in H e (p value = 0.060), as revealed by the Friedman test.

Genetic structure
The AMOVA analysis showed that the genetic variation is significantly structured into among-and within-population components (Table 3), with the differences among populations (sampling locations) explaining c. 11% of the total genetic variation. The F ST -outlier test revealed no departures of the microsatellite markers from the null distribution, suggesting that the observed genetic structure was not influenced by selection at any locus (Appendix Fig. 4). After processing the STRU CTU RE analysis results with the ΔK approach, K = 2 appeared to be the most plausible number of genetic clusters. However, due to the presence of multiple peaks (Fig. 1), the analysis suggested a hierarchical genetic structure. The MaxMean estimator showed that the most optimal number of clusters K = 9. With K = 9 as the optimal number of clusters, individuals were grouped into sampling locations, except for Wierzchlas and Czersk (29 km apart). Based on the STRU CTU RE results, the admixture coefficient ranged from 0.05 to 0.62, with a mean of 0.34. The level of admixture appeared to vary consistently with the expected heterozygosity.

Demographic parameters
The observed values of M-ratio (MR) were below the values expected for the mutation-drift equilibrium (MR eq ) for all populations (Table 4). The Wilcoxon signed-rank test revealed that the MR deficiency (MR < MR eq ) was significant in all cases, providing strong support for the presence of bottleneck event(s) in population histories. However, the Friedman test revealed that populations were significantly heterogeneous with respect to bottleneck intensity (p value = 0.010).
The SF and the LD estimators appeared to be generally concordant with each other (r Spearman = 0.836; p value = 0.001), yielding very close harmonic means across populations of 17.1 vs. 17.3. Based on the harmonic mean N e for the two methods, effective population size ranged from 4.0 (Bogdaniec) to 80.1 (Czersk), with the harmonic mean across populations of 17.2. Interestingly, the two populations (Bogdaniec and Rokita)   (Table 4).
Consistently with the MR-based bottleneck test, the coalescent-based ABC approach showed that the scenario with a single change in population size was significantly better than the null model (no change) ( Table 5). Although the estimates of demographic parameters obtained under the best model showed wide credible intervals, in all cases, the historical population size was significantly greater than the current population size. The UPGMA clustering of populations based on the Euclidean distance between (log-transformed) posterior modes of coalescent parameters revealed the existence of two main demographic groups ( Fig. 2A). The first group (5 populations) was characterized by a very recent genetic bottleneck (Fig. 2C). The second group (6 populations) embraced populations characterized by a moderate-tohigh time to bottleneck. Concerning the time to bottleneck event, populations in the second group were characterized by quite heterogeneous demographic histories, as suggested in Fig. 2C, although the two subgroups had insufficient bootstrap support of 70%. The linear regression analysis for the relationship between the estimate of N 0 μ and the harmonic mean N e showed that the slope was significantly greater than zero (b 1 = 0.0048; SE = 0.0019) while the intercept was non-significant (b 0 = 0.0778; SE = 0.0776) (Fig. 2B). Using the regression slope b 1 as the estimate of mutation rate μ to re-scale the estimated demographic history parameters (Table 5), we found that the most recent bottleneck occurred Table 4 Results of the M-ratio test for genetic bottleneck and estimates of effective population size of Taxus baccata populations. MR, the Garza-Williamson statistics (M-ratio); Δ MR , the difference between the equilibrium value of MR and the observed value; p value, the p value for the bottleneck test; LD-N e , the linkage disequi-librium-based estimate of effective size; SF-N e , the sibling frequencybased effective size; Mean N e , the harmonic mean of N e across two methods; N e /N, the ratio of the mean effective to census population size; 95% CI, the 95% confidence interval  (Fig. 2C). The estimated ancestral population size ranged between 147 (Bogdaniec) and 1154 (Cisowa Góra B) with the harmonic mean of 383. The genetic structure parameters (AR, H e , and Admix) appeared to co-vary quite well with the mean trunk perimeter (Fig. 3). With the remarkable exception of Czersk (CS), study populations with a greater perimeter tended to show a greater allelic richness as well as genetic diversity (H e ) and admixture of gene pool (Admix). The CS population appeared to represent an apparent outlier also in correlations between the trunk perimeter and demographic parameters (Fig. 3). After ignoring CS, the trunk perimeter appeared to correlate significantly (positively) with all genetic structure parameters (AR, H e , and Admix) and the following demographic parameters: negatively with the indicator of bottleneck Δ MR and positively with N e , N 0 μ, and N 0 /N A .

The main considerations
The study populations showed high genetic differentiation, with gene pools corresponding mostly to sampling locations. Also, they showed a significant but variable signal of a genetic bottleneck. Generally, except for a single population, populations characterized by a greater trunk perimeter were genetically more variable and they tended to have a greater effective population size. Moreover, populations with a greater trunk perimeter tended to show a lower deficiency in the Garza-Williamson index and a higher ratio of current to historical population size, suggesting that younger populations are characterized by stronger genetic bottleneck effects.
In this study, the mean trunk perimeter (girth) was used as a surrogate of tree age. Generally, the tree girth is a measure of radial tree growth that reflects a combined effect of the age (as a primary factor) and the annual growth rate (as a confounding factor), which may differ both among sites and among individuals within a single site (e.g., George et al. 2019). While the impact of individual variation in growth rates can be reduced via averaging, the impact of among-site variation cannot be neglected. Such differences in tree-ring widths were observed among the three populations included in this study (see Cedro and Iszkuło 2011), and attributed to temperature and precipitation. In addition, growth rates may also vary due to light conditions (Devaney et al. 2015), which seem to be different in the study sites due to forest structure differences (pers. obs.). Therefore, tree girth is not a perfect measure of tree age. On the other hand, the perimeter of a tree Besides ecophysiological factors (De Swaef et al. 2015), a variable response of trees to their environments due to variation in genetically-based preadaptation can shape the trunk perimeter's inter-site variation (Lu et al. 2014). Assuming that a greater trunk perimeter reflects better individual adaptation (more efficient resource acquisition), natural selection may potentially explain the observed relationship between tree girth and genetic parameters. Keeping in mind that the microsatellite markers used in this study were developed based on the transcriptome sequence (Ueno et al. 2015), natural selection could directly impact the pattern of genetic diversity, especially through heterozygosity advantage (Ledig et al. 1983;Bush et al. 1987;Ellis and Burke 2007). However, according to the F ST -outlier test, the genetic markers did not show any traces of natural selection, making the selection rather an unreliable factor, especially since it would have to act at most loci simultaneously. Therefore, taking the abovementioned limitations into account, we believe that the relationship between genetic parameters and the perimeter observed in this study reflects the effect associated with the mean tree age.
The observed patterns of genetic diversity followed quite well the predictions for a meta-population with extinctionrecolonization dynamics. Hence, the results imply that the species' genetic structure is shaped by a balanced impact of gene flow and turnover dynamics. The long-term scattered distribution coupled with a limited dispersal potential (relative to spatial isolation; Chybicki and Oleksa 2018) suggests that gene flow among yew populations could be low. Such a conclusion is in line with the findings of earlier studies (Myking et al. 2009;Dubreuil et al. 2010;González-Martínez et al. 2010;Chybicki et al. 2012). According to the STRU CTU RE results, older populations showed higher admixture levels suggesting that populations have exchanged genes since their establishment. On the other hand, young populations (except for the outlier) showed homogeneous gene pools, with little or no sign of gene flow. Assuming constantly low gene exchange over generations, the observed "population age vs. genetic diversity" relationship could evolve under low extinction rates to Fig. 3 The relationship between the mean trunk perimeter and genetic parameters in Taxus baccata populations. AR, the mean allelic richness; H e , the mean expected heterozygosity; Admix, the admixture rate inferred based on STRU CTU RE (for K = 9); Δ MR , the difference between expected (under demographic stability) and observed value of the Garza-Williamson bottleneck index (MR); N e , the effective population size (the harmonic mean across 2 estimators); N e /N, the ratio of effective to census population size; DIY-ABC parameter estimates: N 0 μ, the current (mutation-scaled) effective size; N 0 /N A , the ratio of current to ancestral (before demographic event) effective size; τ, the time to demographic event backward in time (in generations × mutation rate). p values show the significance of the Spearman correlation analysis performed with the Czersk population being included and excluded in the analysis enable older populations to approach the migration-drift equilibrium. It seems that exceptional yew longevity and ability to regenerate from branches (Thomas and Polwart 2003) may facilitate slow turnover dynamics. Alternatively, relatively high genetic diversity, admixture, and effective size observed in old populations could directly result from greater species abundance in the Subboreal. In such conditions, suitable habitats could be colonized with seeds from numerous, potentially large source populations, leading to the pattern of high initial genetic diversity and weak founder effect (Raffl et al. 2008;Pluess 2011;Elleouet and Aitken 2019). Consequently, it cannot be excluded that the observed genetic diversity of old populations is mostly reminiscent of the ancestral gene pool dated back to the optimum of species abundance, and represents a quasi-stationary maximum point due to limited gene flow.
The long-lasting species decline, proceeding since the Subboreal, is generally well reflected in demographic history results. All populations showed signs of a genetic bottleneck, as revealed by a simple test with the Garza-Williamson index. The post hoc clustering of DIYABC results showed that differences in the inferred demographic histories between populations were mostly due to bottleneck time and not the ancestral population's size. However, as the time to demographic event was not related to trunk perimeter, older populations did not necessarily show a longer time to bottleneck. On the other hand, the positive association between trunk perimeter and the ratio of current to ancestral population size suggests that older populations tended to reveal lower bottleneck intensity. It should be mentioned that the DIYABC results were obtained based on a very simple demographic model (single demographic event, no gene flow), so any conclusions regarding demographic histories should be drawn with caution. Nonetheless, when coupled with the results on genetic diversity and effective population size, they support the scenario that older populations are better representatives of the ancestral gene pool than recently established ones.
Our results showed that, in the current meta-population of T. baccata, old populations seem to preserve enough genetic variation to be sources of gene diversity for newly colonized sites. This is important especially for the sites revealing strong founder effect such as Rokita and Bogdaniec. These two populations revealed not only the lowest genetic diversity and effective size but also the apparent excess in heterozygosity, which can be attributed to a small number of breeders (Balloux 2004). Interestingly, despite the low genetic variation, high census numbers in Rokita and Bogdaniec demonstrate that these populations do not show signs of population decline. It is not clear, however, how the populations will respond in the long-term. According to our earlier study, Rokita showed an excess of biparental inbreeding at the seed stage, beyond the predictions based on simple ecological determinants such as population density and trunk perimeter (Chybicki et al. 2019). Thus, without a recurrent gene flow, they are potentially at risk of inbreeding depression. However, parentage analysis at the seedling stage demonstrated that the dispersal potential of yew may limit the increase in genetic diversity, especially in the context of the species' strong spatial isolation (Chybicki and Oleksa 2018). Therefore, comparative assessment of the genetic variation of successive generations is a pending goal of our future study.

The outlier population
Czersk was characterized by high genetic variation and large effective population size despite being relatively young, which makes it an outlier population in terms of the relationship between population age and genetic diversity. According to the STRU CTU RE analysis, Czersk clustered with Wierzchlas, which is well known as one of the oldest and the largest (if trees of age > 100 years are counted) yew remnants in Europe. Such a genetic homogeneity suggests either intensive gene flow between these populations or an artificial origin of Czersk. According to predictions on pollen and seed dispersal (Chybicki and Oleksa 2018), it is very unlikely that Czersk and Wierzchlas form a panmictic unit, given that pollen needs to overcome both the distance of 29 km and the potential barrier of the forest canopy. On the other hand, the forest in Czersk is characterized by a high tree species richness, and especially the presence of (non-native) Douglas fir, suggesting that the stand was planted. Our genetic data, collected in this and the previous study (Chybicki et al. 2012), show that there is no spatial genetic structure in Czersk (Appendix Fig. 5), supporting an artificial origin. If the yew population was established by humans, the plant material was almost surely taken from Wierzchlas, which is the closest yew remnant and a protected nature reserve since the 1820s (Tobolski 2002). The high genetic variation in Czersk suggests that the plant material was collected, representatively, as if the goal was to protect gene resources of the Wierzchlas reserve. According to the reserve documentation, the yew decline in Wierzchlas has been noted for the last 100 years (from 5500 in 1910 to < 3000 living trees at present). Hence, it might be reasonable to consider there has been the need for exsitu conservation for a long time, but we cannot go beyond speculations whether such efforts were actually made.

Conclusions
Previous studies showed that yew populations characterized by the large census number unnecessarily represent the richest source of genetic variation (Chybicki et al. 2012;Litkowiec et al. 2018). Our study demonstrated that the trunk perimeter, which is a readily available tree measure, can be used as a proxy for genetic variation in yew populations. We also showed that the perimeter could be used as a simple indicator of demographic history in the study species. Revealing connection between population history and genetic parameters, the study showed that genetic structure in T. baccata is in agreement with predictions for a metapopulation with extinction-recolonization dynamics, at least in areas of historically scattered species distribution (cf. Gargiulo et al. 2019). Keeping in mind that additional studies are required to fully understand the observed pattern, we believe that our results can help optimize conservation efforts and manage the species' genetic resources. Considering the significant genetic differentiation, gene bank collections should optimally include as many populations as possible (Hoban 2019). However, because yew populations do not contribute equally to the overall genetic pool, as revealed in this study, in order to achieve the maximum genetic variation, the sampling intensity across conservation units may be proportional to the mean trunk perimeter. In this way, the ancestral gene pool of the species would likely be captured more representatively. Future studies can focus on the applicability of the observed trend to the other species subjected to historical fragmentation (e.g., Eucalyptus caesia; Bezemer et al. 2019). The spatial genetic structure within the Czersk population assessed based on two different data sets. The dataset from Chybicki et al. (2012) included 66 trees genotyped at 5 microsatellite markers. The dataset from this study included 30 trees genotyped at 18 microsatellite markers. As a measure of kinship, the Nason kinship coefficient was used (Loiselle et al., 1995). Limits of the 95% confidence interval for a random spatial genetic structure were computed based on 10,000 random permutations of tree positions using INEST 2.2 software ( Limits of 95% confidence interval of a randomized SGS