Tree Genetics & Genomes

, Volume 10, Issue 6, pp 1723–1737

Bayesian approach reveals confounding effects of population size and seasonality on outcrossing rates in a fragmented subalpine conifer

Open Access
Original Paper

DOI: 10.1007/s11295-014-0792-3

Cite this article as:
Chybicki, I.J. & Dzialuk, A. Tree Genetics & Genomes (2014) 10: 1723. doi:10.1007/s11295-014-0792-3


Mating systems have long been recognized as key factors determining genetic structure within and between populations. Outcrossing promotes genetic diversity and gene flow between populations, while inbreeding, on the other hand, decreases recombination rates, facilitating fixation of co-adapted genes. In small populations, selfing moderates pollen limitation because of low mate availability, but at the cost of increased inbreeding depression. These conflicts are of more than theoretical interest; they are critical for the management of endangered species. In order to help designing conservation strategies for the management of the gene pool of fragmented populations of Pinus cembra, a protected species in Poland, we have characterized pollen flow and mating structure using nuclear microsatellite markers. We demonstrated that P. cembra in the studied stands of the Tatra Mts. is characterized by an average outcrossing rate (t) of 0.72. Unlike with the existing approaches, using the newly developed Bayesian method, we found that population size and seasonal variation had confounding effects on outcrossing rates. In concordance with predictions, large populations showed significantly higher outcrossing rates (t = 0.89) than smaller ones (t = 0.51). Temporal variation revealed in the outcrossing rate might be linked with masting behavior of the species. On the other hand, we showed that outcrossing rates were not associated with a trunk diameter of a mother tree. Our study also demonstrated that biparental inbreeding is a significant component of mating system. However, we further show that pollen dispersal follows a fat-tailed distribution (with the average dispersal distance of 1,267 m) so that at least some long-distance pollen dispersal must be occurring. Overall, we conclude that the high inbreeding (both selfing and mating between relatives) found in P. cembra buffers for pollen limitation. We argue that small, isolated stands can be at risk of gene pool erosion, despite the potential for long-distance pollen and seed dispersal.


Population fragmentation Mating system Biparental inbreeding Pollen dispersal Bayesian method Pinus cembra 


Mating system has long been recognized as a key factor determining genetic structure within and between populations (Wright 1931; Jain and Marshall 1985; Mitton 1992). In plants, two mating systems are generally recognized, i.e., self- and cross-fertilization. Outcrossing promotes genetic diversity and gene flow between populations (Mitton 1992). Moreover, outcrossing is advantageous because it allows individuals to avoid homozygosity for deleterious mutations, which leads to inbreeding depression (Lande and Schemske 1985; Charlesworth and Charlesworth 1987). However, inbreeding depression may not be the main factor affecting mating system. The survey made by Goodwillie et al. (2005) showed that many species under strong inbreeding depression maintain high selfing rates. On the contrary, some species favor outcrossing even if inbreeding depression is low. On the other hand, self-fertilization can be advantageous because it results in 50 % gain of gene transmission as compared with outcrossing. Self-fertilization increases individual homozygosity and consequently decreases recombination rates (Jain 1976; Wright et al. 2008). Hence, self-fertilization can be an efficient mechanism that enhances fixation of co-adopted gene complexes (Lande and Schemske 1985). All these forces shape the distribution of selfing rates in natural populations. Interestingly, this distribution is more or less continuous in animal-pollinated species, whereas in wind-pollinated species, it is clearly bimodal, with both extremes being over-represented as compared with the total (i.e., irrespective of pollination mode) distribution (Vogler and Kalisz 2001 and references therein). Thus, it seems that extreme mating systems prevail in the case of wind pollination.

Among other factors influencing mating systems, self-incompatibility and pollen availability need particular attention (Young et al. 2012). In many plants (e.g., Brassicaceae, Rosaceae, or Solanaceae), genetically determined self-incompatibility is an efficient way to recognize and reject self-pollen, preventing not only self-fertilization but also biparental inbreeding (reviewed in Takayama and Isoagi 2005). The second factor is related to the hypothesis of reproductive assurance. Predominantly outcrossing species may experience lower fitness when pollen sources or pollinators are scarce. In this case, selfing can be selected as a means to assure reproduction, even at the cost of increased inbreeding depression (Lloyd 1979; Holsinger 1996; Morgan and Wilson 2005; Porcher and Lande 2005). Also, the mass action model predicts that selfing is both density- and frequency-dependent because the rate of self-fertilization increases as the rate of self-pollination (i.e., the relative amount of self- to total pollen on receptive stigmas) increases (Holsinger 1991). For this reason, many species display the mixed mating system, combining the advantages of both extreme reproductive strategies (Goodwillie et al. 2005).

The factors that influence mating system can shape variation in outcrossing rates between populations and seasons or even among individuals. Such a variation can arise, for example, from a variable amount of pollen being available within different populations or seasons. It has been often demonstrated that population density can have impact on selfing rates (Farris and Mitton 1984; Knowles et al. 1987; El-Kassaby and Jaquish 1996; Robledo-Arnuncio et al. 2004), including the break-down of the self-incompatibility system (Kamm et al. 2011). Generally, variation in self-fertilization can lead to unpredictable oscillations in inbreeding rates (Coelho and Vencovsky 2003), and thus, may be a basis for stronger inbreeding effects than predicted by simple models. In small populations, variation in outcrossing rates (t) can result in fluctuation of t between generations (Jain and Marshall 1985), leading to high variation in fixation rates.

Here, we investigated the impact of the size of isolated population fragments on outcrossing rate. As a case study, we used Swiss stone pine populations located in Tatra Mts., Poland. Swiss stone pine (Pinus cembra L.) is a European coniferous tree occupying the timberline ecotone in the Alps and the Carpathians. Unlike to the Alpine core range, where P. cembra is locally abundant, small isolates in Carpathians are relatively common. At the end of the nineteenth century, the main priority of stone pine restoration in Poland was the genetic rescue by sowing and planting stone pines of local but also Alpine or even Siberian origin (Paryski 1971). Since 1946, stone pine is protected by Polish law, and since 1954, when the Tatra National Park was established, the whole Polish population of the species was included within the nature reserve (Zwijacz-Kozica and Żywiec 2007). Unfortunately, stone pine is still believed to be a threatened species in the Polish Tatras (Zwijacz-Kozica and Żywiec 2007). Due to negative anthropogenic impact, most of its stands have been decimated, and as a consequence, stone pine occurs today mainly in almost inaccessible areas (Zwijacz-Kozica and Żywiec 2007), on poor, non-calcareous bedrock, mainly on granite (Myczkowski and Bednarz 1974). At present, there are about 12,000 individuals of stone pine in the entire Tatra region (Jamnicky 1981). However, only 30 % of the total is located in the Polish Tatras, growing mostly as isolated groups (Chmiel 1996).

To date, the mating system of P. cembra in the Tatra Mts. has not been characterized. However, the results for Alpine and East Carpathian populations revealed that P. cembra is characterized by slightly lower outcrossing rates as compared with other pines (Lewandowski and Burczyk 2000; Politov et al. 2008; Salzer and Gugerli 2012; cf. Restoux et al. 2008), with the average about 75 %. However, outcrossing rate can be even lower than estimated based on genotyped seeds (or seedlings). This is because early-acting inbreeding depression can lead to abortion of selfed embryos (Goodwillie et al. 2005). In fact, no correlation was found between fitness components and selfing rates measured at the seed stage in P. cembra (Salzer and Gugerli 2012). However, the results indicated that the rate of early abortion of embryos due to inbreeding depression was higher in small peripheral populations compared to more continuous ones. Interestingly, a detailed study in the Alps showed that population fragmentation significantly reduces outcrossing (Salzer and Gugerli 2012). This can occur as a result of pollen limitation. However, pollen limitation requires pollen production and/or dispersal to be significantly reduced in comparison with a large, more continuous population.

In this study, we aimed to verify the hypotheses that (a) outcrossing rates were positively associated with population size and (b) outcrossing rates revealed seasonal invariance. For this purpose, a new method was developed, which allows the accounting for multiple confounding factors of outcrossing rates, measured either in a continuous or a nominal scale. Furthermore, we focused on verifying whether mating between relatives contributed to the total rate of inbreeding. Finally, we aimed at assessing pollen dispersal kernel with a special emphasis on whether pollen gametes have the potential for long distance dispersal.

Materials and methods

Study species

The Swiss stone pine (P. cembra) is a monoecious five-needled pine, growing at the timberline in the European mountains. The gene flow can theoretically be expected to be extensive in Swiss stone pine, since the species is a predominantly outcrossed, with wind-dispersed pollen, and its wingless seeds are almost exclusively dispersed by the far-ranging bird European nutcracker (Nucifraga caryocatactes; Ulber et al. 2004). The spatial genetic structure within populations of P. cembra can show aggregation of genetically related individuals as a result of caching behavior of the nutcracker. Kin-structured regeneration can potentially fosters mating among relatives (Salzer 2011). Moreover, flowering and seed production in Swiss stone pine occurs every 2–3 years, but only a single year of 4–10 is an abundant mast year (Ulber et al. 2004). In consequence, temporal fluctuations in mating patterns are possible.

Study material

Because P. cembra in the Tatras occupies timberline (typically at 1,300–1,700 m a.s.l.), subpopulations’ borders can be easily delineated through the system of mountain valleys. Consequently, subpopulations are represented by geographically isolated stands. We sought to collect seeds within 4 stands for 3 successive seasons (Fig. 1). Among selected localities, Dolina Białego and Dolina Suchej Kasprowej represent relatively small populations, with about 50 and 350 adult individuals, respectively. The other two, Morskie Oko and Dolina Waksmundzka, represent relatively large populations, with about 1,500 adult individuals each (Myczkowski and Bednarz 1974). The mean geographic distance between stands was 7.3 km. Due to hazard sampling conditions (a very steep slope or cliffs), within each locality, only five mother trees were initially selected. The selected trees were kept the same over the three sampling seasons. In order to avoid additional sources of variation, we did not sample nearest neighbors or trees close to the subpopulation border. The average distance between trees within a subpopulation ranged from 28 (Morskie Oko) to 109 m (Dolina Białego), with the average of 61 m.
Fig. 1

Location of populations of Pinus cembra in Tatra Mts. and the species natural range in Europe (gray-shaded areas). Distribution map courtesy to EUFORGEN,

In June, four metal net bags of 40 cm × 50 cm were put on branches with young cones on each mother tree to protect seeds against the harvesting by the mountain jay (N. caryocatactes L.). Two-year-old seeds were sampled in October. However, due to low seed productivity as well as incidental loss of the material (destruction of bags) or tree fall (a single case), only 13 individuals (mother trees) were successfully sampled. In total, 1,929 seeds were collected (522 in 2008, 1,167 in 2009, and 240 in 2010). In order to have as much as possible balanced seed samples, the random subsample of 1,129 seeds was only analyzed. The number of seeds used for genotyping ranged from 31 to 131 with the average of 87 per mother tree (Table 1). From each seed, the embryo was extracted and used for isolation of the total genomic DNA (using CTAB protocol; Doyle and Doyle 1990).
Table 1

Number of seeds analyzed in particular years and populations



UTM coordinatesa






Outcrossed (%)

Small populations


Dolina Białego







15 (27.3)


Dolina Suchej Kasprowej







17 (54.8)


Dolina Suchej Kasprowej







57 (52.3)


Dolina Suchej Kasprowej







43 (82.7)


Dolina Suchej Kasprowej







23 (21.5)


Dolina Suchej Kasprowej







92 (70.2)

Large populations


Dolina Waksmundzka







74 (80.4)


Dolina Waksmundzka







125 (96.2)


Morskie Oko







29 (82.9)


Morskie Oko







75 (86.2)


Morskie Oko







77 (82.8)


Morskie Oko







105 (90.5)


Morskie Oko







80 (87.9)







812 (71.9)

Dashes indicate years when seeds were not available. The column “Outcrossed” shows the number of seeds being a result of outcrossing (inferred based on the Bayesian paternity analysis; for details, see the “Materials and methods” section)

aGeographic position of a mother tree

bDiameter at breast height of a mother tree (in cm)

cNot available


Embryos extracted from seeds were genotyped at 7 nuclear microsatellite markers (Pc1b, Pc3, Pc7, Pc18, Pc22, Pc23, and Pc35) developed for P. cembra (Salzer et al. 2009) using the Multiplex Master Mix (Qiagen) according to the manufacturer’s instruction. In order to describe the overall variation of genetic markers, the following genetic parameters were computed: the number of alleles (A), the observed (Ho), and expected (He) heterozygosity. Additionally, in order to describe the power of genetic markers, we also estimated the exclusion probability (Eq. 1 in Jamieson and Taylor 1997).

Analysis of outcrossing rates

Outcrossing rates were estimated using several approaches. Due to the sampling design, we were able to organize seeds into groups according to a population size and a year of seed collection. Therefore, besides the overall average, outcrossing rates were estimated separately per population size and per year. Both multilocus (tm) and single-locus estimates (ts) were obtained using the maximum likelihood approach implemented in MLTR 3.4 (Ritland 2002). The single-locus approach is useful to quantify the overall level of inbreeding (i.e., both due to selfing and mating between relatives), while tm is strictly related with selfing. Therefore, the difference tm − ts can be used as to measure the level of biparental inbreeding (Ritland 2002). Correlation of selfing rate among loci, rs, provides additional information about biparental inbreeding. If there is no biparental inbreeding then rs = 1. Additionally, (1 − rs) is approximately the fraction of inbreeding due to biparental inbreeding. Standard errors of the estimates were computed using the bootstrap procedure (with 1,000 bootstrap samples).

Effects of population size and seasonal variation and individual outcrossing probabilities

Simple statistical comparisons of mean outcrossing rates between seasons and between large and small populations accounted for a single factor at once. However, in the case of confounding effects of the two factors, such comparisons can be misleading. Knowing exactly whether a given seed has been a result of outcross fertilization vs. self-fertilization, one could use the multiple logistic regression analysis with nominal explanatory variables (i.e., season and population size; Table 1). In our case, however, such information was not available a priori. To cope with this, we developed a mixed mating model-based regression analysis, in which unknown outcrossing rates were expressed as a function of the explanatory variables.

Generally, in order to perform the regression analysis, for each nominal variable K − 1 unredundant code (or design) variables need to be created, where K is a number of levels observed at a given variable (Hosmer and Lemeshow 2000). To create codes, the method of “deviations from means” (also known as the “effects coding”) was chosen (Hosmer and Lemeshow 2000) because in this case regression slopes are informative about a direction and strength of levels’ effects relative to the mean effect of a nominal variable. It should be stressed, however, that the overall statistical behavior (e.g., the overall fit) of the model remains invariant in respect to the choice of a coding system. In our case, the variable season had three levels (2008, 2009, 2010), while the variable population size had two levels (small, large; Table 1). Therefore, in total, four code variables were created (Table 2).
Table 2

Specification of the code (design) variables for the nominal predictors (season and population size) and the corresponding regression function terms



Code (design) variables

Regression function term


1 (2008)



βs1xs1 + βs2xs2 = βs1



2 (2009)



βs1xs1 + βs2xs2 = βs2

3 (2010)



βs1xs1 + βs2xs2 = − (βs1 + βs2)

population size

1 (small)



βp1xp1 = βp1



2 (large)



βp1xp1 = − βp1

In addition, we included a diameter at breast height (DBH) as a potential factor of outcrossing rate. Here, DBH was used as a proxy for male fecundity (an amount of pollen produced by a tree). According to the mass action model (Holsinger 1991), where the self-fertilization rate is proportional to the amount of self pollen relative to the total pollen, male fecundity may have a confounding, and presumably, negative effect (see, e.g., Shea 1987 for empirical results) on the expected outcrossing proportions, affecting potentially the results for small and large populations.

Finally, using the logistic regression model, the outcrossing rate (a response variable) for the m − th (m = 1, 2, 3) season, the n − th (n = 1, 2) population size, and the maternal DBH equal to d were formulated as follows:
$$ {t}_{mnd}=\frac{ \exp \left(\alpha +{\beta}_{sm}+{\beta}_{pn}+\gamma d\right)}{1+ \exp \left(\alpha +{\beta}_{sm}+{\beta}_{pn}+\gamma d\right)} $$

where α is the intercept, βsm is the regression slope coefficient for the effect of the m − th level in the season variable, βpn is the regression slope coefficient for the effect of the n − th level in population size variable, γ is the regression slope coefficient for the effect of DBH, and d is the DBH of a mother tree. Note that in the Eq. (1), βs3 = − (βs1 + βs2) and βp2 = − βp1 (see Table 2), so that the number of unconstrained (estimable) β coefficients remains equal to 4 (the same as the number of design variables).

To estimate the regression parameters, we adopted the Bayesian approach developed recently as a framework to model plant mating systems (Chybicki 2013; Chybicki and Burczyk 2013). The approach is generally based on the (mixed mating) probability model, similar to that used in MLTR, in which the probability of the multilocus genotype Oij of the j − th offspring in the i − th maternal family is as follows:
$$ \Pr \left({O}_{ij}|m= Seaso{n}_{ij},n= Popsiz{e}_i,d= DB{H}_i\right)=\left(1-{t}_{mnd}\right) \Pr \left({O}_{ij}|{M}_i\right)+{t}_{mnd} \Pr \left({O}_{ij}|{M}_i,\mathbf{P}\right) $$
where tmnd is the probability that a seed collected in the m − th year and the n − th population type taken from a mother tree of DBH equal to d was produced through outcrossing, Pr(Oij|Mi) is the Mendelian probability of the offspring genotype after self-fertilization given the maternal genotype Mi, Pr(Oij|Mi, P) is the Mendelian probability of the offspring genotype after outcrossing given the maternal genotype Mi and the background pollen pool P represented by the matrix of allele frequencies, while Seasonij is the collection year of the j − th seed in the i − th family, Popsizei is the level of the Population size into which the i − th mother tree is classified (see Table 2), and DBHi is the diameter at breast height of the i − th mother tree. In the case of the tree 11, the missing value was replaced with the arithmetic mean (49 cm). In the Eq. (2), tmnd was substituted with the right-hand side of the Eq. (1), so that the likelihood of the model
$$ L\left(\mathbf{O};\alpha, \beta, \mathbf{P}\right)={\displaystyle \prod_i}{\displaystyle \prod_j} \Pr \left({O}_{ij}|m= Seaso{n}_{ij},n= Popsiz{e}_i,d= DB{H}_i\right) $$

was actually a function of regression coefficients and not outcrossing rates (as in MLTR).

The parameters were estimated using the Gibbs sampler (a class of Markov Chain Monte Carlo or MCMC algorithm). In this approach, the posterior distribution is approximated by a sequence of parameter values cyclically drawn using the specified sampling scheme. A uniform Dirichlet distribution was taken as a prior for Pl (i.e., the vector of background allele frequencies at the l − th locus), while for the regression parameters α, β, and γ, a zero-centered normal distribution was taken with the variance of 10. For technical reasons (cf. Chybicki 2013; Chybicki and Burczyk 2013), the additional auxiliary variables A and X were introduced. Each element Alk in the matrix A was used to store the number of copies of the k − th allele at the l − th locus in a background pollen pool. Note that A stores counts of alleles present in outcross pollen gametes only (Chybicki 2013). The elements Xij in X equaled 1 or 0 if the j − th seed in the i − th family was produced through outcrossing or selfing, respectively. The auxiliary variables are normally unknown, but they can be inferred from data given t and P, using a Bernoulli scheme described in detail in Chybicki (2013) and Chybicki and Burczyk (2013). Here, we only stress that, in a consequence of the algorithm, P was estimated simultaneously with the remaining parameters (as in MLTR software). The vector {Alk} has a multinomial distribution with the parameter Pl. Thus, because a Dirichlet distribution is a conjugate prior for a multinomial distribution, Pm could be sampled directly from the posterior multinomial-Dirichlet distribution, leading to the efficient Gibbs sampler for allele frequencies. Summing up, a single MCMC cycle started with inferring A and X. Then, for every l − th locus, Pl was drawn using the Gibbs sampler. Then, the parameters α, β, and γ were updated using the Metropolis–Hastings algorithm, in which new values were proposed using the reflective random walk (Hoff 2009). The final estimates (posterior means and quantiles) were obtained after 1,000,000 cycles, keeping every 250th update and disregarding the first 100,000 updates for burn-in. A given nominal variable was considered as statistically significant, if any of the corresponding slope coefficients estimated for this variable was significantly different from zero (based on the credible interval). As a cross-validation of this procedure, we performed the Bayesian model comparison based on the DIC (Deviance Information Criterion, Spiegelhalter et al. 2002). For this purpose, additional regression analyses were performed based on the simplified (nested) models, resulting from removing one, i.e., season or population size, or both variables (the null model).

One advantage of the algorithm described above was the possibility to estimate the posterior probability that the j − th individual in the i − th family was produced through outcrossing (Pout). Pout was estimated as the mean Xij over MCMC cycles, disregarding the first 100,000 values for burn-in. Finally, every progeny individual with Pout > 0.95 was classified as being outcrossed. The Bayesian approach was implemented in the computer program written in Delphi/Pascal (CEMBRA program is available at

Analysis of pollen dispersal

Because only the maternal families were available, we used the indirect approach called KinDist (Robledo-Arnuncio et al. 2006) to make inferences about pollen dispersal kernel. KinDist uses the theoretical relationship between the pollen dispersal kernel and the standardized kinship between pollen gametes, inferred from an offspring’s genotype after subtracting the maternal input. This approach was chosen because, unlike TwoGener (Smouse et al. 2001), it does not require an effective density of pollen parents (De) to be known. We need to stress here that in our case, due to problematic delimitation of a mating population (see Dzialuk et al. 2014), De cannot be practically assessed. During the analysis, we focused particularly on the assessment of whether the dispersal kernel is thin- or fat-tailed or whether long-distance pollen dispersal (LDD) is a likely phenomenon in the study system. Therefore, we assumed that pollen dispersal follows an exponential family kernel (the exponential-power kernel; Austerlitz et al. 2004), defined in terms of the parameters: the shape (b) and the scale (a). The exponential-power kernel is known for sufficient flexibility in terms of kurtosis (Austerlitz et al. 2004). Generally, if b > 1 then the kernel is considered as thin-tailed (b = 2 leads to the normal distribution). This kind of kernels is characterized by a very low probability of LDD. On the other hand, if b < 1 then the kernel is considered as fat-tailed. In this case, the probability of LDD is relatively high. Therefore, in order to infer the potential for LDD, we compared the three models: normal, exponential, and exponential-power (fixing b at 2, 1 and treating it as an estimable parameter, respectively). The analysis was performed using the POLDISP package (Robledo-Arnuncio et al. 2007). According to the recommendation of the authors (Robledo-Arnuncio et al. 2007), at the initial step, we checked whether pair-wise paternity correlation coefficients decreased significantly with distance. For this purpose, the Spearman rank correlation coefficient was used with p values estimated using the permutation procedure. Also, the threshold distance between mother trees, at which there is likely no correlation in paternity between sibships (Robledo-Arnuncio et al. 2006), was set to the value of 6 km. This value was equal to the distance at which the linear regression function of the observed correlation of paternity started to be negative (Fig. 2b). Also, the distance of 6 km corresponds roughly to the maximum distance between nearest populations in this study. Nonetheless, setting the threshold distance to 1 km, i.e., treating between-populations pollen pools uncorrelated a priori, did not change the results (not shown).
Fig. 2

Genetic structure of pollen gametes (pooled data) and the inferred pollen dispersal based on the Kindist approach. The upper panel shows correlation of paternity as a function of distance between mother trees for the entire sample of progeny (a) and the subsample of outcrossed progeny (b). The best fitting regression lines together with the Spearman rank correlation coefficients are also shown. The lower panel shows pollen dispersal kernel (probability density; given on a log scale; c) and the cumulative probability distribution function (CDF; d) of dispersal distance (given on a log scale) estimated assuming the three models of dispersal. Dashed lines show the results based on the entire sample, while solid lines show the results based on the outcrossed progeny. Note that according to the model comparison procedure, the exponential-power kernel showed the best fit (see the text for details)

Because POLDISP offers no explicit tool for model comparison, we used the general theory developed for a non-linear regression (Burnham and Anderson 2002). First, for the ith model we estimated the log-likelihood, as \( {\ell}_i\left(\widehat{\theta}\right)=-\frac{1}{2}n\left( \log \left({\widehat{\sigma}}_i^2\right)+ \log \left(2\pi \right)+1\right) \), where \( {\widehat{\sigma}}^2 \) was estimated as the sum of squares of residuals (SSR) divided by the number of data points for the ith dispersal model (SSR obtained as the standard output of POLDISP). Subsequently, we estimated Akaike Information Criterion values as \( AI{C}_i=-2{\ell}_i\left(\widehat{\theta}\right)+2K \), where K equalled the number of estimated parameters for the ith model (including \( {\widehat{\sigma_i}}^2 \)). The model with the smallest AICi was considered the best one.

Both TwoGener and KinDist work under the assumption of random (or null) selfing (Austerlitz et al. 2004; Robledo-Arnuncio et al. 2006; Carpentier 2011). Because in our case this assumption was generally violated (see the “Results” section), the above estimation procedures were applied to two data sets. First, all progeny individuals were used as data (the untrimmed sample). In the second case, only individuals produced through apparent outcrossing were used. They were identified using the Bayesian approach described above.


Genetic polymorphism

In total, 97 alleles were detected in progeny genotypes, ranging from 4 to 23 per locus (data not shown). Overall expected heterozygosities varied from 0.164 to 0.849, with the average 0.562. Observed heterozygosities spanned between 0.144 and 0.705, with the average 0.394. We observed significant deficiency of heterozygotes for all markers. The multilocus exclusion probability equaled 0.98499.

Average outcrossing rates

The overall multilocus outcrossing rate tm estimated jointly for all mother trees and seasons equaled 0.725 and was significantly different from both extremes (0 and 1; Table 3). When estimated separately for small and large populations, i.e., ignoring the seasonal effect, the average tm equaled 0.513 and 0.886, respectively. For comparison, tm values averaged per sampling year showed little variation, ranging from 0.661 to 0.773. However, the season-specific outcrossing rates did not account for the effect of population size. The overall single-locus outcrossing rate equaled 0.593 and was significantly different from both 0 and 1. For small and large populations, the average ts equaled 0.404 and 0.769, respectively, while it spanned between 0.519 and 0.679 among sampling years. All single-locus outcrossing rates were significantly lower as compared with the parallel multilocus indices (Table 4). The overall correlation of selfing among loci equaled 0.673 (±0.056). Large subpopulations had significantly higher rs than small subpopulations. At the seasonal level, rs ranged between 0.644 (2010) and 0.691 (2009). All rs estimates deviated significantly from both 0 and 1.
Table 3

Estimates of outcrossing rate together with standard errors (SE)

Estimation level






tm − ts



















































































































Asterisks denote significant difference (α = 0.05; after the Šidák correction for multiple tests) between per-size or per-year averages and the overall average

tm the multi-locus outcrossing rate, ts the single-locus outcrossing rate, rs the correlation of selfing among loci

aGrouping by population size disregarded seasonal variation of outcrossing rates (Small populations: Dolina Białego, Dolina Suchej Kasprowej; Large populations: Dolina Waksmundzka, Morskie Oko)

bGrouping by collection year disregarded variation in outcrossing rates due to population size

Table 4

Effects of population size and seasonal variation on outcrossing rates



Regression parameters

\( \overline{D} \)



















Population size + season + DBH (the full model)





























Population size + season



























Population size


































































Model, the regression model for outcrossing rates; Estimate, the Bayesian estimate (mean – the posterior mean; CI2.5% and CI97.5%, bounds of the 95 % credible interval); Regression parameters, the intercept (α) and the slopes of the logistic regression function (βsm, the regression slope coefficient for the effect of the m − th level in the season variable; βpn, the regression slope coefficient for the effect of the n − th level in population size variable; γ, the regression slope coefficient for the effect of DBH); \( \overline{D} \), the measure of the model fit; pD, the effective number of parameters in the model; DIC, the Deviance Information Criterion measuring the model quality

In addition, we performed comparisons between population size-specific or year-specific estimates and the overall average (obtained for the entire data set) using the Z test (and bootstrap standard errors). The procedure revealed that tm and ts for large and small populations were significantly higher and lower, respectively, as compared with the overall averages. Also, rs for large subpopulations was significantly lower than the overall average.

Mating system parameters were also estimated grouping data simultaneously per year and population size. These estimates revealed that seeds sampled in 2009 and 2010 showed consistently the lowest and the highest outcrossing rates, respectively, regardless of population size. This suggests that mating system parameters might reveal seasonality. Nonetheless, based on the simple statistical comparisons, we were unable to confirm the significance of this pattern (Table 3).

Effects of population size and seasonal variation and individual outcrossing probabilities

Six candidate regression models were used to explain the observed variation in outcrossing rates (Table 4). The DBH model (DIC = 20644) showed the quality comparable to that of the null model (DIC = 20646). Although the Season model (DIC = 20640) showed the significant improvement of fit as compared to the null model, it was apparently worse, as compared to the Population size model (DIC = 20458). This result was in agreement with the result of simple comparisons of mean outcrossing rates, described above. The model accounting for both seasonality and the effect of population size revealed the best quality (DIC = 20439), indicating that both factors had confounding effect on the observed outcrossing rates. The addition of DBH did not improve fit substantially (\( \overline{D} \)), decreasing at the same time quality (DIC) of the model. Moreover, the credible interval for the slope γ included 0, indicating that outcrossing rates were not associated with DBH of mother trees.

Based on the best model (Population size + season), in the case of the Season variable, the slope coefficient βs2 for the season 2009 was significantly lower than 0. It thus indicated that outcrossing rate, or more precisely: outcrossing/selfing odds, was significantly lower in the subsample of seeds collected in that season compared to the average. Taking exp(βs2), one may predict that the outcrossing/selfing odds relative to the geometric mean odds was about 0.56 in 2009. Similarly, βp1 for the small population size was significantly negative, indicating that outcrossing rates were significantly lower in seeds collected in small populations than in seeds taken from large populations. In this case, the outcrossing/selfing odds ratio for small populations equalled 0.34.

As a by-product of the Bayesian estimation, a series of posterior probabilities for progeny of being produced through outcrossing (Pout) was obtained. Using a criterion of Pout > 0.95, we separated 812 (out of 1129) individuals that were produced through apparent outcrossing (Table 1). Among 317 individuals classified as selfed, the posterior probability of being produced through outcrossing ranged from 0 to 0.794, with the average 0.029. In other words, the Bayesian procedure only exceptionally resulted in ambiguous paternities and the vast majority of individuals were classified as being selfed or outcrossed with nearly 100 % certainty. It is worth mentioning that the model with and without over-dispersion in family outcrossing rates gave exactly the same number of selfed vs. outcrossed individuals.

Pollen dispersal kernel

Generally, we found the significant decrease in correlation of paternity between sib-ships with a distance (Fig. 2). However, when outcrossed seeds were only retained, the association between the correlation of paternity and the distance appeared to be stronger (the Spearman rank correlation rs = −0.353; p value = 0.002) than for the total sample of progeny (rs = −0.263; p value = 0.020).

When all pollen gametes (i.e., all progeny) were used in the analysis, the estimated average pollen dispersal distance varied from 10.1 to 69.1 m, depending on the shape of the exponential-power function (Table 5). According to the model comparison procedure, the full exponential-power kernel (i.e., for which both scale and shape parameters were set estimable) appeared the optimal one, with the Akaike weight equal to 59.3 %. For comparison, the normal kernel had the lowest Akaike weight of 12.8 %. The median of dispersal distance reached up to 32 m (Fig. 2), suggesting that pollen dispersal is restricted mostly to a local neighborhood of a mother tree.
Table 5

Estimates of the pollen dispersal kernel

Dispersal kernel






\( {\ell}_i\left(\widehat{\theta}\right) \)



Total progeny



















 Exponential power









Outcrossed progeny only



















 Exponential power









a, the scale parameter of the dispersal kernel; b, the shape parameter of the dispersal kernel; δ, the mean distance of pollen dispersal (in meters); Q50%, the median distance of pollen dispersal (in meters); RSS, the sum of squares of residuals; \( {\ell}_i\left(\widehat{\theta}\right) \), the likelihood of the parameter estimates; AICi, Akaike Information Criterion; wi, the Akaike weight for the i-th model

aItalicized values denote that these parameters were treated as constants during estimation

Generally, the analysis based on the outcross pollen gametes gave the scale of pollen dispersal one order of magnitude larger compared with the untrimmed procedure. The mean and median dispersal distance ranged between 66.1 and 1,266.5 and between 62.2 and 458.8 m, respectively (Table 5). However, also in this case, the full exponential-power kernel appeared the optimal one, with the Akaike weight of 99.4 %. Interestingly, the shape parameter b = 0.206 was similar to that obtained based on the untrimmed procedure.


We showed that P. cembra in the Tatra Mts. is characterized by the mixed mating system. The average outcrossing rate of 72 % is comparable to results for populations from the Eastern Carpathians (Politov and Krutovskii 1994; Politov et al. 2008), while slightly lower than outcrossing rates found for populations in the Alps (81 %; Lewandowski and Burczyk 2000; Salzer and Gugerli 2012). Furthermore, using the newly developed method, we detected confounding effects of population size and seasonal variation on outcrossing rates. In concordance with the expectations, small populations were characterized by lower outcrossing rate. However, we found no association between DBH of a mother and the outcrossing rate. Also, our study demonstrated that biparental inbreeding is a significant component of mating system. Interestingly, biparental inbreeding was found in both small and large populations. Finally, we inferred that pollen dispersal follows a heavy-tailed distribution, enabling long-distance pollen dispersal.

Density-dependent mating system is well documented in the literature on plant mating systems. Most often, low stand density significantly increases the proportion of selfed progeny (Rajora et al. 2002; Robledo-Arnuncio et al. 2004; Mimura and Aitken 2007). Comparing core and peripheral populations in the Swiss Alps, Salzer and Gugerli (2012) found sharp differences in selfing rates (9 % vs. 35 %), indicating that pollen availability must be negatively associated with population size in P. cembra. Other reports for P. cembra seem to be in agreement with this statement (Lewandowski and Burczyk 2000; Politov et al. 2008). In our case, significant differences in outcrossing rates found between small and large populations in the Tatras are in line with these conclusions. According to the reproductive assurance hypothesis (Lloyd 1979), when outcross pollen is unavailable, selfing allows to continue seed production. In our case, small subpopulations of P. cembra in the Tatras can experience outcross pollen limitation because sporadic mature trees are scattered in the forest so that inter-individual distances (even between nearest neighbors) are longer, and due to the presence of the other species, harder to traverse (Milleron et al. 2012) as compared with large, dense subpopulations (especially Morskie Oko).

The average outcrossing rate found in this study translates into the expected inbreeding level F = (1 – 0.72)/(1 + 0.72) = 0.16. In the Ukrainian Carpathians, Politov et al. (2008) found a good correspondence between outcrossing rates and inbreeding levels estimated at embryonic stage. However, low outcrossing rates do not necessarily lead to high inbreeding at the adult stage. In Abies balsamea, the population at the highest elevation revealed the lowest outcrossing rate measured at the seed stage (Neal and Adams 1985). However, the adult population showed an excess of heterozygotes, suggesting that selection favors heterozygotes in this marginal environment. The results for P. cembra fully support this prediction (Salzer and Gugerli 2012) in respect to populations in the Swiss Alps. In the case of the Tatras, a related study revealed that the population of mature individuals is only slightly inbred, with the overall average F = 0.051 (Dzialuk et al. 2014). Using the equilibrium equation (Ritland 1990), we can estimate the relative fitness of selfed individuals to be as low as w = 2tF/((1 – t)(1 – F)) = 0.283. Hence, it seems reasonable to expect that inbred individuals will be outcompeted by non-inbred ones during the seedling development, as often observed in pine species (Savolainen and Hedrick 1995; Koelewijn et al. 1999; Bower and Aitken 2007). On the other hand, the evolutionary theory predicts that recurrent selfing accompanied by intensive selection against inbred individuals can lead to genetic purging (Crnokrak and Barrett 2002). In such conditions, a number of deleterious alleles is significantly reduced, opening possibility for increased self-fertilization and inbreeding without negative consequences for fitness. An experimental study on Scots pine provided good support for this theory (Kärkkäinen et al. 1996). In our case, small and isolated populations of P. cembra seem to be particularly prone to genetic purging, assuming they will continue to experience outcross pollen limitation.

It should be stressed that the reported selfing rates can be underestimated, because our estimates did not account for seed abortion, which is highly correlated with self-fertilization (Koski 1971; Savolainen et al. 1992; Kärkkäinen et al. 1999). The total probability of survival of a seed can be expressed as P(B) = sws + (1 − s)wt, where s is the actual self-fertilization rate, ws and wt is the proportion of progeny survived after selfing and outcrossing, respectively. After Ritland (1990), let w = ws/wt. Then, the (posterior) probability for a seed to be after self-fertilization, given that it survived, is equal \( P\left(A\Big|B\right)=\frac{sw}{1-s\left(1-w\right)} \). In fact, P(A|B) is equivalent to the effective self-fertilization rate \( \widehat{s} \) typically estimated based on genotypes of viable progeny. Consequently, the true selfing rate in small and large populations in the Tatras adjusted for seed abortion, i.e., estimated as \( s=\frac{\widehat{s}}{\widehat{s}+w\left(1-\widehat{s}\right)} \), can be as high as 0.788 and 0.313, respectively. Interestingly, these estimates are in a good agreement with equivalent estimates for peripheral and core populations in the Alps (Salzer and Gugerli 2012; see the Appendix). According to the mass action model (Holsinger 1991), where the frequency of self-fertilization is the proportion of self pollen in the total pollen received by a given tree, in small populations only 21 % of the pollen pool of a given tree is not produced by that tree. For comparison, in large populations the outcross pollen is expected to prevail (69 % of the total). It thus seems that pollen limitation is among the main threats in small subpopulations of P. cembra.

Knowledge about pollen dispersal in P. cembra is extremely scarce. Using a parentage assignment approach, Salzer (2011) revealed that within a marginal stand of P. cembra in the Alps, effective pollen dispersal distance was, depending on the conditions, <200 m or about 315 m on average (maximum distance of 659 m). Our own analysis revealed that the total pollen dispersal follows a heavy-tailed dispersal kernel, confirming great potential of P. cembra to long-distance pollen dispersal. The mean dispersal of about 1 km inferred in this study was threefold higher than previously found. However, estimates reported by Salzer (2011) did not account for pollen immigration, which reached about 40 % from at least 1 km. Thus, the total effective pollen dispersal in the Alps could be higher. Using the cumulative probability distribution (CDF) for the exponential-power kernel (Fig. 2) estimated in this study, we can infer that in the Tatras about 69 % pollen travels up to 1 km, while 31 % would be immigrating from at least 1 km. Thus, our results are in line with the findings based on the direct approach by Salzer (2011). They also correspond well with general knowledge that species with saccate pollen are characterized by great pollen dispersal potential (Schwendemann et al. 2007). In the case of P. cembra, the potential can be further increased by suitable environmental conditions, such as high wind speed. Nonetheless, at a regional scale, the specific configuration of P. cembra populations would eventually determine the potential for long-distance dispersal. More detailed, parentage-based study would be helpful to compare small and large populations in this respect.

An important component of plant mating system is mating between relatives, leading to biparental inbreeding (e.g., Degen et al. 2004; Hirao 2010; Fenster et al. 2003). Interestingly, except for our study, no strong signatures of biparental inbreeding in P. cembra have been shown (Lewandowski and Burczyk 2000; Politov et al. 2008). The between-locus correlation of selfing (rs) can be interpreted as roughly equal to the proportion of inbreeding due to selfing, while (1 − rs) as a proportion of inbreeding due to mating between relatives (Ritland 2002). Consequently, small subpopulations in the Tatras showed a lower proportion of mating between relatives as compared with large ones. However, a lower proportion of consanguineous mating is somewhat counterintuitive, because typically one can expect that biparental inbreeding tends to increase as a number of mates decreases (Morgante et al. 1991; Takayama and Isoagi 2005). One possible scenario can be proposed taking into account the result that pollen dispersal was large. In this case, because short distances are under-represented in small subpopulations (due to lower density), it seems very likely that most outcross pollen travels long distances, favoring gene flow between unrelated individuals. Similar observations were previously made for isolated trees of Picea glauca (O’Connel et al. 2007). On the contrary, in large and denser subpopulations, where short and intermediate distances are well represented, mating between close neighbors can occur relatively frequently (see Fig. 2). However, in such conditions, biparental inbreeding can evolve only if relatives tend to be spatially closer than at random (Fenster et al. 2003). In the case of P. cembra, seeds are typically dispersed by birds (N. caryocatactes; Tomback et al. 1993) and squirrels (Sciurus vulgaris). Given that seeds of P. cembra have no structures facilitating dispersal, such a mechanism allows efficient colonization of new habitats. Birds typically bury several seeds together (Tomback et al. 1993), enabling the development of a family structure. In fact, strong spatial genetic structure was found in populations of P. cembra in the Swiss Alps (Salzer 2011), providing some support for this prediction. However, we cannot be sure that strong spatial genetic structure is also present in the study populations, where clumps of individuals due to seed caching seem to be relatively rare.

Temporal variation in outcrossing rates is often found in natural conifer populations (e.g., Shea 1987; Cheliak et al. 1985; El-Kassaby et al. 1993). It is due to various reasons, with environmental variation being most often indicated. However, seasonality in outcrossing rates inferred in our study might be also caused by the masting behaviour of the species (Ulber et al. 2004). In fact, the significantly lower outcrossing rate in 2009 coincided with the large number of seeds collected in that year, relative to the other years (see M&M). Although we are rather skeptical about a definitely causal linkage between these two characteristics, this might be reasonable to expect that higher fecundity leads to lower outcrossing rate (Shea 1987). Nonetheless, similarly as in O’Connel et al. (2006), we failed to show that a tree size, used as a proxy for fecundity, is associated with outcrossing rates. Therefore, further studies are needed to confirm the role of both male fecundity and masting behavior in shaping mating system parameters of P. cembra.

Usually, P. cembra in the Tatras occupies hardly accessible locations. In consequence, our study was based on a limited number of mother trees and localities. Despite that inferences on mating system parameters were in concordance with predictions, we need to stress that a small number of sampled mother trees may preclude very strong conclusions, especially regarding conservation strategies for the species. A special caution should be paid in the case of pollen dispersal inferences. In order to achieve sufficient statistical power, the parameters of pollen dispersal kernel were estimated based on the entire sample, without any respect of temporal or spatial variation of the dispersal, which might be present in our case. Thus, the estimated dispersal kernel represents rather the average across different population types and different seasons. On the other hand, the indirect approach (KINDIST) used in this study requires correlation of paternity to be a continuous and significant function of a distance between mother trees (Robledo-Arnuncio et al. 2006, 2007). Moreover, the latter is assumed to result entirely from pollen dispersal and not population genetic structure (Austerlitz and Smouse 2001). The absence of correlation between genetic and geographic distance found in the Tatras (Dzialuk et al. 2014) suggests that the observed pollen structure was indeed a result of limited pollen dispersal. Nonetheless, due to discontinuous sampling (see Fig. 2), the estimation procedure might suffer from the bias, which cannot be assessed based on the existing data.

As a threatened species, P. cembra in the Tatras is under strict protection. However, in situ conservation has been questioned as an efficient and satisfactory means to preserve gene pools. Our results show that gene flow through pollen should be generally sufficient to counterbalance genetic drift. In fact, studying genetic structure of P. cembra in the Tatras, we detected very weak genetic differentiation (Dzialuk et al. 2014), suggesting that gene exchange between subpopulations is probably high. On the other hand, based on mating system parameters, we argued that small and isolated subpopulations of P. cembra both in the Alps and Carpathians can experience severe local pollen limitation, which can dramatically lower seed production (Salzer and Gugerli 2012) and fitness of individuals due to inbreeding. In particular, seed yield can be reduced (Salzer and Gugerli 2012), lowering (variance) effective population size. However, because tree species seem to be generally “resistant” to the impact of fragmentation on the genetic diversity (O’Connel et al. 2006; Kramer et al. 2008), the question of whether high adult genetic homogeneity in the Tatras (Dzialuk et al. 2014) will be preserved in future generations remains open. To answer this, we need to know if potentially high pollen-mediated gene flow can preclude erosion of the gene pool in small populations. Therefore, the next-generation genetic structure remains a pending issue for future studies.


This study was supported by the research grant from the Polish National Science Centre (NN304 129336) to AD. We are grateful to the Tatra National Park, and especially to Mr. Tomasz Mączka, for his help in the field work. We thank Katarzyna Meyza and Ewa Sztupecka for their assistance in the laboratory work, Stewart Berlocher for his help with English editing, and three reviewers for their constructive comments on the first version of the manuscript.

Conflict of interest

The authors declare no conflict of interest.

Data archiving

If the manuscript is accepted, genotype data will be submitted to Dryad. All genotype data are deposited at the Reaserch Gate (, doi:10.13140/2.1.1198.2727).

Copyright information

© The Author(s) 2014

Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Authors and Affiliations

  1. 1.Department of GeneticsKazimierz Wielki UniversityBydgoszczPoland

Personalised recommendations