One approach to solving the considerable challenge of conserving biodiversity, in and the genetic resources of forest tree species in particular is through the study to the genetic structure and polymorphism in oak populations and their peculiarities, which are governed largely by origin.

In the current study, we analyzed the progeny of sessile oak [Quercus petraea (Matt) Liebl.], pedunculate oak (Q. robur L.), and their hybrids in the Trakas forest (Lithuania). Sessile oak overlaps with pedunculate oak in regions within the Atlantic and continental climate zones. Because the two oak species grow under different climatic conditions, the populations formed under these conditions are characterized by high intraspecific genetic variation and differ in terms of phenology, growth rate, shape, and other characteristics (Kleinschmit 1993). Many oak species are sympatric, and mating may take place between them. The large number of intermediate forms and hybrids makes the differentiation between the two focal oak species more difficult. However, Kleinschmit et al. (1995) managed to uncover small differences in morphological and genetic characteristics between these species. The authors believe that sessile and pedunculate oaks are ecotypes of the same biological species; however, Aas (1996) reasoned that, taxonomically and ecologically, they are two distinct species. Muir and Schlötterer (2005) also regarded these oaks as distinct species with a common ancestor since there is minor genetic differentiation between them. The results obtained by Gugerli et al. (2007) tend to support the hypothesis that sessile and pedunculate oaks are distinct taxa that arose from a common ancestor. Genetic diversity in autochthonous populations should be assessed at different hierarchical levels, namely, at the levels of species complexes (species belonging to the same botanical section with largely overlapping natural ranges and natural hybridization in all pairwise combinations), species within complexes, and populations within species (Kremer and Petit 1993).

In most cases, identifying whether the trees are sessile or pedunculate oaks is based on an analysis of only morphological characteristics, such as the shape of the leaf and bud, the length of the petiole, the color and texture of acorns, the crown shape, and the distribution of skeletal branches (Kremer and Goenaga 2002). However, the use of a morphological method as a diagnostic criterion in the analysis of the interspecific hybridization of oaks has a number of limitations, which are primarily due to the presence of a wide range of transitional forms of hybrid genotypes (Q. robur × Q. petraea), the lack of clear discreteness in terms of the morphological features used, and the range of intraspecific variability, which make it difficult to assign individuals to a particular taxon. As a result, typing results are often characterized by a high level of subjectivity (Petit et al. 2004).

The presence of permanent intensive hybridization processes in regions where pedunculate oak and sessile oak co-occur should influence the population genetic structure of plantings via an increase in the level of heterozygosity (due to hybrid offspring), the degree of genotypic diversity, and internal population subdivision and differentiation. An analysis of the mating system in mixed oak stands revealed an imbalance in the incidence of genotypic variants, which in turn pointed to the asymmetric nature of gene flow in populations (Neophytou et al. 2010). The selective nature of crossing can be explained by the presence of reproductive isolation: either pre-zygotic (e.g., a difference in the timing of flowering, competition during the germination of pollen grains or fertilization) or post-zygotic (e.g., reduced viability or death of hybrids at different stages of ontogeny). As shown in subsequent studies (during a series of artificial crosses and subsequent evaluation of hybrid offspring), pre-zygotic barriers make the greatest contribution to reproductive isolation (Lepais and Gerber 2011). Thus, for example, in contrast to the egg cells of pedunculate oak, those of sessile oak are preferentially fertilized by the spermatozoa of the same species (Petit et al. 2004).

Despite the extensive molecular genetic studies of the hybridization processes of sessile and pedunculate oaks that have considered the genetic structure of populations, the phylogeny of species, and so forth, new species-specific markers still remain to be found. Moreover, the variability detected by DNA analysis methods tends to be geographically specific, so the obtained data cannot be used in other regions (for the purposes of breeding selection and diagnosis of genotypes). Additionally, some aspects of the genetic structure of hybrid offspring from various types of backcrossing remain unexplored.

Based on these factors, we studied the features of the genetic structure of the hybrid offspring of Q. robur × Q. petraea of different origins. We used simple sequence repeats (SSRs [microsatellites]) and randomly amplified polymorphic DNA (RAPD) to reveal a wider range of ongoing processes of hybridization. The obtained data can be used to design long-term strategies and breeding programs to conserve genetic resources of the oak species under study.

Materials and methods

Individuals from 40 half-sib Quercus families of different origins used in a field trial in the Kaunas region of the Dubrava forest enterprise, Pajiesis forest district, of the Institute of Forestry, Lithuanian Research Centre for Agriculture and Forestry, were the object of the study. The site (site 27) is 22 m2 (0.3 ha, planted in 2009) has mesoeutrophic gleyic soils with temporary overmoisture (Lc). For the comparative analysis, the individuals were divided into six groups based on their taxonomic identity and maternal plant (based on a preliminary study of leaf morphological traits): No. 1 Q. petraea (M: Q. petraea) (19 families, 68 trees); No. 2 Q. robur (M: Q. robur) (5 families, 27 trees); No. 3 Hybrid (M: Q. petraea) (3 families, 13 trees); No. 4 Hybrid (M: Q. robur) (2 families, 10 trees); No. 5 Q. petraea (M: Hybrid) (3 families, 11 trees); and No. 6 Hybrid (M: hybrid) (8 families, 38 trees). The study of oaks taxonomic identity was performed in Trakas forest before the seed collection for progeny test was done (Baliuckas 2000). Oak trees were classified to either pedunculate or sessile using special computer program EICHE 1.0 (Degen and Reinholdt 1999). The parent oak trees were 79–112 years old.

Total DNA was obtained from leaf lamina using a modified CTAB protocol (Padutov et al. 2007). Molecular genetic analysis was performed with two approaches: (1) separate loci characterized by wide allelic diversity (SSR loci were used as the markers) and (2) dispersed genomic loci, which were characterized by low allelic diversity (RAPD loci were used as the markers). To obtain reliable results in the RAPD analysis, the following requirements were fulfilled: only brightly colored specific fractions were used, each sample underwent triple amplification, and a preliminary analysis of total DNA using universal ITS and 16S rRNA primers was conducted to ensure the absence of fungal and bacterial contaminants (White et al. 1990; Lu et al. 2000).

For PCR amplification, High-Fidelity DNA Polymerase (5 Taq:1 Pfu) Master Mix (Primetech, Belarus) and the protocol recommended by the manufacturer were used. A set of oligonucleotide sequences from a number of publications (Moreau et al. 1994; Steinkellner et al. 1997) were used as SSR and RAPD primers.

The SSR amplicons were separated electrophoretically using an ABI PRISM 310 Genetic Analyser (Life Technologies, Carlsbad, CA, USA) according to the instructions of the manufacturer. RAPD bands were analyzed in 1.4% agarose using 1.5 × TBE (Tris–borate–EDTA) buffer and standard procedure, followed by staining with ethidium bromide and visualization under UV light (Padutov et al. 2007).

Genetic and statistical data processing was performed using the software PopGen32 (Yeh 1999). The genetic structure of the studied oaks groups was displayed as frequencies of allelic variants. A number of statistical indicators describing the level of variability and the degree of subdivision were used to assess the basic population genetic parameters (Padutov 2001). For analysis of genetic differentiation, the Nei coefficient of genetic distance (DN) was used (Nei 1972).

Data interpretation, sample genotyping, statistical analyses, and calculation of the main genetic parameters were carried out in accordance with generally accepted systems (Gillet 1999).

Results and discussion

Frequencies of alleles in families of Quercus

During preliminary testing of the set of microsatellite loci, four SSR markers were selected: ssrQpZAG15, ssrQpZAG9, ssrQpZAG46, and ssrQpZAG7. These markers showed statistically significant differences in allelic or genotypic diversity between Q. robur and Q. petraea. Among the RAPD primers, the most informative were OP-R11, OP-R05, OP-R10, and OP-G06, which yielded diagnostically significant loci.

In the comparative SSR analysis of the allelic diversity of nuclear DNA from the Quercus groups, the microsatellite loci had different levels of variation, from a low of 9 (srQpZAG15) to a high of 32 (ssrQpZAG46). The average number of variations for ssrQpZAG9 was 23 and 20 for ssrQpZAG7. Notably, most of the identified allelic variants for each microsatellite locus were present at a low frequency or were rare. On the one hand, these variants reflect local geographic variability and can be used to estimate the gene flow between populations; on the other, they increase the degree of intraspecific subdivision, which reduces the diagnostic value of a marker for interspecific differences between Q. robur and Q. petraea. In addition, increased mutability of a genetic marker increases the probability of allelic variant homoplasy and, as a result, can lead to a false decrease in the level of interspecific differentiation.

Assessment of the level of genetic differentiation

Based on the frequencies of allele occurrence, the coefficients of genetic distance between groups were calculated, and a dendrogram was drawn (Fig. 1) to reflect the genetic and taxonomic relationships among Q. robur, Q. petraea, and hybrid genotypes of different origins (Petit et al. 2003). In the dendrogram, clustering of groups according to the level of genetic similarity coincides with the biological characteristics of each of the offspring types (by origin). Thus, the closest genetic structures occur for offspring identified by morphological features as belonging to Q. petraea, despite their different origins. This phenomenon is explained by the peculiarities of backcrossing of interspecific hybrids of oaks and is receiving much discussion (Petit et al. 2003). Based on the genetic structure of mixed populations, a model of resurrection of Q. petraea has been proposed in which the interbreeding of hybrids is saturated with sessile oaks. The following biological peculiarities of pedunculate and sessile oaks support this model: the presence of asymmetric gene flow due to assortative crossbreeding of hybrids (due to the coincidence of flowering periods) and the fertilization of interspecific oak hybrids by pollen of Q. petraea (at the stages of pollen tube germination and zygote formation). The apparent absence of fertilization in the offspring obtained from hybrid maternal plants, individuals with morphological characteristics of pedunculate oak, also supports the selective nature of fertilization.

Fig. 1
figure 1

Genetic differentiation among progeny groups based on SSRs. DN is the Nei coefficient of genetic distance. On the right, the identified species is given first; the maternal type (M) is in parentheses

The main element of the functioning of this hybridization model, which determines the effectiveness of saturating crossbred hybrids with Q. petraea features, as noted by some authors (Kremer and Goenaga 2002), is a limited number of morphological and physiological traits that are species-specific for sessile oak and pedunculate oak and their localization in several linkage groups, which allows the original morphotypes to be recreated in a small number of generations. This model is also confirmed by the absence of a large number of intermediate morphological forms of Q. robur and Q. petraea in mixed populations (Kremer and Goenaga 2002).

The second cluster is represented by hybrids originating from hybrid and Q. robur maternal trees, while hybrids obtained from Q. petraea were set apart on the dendrogram. These results can also be explained by the assortative nature of mating of pedunculate oak and sessile oak. As noted earlier, the analysis of the results of artificial pollination showed that the maternal trees of Q. robur could be fertilized with equal probability by pollen of Q. robur and Q. petraea; otherwise, the maternal trees of Q. petraea were pollinated preferentially by male gametes of their own species. At the same time, a small number of hybrid offspring of Q. petraea × Q. robur are also formed (Abadie et al. 2012). Moreover, both maternal and paternal genotypes are equally likely to participate in the formation of the plant karyotype. In the case of group 4 (hybrid plants originating from Q. robur maternal plants), pollination occurs with pollen from Q. petraea; group 6 is represented by hybrid genotypes originating from hybrid maternal plants (i.e., this group is mixed); and the haplotypes of Q. robur or Q. petraea, which comprise group 3, are formed by hybrid genotypes originating from Q. petraea as the paternal component. Thus, Q. robur acts as a pollinator in this group. Furthermore, as indicated earlier, in the maternal trees of Q. petraea, there is a certain degree of reproductive isolation in relation to Q. robur. Genotypic compatibility is necessary for fertilization, i.e., certain variants of Q. robur × Q. petraea genotypes are predisposed to interbreeding (Abadie et al. 2012). A detailed analysis of the genetic structure of this group revealed a high degree of genotypic monomorphy (with allelic variants peculiar to Q. robur), which causes greater similarity among families of hybrid genotypes with pedunculate oak and confirms the assumption of genotypic compatibility when crossing Q. robur and Q. petraea. In addition, the presence of the paternal haplotype in group 3 makes these families more similar in genetic structure to Q. petraea, and mixed group 6 occupies an intermediate position in the clustering. The largest genetic differences, as one would expect, are observed for families represented by genotypes belonging to Q. robur that are also classified as Q. robur based on morphological characteristics.

The use of a set of RAPD loci as genetic markers also revealed differences in the genetic structure between the studied oak families of different origins related to the representation and frequency of allelic variants in one group or another. Significant differences among the groups of families were found for loci, with a coefficient of genetic subdivision (GST) exceeding 0.09. Furthermore, RAPD markers with a low value of subdivision parameters were also characterized by negligible levels of differentiation between the groups, and the clustering results based on their analysis in the dendrogram were not statistically significant and varied greatly between the loci.

The structure of clusterization using loci with GST > 0.09 (57% of the total) was shown to be similar in most cases; thus, only the given set of markers could be used to study the degree of genetic differentiation of the Quercus families.

The structure of the dendrogram (Fig. 2) shows that all studied groups could be divided into two separate clusters: families represented by the seed progeny of pedunculate oak and those with a mixed origin of genotypes from Q. petraea and hybrid maternal plants. At the same time, there was some discrepancy regarding the placement of families inside clusters in relation to the expected distribution based on external features. This discrepancy can apparently be explained by an imbalance of genotypic structure within most of the groups with a hybrid origin. The dominant nature of RAPD loci did not allow us to directly manipulate the genotypes and allele frequencies (which are theoretically calculated values), which in the case of samples with unbalanced genotypic structures, introduces significant inaccuracies in the parameters of genetic polymorphism, subdivision, and differentiation. Thus, discrepancies between the results of the clustering and morphological analyses may indicate an imbalance of genotypic structure within groups. Further confirmation of this assumption was supported by the large value of the inbreeding coefficient (FIS) of 19.4% (varying from 14 to 25% among loci and 0.1–43% among families within groups), which was identified with microsatellite markers. Here, the maximum FIS value was obtained for the group 4 families (Hybrid [M: Q. robur]), which presumably explains its location in the dendrogram. Additionally, the diallelic nature of RAPD markers in assessing the processes of interspecific hybridization had several advantages over the multiple allelism of SSR loci. Thus, interspecific differences in the frequency of amplified allelic variants of RAPD loci were larger than those shown by the SSR markers. Moreover, the likelihood of homoplasy of the dominant allele is extremely low in RAPD analysis compared with microsatellite analysis. Importantly, the diagnostically significant RAPD loci in most hybrids were presented by the dominant allele of one of their parental taxa, which directly indicated the specific components in the genotypes of hybrid offspring.

Fig. 2
figure 2

Genetic differentiation among progeny groups based on the RAPD assay. DN is the Nei coefficient of genetic distance. On the right, the identified species is given first; the maternal type (M) is in parentheses

Recently, the identification and study of loci flanked by inverted repeats (including RAPDs) have become of special interest because of the similarity of their structural organization and migratory genetic elements (MGEs) (e.g., transposons, retrotransposons) was noted. MGEs are widely represented in the genomes of most plants and determine the formation and manifestation of many morphological and physiological traits (Kumar and Bennetzen 1999).

Analysis of parameters of genetic diversity based on SSR loci

As seen in Fig. 3, the lowest level of observed heterozygosity occurred in hybrid offspring originating from the original species, namely, Q. robur and Q. petraea, which on the one hand contradicts the biological concept of the hybridization process, but on the other hand, can be explained by the limited number of individuals participating in interbreeding due to the reproductive barrier associated with different flowering periods, progamic incompatibility, and other factors (Abadie et al. 2012). Additionally, the level of expected heterozygosity was higher in the case of maternal plants of Q. robur than for Q. petraea, thereby further confirming the selective nature of the pollination of sessile oak.

Fig. 3
figure 3

Observed and expected heterozygosity of the studied progeny groups (1–6) based on SSRs. Groups: 1, Q. petraea (M: Q. petraea); 2, Q. robur (M: Q. robur); 3, hybrid (M: Q. petraea); 4, hybrid (M: Q. robur); 5, Q. petraea (M: hybrid); 6, hybrid (M: hybrid)

The highest level of observed heterozygosity among hybrid offspring was found in half-sibling families originating from hybrid maternal plants. The increase in the number of heterozygous offspring may have been due to the greater number of variants of genotypes involved in crossing and the high reproductive plasticity of the whole group.

As noted earlier, the group of half-siblings of sessile oak obtained from hybrid maternal plants was homogeneous in its genetic structure, which led to the similarity of the values of the expected and observed heterozygosity parameters. For groups of families represented by Q. robur and Q. petraea species, the level of observed heterozygosity was slightly lower than expected, indicating the presence of inbreeding processes during natural pollination.

The level of polymorphism determined by the RAPD assay, as expected, was lower than the values obtained during the SSR analysis. The proportion of polymorphic loci (according to the 99% criterion) ranged from 79% (Hybrid P: Q. robur) to 100% (Q. robur/M: Q. robur, and Q. petraea/M: Q. petraea). The average number of alleles per locus varied from 1.790 (Hybrid M: Q. robur) to 2.000 (Q. robur/M: Q. robur and Q. petraea/M: Q. petraea). Nei’s genetic diversity (Nei 1978) for the studied groups varied from 0.277 (Hybrid M: Q. robur) to 0.324 (Q. robur/M: Q. robur) and was 0.331 overall.

A detailed analysis of the parameters of genetic polymorphism among individual loci revealed regularities in the distribution of Nei’s genetic diversity that were dependent on the diagnostic significance of the markers (Nei 1978). As an example, in Fig. 4 for loci OP-R11 (1078) and OP-R10 (792), species-specific loci showed the highest level of genetic diversity in one of the groups represented by non-hybrid genotypes. In an alternative group, this locus was characterized by the lowest level of genetic polymorphism, and hybrid genotypes occupied an intermediate position corresponding to the markers with dominant inheritance (Gillet 1999).

Fig. 4
figure 4

Nei’s genetic diversity based on RAPD loci OP-R11(1078) and OP-R10(792) in groups of Quercus families. Groups: 1, Q. petraea (M: Q. petraea); 2, Q. robur (M: Q. robur); 3, hybrid (M: Q. petraea); 4, hybrid (M: Q. robur); 5, Q. petraea (M: hybrid); 6, hybrid (M: hybrid)

Study of genotypic structure

The departure from panmixia in natural plantations refers to the formation of genotypic structure in the progeny that is different from that in the original maternal population. An additional factor that contributes to the transformation of the genotypic structure of a population is introgressive hybridization, which entails the emergence of new alleles through the formation of new allelic and locus combinations and permutations (Petit et al. 2003).

The genotypic structure of family groups was analyzed in two directions: estimation of the dominant composition of the genotypes in each of the groups and compliance of the genotype distribution with Hardy–Weinberg equilibrium. Based on the frequency of dominant (common) variants of the genotypes, Euclidean distance coefficients were calculated, and a dendrogram was drawn (Fig. 5).

Fig. 5
figure 5

Clustering based on the level of similarity of genotypic structures of groups of Quercus using SSRs. On the left, the identified species is given first; the maternal type (M) is in parentheses

As shown in the dendrogram, the greatest similarities in the frequency of dominant genotypes occurred between families of hybrid offspring originating from sessile oak and, conversely, sessile oak originating from hybrid maternal trees. Furthermore, the cluster was joined by the group of families of Q. petraea (M: Q. petraea). On separate branches, in decreasing similarity, it was joined by Hybrid (M: Q. robur), Q. robur (M: Q. robur), and Hybrid (M: Hybrid). The comparative analysis of the structures of the dendrograms obtained by allele and genotype frequencies showed that the structure of clustering in the first and second cases differed significantly (data not shown). This difference was primarily due to the greater diversity of genotypes than of allelic polymorphisms. In addition, as previously noted, the use of rare and unique variants of genotypes (the total share of which reaches 40%) for comparison was not possible, which in turn made a significant contribution to the results of clustering.

To determine the imbalance of the genotypic structure, we calculated the deviations from the theoretically expected patterns for each variant (Brown and Feldman 1981). As shown in Fig. 6, each graph includes a peak characterizing the variants of genotypes by the largest differences between their observed and expected frequencies. The left side (from the peak) of the area is represented by genotypes for which the calculated values of the frequency of the genotypes were higher than the observed values. Here, the x-axis represents the share of these variants in the group, and the y-axis shows the level of deviation, presented in unit fractions. The right area (from the peak) is represented by genotypes that were characterized by an excess of variants compared to the panmictic population.

Fig. 6
figure 6

Distribution of deviations (σp) of frequencies of genotype variants (NV) in groups of Quercus based on SSRs

The lowest level of imbalance was found for the group Q. petraea (M: Q. petraea). For most genotypes, deviations in the frequency of occurrence from the calculated frequency of occurrence did not exceed 0.5%. In the group containing the half-sib offspring Q. robur (M: Q. robur), the average level of structural imbalance was also negligible (< 1.5%) and was mainly due to an excess number of homozygous genotypes and the lack of heterozygous ones. Analysis of the primary sampling of individuals of oak trees by the loci used did not reveal the presence of null alleles with a frequency exceeding 1%. For the other studied groups, the discrepancy in the incidence of genotypic variants was more significant, indicating the assortative nature of crosses and the dominance of certain combinations of alleles in the offspring.


Based on the genetic structure of hybrid offspring of Q. robur × Q. petraea of various origins, the following conclusions can be drawn: interspecific hybridization of pedunculate oak and sessile oak in mixed plantations does not proceed stochastically but suggests that the reproductive barrier can be overcome, based on data from the literature and phenological, physiological and other differences; and based on the asymmetric nature of crossing during introgression, a limited number of individuals are involved, which affects the genetic structure of the hybrid offspring—lower values of the variability indices and intrapopulation subdivision than observed in stands of the original species.

Since sessile oak is adapted to grow in less fertile and drier sites, planting of this tree species should be increased in the abandoned agricultural areas in southern Lithuania. The changes resulting from global warming are likely to be more favorable for sessile oak and will also provide wider possibilities for its usage in forestry. The obtained scientific information should inform the design of strategies for germplasm conservation, long-term breeding strategies and programs, and practical breeding on an eco-genetic basis specifically for these tree species.