Introduction

The heritable basis of intraspecific and interspecific variation of forest trees has been of interest far long before the first tools appeared, allowing to study it explicitly. In contrast to other wild-growing plants, the existence of genetic variation has direct economical implications in trees. At the advent of the modern forestry, associated with extensive planting of forests, seeds and plants were procured from the most easily accessible sources and transferred across species’ ranges without limitation. This approach frequently led to economical disaster for forest owners in terms of huge losses on timber production and other benefits from forests. Therefore, field experiments such as common gardens have rapidly become a tool guiding the choice of seed sources and transfers of forest reproductive materials. The first field trial testing populations of various origins (provenances) of Scots pine (Pinus sylvestris L.) collected across Europe was established already in 1745 by French navy officer Duhamel du Monceau (Langlet 1971) and was followed by similar experiments, aiming at the selection of seed sources for reforestation. Of course, for a forester, the main selection criterion is yield rather than Darwinian fitness (König 2005); however, if a newly established forest stand survives until the rotation age (typically more than 100 years) and produces a high volume of biomass, it can be regarded as a sign that it is site-adapted. International coordination of provenance research was also among the prominent motivations for establishing the International Union of Forestry Research Organizations (IUFRO) in 1892 (Seppälä 1998). At the territory of the present-day Slovakia, IUFRO-coordinated provenance experiments were established as early as in 1909 (Konôpka et al. 2013). Since then, a plenty of national as well as international provenance trials and progeny tests with practically all commercially important tree species have been established in Slovakia.

The problem of common-garden experiments is that the commonly assessed phenotypic traits are typically complex, polygenic, and are affected by the environment, which cannot be fully controlled in field experiments; therefore, they cannot be used to detect genetic variation at the genic level. With the development of the methods of electrophoretic separation of proteins, analysis of allelic variants at genes controlling the synthesis of enzymes (isoenzymes/isozymes/allozymes) became the first widespread tool (Feret and Bergmann 1976), and allozyme markers were considered a standard for population genetic studies in forest trees until the late 1990s. Even though the potential of the analysis of the hereditary material itself was recognized quite early (Hall et al. 1976), the use of DNA-based markers in forest trees quickly expanded only after polymerase chain reaction (PCR) has become widely available.

In Slovakia (more exactly, Czechoslovakia), the boom of marker studies was seized quite early: the first isozyme lab dealing with trees was established at the Institute of Dendrobiology of the Slovak Academy of Sciences in Mlyňany in mid-1980s, (later moved to Institute of Plant Genetics and Biotechnology in Nitra), and with a small delay, another lab at the University of Forestry and Wood Technology in Zvolen (today Technical University in Zvolen) followed. At the Forestry Research Institute in Zvolen, a separate research group was established, although technically dependent on the university lab. Later, allozymes have become gradually replaced by various types of DNA markers.

Due to their longevity, slow generation turnover and huge dimensions at adult age, forest trees have never been favourite models for the studies of fundamental phenomena in genetics. The overwhelming majority of the studies in forest trees used genetic tools to address the same issues as the botanists usually do: evolutionary and migration history of species, adaptation to environment, gene flow and hybridization etc. The progress in ‘population genetics of forest trees’ since the appearance of genetic markers has in fact been identical with progress in botany, not only in Slovakia but also elsewhere in the world.

The aim of this review is summarizing (by far not exhaustively) the most important outcomes of studies employing various types of genetic markers in forest trees in Slovakia, not necessarily based only on local biological materials.

Descriptive studies

Like in other cases of exploration of an unknown terrain, marker studies started with the description of variation patterns and assessment of variation levels. The very first allozyme study coming from Slovakia was that of Kormuťák (1981) on isozyme variation in Abies species. Firs, and especially silver fir (Abies alba Mill.) remained the main object of interest during this initial stage (Kormuťák et al. 1982, 1993; Matúšová 1995; Ziegenhagen et al. 1996). These studies were largely restricted to the assessment of variation levels, and their very local geographical scale did not allow any deeper insights into the evolutionary context, even when such attempts were sometimes made.

In Zvolen, in accordance with the university profile, initial studies were more focused on forestry-relevant issues such as the effects of management on genetic structures of forest stands or seed orchard crops, again mainly on a local scale. A study in Norway spruce (Picea abies Karst.) revealed that old-growth stands and naturally regenerated managed stands showed similar genetic variation levels, while stands originating from reforestation were less genetically diverse and strongly but randomly differentiated, probably due to small effective population sizes of seed sources and the resulting genetic drift (Gömöry 1992). Studies in seed orchards revealed contrasting self-fertilization rates and levels of genetic contamination of seed crop by background pollen in European larch (Larix decidua Mill.) and Scots pine, resulting from different properties of pollen grains (different size and the absence of air sacks in larch; Paule and Gömöry 1992). The issue of seed orchard management was later revisited: effective population sizes of seed orchards of Pinus sp. were assessed based on male and female flowering intensity and flowering phenology, and the effects of management measures on the genetic structures of their crops were assessed using allozyme markers (Gömöry et al. 2003; Machanská et al. 2013).

Phylogeny and postglacial migration

For the research group in Zvolen, beech taxa in western Eurasia (Fagus sylvatica L. sensu lato) have been a favourite object of interest since the early 1990s, when a large-scale monitoring of genetic variation of this species started in cooperation with University of Bordeaux, France. Through international collaboration and own collections, a set of 280 population samples covering the eastern half of the distribution of F. sylvatica subsp. sylvatica (European beech) and the whole range of F. sylvatica subsp. orientalis (eastern or oriental beech) was assembled and analyzed by allozyme markers. The resulting studies revealed that oriental beech is incomparably more differentiated and more diverse than European beech and is composed of three geographically separated genetic lineages, distributed in the Alborz Mts. in Iran, the Caucasus and the Asia Minor (extending to eastern Greece and Bulgaria), European beech being genetically most proximate to the Turkish lineage. A substantially higher allelic richness in oriental beech is an indication that the Pleistocene glaciations affected its gene pool less severely than European beech (Gömöry et al. 2007; Gömöry and Paule 2010). On the other hand, climatic fluctuations during the Pliocene and the Pleistocene caused expansions and retreats of distribution ranges, accompanied by contacts and hybridization between regional genetic lineages; the most prominent example is beech in Crimea, sometimes considered a separate taxon F. taurica Popl. Subsequent development of mathematical and informatics tools allowed explicit testing of phylogenetic scenarios for this taxonomical complex by Approximate Bayesian Computation (ABC). The best-supported scenario placed the splits of genetic lineages of oriental beech into the Early Pleistocene (1.62–1.87 My), while the clade leading to European beech diverged from the Turkish lineage at 1.18 My. The hybrid origin of Crimean beech from the hybridization between the Caucasian and the European lineage was confirmed, the timing of this event being 144 ky (Late Pleistocene). On the other hand, another suspected hybrid lineage, the Balkan beech (treated as separate taxon F. moesiaca Czeczott by local botanists) is just a separate clade diverged from the Turkish lineage during the Cromer interglacial (817 ky), and the current European beech occupying a major part of Europe split from the Balkan lineage at the beginning of the Weichsel/Würm glacial (96 ky) (Gömöry et al. 2018).

When at the beginning of 2000s, the 5th FP EU project Fossilva was organized, aiming at the reconstruction of Holocene migration of main European tree species based on the combination of paleobotanical and genetic methods, the F. sylvatica allozyme dataset, along with a similar dataset on A. alba gathered at the Forestry Research Institute, Zvolen, were included in the analyses. In contrast with earlier assumptions, both types of data showed that the principal glacial refugia, from which the major part of distribution ranges of temperate tree species were colonized, were not located in the very southern parts of the main European southern peninsulas (Balkans, Apennine and Iberian) but farther in the North. In the case of beech, the main refugium was situated at the eastern foothills of the Alps, from where beech migrated to almost the whole current range. The spread from refugia in Italy and southern Balkans was blocked by large river valleys (Po, Danube). The refugia at the French Mediterranean coast started to expand too late (6–7 ky), the surrounding areas were already colonized by the Slovenian/Istrian genetic lineage. Finally, the refugia in the Cantabrian mountains (documented by macrofossils, pollen record as well as allozyme data) did not expand at all, their population sizes were probably too small and the resulting genetic drift may have depleted their gene pools of adaptively important alleles; they were overlaid by newcomers from the main refugium (Magri et al. 2006).

The migration history of silver fir is partly different. Principal refugia contributing to the colonization of the current range were located in northern Apennines and southern Balkans; allozyme data suggest a third important refugium in northwestern Balkans, but this is not sufficiently documented by the fossil record (Liepelt et al. 2009). Maternally inherited mitochondrial markers allowed tracing the migration trails quite in detail: the Pannonian lowland forms a large gap in the distribution of fir, the contact zone east of this gap (the Carpathians) was found to be very short (~70 km) and mixed-haplotype populations are found only in the close proximity of the boundary between mitochondrial lineages. On the other hand, the boundary in the West (the Dinarians) is long and complicated, when the Balkan lineage migrated along the coast even up to northeastern Italy, while the Central-European lineage colonized the interior until eastern Bosnia (Gömöry et al. 2004). Origin from different refugia has also practical implications: dendrochronological analyses documented that genetic lineages differ in growth patterns, resulting from different sensitivity to climate and industrial pollution (Bošeľa et al. 2016).

In the genus Abies, species richness in the Mediterranean area is especially high in the eastern part. There is, however, no unanimity about taxonomical treatment of local populations. The species status of A. nordmanniana (Steven) Spach and A. cilicica (Antoine & Kotschy) Carr. is undisputed, while A. equi-trojani Asch. et Sint. ex Boiss. and A. bornmuelleriana Mattf. are considered separate species by local botanists but later revised as subspecies of A. nordmanniana by Coode and Cullen (1965). Phylogeny of this complex was studied by Hrivnák et al. (2017) using nuclear microsatellites (eastern lineage of A. alba Mill. and A. cephalonica Loud. were included for comparison). ABC simulations showed that A. alba and A. nordmanniana diverged from the common ancestor 3.73 My BP. A. bornmuelleriana split from the southern A. nordmanniana lineage 3.08 My BP, i.e., both events occurred still in the Pliocene. In A. nordmanniana, two genetic lineages were revealed (cryptic subspecies), separated by the ridge of the Grand Caucasus, which diverged 692 ky BP. Finally, A. equi-trojani appeared 621 ky BP, i.e., in the Early Pleistocene. Divergence times indicate that A. bornmuelleriana would deserve species rather than subspecies status. Meanwhile the two subspecies of A. cilicica showed a level of genetic differentiation even higher than for the other studied species – on the other hand, ABC simulations showed divergence of the two subspecies at 695 ky BP, conforming to the level of subspecies in this study.

On a genus-wide scale, phylogeny of Abies was addressed in a PCR-RFLP study of chloroplast DNA by Kormuťák et al. (2004), covering 15 Asian, 6 North-American and 7 Mediterranean species. The analysis revealed a strong divergence of the Mediterranean group from the rest, differentiation within the North-American group (especially between sections Balsameae and Grandes), and less variation among the Asian firs, with the exception of A. mariesii Mast.

A peculiar studied species was Picea omorika (Panč.) Purk, which is a narrow endemic of deep river valleys in Serbia and Bosnia and Herzegovina. In spite of a minuscule range, two genetic lineages were found within the Bosnian populations (Ballian et al. 2006). A chaotic pattern of genetic differentiation a low allelic richness are indication of a strong genetic drift in this Tertiary relic, which occupied a large part of Europe during the Pleistocene (Ravazzi 2002), but as a species adapted to cold and moist climate, it remained trapped in favourable sites within the refugial area and was not able to spread across the Balkans with generally warm and summer-dry climate.

Gene flow and hybridization

Hybridization has always been a high-priority topic for the research team in Mlyňany/Nitra. The initial object of interest was the genus Abies, where pioneering work has been done in artificial hybridization, profiting from rich resources of the Arboretum Mlyňany (Kormuťák 1985, 2004; Kormuťák et al. 2008b, 2013). Cytological aspects of interspecific hybridization in firs were studied as well (Kormuťák 1986; Kormuťák et al. 2002). Genetic markers (allozymes, PCR-RFLP cpDNA) were initially used just to verify the hybrid status of the offspring (Kormuťák et al. 1992, 1993) and to confirm paternal inheritance of plastids (Salaj et al. 1998). Later, the focus shifted to natural hybridization of pines, mainly the studies of hybrid swarms of P. sylvestris L. and P. mugo Turra. Again, PCR-RFLP cpDNA markers were initially used for validation of hybrid status and quantification of hybrid seeds in open-pollinated progenies collected in nature (Kormuťák et al. 2005, 2008a; Maňka et al. 2015). However, development and verification of species-specific cpDNA markers combined with nuclear markers and morphometry allowed assessing the degree and direction of introgression and revealed maternal inheritance of plastid DNA in P. mugo × P. sylvestris crosses, which is exceptional in the Pinaceae (Kormuťák et al. 2017, 2018, 2020).

Another species-rich taxon characterized by high levels of interspecific gene flow is oaks. In Slovakia, the species of Quercus L. subg. Lepidobalanus (Endl.) Oerst. share substantial parts of their genomes. A combination of morphometry and allozyme analysis in mixed populations showed that even individuals unambiguously classified as Q. robur L. based on morphological traits may contain up to 90% of genes typical for Q. petraea (Matt.) Liebl. and vice versa; to a lesser extent, this is true also for Q. pubescens Willd. (Gömöry and Schmidtová 2007). Whether this is a result of sharing ancestral polymorphisms or recent gene flow is still a matter of controversy (Muir and Schlötterer 2005 vs. Lexer et al. 2006). Anyway, not all portions of the genome show identical levels of interspecific differentiation: while in most allozyme genes, the differentiation is very small, the gene for glucose dehydrogenase exhibited coefficient of differentiation of 0.47. Taking into account contrasting ecological requirement of sessile vs. pedunculate oak, such skewed distribution of the coefficient of differentiation suggests that the gene in question may either be under selection pressure itself or be linked to adaptively significant genes (Gömöry et al. 2001).

Genetic architecture of hybrid zones is another topic associated with gene exchange between differentiated genetic lineages. As stated above, there is a plenty of contact lines between ranges of different but interfertile species or intraspecific genetic lineages originating for instance from different refugia. One of them is the contact zone of A. alba and A. cephalonica in southern Balkans, forming a hybrid taxon A. borisii-regis Mattf. Krajmerová et al. (2016) studied this zone employing a combination of a maternally inherited mitochondrial marker and nuclear microsatellites. The distribution of maternal lineages (reflecting colonization by seed dispersal) showed a sharp boundary between the A. alba and A. cephalonica lineages around 39.10°N, while the latitudinal cline is very steep (width of 0.20° ≈ 22 km). In contrast, distribution of nuclear gene pools (reflecting gene flow by pollen) suggests a broad introgression zone with a centre at 40.14°N and width of 2.32° ≈ 255 km. As no gene or gene pool specific for A. borisii-regis was found, this taxon represents a hybrid swarm rather than a separate species.

The contact zone of genetic lineages coming from the Balkan and Central-European refugia may serve as another example. Liepelt et al. (2002) treated the whole central Europe as a single hybrid zone between these lineages, forgetting the fact that the Pannonian plane did not harbour any fir populations during the last glacial and the Holocene, and colonization appeared along separate migration trails east and west of this hiatus. This led to unrealistically high estimates of longitudinal cline widths both for seed-dispersed mitochondrial genes (1.72° ≈ 130 km) and pollen-dispersed chloroplast genes (25.47° ≈ 1990 km). A re-analysis of the eastern (Carpathian) contact zone by Gömöry et al. (2012a) showed cline width better comparable with A. borisii-regis (18 km and 120 km for seed and pollen dispersal, respectively). Moreover, clines in another anemochorous tree, Fraxinus excelsior L, were comparably wide (36 km and 275 km for seed and pollen dispersal, respectively) (Gömöry et al. 2012b).

There is a striking difference in the geographical patterns of hybrid zones between wind-dispersed and animal-dispersed species. In both above cases (representing wind-dispersed trees), the contact zone is not necessarily straight but it is always sharp: populations mixed of representatives of both maternal lineages are very scarce and appear just in the closest vicinity of the contact line. This suggests that the colonization fronts in anemochorous species are typically compact, and when they meet, gene flow by seeds across the meeting line is very limited, while a further movement of the colonization front is blocked. In contrast, in zoochorous trees such as beech or oaks, colonization is accomplished by long-distance dispersal events, when a few beechnuts or acorns are transported far from the colonization front (usually by birds), buried or lost, and give rise to a new local subpopulation. The outcome of this dispersal strategy is a broad hybrid zone, where pure patches of the either haplotype alternate (Magri et al. 2006). In oaks, the same haplotype is even shared across species (Petit et al. 1997).

Adaptation

Since Darwin’s times, adaptation to a specific niche through natural selection has always been considered the most important mechanism of evolution. In forest trees, taking into account their commercial importance and the role of dominant components of forest ecosystems, the knowledge of adaptation mechanisms is of utterly practical importance.

Industrial pollution, especially sulphur dioxide, has long been the main factor devastating forests in Central Europe. Longauer et al. (2001) compared genetic structures of P. abies, A. alba and F. sylvatica in pollution-affected stands in subsets of healthy vs. declining trees. The most pronounced effects were found in spruce: paradoxically, declining trees were richer in alleles and more heterozygous than healthy trees. As the declining trees are all dead this time, this means that selection induced by pollution (in this particular case a combination of SO2 and heavy metals) caused depletion of gene pools of the affected stands.

Common-garden experiments may serve as very useful objects for studying adaptation. First, in the case of trees, they typically contain materials collected across a large part of a species’ range. Second, all population samples are planted in identical macrosites, i.e. a large part of environmental effects on the assessed traits is eliminated. Gömöry et al. (2015) used the Slovak plot of a Europe-wide beech provenance experiment to identify traits under selection pressure by comparison of differentiation at quantitative traits (approximated by the coefficient of phenotypic differentiation PST) and differentiation at neutral markers (FST). Local adaptation driven by diversifying selection was found for all vegetative-phenology-related traits such as budburst or leaf discoloration date (as indicated by PST > FST), whereas stabilizing selection seems to operate on some physiological traits, e.g. stomatal conductance or transpiration rate (PST < FST).

To get a deeper insight into the issue of adaptation, Sanger sequencing of candidate genes was used in P. abies, F. sylvatica and A. alba; candidate genes showing promising results in earlier studies were chosen; however, in different environmental contexts and under different experimental setups. Again, studies profited from the collection of materials in provenance trials and earlier scoring of growth and phenology, as well as detailed measurements of physiological traits (Romšáková et al. 2012; Krajmerová et al. 2017; Hrivnák et al. 2019; Konôpková et al. 2019). The results were not always consistent among methods (FST -outlier approach vs. environmental association analysis), which is, however, quite common in this type of studies. In some genes, such as dehydrin-controlling loci, the observation of the signals of climate-driven selection at several single-nucleotide polymorphisms (SNP) was expectable and obvious (Krajmerová et al. 2017). In other cases, the exact mechanisms of the action of different alleles and their protein products showing significant associations with climate or adaptive phenotypic traits still need to be identified.

Perspectives

The problem of all markers used in the above studies is a limited representativeness; limited in terms of the genome coverage rather than the covering of species’ ranges. The numbers of marker loci typically range between 10 and 30 for allozymes and nuclear microsatellites, and with a few exceptions nothing is known about their positions in the genomes. On one hand, the advantage is that a type of markers can be chosen, which is optimal for a respective type of study (e.g, neutral markers for phylogenetic studies, maternally inherited markers to follow migration etc.). On the other hand, generalization and upscaling to a genome-wide scale is hardly possible. The same applies to Sanger sequencing of particular genes or DNA regions. In forest tree genomics, the candidate gene approach has been preferred because of a rapid decay of linkage disequilibrium in tree populations, which makes association studies difficult. The tested candidate genes are usually orthologs or functional equivalents of genes identified in model species (such as Arabidopsis, tobacco, maize or Populus trichocarpa) as responsive to environmental stress or associated with phenotypic traits such as phenology, growth, resistance to drought, heat or diseases, etc. Prior to application in natural populations, they require validation by gene expression or environmental association studies (Neale and Kremer 2011).

The progress of next-generation sequencing (NGS) methods opened much broader possibilities in all fields mentioned in the above sections, as it allowed extending the scale to the whole genome. Of course, resequencing complete genomes is still far from being an operational method of research at the population scale with a justifiable cost-benefit balance (especially not in the main European conifers with genome sizes of ~20 Gbp). Reduced-representation approaches such as restriction-site associated DNA sequencing (RAD-Seq) allows identification of 104–105 SNPs regularly scattered across the genome at reasonable costs. The Zvolen lab applied double-digest RAD-Seq in evolutionary studies in Alnus glutinosa (L.) Gaertn, F. sylvatica or P. abies, whereas phylogeny, genomic architecture in hybrid zones, and climatic and edaphic adaptation are the targeted phenomena. Although travel limitations associated with the Covid-19 crisis hampered completing the materials collection and finalizing the results in terms of publication, preliminary results indicate that in all studied species, SNPs of adaptive value can be identified.

Of course, NGS offers much broader possibilities than just reduced-representation sequencing. Reference genomes, indispensable for a plenty of purposes in genetic research, are still available only for a limited number of forest trees. The Zvolen lab participated in sequencing of the Abies alba genome (Mosca et al. 2019). Currently, a consortium of research institutions in Europe (European Reference Genome Atlas; ERGA) was established to generate reference genome assemblies for diverse European species inhabiting aquatic and terrestrial ecosystems, ranging from threatened, endemic and keystone species to species of economic importance. Rapidly decreasing sequencing costs may open an opportunity window for Slovak research teams in NGS application in genome-wide association studies, identification of epigenetic changes, transcriptome analysis and other fields.