Introduction

Corylus avellana L. (hazel or hazelnut) is a long-lived, widespread, multistemmed shrub. Its geographic distribution extends from the Mediterranean coast of North-Africa northward to Britain and Scandinavia and from the Atlantic coast of Europe eastward to the Ural mountains (Persson et al. 2004; Kasapligil 1972). Hazelnut is monoecious, dichogamous and wind-pollinated (Germain 1994). Like other plant species, the distribution of hazel in Europe is strongly affected by postglacial recolonization (Huntley and Birks 1983; Huntley 1990; Palmé and Vendramin 2002). In recent years and decades analyses of spatial genetic structures in forest tree and shrub species have yielded valuable information concerning this process, especially concerning the number and approximate location of glacial refugia and their expansion to the central and northern parts of Europe (see for example, Petit et al. 2002, 2003; Magri et al. 2006; Tollefsrud et al. 2008).

The most recent studies of genetic structures in natural populations of hazel are those of Palmé and Vendramin (2002) and Persson et al. (2004). Palmé and Vendramin analysed chloroplast DNA (cpDNA) variation in 26 European populations. Their findings indicate rapid expansion of hazel from one large refugial area in the southwest of France or from different scattered refugia in the west of France into most of Europe excluding Italy and the Balkans. Persson et al. (2004) analysed isozyme genetic variation and identified effects of historical bottlenecks in marginal populations combined with effects of vegetative reproduction. Because hazel is one of the world’s major nut crops, several investigations have dealt with the characterization of hazelnut cultivars (Boccacci et al. 2008; Boccacci and Botta 2009; Gökirmak et al. 2009). Human impact on hazelnut is assumed to have occurred since the Mesolithic era (10,000–6,000 years bp). It is known to have been cultivated by the Romans, but the history of the domestication of C. avellana is still under debate (Tallantire 2002; Boccacci and Botta 2009; Kuster 2000). The strong demand for and intensive trade in hazelnuts suggests a strong human impact on the genetic structures of hazel concerning the distribution of hazel genotypes all over Europe. Thus, extant genetic structures are considered to have resulted from both natural processes such as postglacial remigration and local adaptation and human impact due to domestication and transfer of germplasm.

In Germany there is an ongoing debate concerning this issue (Anonymous 2004; Kowarik and Seitz 2003; Spethmann 2003) particularly because of the new German federal conservation law (BnatSchG 2010) which aims to conserve regional genetic structures and “autochthonous” populations. In our survey we have analysed genetic variation in hazel in most parts of Germany with nuclear (codominant isozymes and amplified fragment length polymorphisms, AFLPs, as dominant markers) and chloroplast (cpDNA-SSR) markers. This is the first systematic description of genetic variation within and between populations of hazel from different regions in Germany.

Based on the analysis of population differentiation at three different marker systems we aim to contribute to the discussion on the role of natural versus human impacts on genetic structures of hazel with a main focus on Germany. Specifically, we tested the hypotheses that (1) German populations are not differentiated from hazel populations in southern and southeastern Europe, and (2) that there has been no impact of the spatial distribution of populations on their genetic structures.

Materials and methods

Populations sampled

Plant material was collected in early spring 2009. Populations were selected by local forest research centres if age, size, site conditions and plant associations indicated nativeness and presumable autochthony and if anthropogenic influences could be excluded as far as possible. Usually these populations are associated with forests and far from human settlements.

In part these populations had already been selected for the conservation of genetic resources. In total 20 populations (Table 1) comprising 18 German populations (for locations see Fig. 1), one Italian population and one Hungarian population, were collected with up to 100 randomly selected samples in each. All plants were sampled if the population size was below 100.

Table 1 Populations sampled, abbreviations, sample size and latitude and longitude
Fig. 1
figure 1

Geographic distribution of the investigated stands of hazel in Germany

Methods

Isozymes were extracted from fresh buds and were separated and stained using standard procedures of isozyme electrophoresis (Wendel and Weeden 1989) with slight modifications. In total seven isozyme systems were investigated encoding 11 gene loci (Pgm-A, Pgi-B, Pgi-C, Got-B, Adh-A, Mdh-A, Mdh-B, Mdh-C, Skdh-A, 6Pgdh-A, 6Pgdh-B). Up to 100 samples per population were analysed (Table 1).

Frozen material of each sample was used for DNA extraction. DNA was extracted from 50 samples per population using a Dneasy 96 Plant kit (Qiagen, Hilden, Germany). Samples were analysed at AFLPs (Vos et al. 1995, with slight modification as described by Gailing and von Wühlisch 2004); a subset of 20 samples was investigated by cpDNA-SSR analysis (Weising and Gardner 1999; Palmé and Vendramin 2002). Initially, different AFLP–primer combinations were analysed in a subset of 48 samples comprising three German populations and those from Italy and Hungary to check variability and reproducibility of the detected fragments. Two pairs of standard AFLP primers with three selective nucleotides (primer 1 EcoRI-ACT/MseI-GAA; primer 2 EcoRI-ACA/MseI-GAA; nomenclature according to Keygene, http://wheat.pw.usda.gov/ggpages/keygeneAFLPs.html) with reliable and highly reproducible results were selected for further studies. Ccmp-primers (ccmp1–ccmp10, except ccmp9) were analysed for cpDNA variation. DNA fragments were separated on an ABI 3100 Genetic Analyzer with the internal size standard GS 500 ROX (Applied Biosystems). The fragments were scored using Genescan and Genotyper software (Applied Biosystems).

Genetic variations of codominant isozymes, dominant AFLPs and cpDNA haplotypes were calculated with the software GENALEX 6.41 (Peakall and Smouse 2006), GSED (Gillet 2010), NTSYS 2.01d (Applied Biostatistics Inc.; copyright 1986–1997) and POPGENE (Yeh and Boyle 1999). AFLP data were transformed in a zero/one matrix and allele frequencies were estimated assuming random mating. Genetic variation within populations was characterized in terms of Na (number of different alleles), Ne (effective number), (\( 1/\sum p_{\text{i}}^{2} \)), He (expected heterozygosity), and PPL (percentage of polymorphic loci). The ability of single loci to differentiate between stands and regions was estimated by single locus F ST values calculated with the software POPGENE. Analysis of molecular variance (AMOVA, ΦPT), which is conceptually related to F ST or G ST, was used to estimate genetic variation among populations (Excoffier et al. 1992; Huff et al. 1993; Peakall et al. 1995; Michalakis and Excoffier 1996 according to Peakall and Smouse 2006). Genetic distance d0 (Gregorius 1974) was calculated with GSED. The distance d0 varies between 0 and 1. If d0 is applied for comparing genetic structures, the distance between two populations is zero if their genetic structures are identical. d0 reaches its maximum value of 1 if two populations have no single allele in common. Unweighted pair-group method with arithmetic mean (UPGMA) cluster analysis was applied using d0. Correlation graphs of geographic (x axis in kilometres) and genetic distances (y axis genetic distance d0) were plotted using the software GENALEX 6.41. Isolation by distance was investigated with the Mantel test (Mantel 1967) and spatial genetic structure analysis (999 permutations each) implemented in this software. Spatial genetic structure analysis generates the autocorrelation coefficient r which is bounded by −1, +1 and displayed as a correlogram. The coefficient r is related to Moran’s I (Peakall and Smouse 2006; Peakall et al. 2003; Moran 1950). For our data maximum geographic distance was divided into ten distance classes of equal size in kilometres. Significant geographic genetic substructures within a distance class are detected if the r value exceeds the upper/lower boundaries of the 95 % confidence interval in the correlogram. The identification of genetic boundaries, namely those areas where genetic structures show an abrupt rate of change, were analysed by Monmorier’s algorithm using the software BARRIER version 2.2 (Manni et al. 2004). A rough map was constructed by Voronoï tessellation which represents a polygonal neighbourhood for each sample (population) that is constituted of those points on a plane that are closer to such sample than to any other one. The analysis was performed using the default routine without changing the edges of the triangulation. Six barriers were computed in a hierarchical order (a; b; c;…).

Results

Genetic variation within and between populations

In total, 181 AFLP fragments, 11 isozyme gene loci and 9 cpDNA-SSRs were analysed. Genetic characteristics of hazel displayed by the different marker systems (Table 2) show the highest estimate for within population genetic variation (Na, Ne, He, and PPL) for isozymes, followed by AFLPs and cpDNA-SSRs with very low values. The hierarchical distribution of genetic variation within and between populations was very different between nuclear and cpDNA markers. Nuclear and biparentally inherited markers revealed low between-population genetic variation with ΦPT values of 0.0351 (P = 0.001) and 0.0347 (P = 0.001) for isozymes and AFLPs, respectively, whereas between-populations differentiation was very high for uniparentally inherited cpDNA-SSRs with a ΦPT value of 0.933 (P = 0.001). The amounts of private (unique to a single population) and locally common alleles were highest in AFLPs, followed by isozymes and cpDNA-SSRs (Table 3). To estimate the ability of a genetic marker system to detect genetic differentiation and spatial genetic structures, single loci F ST values were calculated over all loci for isozymes, AFLPs and cpDNA-SSRs (Fig. 2). Maximum F ST values of 100 % for single loci were observed for cpDNA-SSRs. Much lower values were shown for AFLPS with values up to 24 %. Lowest F ST values were realized for isozyme gene loci with maximum values of about 4 %.

Table 2 Mean values over all loci and populations of Na (number of different alleles), Ne (allelic diversity measure, \( 1/\sum p_{i}^{2} \)), He (expected heterozygosity), and PPL (percentage of polymorphic loci)
Table 3 Results of AMOVA and mean values over all loci and populations for private and locally common alleles
Fig. 2
figure 2

Ability of genetic markers to distinguish between stands of hazelnut estimated by single loci F ST values. Single loci F ST values are ordered according to increasing F ST values for AFLPs, isozymes and cpDNA-SSRs

cpDNA geographic structures

The analysis of cpDNA genetic variation revealed three haplotypes (Table 4). Genetic variation within and between all German populations on the one hand and on the other hand the one population from Hungary is zero. All these populations possess only haplotype H1. The Italian population is completely differentiated with 70 % haplotype H2 and 30 % H3.

Table 4 Observed fragment lengths (bp) and deduced haplotypes in hazel

Spatial genetic structure analysis based on nuclear markers

The Mantel test revealed a clear correlation between geographic and genetic distance matrices for AFLPs if all populations are considered (R 2 = 0.64, P = 0.001; Fig. 3a). A weaker, but still significant correlation is observed if the two stands from Italy and Hungary are excluded from the analysis (R 2 = 0.20, P = 0.004; Fig. 3b). The respective values for isozymes are R 2 = 0.452 (P = 0.001) for all populations and R 2 = 0.099 (P = 0.01) for the German populations only.

Fig. 3
figure 3

Correlation of geographic (x axis, kilometres) and genetic distances (y axis, d0 from AFLP data) for all populations (a) and the German populations (b)

The general pattern of correlated geographic and genetic variation was further analysed using UPGMA cluster analysis based on AFLPs (Fig. 4). The analysis revealed a strong differentiation between the German populations and the two populations from Italy and Hungary. The Hungarian population forms a single branch and the German populations clustered with the Italian one which is still clearly differentiated.

Fig. 4
figure 4

UPGMA cluster analysis based on the genetic distance d0 (x axis) between AFLP structures

Grouping of geographically adjacent populations indicated with the first two letters in common reveals a number of genetically similar populations belonging to the same geographic region; for example, (BB-01, BB-02, BB-03), SH-FK, (ND-FG, ND-FB), (NW-WV, NW-MB), (TH-BB, TH-GT). Potential genetic substructures were analysed by spatial autocorrelation (r) between geographic and AFLP genetic structures. The results are illustrated in Fig. 5 for all populations and in Fig. 6 for German populations only. Genetic substructures were detected in the distance classes 91–182 km and 718–819 km where r exceeds the upper and the lower confidence intervals of 95 % (Fig. 5). Autocorrelation between German populations shows clear substructures in the distance class 112 km.

Fig. 5
figure 5

Correlogram of spatial genetic structures (AFLPs) of hazel populations with ten distance classes each of 91 km within a maximum distance of 910 km including one population from Hungary and one from Italy

Fig. 6
figure 6

Correlogram of spatial genetic structures (AFLPs) for German hazel populations with ten distance classes each of 56 km within a maximum distance of 560 km

The calculation of six genetic barriers based on Monmorier’s algorithm (Manni et al. 2004) delineates areas with pronounced genetic differences (Fig. 7). In hierarchical order these are the barriers a and b representing the populations from Hungary (UNG) and Italy (ITA).

Fig. 7
figure 7

Map of polygonal neighbourhoods (thin lines) of the sampled populations (nos. 1–20) and genetic barriers (bold lines) in hierarchical alphabetical order af

Barrier c delineates the most northwestern population BB-01, followed by barrier d which shows that the most southwestern part represented by RP-BO is genetically different. Further barriers underline the differentiation of populations from Brandenburg (BB-02 and BB-03, barrier e) and indicate that in the centre of the German hazel distribution there are still genetically differentiated populations such as TH-MS from Thuringia.

Discussion

Comparison of the different genetic marker systems revealed clearly different patterns of within- and between-population genetic differentiation in hazel. The observed values at isozyme gene loci are similar to those found previously (Rumpf 2002; Persson et al. 2004). AFLP data are difficult to compare because previous investigations used AFLPs as genetic fingerprints for the characterization of C. avellana accessions only (Ferrari et al. 2005; Chen et al. 2005; Kafkas et al. 2009). AFLP investigations at the population level in other long-lived woody plant species showed higher values of within-species variation for Fagus sylvatica (PPL 76–97 %, He 0.21–0.27) and similar levels in some tropical tree species (Shorea parviflora and S. leprosula) with PPL values of 52 and 53 % and He values at the population level of 0.14 and 0.16 (Papageorgiou et al. 2008; Cao et al. 2006). Very little genetic variation within populations was exhibited by analysing cpDNA-SSRs. Similar results were reported by Palmé and Vendramin (2002). Regarding the ability of the marker systems to differentiate between populations, the overall genetic differences between populations are similar for biparentally inherited isozymes and AFLPs at about 3.5 %.

The maternally inherited cpDNA markers show a high level of between-population genetic differentiation of 93 %. A comparison of the different marker systems concerning the ability of single loci to detect pronounced and statistically significant differences between populations (Table 3) shows that some cpDNA-SSRs reach maximum values differentiating at least one population with 100 %. Furthermore, more than 40 % of the analysed AFLP gene loci exhibit higher F ST values than any isozyme gene locus and 22 % show values equal to or higher than 10 %. Over all populations 1 isozyme, 41 AFLP and 4 cpDNA gene loci showed statistically significant differentiation. Between the populations from Germany cpDNA markers show absolutely no differentiation, but 21 AFLP loci were still highly differentiated and showed pronounced differences between single populations. This result reflects the potential of AFLPs to differentiate hazel populations. Earlier investigations based on AFLPs have shown similar results for other tree species (Papageorgiou et al. 2008; Stefenon et al. 2007; Cao et al. 2006).

Spatial genetic structure analysis

Chloroplast DNA is generally maternally inherited in angiosperms and, therefore, dispersed by seeds only. Because recolonization of habitats occurs through seeds, cpDNA markers provide information on changes in species distribution in the past that is unaffected by subsequent pollen movements (Petit et al. 2003). In population genetic surveys these features in general lead to low within-population but high between-population genetic variation (Dumolin-Lapègue et al. 1997). As in previous investigations in hazel (Palmé and Vendramin 2002), our data show very low levels of within-population genetic variation but a clear geographic pattern of the few cpDNA haplotypes. In contrast to previous investigations, our data show only one identical haplotype in all German populations and the Hungarian one. The Italian population is composed of two unique haplotypes and is thus completely differentiated from all other populations. A clear separation between Italian and Hungarian (“the Balkan”) populations and those of the rest of Europe is not supported by our data. This seems to be because of a problem in clearly delineating the Balkan region geographically. In our survey the Balkan region, as a synonym for southeast Europe, is represented by one population sample from Hungary, whereas in the study by Palmé and Vendramin (2002) four population samples from Slovakia, Croatia, Rumania and Greece were included. Thus, our results do not contradict the previous results and the analysis of more populations in this area could help elucidate the “fine-scale” genetic structures in this area.

Complete differentiation of the Italian population was also reported by Palmé and Vendramin (2002). They argue that Italy is excluded as a possible source of postglacial recolonization of the more northern parts of Europe. The assumption that for recolonization the Alps are stronger barriers than previously assumed is supported by the results of Magri et al. (2006), who found that beech recolonization north of the Alps took place without refugial populations from Italy. The cpDNA markers investigated in hazel are able to identify large-scale genetic structures at the European level, but are not suitable for identifying spatial structures at a finer scale in Germany, even if a comparably large sample size of 20 samples per population is analysed, as shown in this survey.

Our investigations of nuclear markers reveal clear spatial genetic structures between hazel populations. As implied from the general comparison of the different marker sets and shown by our results, AFLPs are well suited in this regard. Mantel tests for all populations showed high correlations between geographic and genetic distances. The Mantel test based on isozymes showed a weak but significant correlation, but it was not possible to draw a more detailed picture of regional geographic/genetic differentiation. This result is supported by the most recent work of Persson et al. (2004) who analysed 40 populations of hazel spreading from Scandinavia to Croatia and UK to Slovakia. The authors reported a “lack of regional differences in allele frequencies” and low genetic variation between populations, especially in Central Europe with G ST values between 3 and 5 %, which is in accordance with the findings of our study. Similar results were obtained in beech analysing isozyme genetic variation (Konnert et al. 2000).

In contrast the cluster analysis for AFLPs showed a higher resolution of regional spatial structures because most of the geographically adjacent populations are also genetically adjacent. One of a few exceptions is the population RP-BO from Rhineland-Palatinate. Different resolutions in the detection of spatial structures with different marker sets were also shown in beech populations in Germany. No or only weak structuring was observed with isozymes, whereas highly polymorphic SSR markers revealed strong clustering of different regions (Konnert et al. 2000; Rajendra 2011).

The occurrence of spatial genetic structures in different regions is supported by autocorrelation analysis and calculation of genetic barriers. Autocorrelation analysis showed spatial genetic structures in two distance classes which refer first to the genetic differentiation of German populations on the one hand and the two southern populations on the other hand and second to the regional genetic structure within a distance of about 110 km. Delineation of genetically differentiated areas by genetic barriers indicate pronounced differences among German hazel populations at least in the most northwestern and the southwestern parts of the German distribution. Concerning the two populations from southern Europe, AFLP data surprisingly showed the Italian population relatively close to the German populations, whereas the Hungarian population is highly differentiated. This result seems to contradict the result of the cpDNA analysis which showed that the German and Hungarian populations share the same haplotype. If AFLPs represent mostly nuclear genetic information, which is very likely if random amplification of the large nuclear and the comparably small chloroplast genome is assumed (Schroeder and Degen 2008), this result possibly suggests that nuclear genetic structures of German hazel populations were influenced by more than one glacial refugia. It is conceivable that initially one refugia in southwest Europe spread to the north east and was later influenced by pollen flow from other refugia.

In summary, our data show that our working hypothesis concerning the absence of isolation by distance between German and southern populations and among German populations only must be rejected. Present genetic structures of hazel in Germany indicate that the effects of gene flow resulting from human activity and hybridization events due to the domestication since Roman times, and the strong trade in hazel germplasm, are more restricted than expected. Our results indicate a general trend for isolation by distance and moreover genetic differentiation between hazel populations of different geographic regions of Germany.