Quercus robur (L.) and Quercus petraea (Matt.) are very common tree species in many forests in Europe. Their distribution range from Scandinavia to the Iberian Peninsula, and from Great Britain towards Eastern and South-Eastern Europe (San-Miguel-Ayanz et al. 2016). The distribution edge of Quercus robur goes a few hundred kilometres more to the East up to the South-Ural region in Russia. Both species are economically significant and provide high quality hardwood for timber construction, furniture and barrels.

Quercus robur and Q. petraea have been genetically very well studied over their West-European distribution range, but there are still some gaps especially to the East (Degen et al. 2019). As gene markers, chloroplastic RFLPs, microsatellites and recently SNPs have been applied to study genetic differentiation and population genetic processes like gene flow, mating system and the impact of post glacial re-colonisation and human seed transfer (Konig et al. 2002; Petit et al. 2002; Buschbom et al. 2011; Gerber et al. 2014). Although large sets of SNPs including thousand to millions of SNPs have been developed for the two oak species (Leroy et al. 2019; Lepoittevin et al. 2015), there is still a need for smaller sets of highly geographically informative SNPs that can be cost-effectively genotyped.

For SNP discovery, we used leaf or cambium material from 95 Q. robur and Q. petraea trees originating from all Europe, Ukraine and Russia (Table 1). The sampling size was more intensive in Germany to catch within-country variation. DNA was extracted according to Dumolin et al. (1995). Double Digest Restriction site associated DNA sequencing (ddRAD) (Peterson et al. 2012) was conducted on all samples to detect SNPs in the nuclear genome (Floragenex, Portland, USA). Among the 26,074 putative SNP loci obtained by ddRADseq, only those with a minimum flanking region of 50 bp around the SNP and a maximum of two neighbour SNPs were selected for further analysis (3648 loci). Further cleaning of the data included the removal of loci with more than 10% missingness and a minor allele frequency lower than 1%. Data was grouped by species, country, and state within Germany to conduct discriminant analysis and to detect the loci with the highest contribution. This allowed a selection of SNP loci with a geographical signal. We additionally looked among samples from Germany and Russia to select loci with both high expected heterozygosity and positive Fis to avoid parapatric loci. All analyses were conducted in R 3.6.0 using, among others, the packages vcfR, poppR, adegenet and hierfstat. A final selection of 168 loci was used to design four MassARRAY ® iPLEX™ multiplexes (Assay Design Suite v2.0 [Agena Bioscience™, San Diego, USA]), which included a total of 130 loci (Supplementary Material S1).

Table 1 Quercus robur and Q. petraea samples used for double digest restriction site associated DNA sequencing (ddRAD)

We choose to test our newly developed markers on 190 Quercus robur from 19 locations in its Russian distribution range (Table 2). All samples were run on a MassARRAY® iPLEX™ platform (Agena Bioscience™, San Diego, USA) using the iPLEX™ GOLD chemistry. Genotyping was conducted with Typer Viewer v.4.0.24.71 (Agena Bioscience™, San Diego, USA). We estimated for each locus the percentage of amplification, observed heterozygosity (Ho), within population gene diversity (Hs) (Nei 1987), Fis, Fst (Weir and Cockerham 1984) and average differentiation among sampling locations (Gregorius 1987). Significance levels for Fis and Fst were tested with 10,000 randomizations (Supplementary Material S2). Analyses were conducted with GDA_NT (Degen, unpublished) and Fstat (Goudet 1995). A total of 119 markers was usable. Among those, four loci showed an amplification lower than 80% and 24 loci were not polymorphic in the screened Russian samples. Additionally, nine loci significantly deviated from HWE in the tested samples. Differentiation was low (Fst = 0.03) but significant (Goudet et al. 1996).

Table 2 Spatial location of the 19 Russian populations included in the screening of 130 SNP loci

We successfully developed new SNP markers for Q. robur and Q. petraea, which will be useful for population genetic studies at the European level. Additional screening will be needed on samples from other regions, to address whether a single set of SNP markers will be sufficient to cover the whole distribution range of these species and be useful to track the origin of forest reproductive material.