Assessing molecular diversity among 87 species of the Quercus L. genus by RAPD markers

Oaks (Quercus sp.) are among the most ecologically and economically important woody Angiosperms of the northern hemisphere. Nowadays, the reduction of Quercus biodiversity is becoming a matter of global concern, and several oak species have been included in the IUCN Red List of Threatened Species. Consequently, characterization and preservation strategies for the oak germplasm are largely promoted. Thus, in this work, the genetic diversity existing among 87 different Quercus species was assessed using the RAPD markers, in order to better typify these specimens, to show the amazing DNA variability of this plant genus, and to confirm or infer new putative molecular correlations. Our data were discussed taking into consideration the phylogenetic and phylogeographic relationships previously proposed by the literature. In general, the obtained results corroborated that the evolutionary pattern of Quercus genus has been extremely intricate and continues to change rapidly, making it difficult to be fully resolved. The evidence collected in the present investigation would confirm the complex evolution of the oaks, due to their high migration capacity, divergence rate, and hybridization propensity. This research, performed on a so large series of species, represents a positive contribution for highlighting the genetic diversity within collections of Quercus germplasm and favouring ex-situ conservation programmes.


Introduction
Population genetic studies performed on Quercus L. species (oaks), native to north America and Asia or present in the continuous distributions of the continental Europe, have suggested that the extraordinary biodiversity of this genus has been successfully subjected to several macro-and microevolutionary processes (Kremer and Hipp 2020).
Quercus genus includes both deciduous and evergreen plants and extends from cold latitudes to tropical areas, mainly in the northern hemisphere.About 450 species of oaks have been discovered all over the world, with a high percentage in the Americas (65%), where they represent about 20% and 29% of the US and Mexican forest biomass, respectively.
Vol:. ( 1234567890) The literature has classified oaks repeatedly (Aldrich and Cavender-Bares 2011;Gailing and Curtu 2014;Cavender-Bares 2019).The last validated version has proposed the existence of two subgenus: Cerris and Quercus (Denk et al. 2017).The first one would include three monophyletic groups: (i) cycle cup oaks (section Cyclobalanopsis) occurring in Asia, (ii) cerris oaks (section Cerris) and (iii) ilex oaks (section Ilex), both located in Eurasia and north Africa.On the other hand, the second subgenus would be made up of five clusters: (i) live/evergreen oaks (section Virentes), (ii) species of the section Ponticae and (iii) white oaks (section Quercus) distributed across the Americas and Eurasia (the latter also in north Africa), (iv) red oaks (section Lobatae) and (v) golden cup oaks (section Protobalanus) restricted to the Americas.
Quercus genus shows a wide range of morphological and physiological traits.Nevertheless, in some cases, the difference among closely related specimens is often not clearcut, making challenging their identification.For instance, white oaks usually exhibit similar characters, presenting overlapping systematic and ecological aspects (Gailing and Curtu 2014;McVay et al. 2017).This phenomenon is linked to the fact that oaks are prone to interspecific hybridization, introgression, and intricate speciation patterns, which involve incomplete lineage sorting (Pang et al. 2019).In addition, the sympatric occurrence of several Quercus species also promotes interbreeding and taxonomic confusion (Beatty et al. 2016).By contrast, the highest rates of species diversification have been detected in response to landscape changes and migration events into new territories (Kremer and Hipp 2020).
For centuries, oak classification has been based entirely on morphological traits, until the development of molecular approaches has changed remarkably the phylogeny of this plant genus (Denk et al. 2017).Among all, DNA analysis has been recognised as the most powerful tool for reconstructing Quercus evolutionary steps (Ardi et al. 2012).Dominant markers, such as AFLPs (Amplified fragment length polymorphisms), ISSRs (Inter Simple Sequence Repeats), and RAPDs (Random Amplification of Polymorphic DNAs), have generally provided higher levels of discrimination than other techniques, even in presence of hybrid species (Cotti 2008).
AFLP has been one of the first method applied to Quercus genus, for instance, to: (i) highlight the similarity existing among four California red oak species (i.e., Q. agrifolia, Q. parvula, Q. kelloggii, and Q. wislizeni) (Dodd and Kashani 2003); (ii) assess the gene flow between Q. suber and Q. ilex samples from Portugal (Coelho et al. 2006); (iii) evaluate the degree of differentiation within indigenous populations of Flemish Q. petraea and Q. robur (Coart et al. 2002); (iv) investigate the hybridization rate between Q. crispula and Q. dentata, two dominant components of the cool-temperate forests across Japan (Ishida et al. 2003).
Although modern techniques, like the restrictionsite associated DNA next-generation sequencing (RAD-seq), have been usefully applied for obtaining large-scale information about biogeography, niche modification, and diversification patterns of the American and European oaks (Hipp et al. 2018;2020), RAPD method has been deemed as one of the best systems to investigate the evolution of the Quercus species, due to the high level of polymorphisms present in oak genomes and the easy applicability of this molecular analysis (Aykut 2016).In this regard, RAPDs have been employed for detecting genetic similarities among Q.robur, Q. macranthera, Q. infectoria, Q. magnosquamata, and Q. libani from forests of the north-west Iran (Ardi et al. 2012), for solving taxonomic issues relative to three Turkish evergreen oak species (i.e., Q. coccifera, Q. ilex, and Q. aucheri) (Aykut et al. 2013;Aykut 2016), or for distinguishing Q. crassifolia and Q. crassipes Mexican exemplars with partially overlapping distributions (Tovar-Sánchez and Oyama 2004;González-Rodríguez et al. 2004).
Nowadays, the degradation rate of oak biodiversity is becoming a matter of global concern.Uncorrected land exploitation, over-cutting, over-grazing, and climatic changes are the main factors that contribute to the destruction of Quercus woods and trigger the threat of extinction to many species of this genus (Jawarneh et al. 2013).Therefore, characterization and preservation strategies of the oak genetics are largely favoured and promoted.In light of this, starting from 2010, a collection of about 430 oak exemplars, which represent 87 non-hybrid species deriving from different areas of the world, have been bedded out at the Botanical Gardens of Rome Tor Vergata (Rome, Italy).Thus, the aim of the current work consisted in the assessment of the genetic diversity of these 87 oak species by RAPDs, in order to: (i) better typify them at molecular level; (ii) highlight the amazing DNA variability existing in this plant genus; (iii) confirm or infer new putative genetic correlations among the studied specimens.Therefore, the obtained results represent the action of a project aimed at characterizing and promoting the ex-situ conservation of the Quercus germplasm.

Plant material and ethical statements
The plant material was kindly provided by Prof. Antonella Canini (Botanical Gardens of Rome Tor Vergata, Rome, Italy; Lat: 41.84758, Long: 12.64385).The sampling occurred in Spring 2019 and consisted in the collection of leaves (without symptoms of pathology) from 87 oak species present in the living collection of the Botanical Gardens of Rome Tor Vergata (Supplementary Information 1).Each sampled specimen, today accessible and growing at the Botanical Gardens of Rome Tor Vergata, was tagged and included in a dedicated register; botanical vouchers (codes: Qsp1-430) are available at the Botanical Gardens of Rome Tor Vergata.For each species, from 3 to 5 individuals were sampled and analysed, to validate accuracy, repeatability, and validity of methods and results.After picking, plant tissues were carefully washed with sterile water, powdered by mortar and pestle in presence of liquid nitrogen, and stored at − 80 °C until the use.The experimental research performed in this study complies with relevant institutional, national, and international guidelines and legislation; in particular, the collection of the plant material was carried out in respect of the IUCN Policy Statement.In addition, the current work promotes the conservation status of the investigated species not already available in public collection.

DNA extraction and quantitation
Total DNA was isolated from 100 mg of powdered plant material by using the NucleoSpin Plant II kit (Macherey-Nagel), according to manufacturers' instructions.However, this protocol was optimized and adapted for oak tissues, due to their high levels in secondary metabolites which negatively affect yield and purity of the nucleic acids.Thus, before the application of the kit, the sample was subjected to a preliminary purification step based on the CTAB method reported in Lefort and Douglas (1999) and Gismondi et al. (2012), with some modifications.In detail, the sample was resuspended in the extraction buffer (50 mM Tris-HCl pH 8.0; 20 mM EDTA; 0.7 M NaCl; 1% CTAB; 1% PVP; 2% SDS; 0.7% β-mercaptoethanol) for 15 min at 65 °C, diluted with a quarter of volume by chloroform/isoamyl alcohol (24:1; v/v), vortexed, and centrifuged at 11.000 g for 10 min.Finally, the aqueous phase was recovered and treated as described in the guidelines of the extraction kit.DNA concentration and purity were determined by a NanoDrop spectrophotometer (Thermo Scientific), reading the absorbance of the samples at 260 nm and 280 nm.

Agarose gel electrophoresis
After the amplification, PCR products were fractioned by electrophoresis on 2% agarose gel prepared using pH 8.5 TAE buffer (40 mM Tris-HCl; 1 mM EDTA; 20 mM acetic acid) and containing ethidium bromide (10 mg/mL).The run was carried out at 80 Volts for 120 min.DNA fragments were visualized under UV-light and photographed by a VersaDoc MP 4000 system (Bio-Rad).The molecular weight of the amplicons was estimated using the Mass Ruler DNA marker (Thermo Scientific) as reference.

Bioinformatics analysis
Polymorphic bands were detected in all species.Thus, RAPD profiles were defined using a binary method in which presence or absence of specific DNA fragments was coded as 1 or 0, respectively.Total number of bands, number of polymorphic bands, polymorphism in percentage, mean of the band frequencies, polymorphic information content value, marker index, and resolving power were calculated as reported in Guasmi et al. ( 2012), Ghosh et al. (2014), and Chesnokov and Artemyeva (2015).To simplify the analysis, each taxon was renamed using an internal lab-code (see Supplementary Information 2).The relationships among the genetic profiles were established using the reduced median-median joining (RM-MJ; r = 2, epsilon = 0) approach (Bandelt et al. 1999) implemented by Network software (v.10.0).Moreover, the matrix produced from the binomial data was employed to calculate the phylogenetic distance of the specimens, according to Nei and Li (1979) coefficient and UPGMA method, through the FreeTree program (Hampl et al. 2001).The relative dendrogram was drawn and viewed by TREEVIEW software (Page 1996).

Results and discussion
Oaks are among the most ecologically and economically important woody Angiosperms of the northern hemisphere.Therefore, the preservation of the Quercus genus appears fundamental for maintaining as balanced the environmental equilibria.Indeed, loss of biodiversity can cause severe consequences, including landscape fragmentation, phytosanitary problems, genetic erosion, and habitat degradation.To avoid such type of phenomena, conservation strategies and characterization studies of plant exemplars are crucial.For these reasons, in the present work, 87 different oak species were analysed by RAPD markers, to develop a specific identification system and to investigate the genetic diversity present in the oak germplasm collection present at the Botanical Gardens of Rome Tor Vergata for ex-situ conservation purposes.
The first step of the investigation included the setting up of the amplification protocol.Several combinations of primers were tested (see Materials and Methods section); among all, the most informative profiles, or rather those with the clearest and sharpest bands, were obtained using OPA-1 primer alone (Fig. 1A).This condition was further examined on varying two different annealing temperatures (i.e., 52 and 54 °C) and two MgCl 2 concentrations (i.e., 0.5 and 0.75 mM) (Fig. 1B).The best procedure, in terms of richness in bands, appeared to be that characterized by an annealing temperature of 52 °C in presence of 0.75 mM MgCl 2 .Thus, this amplification system was selected to be applied on DNAs purified from the 87 oak species.The reproducibility of the banding patterns, generated using the standardized RAPD approach reported above, was checked by comparing the molecular profiles of the same oak DNA isolated in different extraction procedures and carrying out inter-laboratory replicates.The polymorphic nature of the Quercus DNAs can be observed in Fig. 1C, which shows OPA-1 genetic schemes for some representative specimens.To validate the experimental protocol, PCR controls were always carried out (Fig. 1D).
The selected RAPD primer produced 11 bands with sizes going from approximately 400 to 1100 bp.The fragments out of this range were not considered during the analysis.The binary profiles reflecting the RAPD results for all oak samples were reported in Supplementary Information 2. The molecular pattern showing the highest number of amplicons was relative to Q. buckleyi (i.e., 11/11), while Q. grisea and Q. pubescens were the species characterized by the simplest profiles (i.e., 3/11).The most represented amplicons were those at 600 and 750 bp, which appeared in 74 and 73 species out of 87, respectively.By contrast, the band at 550 bp, detectable only in 31 taxa, was the less conserved among all.
RAPD profiles revealed that 92.8% of Lobatae species were characterized by the band at 400 bp, while 96.5% by the amplicon of 600 bp.On the other hand, the signal at 950 bp was absent in 78.5% of them.The bands at 600 bp and 900 bp typified 84.2% of the samples belonging to Quercus section.Surprisingly, the same percentage of specimens presented the absence of the fragment at 700 bp.About Cerris section, 88.2% of the oaks exhibited the longest amplicon (i.e., 1100 bp), while 94.1% the band at 750 bp.The amplicon at 580 bp was the less frequent for the same species, being detectable only in 11.8% of the molecular patterns.The reduction of the frequency (64.7%) of the band at 600 bp for the species included in the Cerris section, as well as other differences observed among distinct clusters, could be linked to migration and diversification events, considering the excellent adaptative capacity of oaks to several environments and climate conditions and the continuous gene flow existing within this genus (Cavender-Bares 2019).
The RAPD method employed in this study and based on the application of OPA-1 primer could be considered highly discriminant for oak species, according to the efficiency parameters calculated and reported in Table 1.In particular, OPA-1 was able to amplify a good number of bands, compared to other primers tested in RAPD analyses for Quercus species (Ardi et al. 2012), which were all polymorphic, allowing to reveal a great genetic variability in our samples.Indeed, the mean of the band frequencies (i.e., 0.576) suggested a general heterogeneity in the distribution of the amplicons among the species.In addition, the PIC value, often used as a measure for the informativeness of a molecular marker, was very high (0.488), taking into consideration that dominant markers present a maximum PIC value of 0.5 (Serrote et al. 2020).Marker index and resolving power values, which show the capacity of the selected primer to detect genetic variability across the investigated plant material (Ghosh et al. 2014), revealed that OPA-1 had a discrete discriminatory potential, even with respect to other RAPD primers investigated in different species (Tonk et al. 2014).
To reduce the complexity of the obtained data, a network (Fig. 1E) was built by applying RM-MJ technique and giving equal scores to each character (locus/band).The outcome put in prominence the existence of 79 different genetic profiles out of 87 analysed species, confirming the high discriminating power of the selected RAPD method.Indeed, 90.1% of the samples appeared in the graph as single nodes (named using the lab-codes reported in Supplementary Information 2).In five cases (i.e., A1, A2, A11, A16, and A49), node size was bigger than the basal form, as proportional to the number of samples which included, representing taxa with identical molecular patterns.The largest node (tagged A16) contained 4 species (Q.turbinella A16, Q. alentejana A64, Q. petraea ssp.iberica A65, and Q. hartwissiana A67) which possess a very diversified area of origin but whose genetic proximity can be justified by the fact that they are all white oaks (Quercus section).The second node for dimension (tagged A2) consisted in the RAPD profile shared by 3 taxa of red oaks (Q.gravesii A2, Q. imbricaria A26, and Q. shumardi A42) belonging to Lobatae section and native of central and northern America.Finally, three nodes were made up of couples of species: node tagged A1 (Q.seemannii A1 and Q. canbyi A6); node tagged A11 (Q.agrifolia A11 and Q. rysophylla A17); node tagged A49 (Q.glauca A49 and Q. coccifera ssp.rivas-martinezii A70).The similarity showed by Q. agrifolia and Q. rysophylla might be expected, as they are both native to the same area (i.e., Mexico) and the only evergreen red oaks among those investigated in this study.The overlapping genotypes of Q. seemannii and Q. canbyi could suggest a common ancestor, also taking into account their membership to Lobatae section and their origin from Mexico.Surprisingly, Q. glauca and Q. coccifera ssp.rivas-martinezii exhibited the same RAPD fingerprint, although belonging to different sections, namely Cyclobalanopsis and Cerris, respectively.
The positioning of the species in the network did not allow to group the taxa according to their sections (e.g., Quercus, Cerris, Lobatae); however, the distribution pattern reflected the geographical localisation and/or the ecological role of the specimens, as suggested by other literature works.Indeed, such type of clustering has been already observed and explored by Kremer and Hipp (2020), looking for the ancestor of American white and red oaks.In the same context, the molecular distance registered between Q. fabrei (A83) and Q. montana (A45) is noteworthy.In fact, even if these two species belong to Lobatae section, they are native of different continents, in that order, Asia and America.Thus, events of evolutionary convergence, maybe dictated by specific environmental pressures, might have led the selection of peculiar genetic schemes closely correlated.
Other two nodes, that is A31 (Q.marilandica) and A69 (Q.virgiliana), distinguish themselves from the central core, spreading together toward an alternative direction.It is difficult to clearly understand the molecular and/or ecological mechanisms underlying this phenomenon, considering that A31 is a species of the Lobatae section native to central and eastern USA while A69 refers to Quercus section and originates from western Mediterranean.
RAPD data were also used to construct a similarity matrix (Supplementary Information 3), based on the coefficients of Nei and Li (1979), for assessing the genetic relationship existing among the 87 Quercus species considered in this work.The similarity indices ranged from a maximum of 1, when two RAPD profiles were identical, to 0, in case of couples of species with no band in common.Zero values were found for the following pairs: Q. trojana ssp.euboica (A68) and Q. pubescens (A75); Q. pontica (A84) and Q. sinuata var.breviloba (A12); Q. pontica (A84) and Q. rotundifolia (A54); Q. trojana ssp.euboica (A68) and Q. pontica (A84).These results cannot be explained as a consequence of membership to different sections, confirming the evidence inferred by the network analysis about the incapacity of the present RAPD approach (i.e., OPA-1 primer) to discriminate oak sub-genera and relative clusters.
Starting from the previous similarity matrix and using the unweighted pair-group method with arithmetic averages (UPGMA), an unrooted dendrogram (Fig. 2) showing the calculated genetic distances among the species was produced.Logically, the samples with identical genetic profiles clustered together (e.g., A1, and A6; A2, A26, and A42; etc.), as already observed and described in detail for the network (i.e., bigger nodes).
In the phylogenetic tree, Q. pontica (A84) appeared as the most genetically distinct species among all.This outcome could be expected, considering that Q. pontica has a very restricted geographical distribution, which confines itself on the Caucasus mountains of the western Asia (Denk et al. 2017).Indeed, this taxon has been considered as a Tertiary relict species markedly separated from the other white oaks, according to an analysis carried out on the nuclear ribosomal Internal Transcribed Spacer (ITS) barcode gene (Denk and Grimm 2010).
In the dendrogram, the major part of Lobatae characterized the upper region of the graph, while species from the Quercus section were mainly located in central branches.However, in generally, oaks did not group distinctly based on their classification in sections, either in case of subspecies (i.e., Q. petraea, Q. robur, and Q. coccifera), but most probably because of a mixture of other factors, first of all the sharing of similar habitats.Paying attention to this aspect, it could be noticed that phylogeographic schemes were not fully respected; however, oaks are renowned for posing issues about the definition of species boundaries, which lead to taxonomic problems in classifying the specimens (Kashani and Dodd 2002).To justify this complex outcome, one should keep in mind that a lot of studies have described that the gene flow acts as a strong homogenizing agent able to reduce the genetic distance among Quercus species.Moreover, when species are not hybrids (like those investigated in this study), other elements, like shared ancestral polymorphisms and historical introgression events, could make the phylogeny more complicated and unclear (Tovar-Sánchez and Oyama 2004;McVay et al. 2017).
Particular attention should be paid to Q. crenata, a species strongly threatened with extinction.Indeed, nowadays, only about a thousand of exemplars have been recorded in Italy (where it is protected) and a few in France.Recent genetic studies have proposed this species as the remainder of the ancestral form from which Q. suber would have evolved (Simeone et al. 2018), rejecting the theory of a stabilised hybridization between Q. suber and Q. cerris (Bellarosa et al. 2005).RAPD data reported in this research would be more in line with the first hypothesis, due to the higher proximity of Q. crenata (A77) to Q. suber accession, although this ambiguous species appeared quite isolated in the dendrogram from both its potential relatives.
Q. alentejana is another important taxon still under investigation.Some scholars consider it as a new independent species, others like a complex hybrid between Q. pyrenaica and Q. marianica.The scientific evidence here described did not reveal a high molecular similarity between Q. alentejana and Q. pyrenaica; however, no conclusion can be drawn because, unfortunately, Q. marianica was not among the studied samples.
The last peculiar aspect to be discussed in this work is relative to Q. robusta, whose presence is mainly limited to American stands of small size.The reduced distribution makes it as a species to be protected.According to Govaerts and Frodin (1998), Q. robusta would be the result of the hybridization between Q. emoryi and Q. gravesii.This assumption seems to be not corroborated by the phylogenetic reconstruction reported in the current study, which showed a considerable genetic distance between Q. robusta and Q. gravesii.Thus, it is possible that Q. robusta differentiated deeply from its ancestors or that the proposed hybridization model is not correct.In this regard, Muller (1952) had already confirmed this second hypothesis, explaining that Q. emoryi and Q. gravesii would not be able to fertilize each other, despite they share the same habitats and/or spatial regions.
In conclusion, we assessed the molecular diversity existing among 87 different Quercus species by applying a molecular method based on RAPD markers.With some exceptions, our data were generally coherent with the phylogenetic and phylogeographic relationships proposed by the previous literature.Our results confirmed that the evolutionary pattern of Quercus genus has been extremely complex and continues to change rapidly, making it difficult to be fully resolved.This phenomenon was surely due to several biological mechanisms typically linked to oaks, such as: (i) capacity to migrate rapidly, promoting ecological consequences on lineage differentiation; (ii) high Vol:.( 1234567890) rates of evolutionary divergence; (iii) hybridization propensity, which contributes to adaptive introgression.
Some oak species included in this study are inserted in the IUCN Red List of Threatened Species, as vulnerable or endangered.Thus, the molecular characterization performed in this study, on a so large series of species, will surely represent a positive contribution for profiling the diagnostic genetic diversity of the Quercus species and for favouring preservation strategies of their genetics, with the final aim to promote ex-situ conservation programmes for oak germplasm.
54 °C) were evaluated.Amplifications were performed using a Bio-Rad IQ5 thermocycler whose conditions were set as follows: (i) 4 min of denaturation at 95 °C; (ii) 60 cycles, each one consisting in 20 s of further denaturation at 95 °C, 20 s of annealing at 52/54 °C, and 45 s of extension at 72 °C; (iii) final extension step at 72 °C for 10 min.

Table 1
Efficiency parameters of the applied methodThe sequence of the primer used for producing RAPD profiles and the relative efficiency parameters were reported: total number of bands (TNB); number of polymorphic bands (NPB); polymorphism in percentage (% Polym); mean of the band frequencies (MBF); polymorphic information content value (PIC); marker index (MI); resolving power (RP)