Genomic-based microsatellite development for Ternstroemia (Pentaphylacaceae) and transferability to other Ericales

Background The genus Ternstroemia is associated with the vulnerable tropical montane cloud forest in Mexico and with other relevant vegetation types worldwide. It contains threatened and pharmacologically important species and has taxonomic issues regarding its species limits. This study describes 38 microsatellite markers generated using a genomic-based approach. Methods and results We tested 23 of these markers in a natural population of Ternstroemia lineata. These markers are highly polymorphic (all loci polymorphic with 3–14 alleles per locus and expected heterozygosity between 0.202 and 0.908), most of them (19 out of 23) are in Hardy-Weinberg Equilibrium and free of null alleles (18 out of 23). Also we found no evidence of linkage among them. Finally, we tested the transferability to six other American species of Ternstroemia, two other Pentaphylacaceae species, and four species from different families within the order Ericales. Conclusions These molecular resources are promising tools to investigate genetic diversity loss and as barcodes for ethnopharmacological applications and species delimitation in the family Pentaphylacaceae and some Ericales, among other applications.


Introduction
The genus Ternstroemia Mutis ex L.f. contains between 110 and 160 species mainly distributed across the tropical and subtropical regions worldwide [1,2]. In Mexico, some Ternstroemia species are considered either diagnostic or associated with tropical montane cloud forests (TMCF) [3], while other species across the world are widespread in both tropical and temperate forests. According to climate-change scenarios, the TMCF will face severe threats regarding physiological adaptation and survival [4]. Among the species of Ternstroemia, some taxa such as T. dentisepala B.M. Barthol. and T. huasteca B.M. Barthol. are considered a priority for conservation since they are endemic and geographically rare [5]. Ternstroemia huasteca is considered "vulnerable" in the IUCN Red List [6]. Moreover, Mexican species such as T. chalichophila, T. dentisepala, and T. impressa belong to the Ternstroemia lineata species complex, a taxonomic group with unresolved relationships, putatively obscured by ongoing interspecific processes [7].
In Mexico, most genetic diversity research on forest species has been focused on timber species, such as the Pinaceae and the genus Quercus. In contrast, the knowledge of genetic patterns in other tree species is relatively scarce [8]. Nevertheless, some studies on Mexican TMCF species such as Abies [9], Chiranthodendron [10], Liquidambar [11], and Podocarpus [12] have identified historical gene drift and populational isolation during the Pleistocene interglacial periods [13].
The biogeographic history of Ternstroemia remains poorly known because there are no comprehensible explanations of its amphipacific distribution. However, efforts like the study of Rose et al. [14] included few Pentaphylacaceae species. Moreover, there are no populational studies of any species of the genus nor specific genetic markers, while during recent years, new species are increasingly being described, frequently endemic and/or threatened [15][16][17][18]. In Mexico, several species of Ternstroemia are considered an ethnopharmacological resource, named as "té de tila". The dry fruits are used to treat anxiety and insomnia; also, they are known for their analgesic, anti-inflammatory, and anticonvulsant properties [19]. Although, most of the effects result from the neurotoxicity induced by terpenoids and may be considered a health risk [20]. This issue is a health concern because infusions are sold as mixtures with other species (by local sellers and street markets) or in sachets (finely ground in supermarkets).
Currently, there are no published assessments about the impact of the exploitation of Ternstroemia fruits on its genetic diversity and demography. However, similar systems (trees under some exploitation) such as Aquilaria [21], Cedrus [22], and Dipterocarpus [23] have been genetically evaluated using microsatellites or simple sequence repeats (SSRs). This technique offers advantages such as codominance, high polymorphism, individual resolution, technical simplicity, and the ease of applying it to degraded DNA, such as herbarium vouchers [24].
Considering the pharmacological importance and ubiquity of Ternstroemia within the Mexican TMCF, it is urgent to develop specific genetic markers for the genus. Therefore, this manuscript describes the development of a set of 38 nuclear microsatellites. Specifically, we seek to test its utility in (1) estimating genetic diversity in Ternstroemia lineata, and (2) its transferability to six other key species of Ternstroemia and other Ericales.

Tissue collection and DNA isolation
We collected single, young leaves in silica gel from 20 individuals of Ternstroemia lineata from a mixed temperate forest in southern Morelia (Central-Western Mexico), between the localities of San Miguel del Monte and Ichaqueo. The individuals were georeferenced and sampled at least 250 m apart ( Fig. 1). Leaves from species other than T. lineata were either opportunistically collected and dried in silica gel or retrieved from herbarium vouchers (such as Freziera sp. and the South American T. asymmetrica and T. subserrata). These two South American species may give insights about the markers' transferability success in non-Mexican species. The Ericales sample included some representative families across the order: Ebenaceae (Diospyros xolocotzii), Foquieriaceae (Foquieria splendens), Primulaceae (Rapanea sp.), and Symplocaceae (Symplocos citrea); this selection is intended to provide a rough estimation of cross-amplification success across the order Ericales. We grounded the leaf tissue of all the studied species in individual 2 mL microtubes in a Retsch Mill (MM400). DNA was isolated using the CTAB protocol [25].

Microsatellite search using high throughput sequencing
In order to find the repetitive regions, we outsourced All-Genetics & Biology SL (La Coruña, Spain). One individual was randomly chosen as the source sample and used to construct the genomic DNA library. The library was prepared using the Nextera XT DNA enrichment kit (Illumina) with the microsatellite motifs: AC, AG, AT, ACG, and ATCT and following the manufacturer's instructions.
The library was then sequenced in the Illumina MiSeq platform (PE300), producing 9,921,869 paired-end reads. The quality of the raw sequencing data was checked using FastQC [26]. Finally, reads were processed in Geneious 10.2.3 and using in-house developed scripts. Primer design was implemented in Primer3 [27,28] in Geneious 10.2.3 (Biomatters, LTD).

Validation and cross-amplification
First, we filtered the database to select only perfect and uninterrupted motifs, and then we randomly selected 38 candidate primer pairs. These 38 loci contained only dinucleotide repeats; by far, the most common (35 loci      Genetic parameters such as expected and observed heterozygosity (He, Ho), allelic richness (Na), and deviation from Hardy-Weinberg equilibrium were calculated in GenAlEx 6.503 [29]. PIC was calculated using PIC_CALC [30]. We tested linkage disequilibrium among all the 23 chosen loci using the association index ( r d ) of Agapow and Burt [31] implemented in the R package poppr [32]. We checked the presence of null alleles using the R package PopGenReport [33]. Finally, we performed an interpolation of the individual heterozygosity using Empirical Bayesian kriging [34] 1 3 in ArcGIS 10.3 (ESRI, USA), to explore the marker set's potential in fine-scale genetic approaches. This approach automatically calculated the semivariogram from 1000 simulations using a standard circular neighborhood search (10-15 neighbors data points, radius = 0.013).
Since we selected the 23 evaluated markers for their consistent amplification, the minimum amplification success of this subset for Pentaphylacaceae was 91.3%. The interpolated map of genetic diversity (measured as individual heterozygosity) showed a clear declining trend from South-West (Fig. 2). Therefore, these markers are a suitable set for finescale population genetic research in this family. They can also clarify the taxonomic limits among troublesome groups such as the Ternstroemia lineata species complex (e.g., T. chalicophila Loes., T. dentisepala, and T. impressa Lundell tested in this study). In this species complex, efforts using traditional phylogenetic markers have been insufficient. They are also useful for assessing the genetic diversity in threatened species, as is the case of T. huasteca.