Findings

Background

Microsatellite markers, or simple sequence repeats, are widely applicable as DNA-based markers for population genetics studies. Moreover, their cost-effective development has been increasingly facilitated by applying next-generation sequencing (NGS) technologies [20].

Distylium lepidotum Nakai (Hamamelidaceae) is a small tree endemic to the oceanic Ogasawara Islands in the northwestern Pacific Ocean. The species is the dominant tree in the DistyliumPouteria dry scrub [18], which is inhabited by Boninoxya anijimensis Ishikawa, a locust recorded as a new genus and species [8]. The locust utilizes D. lepidotum as the sole food, i.e., it is monophagous [8, 9]. Although it is only distributed on Anijima Island of the Ogasawara Islands, it has been exposed to alien predatory species such as Anolis carolinensis. Conservation/benign introduction measures of B. anijimensis are needed on the Ogasawara Islands, except Anijima Island, to protect the B. anijimensis populations. As D. lepidotum is an essential food source, it may be possible to transplant the species. Therefore, it is important to reveal the genetic structure of the species to minimize any genetic disturbance due to the transplant. Here, we developed microsatellite markers to investigate the genetic diversity and structure in D. lepidotum.

Methods

Microsatellite markers were developed for D. lepidotum using an Illumina MiSeq Desktop Sequencer (Illumina, San Diego, CA, USA). Total genomic DNA was extracted from one silica-gel dried D. lepidotum leaf sample collected from Chibusayama (26°39′17.4″N 142°10′03.6″E) on Hahajima Island of the Ogasawara Islands using a DNeasy Plant Mini Kit (QIAGEN, Hilden, Germany). A shotgun library was prepared using the Nextera DNA Sample Preparation Kit v2 (Illumina), and the raw de novo sequencing data were obtained using the MiSeq Reagent Kit v2 (500 cycles) (Illumina). The raw reads were divided into each index, extra sequences (adapters and indices) were trimmed, and FASTAQ files were generated using the MiSeq Reporter v.2.5.1 (Illumina). The paired-end reads were merged using PEAR 0.9.6 [21] with default parameter settings. After the paired-end assembly, the low quality reads (<95 % with Phred quality score of 30) were removed using the script fastq_quality_filter included in the FASTX-Toolkit v.0.0.14 [7]. The resulting FASTQ files were converted to FASTA format using the ShortRead package [12]. A total of 1734,031 contigs with an average length of 241 bp were obtained.

The microsatellites were identified and the primer pairs were designed with QDD2.1 [11]. A total of 41,367 unique sequences containing pure/compound microsatellite regions (2–6 nucleotide motifs with >5 repeats) and primer-designable flanking regions were selected. The primer pairs were designed with Primer3 [17] and implemented in QDD2.1 using the following criteria: (1) polymerase chain reaction (PCR) product size of 90–500 bp and (2) primer lengths of 20–27 bp, melting temperature of 57–63° C, and GC content of 20–80 %. Finally, 18,239 microsatellite primer pairs were designed using Primer3.

Amplification and polymorphism were confirmed in 48 selected primer pairs after considering the microsatellites (one single dinucleotide motif with more than ten repetitions), design type (“A” or “B” in QDD2.1), and PCR product size to apply multiplex amplification (Table 1). Four universal primers with different fluorescent tags designed by Blacket et al. [1] were prepared, and the 5′ end of each forward primer was attached to the same sequence as a tail. In addition, as the 5′ end sequences of each reverse primer became 5′-GTTT-3′, a PIG-tail (5′-GTTT-3′, 5′-GTT-3′, 5′-GT-3′, or 5′-G-3′) was added to reduce stuttering due to inconsistent addition of adenine by Taq DNA polymerase [2].

Table 1 Characteristics of the 32 microsatellite markers developed for Distylium lepidotum

PCR amplification was performed using the QIAGEN Multiplex PCR Kit. Multiplex PCRs were performed for each of the four primer pair sets using the following thermal cycle conditions: initial denaturation for 15 min at 95° C, 35 cycles of denaturation for 30 s at 95° C, annealing for 1.5 min at 57° C, extension for 1 min at 72° C, and final extension for 30 min at 60° C. The PCR products were separated by capillary electrophoresis on an ABI3130 Genetic Analyzer (Life Technologies, Waltham, MA, USA) with the GeneScan 600 LIZ Size Standard (Life Technologies). The fragments were sized using GeneMapper 4.0 (Life Technologies).

We finally tested two populations from Chichijima and Hahajima Islands in the central part of the Ogasawara Islands to evaluate the allelic polymorphisms: 24 individuals from Asahiyama (27°05′40.7″N 142°12′35.6″E) on Chichijima Island and 20 individuals from Omotohama (26°37′28.9″N 142°10′41.7″E) on Hahajima Island. Voucher specimens of the representative individuals were deposited in the Makino Herbarium (MAK) of the Tokyo Metropolitan University, Japan (Asahiyama: no. MAK436933; Omotohama: no. MAK436934). The number of alleles per locus (N A), observed heterozygosity (H O), expected heterozygosity (H E), and fixation index (F IS) were calculated to characterize each locus using GenAlEx 6.501 [13]. The Hardy–Weinberg equilibrium (HWE) at each locus of each population and linkage disequilibrium (LD) between each locus pair in each population were tested with Genepop 4.0 [16]. In addition, the null allele frequencies (F Null) were estimated with CERVUS 3.07 [10]. To examine genetic differentiation between the two populations, Weir and Cockerham’s [19] estimate of pairwise F ST was calculated using FSTAT 2.9.3.2 [6]. The deviation of each pairwise F ST from zero was tested based on 1000 randomizations. Genetic structure was also evaluated by a Bayesian clustering method implemented in STRUCTURE 2.3.4 [4, 5, 15]. Markov chain Monte Carlo methods consisted of 100,000 burn-in steps and followed by 100,000 iterations. Ten replicate runs were performed at each K value from one to five under an admixture model with correlated allele frequencies. The log-likelihood probability at each run and the rate of change in the log-likelihoods between adjacent K values, ΔK [3], were calculated and compared across a range of K values to determine the best fit for the data.

Results and discussion

Of the 48 tested microsatellite markers, 32 primer pairs were polymorphic among 44 individuals (Table 1). N A ranged from three to 22 alleles in the Chichijima population and from one to nine alleles in the Hahajima population (Table 2). H E ranged from 0.156 to 0.940 in the Chichijima population and from 0.368 to 0.845 in the Hahajima population (Table 2). Locus Isu07063 in the Hahajima population was monomorphic; only one allele was found in six samples, and the remaining 14 samples were not successfully amplified, suggesting the existence of null alleles. In addition, F Null was high (Table 2). The Isu00524 locus in both populations deviated significantly from HWE. Significant deviations from HWE in the Chichijima or Hahajima populations were detected at several loci (Table 2; Isu04069, Isu07049, Isu10193, Isu12265, Isu15054, and Isu16805). These loci possibly involved null alleles, because null alleles are a common cause of apparent deviations from HWE [14]. Actually, F Null values were high in most of these loci (Table 2). However, these HWE deviations may have been caused by inbreeding, which can often occur in small populations. In either case, these loci should be used cautiously in further analyses. No significant LD was observed between the markers in the two populations.

Table 2 Genetic diversity of the 32 microsatellite markers in the two Distylium lepidotum populations

Of all the 397 alleles that were detected, the 193 alleles which were detected in the Chichijima population were not found in the Hahajima population. On the other hand, the 53 alleles which were detected in the Hahajima population were not found in the Chichijima population. In addition, the two populations were significantly differentiated (F ST = 0.0971). The Bayesian clustering analysis represented the highest ΔK value at K = 2 (ΔK = 121.4; Appendix). The Chichijima population was almost entirely composed of the cluster I (dark gray); the Hahajima population generally comprised the cluster II (light gray) (Fig. 1). However, because admixture was observed in some individuals of the Hahajima population, the infrequent gene flow between islands might occur. These data indicated that these markers can be used to analyze population genetic structure in the future.

Fig. 1
figure 1

Results of Bayesian clustering, STRUCTURE, at K = 2 of the two Distylium lepidotum populations. Vertical columns represent individual plants, and the heights of bars of each color are proportional to the posterior means of estimated admixture proportions. For population localities, see Table 1

Conclusions

These 32 novel microsatellite markers will be valuable for elucidating the genetic diversity and structure of D. lepidotum, since they have enough polymorphisms and they can clearly distinguish the two populations. The genetic data would be useful to investigate the genetic diversity and structure of D. lepidotum which is necessary for a food source of the endangered locust species on the Ogasawara Islands.