Rapid development of 56 novel microsatellite markers for the benthic freshwater bug Aphelocheirus aestivalis using Illumina paired-end sequencing data and M13-tailed primers

The freshwater true bug Aphelocheirus aestivalis (Aphelocheiridae) is widely distributed in Europe but occurs rather locally and often in isolated populations. Moreover, it is threatened with extinction in parts of its range. Unfortunately, little is known about the genetic diversity and population structure due to the lack of molecular tools for this species. Thus, to overcome the limitations, a whole-genome sequencing has been performed to identify polymorphic microsatellite markers for A. aestivalis. The whole-genome sequencing has been performed with the Illumina MiSeq platform. Obtained paired-end reads were processed and overlapped into 2,378,426 sequences, and the subset of 267 sequences containing microsatellite motifs were then used for in silico primer designing. Finally, 56 microsatellite markers were determined and 34 of them were polymorphic. Analyses performed in two samples (collected from Drawa and Gowienica rivers, respectively) showed that the number of alleles per locus ranged from 2 to 21, and the observed and expected heterozygosity varied from 0 to 0.933 and 0.064 to 0.931, respectively. The microsatellite markers developed in the present study provide new suitable tools available for the scientific community to study A. aestivalis population dynamics. The assessment of its genetic diversity and population structure will provide important data, that can be used in population management and conservation efforts, elucidating the broad- and fine-scale population genetic structure of A. aestivalis.


Introduction
The benthic freshwater true bug Aphelocheirus aestivalis (Fabricius 1794) and two endemic congeneric species from the Iberian Peninsula, A. murcius (Nieser and Millán 1989) and A. occidentalis (Nieser and Millán 1989) [1] are the only representatives of the family Aphelocheiridae in Europe. A. aestivalis is distributed throughout Europe [2], where lives in the middle and upper reaches of streams and rivers with β-mesosaprobic waters [3,4]. The individuals of A. aestivalis occurring in Europe are predominantly micropterous [2]. Due to the fact that only a very few macropterous specimens capable of flight have been collected in the past [5], it has been assumed that A. aestivalis disperses mainly through the watercourses [6]. The remarkable downstream drift of nymphs and adults was observed e.g. in the Danube [6] and the Rhein rivers [5].
Although A. aestivalis is relatively tolerant of changes in some physical and chemical characteristics of water, it is mainly threatened by industrial and agricultural pollution and the regulation of watercourses [7]. In some European countries, A. aestivalis is considered an endangered species and is included in local and national Red Lists or Red Data Books [2]. Successful conservation efforts require not only an understanding of the biology of A. aestivalis and its environmental requirements (e.g. [2,6,8,9]) but also the knowledge on the distribution of the genetic diversity in its populations. Therefore, information about the genetic diversity and differentiation should become incorporated into ecological research and further conservation strategies. However, only a few molecular data that analyze the distribution of A. aestivalis have been published so far. For instance, fragmentary data have been obtained during a comprehensive study based on DNA barcodes in aquatic Heteroptera [10], the differentiation among related species [3,11], and in approaches leading to the effectiveness evaluation of new molecular tools [12]. More recently, the variability of selected mitochondrial and nuclear markers was evaluated in five populations of A. aestivalis [13]. Analyses revealed low genetic diversity at both levels. Interestingly, endosymbiotic bacteria of Wolbachia have been detected in all tested specimens, and its presence could correlate with decreasing mitochondrial diversity of A. aestivalis. However, the low genetic variation at both mitochondrial and nuclear level that has been observed in tested samples can also be a result of populations' close relationships, past demographic phenomena, or a specific feature of A. aestivalis.
In the past, microsatellite markers have emerged as one of the most popular and effective molecular tools [14]. Given their codominant and neutral nature, biparental mode of inheritance, and high level of polymorphism, microsatellites have been widely used for determining the genetic divergence among populations, estimation of population structure, and genetic diversity in different insect species (e.g. [15]). Nevertheless, such nuclear markers have not been developed for A. aestivalis so far, and thus, the structure of its populations, as well as the genetic divergence among them, have not been determined.
To overcome the limitations described above, a set of novel microsatellite markers has been developed for A. aestivalis. Moreover, a fast and cost-effective protocol for species-specific microsatellite markers discovery using Illumina MiSeq sequencing, and the amplification of selected loci with M13-tailed forward primers is presented in this paper.

Sample collection and DNA extraction
Specimens of A. aestivalis were collected in 2017 from various rivers located in Central and Western Europe (Table 1). A. aestivalis is not an endangered or protected species in the sampling area and sampling activities were not performed at locations where specific permission is required. The samples were collected using dragnet, scraping 0.2 m 2 of the bottom. The specimens were picked in situ from the collected material. All specimens were separately preserved in 70% ethanol and stored at − 20 °C until further analyses.
Genomic DNA was extracted from specimens using the High Pure PCR Template Preparation Kit (Roche Diagnostics GmbH, Mannheim, Germany) according to the manufacturer's protocol. Genetic material was extracted from the abdominal tissue of all collected individuals.

Microsatellites identification
Total genomic DNA isolated from four individuals collected from different populations (marked as DS, DR, SP, and BR) was pooled and used for library preparation using the TNEBNext® DNA Library Prep Master Mix Set for Illumina® (New England BioLabs Inc.). The sequencing was performed with an Illumina MiSeq sequencer at Genomed (Warsaw, Poland). A 2 × 250 base pair (bp) read mode was used during sequencing. The obtained sequences were trimmed and filtered with the Cutadapt v.1.12 software package [16].
The obtained sequences (2.4 Mega reads) were used for microsatellites detection with QDD v.3.1.2 [17]. Primers were initially designed in the QDD pipeline, but its further analyses (including determining the annealing temperature and the ability to form homo-and heterodimers) were performed with the FastPCR [18] and OligoAnalyzer [19] software.

Microsatellites amplification and analyses
All forward primers were tagged with M13(-21) (5′-TGT AAA ACG ACG GCC AGT -3′) tails at the 5′ end [20]. PCR reactions were performed in 20 µL volume containing 0.8x JumpStart Taq ReadyMix (1 U JumpStart Taq DNA polymerase, 4 mM Tris-HCl, 20 mM KCl, 0.6 mM MgCl 2 , 0.08 mM dNTP; Sigma-Aldrich, Germany), 0.4 µM of forward and reverse primers, 0.3 µM of fluorescently labeled (6-FAM, HEX or TAMRA) M13 primer and ~100 ng of      After the suitable annealing temperatures were determined, the screening for polymorphic and consistently amplifiable loci was performed. In the first step, DNA extracted from 14 A. aestivalis individuals collected from seven sampling sites from Central and Western Europe (Poland: specimens from Brda (BR), Słopica (DS), Drawa (DR), Stary Potok (SP) rivers, and two sampling sites from Krąpiel river (KK and KI, respectively), and Portugal: specimens from Corgo river (CR); two individuals per sample;) was used as templates for amplification. Then, microsatellite loci with amplification success were chosen for routine genotyping. Amplification products were run on an ABI Prism 310 DNA Sequencer (Applied Biosystems, Carlsbad, CA, USA) and analyzed with PeakScanner v.2.0 (Applied Biosystems) using GS-500 (ROX) as a size standard. Routine genotyping was performed using template DNA extracted from specimens collected from Drawa (DR; 30 individuals) and Gowienica (GOW; 30 individuals) rivers. Details for developed microsatellite loci are given in Table 2.
Basic summary statistics for each locus (i.e. allele size, number of alleles per locus (Na), number of private alleles, observed (H O ) and expected (H E ) heterozygosities, as well as tests for Hardy-Weinberg equilibrium (HWE) were calculated in GenAlEx v. 6.503 [21] based on results obtained for two samples collected from Drawa and Gowienica rivers, respectively. Tests for linkage disequilibrium were performed using Genepop v. 4.7.5 [22,23]. Micro-Checker v. 2.2.3 software was used to examine large allele dropout, stuttering, and null alleles as potential sources of error [24]. The significant values for all diversity tests of significance were corrected by the sequential Bonferroni's procedure [25].

Results and discussion
The advances in high-throughput sequencing technologies enable the rapid and cost-effective development of microsatellite markers for a vast number of different taxa. Currently, Illumina has becomes the first-choice platform used to accomplish microsatellite isolation. In the present study, the A. aestivalis whole-genome library was sequenced using the Illumina MiSeq protocol. Obtained paired-end reads were processed and overlapped into 2,378,426 sequences. A total of 234,634 sequences contained at least one microsatellite. Further analyses revealed among them 34,583 unique sequences contained microsatellites and for 15,049 of them, primer designing was possible. Those sequences contained three pentanucleotide motifs, 43 tetranucleotide motifs, 1030 trinucleotide motifs, and 13984 dinucleotide motifs. Mononucleotide repeats were excluded from analyses. Selected 267 sequences containing microsatellite motifs were used for in silico primer designing. Among them, two contained pentanucleotide repeats, 43-tetranucleotide repeats, 56trinucleotide repeats, and 166-dinucleotide repeats. For the preliminary screening under laboratory conditions, 75 loci out of these 267 potential microsatellite markers were tested in 14 A. aestivalis individuals collected from different sampling sites. Primers pairs that failed to amplify target loci, as well as primers that amplified regions of unexpected size, or amplified markers showing stuttering patterns during genotyping, were discarded. A final suite of 56 microsatellite loci yielded consistent amplification (Table 2). Sequences containing these markers were deposited in GenBank under accession numbers MT988404-MT988459.
Polymorphism of 56 selected microsatellite loci was then analyzed among specimens collected from two populations inhabiting unconnected rivers (Drawa and Gowienica, respectively). Analyses revealed 22 monomorphic loci and 34 loci which were polymorphic and consistently amplifiable. Further analyses of these informative loci showed that the total number of observed alleles per locus ranged from 2 to 21 in separated samples, and from 2 to 22 in the overall sample (Table 3). Private alleles were noticed in all but four loci (not observed at loci Aaesti12, Aaesti15, Aaesti19, and Aaesti22). Observed heterozygosity ranged from 0.000 (at loci Aaesti5 and Aaesti21 only two different homozygotes were observed per each locus among individuals from the Drawa river and similarly, two different homozygotes were   . Within the overall sample, the same parameters ranged from 0.000 to 0.898 and from 0.065 to 0.935, respectively. Null alleles were identified in 26 microsatellite loci and that in turn may resulted in a heterozygote deficiency (Table 3) [26]. The test of linkage disequilibrium showed a non-random association among alleles at 7 pairs of identified loci (i.e. Aaesti5-Aaesti28, Aaesti8-Aaesti16, Aaesti8-Aaesti22, Aaesti-16-Aaesti69, Aaesti8-Aaesti72, Aaesti19-Aaesti26, and Aaesti59-Aaesti72). This may be a result of physical linkage (loci may be located on the same chromosome, perhaps in close proximity to each other) or other phenomena (e.g. drift, selection, and/or gene flow). Consequently, one of the two associated markers should be excluded from further population genetic analyses in order to avoid the overstating of population structure [27]. The departures out of Hardy-Weinberg equilibrium after Bonferroni correction were found in all but thirteen microsatellite loci. These deviations were mainly due to a deficit of heterozygotes. Low heterozygosity estimated for most identified polymorphic microsatellite markers may be the effect of small size of selected A. aestivalis populations, inbreeding and minimal or null immigration of new individuals into populations. On the other hand, observed heterozygosities calculated for some loci (e.g. Aaesti40, Aaesti55, and Aaesti64) are relatively high, indicating that tested A. aestivalis populations are not devoid of the intra-population genetic variation. In turn, analyses for connected samples revealed the departures out of Hardy-Weinberg equilibrium after Bonferroni correction only in two identified loci.
Microsatellite markers have proven to be crucial in understanding population genetics and ecology of many species, including freshwater insects (e.g. [28]). Within this group, A. aestivalis has a potential as a model organism for the study on the impact of past demographic phenomena, phylogeographic relationships, and infection with endosymbiotic Wolbachia on the genetic diversity of slowly dispersing freshwater insects. In the present study, a set of 56 microsatellite markers have been identified for freshwater A. aestivalis and 34 of them were polymorphic. These markers provide new suitable tools available for the scientific community to study A. aestivalis population dynamics. Finally, the assessment of its genetic diversity and population structure will provide important data, that can be used in population management and conservation efforts, elucidating the broad-and fine-scale population genetic structure of A. aestivalis.