Introduction

Tiger beetles of the genus Cicindela are large diurnal predatory insects that tend to prefer sandy habitats near bodies of water such as river edges, and coastal beaches [1]. Many species along the North American Atlantic coast are declining due to the destruction of adult and larval beach habitat through increased development and recreational use, erosion, and sea level rise. The federally threatened northeastern beach tiger beetle Cicindela dorsalis dorsalis, which once was described as occurring in great swarms along beaches from Martha’s Vineyard, Massachusetts (MA) to New Jersey (NJ), and a common inhabitant of coastal beaches from MA south to Virginia (VA) is extirpated from much of its native range (United States Fish and Wildlife Service (USFWS) [2]). The white beach tiger beetle C. d. media native range overlaps with C. d. dorsalis and extends from NJ south to Florida (FL). However, while this species is also declining, it is generally considered more abundant than C. d. dorsalis [3]. The Puritan tiger beetle C. puritana is federally listed as threatened, and historically ranged from the Chesapeake Bay to Connecticut (CT), but is now reduced to a few isolated populations in Maryland (MD) and CT. While other tiger beetles co-occur with C. d. media, C. d. dorsalis, and C. puritana, these specific species are currently the focus of intense conservation efforts. To support their conservation, we developed a suite of microsatellite loci for population genetic research to facilitate estimation of the extent of gene flow, genetic diversity, and existence of metapopulations.

Main text

Methods

Multiple genomic shotgun DNA libraries of single individuals and pooled conspecifics were prepared from C. d. media, C. d. dorsalis, and C. puritana collected from throughout their native range. All samples were collected by the USFWS and provided to the U.S. Geological Survey (USGS) Leetown Science Center as whole beetles preserved in 95% ethanol. DNA was extracted from the head of each individual beetle using the DNEasy Blood and Tissue Kit (Qiagen, Germantown, MD). DNA was quantified using a Nanodrop spectrophotometer (ThermoFisher Scientific, Frederick, MD), and used for construction of libraries for Ion Torrent PGM sequencing. Sequence reads were generated from C. puritana (n = 1), C. d. media (n = 1), and C. d. dorsalis (n = 7) among 11 Ion Torrent sequencing chips. An additional library was sequenced on a 454 Junior for n = 1 C. d. dorsalis. All sequencing was performed at the USGS Leetown Science Center, Kearneysville, WV.

All sequence reads were imported into Qiagen CLC Genomics Workbench (ver 6.5.1). Quality and length trimming were performed with the following settings: ambiguous limit = 2, ambiguous trim = yes, quality limit = 0.015, minimum number of nucleotides in reads = 20, discard short reads = yes, remove 5′ or 3′ nucleotides = no. All quality trimmed C. d. media and C. d. dorsalis reads were concatenated into one file, and all quality trimmed C. puritana reads were concatenated into a separate file. We pooled the C. d. media and C. d. dorsalis samples since they are closely related subspecies, and microsatellite loci from one sub-species would have a high chance of success for cross-amplification in the other. Each fasta file was screened for di-, tri-, tetra-, penta-, and hexanucleotide microsatellite repeat motifs in the program QDD [4]. Settings for QDD included searching for a minimum of five repeats per motif, and a minimum sequence length of 80. The output of QDD included thousands of candidate microsatellite loci and primers designed using the integrated PRIMER 3 code [5]. From the two lists of candidate microsatellite loci, we chose to test primers for 30 loci in C. d. media/C. d. dorsalis, and 31 loci in C. puritana. Dinucleotide loci were avoided. Each sequence with a candidate microsatellite was blasted against the NCBI nt database, and none with any match to nt had strong similarity to organisms other than insects. Microsatellite loci were initially screened individually using M13 tailed primers [6]. Polymerase chain reactions were performed in 25 μl volumes, consisting of 10 ng of DNA, 1X PCR Buffer (Promega, Madison, WI), 0.25 μM of labeled forward primer, 0.5 μM of unlabeled reverse primer, 0.1 μM of labeled M13, 2.0 mM MgCl2, 0.2 mM of each dNTP, 0.25 units/μl Bovine Serum Albumin (New England Biolabs, Ipswich, MA), and 0.06 units/μl of Taq polymerase (Promega), using the following cycling conditions: 94 °C for 15 min, 29 cycles of 94 °C for 1 min, 58 °C for 45 s, and 72 °C for 45 s, 5 cycles of 94 °C for 1 min, 52 °C for 45 s, and 72 °C for 45 s, all followed by 72 °C for 10 min. PCR products for each locus were electrophoresed separately on an ABI 3130 Genetic Analyzer (ThermoFisher Scientific) automated DNA sequencer. Alleles were called using GeneMapper (ver. 4) (ThermoFisher Scientific) following the protocols described in King et al. [7].

The thirty C. d. media and C. d. dorsalis loci were initially tested on a sample of n = 8 C. d. dorsalis from Martha’s Vineyard, MA collected in 2013, and n = 8 from Cedar Island, MD collected in 2013. The thirty-one C. puritana loci were tested on n = 8 individuals collected from Little Cove Point, MD in 2013. Based on the amplification characteristics and levels of polymorphism within these test populations, 17 loci for C. d. media/C. d. dorsalis and eight loci for C. puritana were chosen for optimization in larger population samples (Tables 1 and 3). A multiplex PCR was designed for the C. d. dorsalis/C. d. media loci using the software Multiplex Manager [8], allowing the 17 loci to be run among four separate multiplex reactions (Table 1). Each multiplex PCR used the following concentration of reagents in a 15 µl reaction: 1.6X PCR Buffer (Promega, Madison, WI), 0.08 units/µl Taq polymerase (Promega), 0.2 µM of each forward and reverse primer, 0.3 mM dNTPs, and 3.75 mM MgCl2. Multiplex 1 and 3 utilized an annealing temperature of 56 °C, whereas 2 and 4 utilized 58 °C. Thermal cycling conditions were as follows: 94 °C for 2 min, 34 cycles of 94 °C for 30 s, 56/58 °C for 30 s, 72 °C for 90 s, followed by a final extension at 72 °C for 10 min. No multiplexed reactions were developed for the C. puritana microsatellite loci, which were genotyped using M13 tailed primers.

Table 1 Characteristics of 17 microsatellite loci in two collections of Cicindela dorsalis dorsalis, and one collection of C. dorsalis media

Data analyses

Final testing of the microsatellite locus panel for the C. d. media/C. d. dorsalis loci was on population samples of n = 24 C. d. media from Fisherman’s Island, Virginia (FI; 37.086 N, − 75.947 W), n = 20 C. d. dorsalis from Cedar Island, Maryland (CI 37.937 N, − 75.892 W), and n = 20 C d. dorsalis from Martha’s Vineyard, MA (MV; 41.3498 N, − 70.464 W). For C. puritana, a population of n = 20 from Connecticut River, CT (location withheld), and n = 20 from Little Cove Point, MD (38.38635 N, − 76.385 W) were sequenced. All genotype data were analyzed in MICRO-CHECKER (ver 2.2.3) to assess the occurrence of null alleles, large allele dropout, and scoring errors [9]. Exact tests in GENEPOP [10] were used to determine if the distribution of genotypes at each locus conformed to Hardy–Weinberg equilibrium (HWE). Multi-locus tests of conformance to HWE were completed using Fisher’s method in Genepop. Linkage disequilibrium (LD) was tested for all pairs of loci using contingency tables in GENEPOP. All tests of HWE and LD tests in GENEPOP used the default Markov chain parameters. Significance levels for HWE and LD tests were adjusted using the sequential Bonferroni correction. To assess genetic diversity, observed and unbiased expected heterozygosity and the effective number of alleles were calculated in Genalex ver 6.5 [11, 12]. Finally, to evaluate the extent of genetic differentiation among populations, we calculated pairwise \(F_{\text{ST}}^{{\prime }}\) in Genalex.

Results and discussion

Raw sequencing reads from all specimens are deposited in the NCBI short read archive as BioSamples under the NCBI BioProject PRJNA563672 for C. d. media and C. d. dorsalis, and BioProject PRJNA563686 for C. puritana. Among the 9,703,887 quality trimmed C. puritana reads processed by QDD, 238,322 contained putative microsatellites. Similarly, among the 5,569,580 quality trimmed C. d. media/C. d. dorsalis reads, 66,576 were identified by QDD as containing putative microsatellites.

Summary statistics of the genotypes collected from 17 multiplexed loci tested in three population samples of C. d. dorsalis and C. d. media are presented in Table 1. There were no missing data. Microchecker identified locus Cdo15 as having potential scoring errors in addition to possible null alleles, while a few other loci were flagged as possibly having null alleles. There was no evidence of linkage disequilibrium among locus pairs within or among collections. Several loci were monomorphic in one of the C. dorsalis dorsalis collections, precluding tests of HWE in Genepop for these loci. All populations were out of HWE based on Fisher’s method examining P-values across all loci. The most polymorphic locus was locus Cdo13 with seven alleles in C. d. media, and the number of alleles averaged across loci was higher in C. d. media at four versus approximately two in the C. d. dorsalis collections. The expected heterozygosity averaged across loci was low and similar across the three collections ranging from 0.20–0.29, and effective number of alleles was small reflecting the low levels of heterozygosity. Pair-wise estimates of genetic differentiation (\(F_{\text{ST}}^{{\prime }}\)) were high and statistically significant among all collections ranging from 0.334 to 0.767 (Table 2). This suggests a high level of genetic differentiation, and suitability of these loci for characterizing population structure.

Table 2 Matrix of pair-wise \(F_{\text{ST}}^{{\prime }}\) values (below diagonal) and P-values (above diagonal) between a collection of Cicindela dorsalis media, and two collections of C. d. dorsalis

Complete genotypes were also obtained for the eight loci screened in two population samples of C. puritana (Table 3). Some loci were identified as having null alleles by Microchecker, but no loci were flagged as having scoring errors. All loci were polymorphic in at least one population. The Little Cove Point collection was out of HWE, while Connecticut River was in HWE based on Fisher’s method examining all loci. Like for the C. d. dorsalis and C. d. media loci, some of the C. puritana loci were not sufficiently polymorphic for HWE testing in Genepop. There was no evidence of linkage disequilibrium among locus pairs or among collections. The most polymorphic locus was CpuQ2 with six alleles in the LCP collection. While the average number of alleles was similar across populations, the number of alleles at each locus was variable between populations with no consistent pattern. Both observed and expected heterozygosity, as well as the effective number of alleles were similar and low in the two populations. Pair-wise \(F_{\text{ST}}^{{\prime }}\) was large at 0.789 (P < 0.001) between the two C. puritana populations.

Table 3 Characteristics of eight microsatellite loci in two collections of Cicindela puritana

Overall, the results of the initial application of these loci to a small set of samples herein suggest that they will have utility for assessing population structure and patterns of gene flow in other populations of Cicindela tiger beetles. In addition, the shotgun genomic sequencing approach we employed identified thousands of candidate loci, allowing for the development of additional markers if needed.

Limitations

The number of populations and individuals examined so far is modest. Therefore, application of these microsatellite markers to additional populations of C. d. media, C. d. dorsalis, and C. puritana will reveal whether the levels of variation seen, such as a relatively small number of alleles per locus and low levels of heterozygosity, are typical among populations within these taxa. For a locus like Cdo15 in C. d. media and C. d. dorsalis identified by MICROCHECKER as having potential scoring errors, genotyping of more populations will help resolve whether this is truly a likely scoring error, or artifact of small sample size. Also, some individual loci strongly deviated from HWE and in most cases this was due to a heterozygote deficiency, most likely suggesting the occurrence of null alleles, though multiple processes such as non-random sampling can contribute to single locus departures from HWE [13]. Genotyping of additional populations with a higher sample size of individuals will help identify loci with consistent patterns of departure from HWE, the causes of which can be investigated further.