Development of markers using microsatellite loci of two rove beetle species, Paederus fuscipes Curtis and Aleochara (Aleochara) curtula Goeze (Coleoptera: Staphylinidae), followed by analyses of genetic diversity and population structure

Background The family Staphylinidae is the most speciose beetle group in the world. The outbreaks of two staphylinid species, Paederus fuscipes and Aleochara (Aleochara) curtula, were recently reported in South Korea. None of research about molecular markers and genetic diversity have been conducted in these two species. Objective To develop microsatellite markers and analyze the genetic diversity and population structures of two rove beetle species. Methods NGS was used to sequence whole genomes of two species, Paederus fuscipes and Aleochara (Aleochara) curtula. Microsatellite loci were selected with flanking primer sequences. Specimens of P. fuscipes and A. curtula were collected from three localities, respectively. Genetic diversity and population structure were analyzed using the newly developed microsatellite markers. Results The number of alleles ranged 5.727–6.636 (average 6.242) and 2.182–5.364 (average 4.091), expected heterozygosity ranged 0.560–0.582 (average 0.570) and 0.368–0.564 (average 0.498), observed heterozygosity ranged 0.458–0.497 (average 0.472) and 0.418–0.644 (average 0.537) in P. fuscipes and A. curtula, respectively. Population structure indicates that individuals of A. curtula are clustered to groups where they were collected, but those of P. fuscipes are not. Conclusion Population structures of P. fuscipes were shallow. In A. curtula, however, it was apparent that the genetic compositions of the populations are different significantly depending on collection localities. Supplementary Information The online version contains supplementary material available at 10.1007/s13258-022-01293-2.


Introduction
Staphylinidae Latreille, the rove beetles, is one of the largest family represented by approximately 63,600 species (Irmler et al. 2018). Among 32 subfamilies of rove beetles, the subfamily Paederinae Fleming contains 5962 species within 225 genera (Anlaş and Çevik 2008). The genus Paederus Fabricius, a widespread speciose group, comprises approximately 530 described species, and 128 species in eight subgenera are recorded in Palaearctic region (Assing 2020). The subfamily Aleocharinae Fleming is the largest staphylinid subfamily including approximately 16,200 species (Leschen and Newton 2015). The genus Aleochara Gravenhorst is the one of the most speciose genera comprising about 540 species within 19 subgenera (Yamamoto and Maruyama 2016).
The "Paederus dermatitis" damages have been reported continuously (Kim et al. 1989;Lim et al. 1996) and abundance of A. curtula also increased in 2018-2019. Despite their continuous outbreaks, studies of genetic diversity or genetic structure remain extremely limited. Microsatellites are DNA sequences consisting of 1-6 tandemly repeated nucleotides and can be used as informative DNA markers (Kwak et al. 2020). Several research about the genetic diversity of populations using microsatellite markers have been conducted for the use of conservation for endangered species (Kim et al. 2019;Kwak et al. 2020) or monitoring the domesticated organisms including crops and livestock (Biswas et al. 2020;Zhang et al. 2021).
In this study, we developed microsatellite markers of P. fuscipes and A. curtula for the first time. Genetic diversity was analyzed by genotyping polymorphic alleles using the newly developed microsatellite markers. We also visualized population structure. This work can provide insights for identifying characteristics of certain population within the species.

Collecting samples and DNA extraction
The specimens of Paederus fuscipes were collected manually on wet flatland including rice paddies from Taean

DNA sequencing and Microsatellite identification
After extracting the genomic DNA of P. fuscipes and A. curtula, we checked the quality of DNA by 1% agarose gel electrophoresis and PicoGreen ® dsDNA Assay (Invitrogen). DNA library was prepared according to Illumina Truseq DNA PCR-Free Library prep protocol. The quality of the libraries was verified by capillary electrophoresis (Bioanalyzer, Agilent). The library was clustered on the Illumina cBOT station and sequenced paired end for 101 cycles on the Novaseq 6000 sequencer according to the Illumina cluster and sequencing protocols.
Appropriate microsatellite loci from the genome sequence were searched using the QDD program (Meglécz et al. 2010), and flanking primer sets were also detected. Appropriate primer sequences were selected with high effective diversity according to the microsatellite development workflow (Lepais et al. 2020).

Development of microsatellite markers
The forward primers used to amplify the microsatellite loci was labeled with a fluorescent dye, 6FAM. To validate the selected primers, Polymerase chain reaction (PCR) was performed in a total volume of 20 μl containing 0.4 μl genomic DNA, 8 μl each for forward and reverse primers and 10 μl BioFACT™ Lamp Taq PCR Master Mix 2 (BIOFACT, Daejeon, Korea), with an initial activation of 5 min at 95 °C, 30 cycles of 30 s at 95 °C, 30 s at the locus-specific annealing temperature, 30 s at 72 °C and a final extension at 72 °C for 5 min using a Thermal Cycler (VeritiPro, Marsiling, Singapore). Genotyping the obtained alleles was performed using ABI PRISM 3730XL Analyzer and GeneMapper ® Software Version 4.0.

Data analysis
Genetic diversity parameters including the average number of alleles (Ad), expected heterozygosity (H e ), observed heterozygosity (H o ), and polymorphic information content (PIC) were analyzed using PowerMarker V3.25 (Liu and Muse 2005), which was also conducted for reconstructing UPGMA and NJ trees. GenAlEx 6.503 (Peakall and Smouse 2012) was used for AMOVA (analysis of molecular variance), PCoA (principal coordinates analysis), and calculating genetic distances (F st ) among populations.
The population structure was analyzed using STRU CTU RE 2.3.4 (Porras-Hurtado et al. 2013). To predict the optinum K value, the appropriate number of clusters constituting the structure, the program was run in three repetitions for each K from 2 to 10, with a burn-in period of 100,000 and 200,000 Markov chain Monte Carlo (MCMC) iterations. CONVERT (Glaubitz 2004) was used to transfer text files to input data for STRU CTU RE 2.3.4. software. Appropriate K values were determined using the web-based software STRU CTU RE HARVESTER (Earl and vonHoldt 2012).  (Tables S2, S3).

Population structure
Based on STRU CTU RE HARVERSTER, appropriate clusters were determined to be K = 7 or K = 4 in P. fuscipes and K = 2 or K = 3 in A. curtula, respectively (Fig. 1a, d). The structure was not divided separately neither K = 7 nor K = 4 in P. fuscipes (Fig. 1b). However, that was divided into two (K = 2) or three (K = 3) groups in A. curtula (Fig. 1e). PCoA also shows scattered plots for P. fuscipes but clustered plots for A. curtula (Fig. 1c, f). The results of AMOVA suggests relatively high genetic differentiation with 26% of variation among populations of A. curtula, while P. fuscipes shows low population differentiation ( Table 2). The highest genetic distance (F st ) was 0.005 between TA and SC in P. fuscipes, 0.348 between MJ and GJ in A. curtula (Fig. 2).
The H e of SC was relatively higher than the other populations, and GJ was the minimum. The minimum H o was GJ, however, the maximum was JE (Table 1). Considering the H e is calculated theoretically assuming random mating, it cannot be able to exactly match or corresponding the heterozygosity within actual populations and gene diversity (Gregorius 1978).
Genetic diversity among populations of P. fuscipes was almost identical and it showed shallow structure both K = 7 and K = 4 (Fig. 1b). Although the physical distances among three localities, genetic distances were extremely low (Fig. 2). It suggests that gene flows of P. fuscipes individuals were less limited and there were robust genetic communications. Populations of A. curtula showed strong structure. Estimated K value supported two genetic groups (Fig. 1d, e), which coincide with the pie chart of Fig. 2 (right). Especially GJ (Gyeongju), the region where the outbreak occurred, indicated low genetic diversity and extremely high genetic distances against the others.

Conclusion
Geographically isolated, individuals of GJ are restricted to communicate with those of other localities. In addition, the evolution within genus Aleochara took place rapidly. Molecular phylogeny suggested that the diverse species originated in relatively short period (Maus et al. 2001).
Our results provided basic data for a quick monitoring technique dealing with future outbreaks, but also discovering potential subspecies of A. curtula. It is expected that microsatellite markers can be applied to evaluate genetic diversity of more than population level within the family Staphylinidae.  Genetic structures are displayed as pie charts. F st values between TA-YS and SC-YS was calculated to be 0 because of a few individual samples 1 3 otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.