Introduction

Sweat bees (Hymenoptera: Halictidae) are widespread pollinators which exhibit an unusually large range of social behaviours from non-social, where each female nests alone, to eusocial, where a single queen reproduces while the other members of the colony help to rear her offspring [1]. Sweat bees are also unusual in that social and non-social species are often closely related, with multiple evolutionary transitions having occurred between sociality and non-sociality [2]. Sweat bees thus represent excellent models for understanding social evolution [1, 2]. Here we present a new set of microsatellite loci developed from Lasioglossum malachurum (Kirby, 1802), a haplodiploid eusocial species that has been particularly well studied, mainly because it is widely distributed in the Western Palaearctic and because it often occurs in large, dense nesting aggregations that facilitate behavioural research [3,4,5,6]. Microsatellite markers are widely used in social evolution research, for example to investigate population structure, estimate genetic relatedness and assign offspring to parents [7,8,9]. Microsatellite loci have been developed for this species previously [3, 10] but most of them have comparatively low heterozygosities and are difficult to combine into multiplex reactions because of highly specific annealing temperatures and polymerase chain reaction (PCR) mixes. Here, we report 24 new microsatellite markers developed for L. malachurum, 14 of which have been efficiently amplified in two multiplex sets. These markers should substantially aid future studies on sweat bee behaviour and ecology.

Main text

Lasioglossum malachurum females were sampled from a field site at Denton in East Sussex, UK in 2015. Genomic DNA was extracted from head, abdomen and/or legs using an ammonium acetate extraction method [11, 12]. DNA concentration was quantified using a Fluostar Optima fluorimeter and its quality assessed using gel electrophoresis. DNA from one foundress (female M4) from Denton was digested using MboI and the fragments enriched for dinucleotide and tetranucleotide repeat motifs (following [13]). An Illumina paired end library was then compiled using this repeat-enriched genomic DNA. The NEBNext Ultra library preparation kit (New England Biolabs Inc. Cat. No. E7370S) protocol was followed and DNA sequencing was conducted using a MiSeq Benchtop Sequencer (Illumina). Primer sets were designed from 53 microsatellite sequences using PRIMER3 v0.4.0 [14]. Sequences were confirmed to be unique using BLAST software [15].

Each 2 µl PCR contained approximately 10 ng of air-dried genomic DNA, 0.2 µM of each primer and 1 µl QIAGEN Multiplex PCR mix (QIAGEN Inc. Cat. No. 20614) following [16]. As we required loci that could be reliably multiplexed together for efficient use we designed primers with very similar melting temperatures (± 2 °C) enabling these to be amplified at the same annealing temperature (57 °C). The following PCR profile was used: 95 °C for 15 min, followed by 44 cycles of 94 °C for 30 s, 57 °C for 90 s, 72 °C for 90 s and finally 60 °C for 30 min. PCR amplification was performed using a DNA Engine Tetrad ®Thermal Cycler (MJ Research, Bio-Rad, Hemel Hempstead, Herts, UK). PCR products were genotyped on an ABI 3730 48-well capillary DNA Analyser using the LIZ size standard (Applied Biosystems Inc. Cat. No. 4322682). Alleles were scored using GENEMAPPERv3.7 software (Applied Biosystems Inc.). Of the 53 markers, 24 could be scored reliably across the test sample (23–40 females all from the same field site at Denton) (Table 1). The remaining 29 were found to be either monomorphic or unreliable following our PCR methodology (Table 2). It is possible that with more specific optimization, some of these could be used in future studies. We successfully incorporated 14 of the optimized markers into two multiplex panels (using the above PCR reagents and concentrations) with no dropout or artifacts produced (Table 1).

Table 1 Characterisation of 24 new L. malachurum microsatellites
Table 2 Identification of a further 29 markers that were rejected and not considered for multiplex panels

The numbers of alleles and heterozygosities were calculated for each of the 24 loci using CERVUS v3.0.6 and with the sample sizes shown in Table 1 [17]. Tests for deviation from Hardy–Weinberg equilibrium (HWE) and linkage disequilibrium (LD) were conducted using GENEPOP web version 4.2 [18]. To correct p-values in multiple tests, the Q Value was applied to LD p-values. The q value is a measure of the significance in terms of false discovery rate, rather than conventional Bonferroni correction which attempts to measure significance in terms of false positives only [19]. Observed levels of heterozygosity ranged from 0.45 to 0.95 with 3–17 alleles per locus (Table 1). Only Lma31 deviated from HWE (p = 0.049). No groups of loci displayed LD, providing no evidence of physical linkage based on the individuals genotyped.

These loci are likely to be useful for investigating the ecology and behaviour of L. malachurum and also potentially that of other sweat bees. Indeed, we have successfully amplified 22 of the 24 loci in L. calceatum (Scopoli) individuals sampled in the UK; only Lma20 and Lma21 failed to amplify and 17 of the 22 loci that did amplify were polymorphic (Table 1; Davison & Field, in prep.).

Limitations

Due to the relatively short read length of the MiSeq Benchtop Sequencing system we were unable to design primer sets to amplify greater than 300 bases. This may however be somewhat fortuitous; the incorporation of larger markers into multiplex panels often proves problematic, since they are generally harder to amplify than markers with smaller products and are more susceptible to dropout [20].