1 Introduction

The western honey bee (Apis mellifera) is a highly variable species, with approximately 30 recognized subspecies (Ruttner 1988). Within subspecies, there are also ecotypes and breeding lines, which are important for practitioners who use and preserve genetic resources of bees. Identification of honey bee subspecies is not a trivial task because the differences between them are small and hybrids are possible. The first descriptions of the subspecies were based on morphology. Many different morphological traits were used, including size of various body parts, wing venation and pigmentation (Ruttner et al. 1978; Ruttner 1988). However, the forewing venation alone (Francoy et al. 2008; Tofilski 2008) or even shape of single wing cell (Francoy et al. 2006) can provide enough information for the identification.

In the last 30 years, molecular methods have become increasingly important in identification of honey bee subspecies (Bouga et al. 2011; Meixner et al. 2013). The methods are mainly based on microsatellites (Bodur et al. 2007; Franck et al. 1998; Jensen et al. 2005), allozymes (Bouga et al. 2005; Ivanova et al. 2012; Savit et al. 2006) and mitochondrial DNA (mtDNA) (Garnery et al. 1993; Ilyasov et al. 2011; Gruber et al. 2013). In recent years, the use of single-nucleotiode polymorphisms has also become important (Whitfield et al. 2006). Other molecular methods, for example, based on pheromones (Hepburn and Radloff 1996), are rarely used.

In the past decade, methods of identification based on genetic data achieved significant progress. Earlier studies were based on diagnostic alleles, i.e. alleles which are unique for a certain evolutionary lineage (Garnery et al. 1998). Newer statistical methods employing multiple unlinked loci do not necessarily require that the different subspecies possess unique diagnostic alleles because they aim at delineating homogenous genetic clusters of individuals (e.g. subspecies or populations) on the basis of their genotypes using a Bayesian method (Pritchard et al. 2000). A standard approach involves sampling of genotypes from a number of potential source populations and using these samples to estimate allele frequencies in each of them by finding population groupings minimizing genetic disequilibrium (Falush et al. 2003) or, instead, calculating expected frequencies on the basis of inbreeding rates (Gao et al. 2007). It is then possible to compute the probability that a given genotype originated from one of assumed clusters. Bayesian clustering methods not only are highly effective in distinguishing subspecies that belong to different evolutionary lineages (Jensen et al. 2005; Soland-Reckeweg et al. 2008; Oleksa et al. 2011) but also work well for comparatively closely related populations (Muñoz et al. 2009).

In general, studies based on molecular methods confirmed the results of earlier ones based on morphological methods. Both subspecies and evolutionary branches described by Ruttner (1988) using morphological methods were later largely confirmed using molecular methods (Cornuet and Garnery 1991; Garnery et al. 1992; Franck et al. 2000; Miguel et al. 2011). In some studies, a mixture of different methodologies was used and it was usually suggested that there is agreement between results obtained using morphological and molecular markers (De la Rúa et al. 2007; Miguel et al. 2011). The agreement was particularly strong when geometric morphometrics methods were used (Miguel et al. 2011). However, in some studies, discrepancies between different methods were observed (Franck et al. 2000; De la Rúa et al. 2001; Radloff et al. 2001; Kandemir et al. 2006).

In this study, we compared results of classification of two honey bee subspecies based on geometric morphometrics of forewing and genetic markers. We used two kinds of molecular markers, nuclear microsatellites (also known as simple sequence repeats—SSRs) and COI-COII intergenic region of the mtDNA. Because most phenotypic traits are determined by nuclear genes, we expected that morphometrics will correlate more strongly with nuclear markers than with mitochondrial markers.

2 Materials and methods

2.1 Sampling

In this study, we used honey bees from feral colonies from the area between Gdańsk and Olsztyn (18°36′–21°1′E, 53°14′–54°24′N) in Northern Poland (Oleksa et al. 2013a). This region was originally inhabited by the western and northern European dark bee, A. m. mellifera. In recent decades, genetic composition of the local population was changed by extensive importation of Carniolan bees A. m. carnica. Since distances between neighbouring colonies were usually more than several hundred metres, we assume that drifting between the colonies was negligible. Approximately ten bees from each colony were taken for the analysis. In total, 696 bees from 66 colonies were analysed.

Bees were stored in 90 % ethanol in a freezer (−20 °C) until morphological examination and DNA extraction. Insect thoraces were used as a source of DNA for molecular analyses, while wings were taken for measurements. Whole-genomic DNA was extracted with the standard Chelex procedure (Walsh et al. 1991) or Insect Easy DNA kit (EZNA) and then subjected to PCR polymerase chain reaction treatment. To infer about the origin of bees, the following two kinds of genetic markers were employed: COI-COII region of mtDNA and a set of 17 nuclear microsatellite loci of nuclear genome.

2.2 mtDNA analysis

The COI-COII region of mtDNA comprises sequence between the two units of cytochrome oxidase, including the gene of transfer RNA for leucine (tRNAleu) and the non-coding insert, composed of several repeated units referred to as P and Q (Garnery et al. 1993; Franck et al. 1998). Bees from the evolutionary branch C (which includes A. m. carnica), are characterized by only one sequence Q and the lack of sequence P. In contrast, in bees of M branch (which includes A. m. mellifera), there is at least one sequence of P next to the sequence Q. Differences in the number of repeated units allowed for the development of a simple diagnostic test using PCR amplification of COI-COII region and digestion of the resulting product with the restriction enzyme DraI (Garnery et al. 1993). The COI-COII region was amplified using primers E2 (5′-GGCAGAATAAGTGCATTG-3′) and H2 (5′-CAATATCATTGATGACC-3′) according to protocol described by Garnery et al. (1993). PCR reaction was performed in the total volume 15 μL (7.5 mL of Qiagen PCR Master Mix, BSA, primers E2 and H2 and deionized water to the total volume). To estimate the total size of the amplified fragment, 5 μL of the product was run on a 1 % agarose gel. The remaining part of the product was digested with the restriction enzyme DraI, and the resulting fragments were separated on 2 % agarose gels. Banding patterns were photographed under UV light and analysed using a computerized gel documentation system (Quantity One ver. 4.6.5, Bio Rad, USA). In order to compare mtDNA markers with other methods, mitotypes index was used. The index was assigned either one or zero for carnica and mellifera mitotypes, respectively.

2.3 Microsatellites analysis

Seventeen microsatellite loci (for primers, see Solignac et al. 2003) were amplified in two multiplex reactions: multiplex 1 A113, A24, A7, A88, Ap28, Ap43, Ap55 and Ap66 and multiplex 2 A025, Ac011, Ap090, Ap103, Ap226, Ap238, Ap243, Ap249 and Ap256. Forward primers for these loci were 5′ labelled with fluorescent dyes. Multiplex PCR was performed using the Multiplex PCR Kit (QIAGEN, Inc.) following the recommended protocol in a final reaction volume of 10 μL (5 μL of 2× QIAGEN Multiplex Master Mix, 4 μL of primer mix and 1 μL of template DNA). The PCR cycling started with an initial incubation at 95 °C for 15 min. It was followed by 9 touchdown cycles of 94 °C for 30 s, 60 °C (−0.5 °C per cycle) for 1 min 30 s and 72 °C for 1 min and 24 cycles of 94 °C for 30 s, 55 °C for 1 min 30 s and 72 °C for 1 min. Finally, tubes were incubated at 72 °C for 10 min. The separation of fragments was carried out on automated sequencer ABI PRISM 3130xl (Applied Biosystems) using the internal size standard (LIZ 600, Applied Biosystems). Resulting electropherograms were scored using GeneScan ver. 3.7 and Genotyper ver. 3.7 software (Applied Biosystems).

Assignment of individual multi-locus genotypes to the two clusters (subspecies A. m. mellifera and A. m. carnica) was performed with a Bayesian clustering method implemented in the InStruct software (Gao et al. 2007). The method assumes that there are K populations, each of which is characterized by a set of allele frequencies at each locus. Individuals in the sample are probabilistically assigned to one of the populations or jointly to two or more populations if their genotypes indicate they are admixed. For each individual assignment, probabilities of belonging to each ancestral population are computed. The Markov Chain Monte Carlo method can allow the posterior probability distribution to be computed for estimated parameters. This method is similar to the widely used STRUCTURE algorithm (Pritchard et al. 2000); however, it does not account for the presence of Hardy–Weinberg and linkage equilibriums. We did not use the STRUCTURE method because our data were composed of groups of worker genotypes. The presence of kin structure results in the violation of the assumptions of Hardy–Weinberg model. InStruct allows for different assumptions on the ancestry of the population (i.e. ancestral populations could be admixed or not admixed). We used the admixture model, which assumes that each individual (i) has inherited some fraction of its genome from ancestors from both subspecies. A burn-in of 100,000 iterations, followed by an MCMC (Markov Chain Monte Carlo algorithm) of 200,000 iterations was applied. In the analysis, all workers were included. For each worker, the probability of assignment to A. m. carnica was estimated. The probability is further referred to as “microsatellites index”. The workers and colonies were classified as A. m. carnica if the microsatellites index was larger than 0.5.

2.4 Wing morphometrics

Two forewings of each worker were dissected, mounted in glass photographic frames and scanned with a Nikon Coolscan 5000 ED scanner equipped with an SF-210 slide feeder (image resolution 2400 dpi). For every wing image, the coordinates of 19 vein junctions were determined automatically using DrawWing software (Tofilski 2004). The vein junctions were used as landmarks for geometric morphometrics (Tofilski 2008; Gerula et al. 2009). The position and numbering of the landmarks was the same as in Gerula et al. (2009). Discrimination between subspecies was based on reference samples which were obtained from queen breeders (Gerula et al. 2009). The reference samples were not verified using molecular markers; therefore, they can contain hybrids. For the discrimination between subspecies, the first canonical variate was used (hereinafter referred to as “morphometrics index”). The workers and colonies were classified as A. m. carnica if the morphometrics index was larger than −1.55. The threshold value was calculated as a midpoint between the mean canonical scores for A. m. mellifera (−3.73) and A. m. carnica (0.62) (Gerula et al. 2009).

2.5 Testing agreement between methodologies

The agreement between mitotypes, microsatellites and morphometrics was assessed by comparing the number of workers or colonies classified identically. Additionally, the relationship between the indices was compared by regression analysis in PAST ver. 3.0 (Hammer et al. 2001). Analyses were performed for both individuals (workers) and colonies. To characterize colonies, median values based on colony members were computed.

2.6 Identification of hybrids

There are no objective criteria for distinguishing between pure subspecies and hybrids. The identification requires the use of threshold values which are to some degree arbitrary. Here, we have used an approach proposed by Vähä and Primmer (2006) who recommended using a rather low and restrictive threshold value of 0.1 if the aim of the study is to efficiently detect hybrids, even at the expense of misidentification of a proportion of purebreds. Therefore, the workers and colonies were classified as pure A. m. carnica if the microsatellites index was larger than 0.9, as pure A. m. mellifera if the index was smaller than 0.1 and as hybrids if the index was between those values. Workers classified using those criteria were used as reference in the classification based on morphometrics. This allowed to verify if morphometrics can be used for identification of hybrids. We have used workers and not colonies for this classification because the number of colonies in the smallest group was much smaller than the number of morphometric variables. First, forward stepwise discriminant analysis was used to select morphometric variables which most effectively discriminate the groups (Statistica ver. 10.0, StatSoft 2011). Next, the selected variables were used to obtain classification functions (Online Resource, Tables 1 and 2). Leave-one-out cross-validation was used to obtain more reliable results (PAST ver. 3.0, Hammer et al. 2001).

3 Results

In 66 analysed colonies, we found 26 (39.4 %) colonies with mitotype characteristic for C lineage. The remaining 40 (60.6 %) colonies had M mitotypes. When bees were assigned to the two subspecies based on the assumed threshold for microsatellite index, 492 (70.7 %) workers and 47 (71.2 %) colonies were classified as A. m. mellifera and 204 (29.3 %) workers and 19 (28.8 %) colonies as A. m. carnica. For morphometric index, corresponding values were 512 (73.6 %) and 52 (77.3 %) for A. m. mellifera and 184 (26.4 %) and 15 (22.7 %) for A. m. carnica (Fig. 1). Frequency distribution of the microsatellites index differed markedly from distribution of the morphometric index for both workers and colonies (Kolmogorov–Smirnov test, workers D = 0.884, P = 0.0001, Fig. 1b, c; colonies D = 0.894, P = 0.0001, Fig. 1d, e).

Figure 1.
figure 1

Identification of honeybee subspecies. a Bar plot result of the InStruct assignment test for K = 2 under the admixture model. Colours indicate the relative contribution of each of the two subspecies (A. mellifera mellifera, gray; A. mellifera mellifera, white) recovered from the data for each individual (column) in each colony (delineated with black vertical lines). be Distribution of assignment indices based on microsatellites (b, d) and wing morphometrics (c, e). The graphs b and c correspond to indices at the level of workers and graphs d and e correspond to indices at the level of colonies. In all graphs higher values of the indices mean more similar to A. m. carnica and smaller values mean more similar to A. m. mellifera.

More than three quarters of the colonies (75.76 %) were classified to the same subspecies by all three methods. At the level of workers, the agreement between the three methods was lower (70.11 %). The agreement was highest between microsatellites and morphometrics (Table 1) and the lowest between mitotypes and morphometrics (Table 1).

Table 1 Percent of colonies (upper triangle) and workers (lower triangle) which were classified to the same subspecies by two of the three methods: microsatellites, mitotypes and morphometrics.

Indices of subspecific assignment based on the three methods were highly correlated with each other. The highest correlation (0.753) was between microsatellites index and morphometric index (Table 2, Fig. 2a). The lowest correlation (0.428) was between mitotype index and morphometric index (Table 2, Fig. 2d).

Table 2 Spearman correlation between indices of subspecific assignment based on the following: microsatellites, mitotypes and morphometrics. Upper triangle corresponds to colony level and lower triangle to worker level. All correlations are highly significant (P < 0.0001).
Figure 2.
figure 2

Relationship between indices of honey bee subspecies identification based on three methods: microsatellites, mitotypes and morphometrics. a, c Indices at the level of colonies. b, d Indices at the level of workers. In the graphs a and b, microsatellites index was logit-transformed. In all graphs, higher values of the indices mean more similar to A. m. carnica and smaller values mean more similar to A. m. mellifera. On the graphs a and b, least-square linear regression lines with its 95 % confidence limits are shown (solid and dotted lines, respectively). e, f Distribution of assignment indices for colonies with carnica and mellifera mitotypes. For each sample, the 25–75 % quartiles are drawn using a box, and the median is shown with a line inside the box. The whiskers indicate data points outside the box and 1.5 times higher (or lower) than 25–75 % quartiles values; data points outside these ranges (“outliers”) are shown as circles.

Relationship between the morphometric index and microsatellite index was non-linear and S-shaped at the level of both colonies and workers. Therefore, microsatellite index, which is probability and ranges from 0 to 1, was logit-transformed, to better meet the assumptions of linear regression. There was highly significant correlation between morphometric index and logit-transformed microsatellite index, both at the level of individual and colony level (Fig. 2a, b). As expected, the relationship was stronger at colony level—the proportion of the total variation that was explained by the models equalled 0.50 for individuals and 0.74 for colonies.

In case of discrimination between pure subspecies and hybrids based on microsatelites, 403 (57.9 %) workers were classified as pure A. m. mellifera, 84 (12.1 %) as pure A. m. carnica and 209 (30.0 %) as hybrids. When the three groups were used as reference samples for classification based on morphometrics, 75.9 % of workers were classified correctly. Classification of hybrids was less accurate in comparison to classification of pure subspecies (Table 3).

Table 3 Classification of worker bees as pure subspecies and hybrids using morphometrics.

4 Discussion

We have attempted for the first time to use morphometrics for identification of hybrids between honey bee subspecies. This identification was moderately accurate but allowed to discriminate between pure subspecies with higher reliability. One pure subspecies can be misclassified with hybrids but never with the other pure subspecies (Table 3).

The results presented here show significant correlation between the three methods of identification of honey bee subspecies. This corroborates and strengthens previously published conclusions (De la Rúa et al. 2007; Miguel et al. 2011). In this study, we have used for the first time three independent methods of subspecies identification which were based on different reference samples. We have also used admixed population, which better corresponds with real-life problems of honey bee subspecies conservation.

Particularly, strong agreement occurred between methods based on morphometrics and microsatellites. There is no clear answer which of the two methods is more accurate because the true assignment of workers to subspecies remains unknown; however, microsatellites have some advantages over morphometrics. First of all, as categorical traits, microsatellite alleles can be interpreted without error (although some issues could arise due to problems with amplification or electrophoresis conditions). On the other hand, morphological traits suffer from measurement errors. Second, unlike microsatellites, the morphological traits are affected by the environment which can obscure differences between subspecies.

As expected, there was a weaker relationship between morphometrics and mitochondrial markers. Although it is often suggested that there is agreement between nuclear and mitochondrial markers (Garnery et al. 1998; Jensen et al. 2005), there are some inconsistencies between the methodologies (Lobo 1995). Because mtDNA is inherited maternally, it is not suitable for identification of hybrids and quantification of level of introgression at the individual level (Garnery et al. 1998; Schneider et al. 2004). For example, Kandemir et al. (2006) detected two distinct mitochondrial lineages C and O, although the overall Cyprus population was relatively homogenous in terms of microsatellites. Similar disagreement between mitochondrial and nuclear markers occurs in Africanized honey bees (Lobo 1995) and in A. m. iberica (Cánovas 2008, 2011). In studies of subspecies identification, it was recommended to use mtDNA only for initial screening (Rortais et al. 2011) or together with morphometrics or nuclear markers (Nielsen et al. 1999; Pinto et al. 2003).

It is sometimes suggested that molecular methods are better than morphological ones (Page 1998); however, not all of them are suitable for discrimination of all honey bee subspecies (Sheppard et al. 1996). In some studies, morphometrics proved to be more effective in identification of subspecies than molecular markers (Oldroyd et al. 1995). Morphological characters were also more suitable for distinguishing ecotypes within subspecies (Strange et al. 2008). The subspecies have been described using morphological characters (Ruttner 1988). The molecular markers were introduced later and not all subspecies (for example, A. m. simensis, Meixner et al. 2011) are described in this way. Some endemic subspecies, for example, A. m. adami has been hybridized with other subspecies (Harizanis and Bouga 2003), and finding reference samples that enable identification of A. m. adami can be difficult. Similar situation occurs in case of extinct subspecies or ecotypes. In most case of archaeological material, only morphometrics can be used (Bloch et al. 2010). A major advantage of morphometrics is low cost and greater availability. This method is available not only to scientists but also to beekeepers. Wing venation can be measured automatically which is faster, less labour intensive and more accurate (Tofilski 2004, 2007; Baylac et al. 2008; Francoy et al. 2008). In recent years, there was significant improvement in methodology used for morphometric analysis of honey bee wings. Geometric morphometrics become increasingly popular (Monteiro et al. 2002; Tofilski 2008, Francoy et al. 2009; Miguel et al. 2011; Barour et al. 2011; Kandemir et al. 2011). The geometric morphometric provides more accurate results (Tofilski 2008; Miguel et al. 2011) and allows better graphical presentation of shape changes.

The results presented here show that classification of a colony to subspecies is more reliable if it is based on more than one individual. This confirms earlier studies which showed relatively large error of identification based on a single worker (Daly and Balling 1978; Page and Erickson 1985; Tofilski 2008) and supports the recommendation that the identification should be base on more than ten workers from a colony (Meixner et al. 2013). Morphology of an individual is shaped not only by its genetic makeup but also by environmental influences (Daly et al. 1988, 1995; McMullan and Brown 2006). As a consequence, morphological markers are more variable than molecular markers. Variation of the morphological characters can be particularly large in case of starvation or presence of parasites; however, identification of honey bee subspecies was largely correct even in case of workers developing under unfavourable conditions (Daly et al. 1995). Notwithstanding, if the aim of the study is to assign a colony to subspecies, a large number of individuals needs to be measured in order to reduce the identification error.

In this study, we examined the concordance between assignment of bees based on genetic and morphometric markers, in an area originally occupied by A. m. mellifera. Our results indicate that the native subspecies still predominates in the studied area and pure A. m. carnica are relatively uncommon despite recent queen importation. Perhaps, the observed strong link between nuclear and mitochondrial markers can be explained by the recency of hybridization. Furthermore, partial reproductive isolation between the two subspecies (Oleksa et al. 2013b) could counteract the weakening of the relationship between nuclear and mitochondrial genomes. The question whether the observed relationship between microsatellites and mitochondria or genetic and morphometric indices is unique for the particular population studied or represents more general pattern opens interesting perspectives for future research.

The results presented here show that morphometrics can be used for detection of hybrids between A. m. mellifera and A. m. carnica (but see Guzman-Novoa et al. 1994). The wing measurements are relatively inexpensive and accessible to beekeepers; therefore, they can be an alternative to molecular methods in projects aiming at protection of endangered A. m. mellifera.