1 Introduction

Geographical features in a landscape are often associated with abrupt transitions in species or subspecies distribution. These may be legacies of historical barriers to gene flow, caused by the geographical feature, or by a change in contemporary ecological processes that affect species distribution. The Isthmus of Kra on the Malay peninsula, between 10° 34′ N and 11° 24′ N, is a transition zone between seasonal rainforest and mixed moist deciduous forest (Hughes et al. 2003), and between Indochinese and Sundaic biota (Woodruff 2003). During the early Pliocene (5.5–4.5 million years ago), many plant and animal populations north of the Kra Isthmus were repeatedly separated from populations to the south by changing sea levels, leading to the creation of new species and subspecies (Woodruff 2003). Contemporary distributions often reflect these historical events. For example, three shrew species show a transition zone just south of Isthmus of Kra near Satun to Sai Suri (Roberts 2011). Additionally, the honey bees Apis cerana (Deowanish et al. 1996; Smith and Hagen 1996; Sihanuntavong et al. 1999; Sittipraneed et al. 2001; Warrit et al. 2006) and A. dorsata (Insuan et al. 2007) show significant genetic variation among geographical regions of Thailand, particularly between populations north and south of the Kra Isthmus (Sihanuntavong et al. 1999; Warrit et al. 2006). Interestingly, Varroa jacobsoni, a parasitic mite of A. cerana, also shows different haplotypes north and south of the Kra ecotone (Warrit et al. 2006).

The stingless bees are a large group of tropical eusocial bees of the tribe Meliponini (Kerr and Maule 1964). In contrast to honey bees, which reproduce and migrate by swarms that can travel many tens of kilometers from the natal nest (Koeniger and Koeniger 1980; Nakamura et al. 1991; Dyer and Seeley 1994; Itioka et al. 2001; Paar et al. 2004), stingless bee colonies reproduce by a gradual process of budding, which restricts the distance of daughter colonies from their natal nest to 100 m or so (Michener 1979; Inoue et al. 1984; van Veen and Sommeijer 2000; Roubik 2006; Francisco and Arias 2010). This reproductive behavior means that stingless bee populations tend to show much greater structure than honey bee populations (Francisco et al. 2008, 2014; Francisco and Arias 2010; Brito et al. 2014) and might be expected to reflect the vestiges of past biogeography even more strongly than honey bee populations. Tetragonilla collina Smith, 1857, is one of the most common and broadly distributed stingless bee species in the Indochina region (Sakagami and Khoo 1987) and is found throughout Thailand (Theeraapisakkun et al. 2010). It is therefore an ideal species to examine the hypothesis that the Isthmus of Kra is a transition zone for stingless bee subpopulations that were formerly separated north and south of the Isthmus during the Pleistocene.

Combinations of morphometric and molecular techniques are often used to quantify genetic diversity and to determine intra-generic boundaries of bee subpopulations (Sittipraneed et al. 2001; Arias et al. 2006; Mendes et al. 2007; Tofilski 2008; May-Itzá et al. 2012; Rattanawannee et al. 2012; Wappler et al. 2012). For instance, Francoy et al. (2006) showed that a single wing cell, the radial cell, carries sufficient information to correctly classify three groups of Apis mellifera (Africanized, Italian, and Carniolan) with a fidelity of nearly 99%. Morphometric analysis of wing venation of the Brazilian stingless bee Plebeia remota revealed cryptic species within the population (Francisco et al. 2008). Francoy et al. (2011) proposed that geometric morphometric analysis of wing shape could be used as a first step for assigning genetic lineages and geographic origins samples of the stingless bee Melipona beecheii.

Here we use a phylogenetic analysis of the COI region of the mitochondria in combination with geometric morphometrics of wing venation to determine whether the T. collina population of Thailand shows a pattern of differentiation about the Kra ecotone that parallels that of A. cerana. If so, then this would reinforce the idea that there is a sharp biological division north and south of the ecotone that acts as a barrier to gene flow and enhances differentiation of bee populations. If there is no population subdivision in T. collina, then it may be inferred that the sharp boundary observed in A. cerana arises from a hybrid zone brought about by the reunification of the Sunderland and Indo-Chinese subpopulations of A. cerana in this area after the Pleistocene.

2 Materials and methods

2.1 Sample collection

T. collina workers were sampled from 71 colonies from 25 locations throughout Thailand. Sampling locations were grouped into seven geographical subpopulations (Table I, Figure 1). The bees were collected directly at the nest entrance tube of each colony. For morphometric analysis, at least 15 workers were collected, immediately killed with ethyl acetate, and then stored in 70% (v/v) ethanol. For molecular analysis, bees were immediately preserved in 95% (v/v) ethanol and then kept at −20°C until analysis.

Table I Sampling sites of Tetragonilla collina
Figure 1.
figure 1

Tetragonilla collina collection sites in Thailand. The numbers correspond to those in Table I.

2.2 DNA extraction, amplification, sequencing, and alignment

Genomic DNA was extracted from thoracic muscle of one worker bee per colony (34 colonies) using the DNeasy® Blood & Tissue Kit (Qiagen, Germantown, MD) following the manufacturer’s instructions. We amplified a fragment of the mitochondrial cytochrome oxidase subunit I (COI) in a 50-μL final reaction volume containing 1× PCR master mix (Catalog#K0171, Fermentas Life Science), 200 nmol of forward (LCO1490: 5′-GGTCAACAAATCATAAAGATATTGG-3′) and reverse (HCO2198: 5′-TAAACTTCAGGGTGACCAAAAAATCA-3′) primers (Folmer et al. 1994), and at least 200 ng of DNA template. Thermal profiles consisted of an initial denaturation step of 94 °C for 5 min, followed by 35 cycles of 94 °C for 1 min, 48 °C for 1 min, and 72 °C for 150 s, with a final extension step of 72 °C for 5 min.

Amplified PCR products were purified using QIAquick® Gel Extraction Kit (Qiagen, Germantown, MD) and directly sequenced by AITbiotech Pty Ltd. (The Rutherford Science Park 1, Singapore). Partial DNA sequences were aligned and edited using MEGA6 v6.06 (Tamura et al. 2013). Sequences have been deposited in GenBank under accessions KU934111–KU934146 (Table I). To obtain outgroups for phylogenetic analysis, we sequenced the same COI fragment from three additional stingless bee species [Tetragonula pagdeni Schwarz, 1939 (GenBank ID: KU934145), Homotrigona fimbriata Smith, 1857 (GenBank ID: KU934146), and Tetrigona apicalis Smith, 1857 (GenBank ID: KU934147)], all collected in Thailand.

2.3 Molecular diversity indices analysis and phylogenetic reconstruction

Measures of genetic diversity including the average number of nucleotide differences (k), number of polymorphic sites (S), haplotype diversity (h) (Nei 1987), and nucleotide diversity (π) (Nei and Li 1979) were obtained using DnaSP v5.0 (Librado and Rozas 2009).

Maximum likelihood (ML) and Bayesian inference (BI) methods were used to reconstruct the phylogenetic relationships among COI haplotypes. The program Kakusan4 (Tanabe 2007), with maximum likelihoods calculated in TREEFINDER (Jobb et al. 2004), was used to estimate the best-fit models of nucleotide substitution as determined by the Akaike information criterion, AIC, (Akaike 1974) implemented for ML and the Bayesian information criterion, BIC, (Schwarz 1978) for BI. The ML analysis was performed using the likelihood-ratchet method in TREEFINDER (Jobb et al. 2004), with 1000 bootstrap replicates to estimate branch confidence values. Tree topologies with bootstrap values 70% or greater were regarded as being sufficiently resolved (Huelsenbeck and Hillis 1993). The BI analysis was performed with MrBayes v3.1 (Huelsenbeck and Ronquis 2001), which employs a Metropolis-coupled, Markov chain Monte Carlo (MC-MCMC) sampling approach. A four-chain MC-MCMC analysis was run twice in parallel (with default heating values) for one million generations starting with a random tree, and trees were collected every 100 generations. The log-likelihood values of the sample points were plotted against the generation time, and 25% of the generations were discarded as “burn-in” samples. The remaining trees were used to estimate consensus tree topology, bipartition posterior probability (bpp), and branch length (Huelsenbeck and Ronquis 2001). A bi-partition posterior probability of 0.95 or greater was regarded as significant support for the consensus tree (Larget and Simon 1999).

2.4 Geometric morphometrics

Ten workers were randomly selected from each of 71 colonies for dissection, giving a total of 710 bees analyzed. The right forewing of each bee was dissected and slide-mounted. Wings were photographed with a digital camera attached on a stereomicroscope (Olympus SZX16) under ×25 magnification with the same camera setting and by the same person (AR). The wing images were randomly ordered using tpsUtil v.1.49 (Rohlf 2012) before bi-dimensional coordination of landmarks. A set of 13 homologous was digitized using the tpsDig2 v.2.16 software (Rohlf 2010) by the same person (AR). Landmarks are shown in Figure 2.

Figure 2.
figure 2

Right forewing of a Tetragonilla collina worker. The arrows indicate the respective position of each of the plotted landmarks.

All specimens were digitized twice by the same person (AR). Procrustes ANOVA (Klingenberg and McIntyre 1998) was then performed using MorphoJ v.1.05c (Klingenberg 2011) to ensure that the observed variation was attributable to biological variation and not to measurement error. Repeatability (R), the proportion of variance due to true variation among individuals in relation to the total variance, was calculated according to Arnqvist and Mårtensson (1988). To reduce the effects of measurement error, repeated measurements were averaged in MorphoJ and used in subsequent analyses.

Samples were grouped into seven prior groups according to geographical area (Table I) and into two groups based on the molecular analysis (see Sect. 3). Centroid size (CS) of the landmarks were calculated as an indicator of overall size of the wing (Zelditch et al. 2004) and used to assess whether bees of different subpopulations differed in size. We compared CS of the landmarks between two of the clades suggested by the COI analysis using an independent sample t test.

For variation of wing shape, the two-dimensional landmark data were first subjected to the Procrustes superimposition, which removes variation (i.e., scale, position, and orientation) that is not attributable to wing shape variation (Dryden and Mardia 1998). Canonical variate analysis (CVA) was then performed to examine the relative difference in wing shape among populations using MorphoJ. Mahalanobis distances and Procrustes distances between pairwise populations were determined and the significance of differences assessed by a permutation test (10,000 iterations). In order to infer phenotypic relationships of wing-shape variation among populations, a neighbor-joining tree (NJ) (Saitou and Nei 1987) based on Mahalanobis distances between the centroids of each population derived from CVA was constructed using MEGA6 v6.06.

3 Results

3.1 Molecular data analysis

We obtained DNA sequence comprising 519 base pairs (bp) from 34 individuals. The nucleotide composition showed high A+T content (average 71.48%). Multi-alignment and pair-wise sequence comparisons showed a total of 119 variable sites (89 informative), with 131 single base substitution sites comprising 93 transitions (70.99%) and 38 transversions (29.01%).

Molecular diversity indices are shown in Table II. In total, 19 different haplotypes were detected from the 34 T. collina individuals examined. The estimates of haplotype diversity (h) were high overall (0.95), ranging from 0.67 to 1.00 per subpopulation (Table II). Overall nucleotide diversity (π) was 0.073, and ranged from 0.011–0.070 per population. Populations from the east had the highest levels of nucleotide diversity, while those from the south had the lowest (Table II). When the bee samples were divided into two groups based on the two major clades suggested by the phylogenetic analysis of mitochondrial COI sequences (see below), molecular diversity indices of clade A were higher than those of clade B (Table II).

Table II Summary of molecular diversity indices of Thai Tetragonilla collina populations based on the mitochondrial COI gene

3.2 Phylogenetic analysis

The best-fit evolutionary model for a ML tree under the AIC was J2+G+I and that for a BI tree under BIC was HKY85+G. The topology of the Bayesian tree suggests that Thai populations of T. collina are monophyletic but divided into two major clades—clade A and clade B. Clade A comprises bees from northern, central, northeastern, eastern, and western1 (PHET1 and PHET2) populations, with a ML bootstrap value of 90.7% and a Bayesian posterior probability of 0.55. Clade B includes populations from western2 (SUNGKLA1, SUNGKLA2, and RATRY), Kra, and southern populations with high support values (ML bootstrap value of 89.9% and a Bayesian posterior probability of 1.0) (Figure 3 and Table I).

Figure 3.
figure 3

Phylogeny relationships of Tetragonilla collina populations in Thailand and outgroups based on Bayesian inference analysis (BI) of mitochondrial COI gene. Node supports inferred from Bayesian posterior probability and bootstrap value for ML.

3.3 Geometric morphometric analysis

Procrustes ANOVA (Klingenberg and McIntyre 1998) showed that the measurement error was low relative to overall shape variation. The between-individual mean square significantly exceeded the mean square attributable to measurement error (F 15,598 = 52.76; P < 0.0001). In addition, the repeatability of landmark acquisition was high (R = 0.96).

The bees from seven geographic populations differed significantly in wing centroid size (ANOVA: F 6, 703 = 114.8; P < 0.0001). When the bee samples were divided into two groups based on the two major clades suggested by the phylogenetic analysis of mitochondrial COI sequences (Figure 3), bees of clade A (mean wing CS = 772.64 ± 26.82) had significantly (P < 0.001 two-tailed t test) larger wings than bees from clade B (734.78 ± 21.98) (Figure 4).

Figure 4.
figure 4

Box-and-whisker plot showing forewing centroid size variation of Tetragonilla collina populations in Thailand. Boxes exhibit the median, whiskers exhibit the minimum and maximum observation, and circles exhibit the outliers of the data sets.

The first two canonical variates explained 77.640% of total variance (Table III). Based on permutation tests for Mahalanobis and Procrustes distances, most subpopulation pairs showed significant differences in forewing shape (Table IV). In addition, individuals from clade A were generally well separated from those of clade B (Figure 5).

Table III Eigenvalues and percentages of variance explained by six canonical variates produced from canonical variate analysis (CVA) of Tetragonilla collina wing venation pattern
Table IV Mahalanobis distances and Procrustes distances of Tetragonilla collina populations in Thailand derived from CVA of the worker bee’s forewings with p values calculated by 10,000 random permutations per test to examine statistical significant differences between pair of stingless bee populations
Figure 5.
figure 5

Scatterplot of individual scores for the first two canonical variates derived from the canonical variant analysis (CVA) of landmarks. Wireframes representing the shape change (solid line) from the consensus configuration of landmarks (dash line) to each extreme negative and positive CV scores. The symbols correspond to those in Fig. 3.

Deviations in shape from the consensus configuration along the first two CV axes of the CVA plot, as represented by wireframe graphs in Figure 5, showed that individuals located in the positive dimension of CV1 had an extended cross vein between the cubitus and vannal vein (cu-v) (landmarks 1 and 2) compared to individuals in the negative dimension. Further, the first branch of cubitus (Cu1) (landmarks 11 and 12) was constricted in the positive group relative to the negative group. Wing shape change along CV2 arose from the distal shift of landmarks 7 and 8 (radial sector) in the positive group when compared to the negative group.

A neighbor-joining tree constructed using Mahalanobis distances between population centroids revealed two distinct groups that are similar to the A and B clades revealed by the mitochondrial phylogenetic analysis (Figure 6).

Figure 6.
figure 6

Neighbor-joining tree based on Mahalanobis distances between population centroids derived from canonical variate analysis.

4 Discussion

Both genetic and morphological data strongly suggest that the T. collina population of Thailand is divided into two distinct clades. One clade (B), which is extant on the Thai-Malay Peninsula, also extends north of the Peninsula along the Myanmar border (Figure 3). This Thai-Malay clade is significantly smaller in wing size than the second clade (A) that occupies all other areas. There is a small area of overlap between clades A and B well north of the Kra Isthmus (Figure 3).

Various studies of population genetic structure of Thai honey bees have found a biogeographical transition zone between mainland and peninsula populations focused on a sharp boundary at the Kra ecotone (at Bang Saphan, Prachuap Khiri Khan: 11° 24′ N, 99° 31′ E; and Tup Sa Kae, Prachuap Khiri Khan: 11° 31′ N, 99° 35′ E) (Limbipichai 1990; Deowanish et al. 1996; Smith and Hagen 1999; Warrit et al. 2006). Woodruff (2003) suggests that sea level rises submerged sections of the Thai-Malay Peninsula on at least two occasions during the early/middle Miocene (24–13 mya) and in the early Pliocene (5.5–4.5 mya). These historical inundations most likely led to the contemporary subdivisions in various honey bee populations, but what about stingless bees?

Thummajitsakul et al. (2008) examined genetic variation and population structure of the arboreal stingless bee, Trigona pagdeni Schwarz, based on molecular markers. They detected differentiation between samples collected north and south of the Isthmus of Kra, but a much stronger differentiation between populations in the northeast with respect to all other populations. This differentiation parallels that seen here in T. collina, and suggests that stingless bees and honey bees differ strongly in their biogeography in Thailand.

In stingless bees, gene flow via females is restricted by their reproductive biology. Stingless bee colonies propagate by establishing a new nest nearby the parent nest (Roubik 2006). Food and other resources are transferred from the parent colony to the daughter nest over several weeks or months, greatly restricting the dispersal distance (Michener 1979; Inoue et al. 1984; van Veen and Sommeijer 2000) and the extent to which mitochondrial haplotypes can spread per generation via queens (Francisco and Arias 2010; de J. May-Itzá et al. 2012; Nogueira et al. 2014). Males probably disperse further than queens (Paxton 2000; Cameron et al. 2004; Kraus et al. 2008; Mueller et al. 2012), but nonetheless the flight distance of Melipona scutellaris males, for example, is only 0.8–1 km (Carvalho-Zilse and Kerr 2004) and less than 10 km for Scaptotrigona mexicana (Kraus et al. 2008). Therefore, high population subdifferentiation is expected (Francisco et al. 2014) and usually observed (e.g., Franck et al. 2004; Quezada-Euán et al. 2007; Tavares et al. 2007; Thummajitsakul et al. 2008, 2010; de J. May-Itzá et al. 2012; Brito et al. 2014) in stingless bee populations worldwide. Geographical barriers such as rivers, oceans, and mountain ranges can further restrict the dispersal of stingless bees (Brito et al. 2014). Despite limited dispersal, the clade B of T. collina appears to have expanded its range into northwest Thailand, where it apparently out-competes the mainland clade. This may be because the forest types of western Thailand have more in common with the Malay Peninsula than with central and northern Thailand (Maxwell 2001, 2004; Wikramanayake et al. 2002).

More important than forest type may be a combination of altitude and climate. The transition zone between clades A and B appears to be the Central Plane of Thailand (Figure 3). This area is very dry in the dry season but endures prolonged periods of inundation during the wet season. As T. collina nest in cavities, either underground (typically beneath a tree), or in a termite mound (Jongjitvimol and Wattanachaiyingcharoen 2007), topology and climate combine to make the Central Plane an inhospitable environment for ground-nesting T. collina, and the area probably acts as a strong impediment to gene flow between clades A and B. We therefore speculate that this area acts as a natural barrier to gene flow between the two clades, and maintains their integrity. Thus, we propose that it is ecological factors rather than historical biogeography that drive the contemporary distribution of the clades of T. collina. In contrast, species like A. cerana that nest in tree cavities are able to survive prolonged periods of flooding. Thus, the central plane does not affect gene flow within honey bee species. Nonetheless, the structure of the T. pagdeni population of Thailand is also separated by the central plane (Thummajitsakul et al. 2008), yet this species also nests in trees above ground, and is therefore typically unaffected by seasonal flooding.