1 Introduction

Species decline is affecting essential ecosystem services (e.g., pollination); thus the development of biodiversity conservation actions toward preservation of affected species is an urgent need (Cardinale et al. 2012). Distribution data are a key component in developing initiatives aimed at species conservation prioritization (Kujala et al. 2011). However, for most species, especially insects and other invertebrates, this type of data is usually limited or absent (the so-called Wallacean shortfall) (Diniz-Filho et al. 2010). Species distribution modeling (SDM) has been proposed as an effective method for overcoming this shortfall, assisting in estimating the probability of species occurrence using the few available and sometimes biased distribution records (Diniz-Filho et al. 2010).

Species distribution modeling consists in applying algorithms that relate species occurrence to ecological data (mostly climatic and topographic) in order to obtain potential species distribution (Soberón and Peterson 2005). SDM has been applied to address conservation issues including evaluating species representation within natural protected area networks (Nóbrega and De Marco 2011; Bosso et al. 2013) and identifying optimal values of key environmental variables that favor species survival at broad scales (Rojas-Soto et al. 2008). Species potential distribution models have also been used to support the drawing of geographic limits for species considered taxonomically problematic (Raxworthy et al. 2007) and to understand speciation patterns (Peterson et al. 1999).

Bees are among the most important pollinators of both wild plants (Ollerton et al. 2011) and crops (Klein et al. 2007). In tropical America, stingless bee species of the genus Melipona Illiger, 1806 (Meliponini: Apinae), are considered potentially important pollinators due to their perennial nesting behavior, populations size (>1000 individuals per colony), polylecty but constancy in flower visiting and foraging behavior (buzz pollination) (Slaa et al. 2006). Melipona species are eusocial bees exclusive to the Neotropical region, reaching ∼60 species in South America and 16 species in Central America and Mexico (CAM). Several species of the genus have cultural and economic importance, since their honey, pollen and wax are frequently used as medicine and food supplements and in religious practices (Ayala et al. 2013). Despite their importance, some Melipona species may be at risk at least in a portion of their distribution areas (Cairns et al. 2005). In the context of a global decline in bee populations, preserving the diversity of wild bees, including Melipona stingless bees, is essential in maintaining high-quality pollination dynamics (Garibaldi et al. 2013).

Studies on Melipona species from CAM have evaluated aspects related to meliponiculture and traditional knowledge (Reyes-González et al. 2014), biology (Macías-Macías et al. 2011), taxonomy and diversity (Ayala 1999; Camargo and Pedro 2008), population genetics (May-Itzá et al. 2012), and conservation threats (Cairns et al. 2005). There exists general knowledge about the distribution of the species in the region (Camargo et al. 1988; Ayala 1999; Roubik and Camargo 2012), yet there are no studies focusing on their distribution patterns or describing the associated ecological characteristics on a broad scale.

Considering the lack of knowledge on the distribution and ecology of Melipona species in CAM, and aiming to provide data to develop further studies concerning conservation strategies for these important native bees, this study was designed with three objectives: 1) to develop broad-scale species distribution models, identifying potential geographic divergence among taxonomically related species; 2) to identify key ecological factors (climate, topography, ecoregions) affecting species distribution; and 3) to evaluate conservation conditions of the species, assessing the proportion of potential species distribution under protection and under agricultural interventions.

2 Materials and methods

2.1 Species and study area

We applied SDM to analyze the geographic patterns of 11 Melipona species occurring in CAM: Melipona beecheii s.l. Bennett, 1831; M. triplaridis Cockerell, 1925; M. colimana Ayala, 1999; M. fasciata Latreille, 1811; M. solani Cockerell, 1912; M. costaricensis Cockerell, 1919; M. fallax Camargo & Pedro, 2009; M. lupitae Ayala, 1999; M. yucatanica Camargo Moure, & Roubik, 1988; M. phenax Cockerell, 1919, and M. micheneri Schwarz, 1951.

It is possible that M. beecheii s.l. (hereafter M. beecheii) is a species complex (May-Itzá et al. 2012), but here it is treated as a single taxon. Species for which available locality information was insufficient (n < 5) were not included in this study (i.e., M. carrikeri Cockerell, 1919; M. insularis Roubik and Camargo, 2012, and M. torrida Friese, 1917).

The study area corresponds to the known distribution of the Melipona genus in CAM, which runs from the Mexican states of Sinaloa and Veracruz to southern Panamá. However, because most of these species show restricted distribution patterns within this geographic area, and in order to avoid excessively extrapolated predictions difficult to interpret (Barve et al. 2011), we restricted the total study area for each species based on identifying both the ecoregions where the species have been collected and adjacent ecoregions. Therefore, the entire study region (CAM) was stratified into six modeling zones (Fig. S1, Online Resource 1). M. beecheii has been collected in Cuba and Jamaica, but we did not include these islands in our analysis because of a lack of data collection. Even though some species (i.e., M. fallax, M. micheneri, M. phenax) have been reported as occurring in the northern part of South America, this region was not included because of a lack of data on species occurrence and because previous analysis (not included) failed to predict distribution of these species within that area.

2.2 Species record data

Locality records were mainly retrieved from label information of specimens belonging to the collections listed in Table S1 (Online Resource 1). Other record sources included personal fieldwork and donations from scientific collectors (Table S2, Online Resource 1).

When necessary, geographic coordinates were obtained by means of regional and global atlases supported by the use of geographic information systems (GIS) and Google Earth (Google Inc. 2015). A total of 323 unique localities were used to generate the distribution models presented here (Table S3, Online Resource 1).

2.3 Predictive variables

Nineteen bioclimatic variables widely used for generating species distribution models (annual and seasonal trends in precipitation and temperature) were retrieved from the WorldClim database (Hijmans et al. 2005; http://www.worldclim.org/current) at 30 arc-second (∼0.01 × 0.01 degrees) spatial resolution. This resolution was chosen to match our species occurrence data resolution (Peterson et al. 2011), and also considering that Melipona species have a relatively small flight range (≤2 km) (Araújo et al. 2004). In addition, four topographic variables (aspect, compound topographic index [CTI], elevation, and slope) were downloaded from the U.S. Geological Survey’s HYDRO1k database (http://eros.usgs.gov).

An ecoregion layer (Olson et al. 2001) was used as a predictive (categorical) variable for stratifying zones according to the extent of species occurrence, and to identify species’ biogeographic affinities. Spatial data were managed and analyzed using ArcGIS v.10.

Principal component analysis was applied to the 19 bioclimatic variables in order to reduce collinearity and to generate a smaller set of uncorrelated predictive variables (Dormann et al. 2013). A total of five principal components (PCs) were generated, accounting for 100 % of the total variation for each modeling zone (Table S4; Online Resource 1).

The five PCs generated from the bioclimatic data set and the four topographic variables previously mentioned were added as prediction variables for developing ecological niche modeling.

2.4 Species distribution modeling

To generate the distribution model for each species, a consensus approach was applied (Araújo and New 2007; Marmion et al. 2009), using five algorithms: support vector machine, generalized linear model, artificial neural network, GARP, and Maxent. Models generated with the first three methods were obtained by calculating “presence vs. background” models using the ModEco software package (Guo and Liu 2010). GARP and Maxent models were generated using the DesktopGarp (Stockwell and Peters 1999) and Maxent v. 3.3.3k (Phillips et al. 2006). software packages. A brief description of each algorithm is provided in Online Resource 2, and Table S9 (Online Resource 2) shows the parameters used for running each of the five predictive algorithms.

2.5 Accuracy assessment

Accuracy assessment for each species and each algorithm was performed by calculating the area under the curve (AUC) of the receiver operating characteristic (ROC), which evaluates commission vs. omission errors (Fielding and Bell 1997). An AUC value of ≤ 0.5 indicates that the model performed no better than random, whereas an AUC of 1 indicates a perfect performance, and an AUC of ≥ 0.7 reflects an acceptable model. Accuracy was also assessed by the significance in the proportion of areas predicted (Chi-square test; Anderson and Martínez-Meyer 2004). For species with more than 20 records (M. fallax, M. costaricensis, M. solani, M. beecheii), 25 % of the total was separated as an independent test sample, while for the remaining species, model accuracy was estimated based on the location of training samples.

2.6 Model ensemble

The median consensus method (Marmion et al. 2009) was applied to integrate the models generated from the five algorithms into one model for each of the 11 species. Before combining the models, a simple method ([cell value – minimum]/[maximum – minimum]) was applied to standardize the continuous range of values to a scale of 0–1. With the exception of GARP, the algorithms generate models with a potential range of continuous values from 0 to 1. Because the ten best models of GARP were added to obtain a model with corresponding values (i.e., 0–10 as potential range), a transformation was applied so that these models would have the same scale as the other four algorithms.

By examination and visualization of thresholds calculated by each of the algorithms, expert opinion was used to determine the thresholds for each species that would define binary models (presence/absence) of potential distribution (Table S3, Online Resource 1). The presence boundaries of the binary maps were generalized to facilitate posterior analysis. Physiographic features were used to describe distribution patterns: Marshall’s classification (2007) was used for Central America and Cervantes-Zamora et al.’s (1990) for Mexico.

2.7 Environmental and conservation characterization of species predicted presence

In order to analyze the range of climatic and topographic conditions modeled as suitable for the presence of each species, binary models were overlaid on the climate and topography layers using ArcGIS v.10. Maxent outputs were used to identify the most important variables that predicted each species distribution, and the most closely correlated bioclimatic variables defining each key PC were identified.

Finally, for evaluating the percentages of species potential distribution subject to protection and agricultural interventions, data on natural protected area boundaries (ProtectedPlanet.net 2015) and the proportion of cropland (Ramankutty et al. 2010) were also spatially associated to predicted presence areas for each species.

3 Results

We obtained distribution models (maps) for 11 species of the genus Melipona in CAM. The information provided by the models, along with further studies, will be useful for understanding the conservation status of the species and identifying priority areas for conservation planning.

3.1 Species potential distribution models

Species models for which accuracy assessment was calculated (n > 20, Table S3) performed better than expected by chance (e.g., AUC > 0.8975, Table S3). According to the location of species records (Fig. S1), core areas identified as having highest habitat suitability for Melipona species (darker shaded areas; Figure 1) are scattered and restricted mainly around collecting localities (e.g., Figure 1h, i), although some species models revealed presence areas in regions with no neighboring collecting records (e.g., M. beecheii along the Pacific coast of Mexico, states of Guerrero and Michoacán; Figure 1a).

Figure 1.
figure 1figure 1

Consensus distribution models of Melipona species of CAM. Panels a and b correspond to subgenus Melikerria; c to f to subgenus Michmelia; h, i and j to subgenus Melipona, and k to subgenus Eomelipona. Probability estimates of species occurrence is expressed as a continuum; darker colors indicate higher prediction values and lighter colors indicate lower values. The highest prediction interval (darkest shade) was defined using an expert-based threshold (shown as the lower interval limit). Panel l shows the natural protected area (NPA) network within the study region (ProtectedPlanet.net 2015).

Models follow particular discontinuous patterns along their respective modeling zone. For example, models for species of the subgenus Michmelia reveal that M. colimana is clearly distributed north of M. fasciata (Figure 1c, d, respectively), with each species occupying distinctive physiographic provinces (Table S5, Online Resource 1).

According to the models, species are expected to occur at distinct elevation ranges (Fig. S2, Online Resource 1). For instance, M. colimana and M. fasciata tend to be high-elevation species (65 % of distribution at 1650–2750 m and 550–2200 m, respectively), while other species from the same subgenus, such as M. solani, seem to be low-elevation species (<419 m) (Fig. S2 c, d, e, respectively).

With regard to the biotic conditions associated with the distribution models for each species, potential distribution areas occur in 44 ecoregions (Table S6, Online Resource 1) which correspond to five biomes: deserts and xeric shrublands (DXS), tropical and subtropical coniferous forests (TSCF), tropical and subtropical moist broadleaf forests (TSMBF), tropical and subtropical dry broadleaf forests (TSDBF), and mangroves (M) (Fig. S3, Online Resource 1). Closely related phylogenetic species may share a dominant biome type, yet the variety of ecoregions associated with each species reflects conditions of biotic heterogeneity (Table S6, Fig. S3). For example, models of M. yucatanica and M. lupitae (subgenus Melipona) show a predicted potential distribution mainly within TSDBF (>56 % and >95 %, respectively), but M. yucatanica is distributed in a greater variety of ecoregions.

3.2 Environmental characterization of species predicted presence

The predictive variables (PC1 to PC5, slope, DEM, aspect, and CTI) that contributed the most in generating the species models differed among species (Table S7, Online Resource 1). However, the most important correlated climatic variables included within the main PCs were recurrent among the six geographic zones (Table S4). Examination of Tables S4 and S7 shows that for most of the species, precipitation-related variables (bc12, annual precipitation; bc13, precipitation of wettest month; and bc16, precipitation of wettest quarter) were the most important in building the models, whereas temperature-related variables (bc4, temperature seasonality; and bc7, temperature annual range) were important only for M. beecheii, M. fasciata, and M. yucatanica. The elevation variable (DEM) accounted for >40 % in producing the models for M. triplaridis, M. colimana, M. phenax, and M. micheneri.

Patterns regarding the type and values of key bioclimatic variables determining the distribution model characteristics (e.g., “optimum” habitat values) differed among species (Fig. S4, Online Resource 1). Some species experience a narrow optimal range of climatic conditions, while others seem to be more tolerant, having more widespread distribution along the bioclimatic gradient. Some variables are important in delimiting the potential distribution for several species in the same subgenus, but each species is expected to be distributed mainly within specific variable intervals (e.g., subgenus Michmelia Fig. S4a, b, c, d).

The species M. yucatanica shows a specific pattern with respect to variable bc4, having two well-defined “optimal condition” ranges (437–900 and 1825–2288) (Fig. S4t). These intervals roughly correspond spatially to two main areas along the Pacific coast of Mexico, Guatemala, El Salvador, and Nicaragua, and along southeastern Mexico (Veracruz, Campeche, Yucatán, and Quintana Roo states).

3.3 Conservation characterization of species predicted presence

For most of the species, less than 20 % of the predicted presence is included within recognized natural protected areas (NPAs) (Fig. S5, Online Resource 1). The species with the lowest proportional area included within NPAs (<1 %) is M. yucatanica, while M. fallax shows the highest (53 %) (Fig. S5i and g, respectively). Interestingly, a small proportion of protected fragments is larger than 100 km2 (Table S8, Online Resource 1).

The proportion of predicted areas included in NPAs (Fig. S5) revealed two characteristic scenarios resulting from the combination of extent and location of modeled species occurrence and existing NPAs: For some species, such as M. fallax and M. costaricensis, for which a large proportion of predicted occurrence is confined to Costa Rica and Panamá (>37 % and >33 % for each species in each country), a large portion of such area (>45 % and >29 %) is under protection (Fig. S5f and g). On the other hand, M. colimana, M. fasciata, and M. lupitae, for which potential distribution is located exclusively in Mexico, exhibit a respective 18 %, 5 %, and 16 % of their distribution under protection (Fig. S5c, d, h).

A large proportion (>60 %) of predicted distribution areas for several species (M. beecheii, M. triplaridis, M. fasciata, M. solani, M. costaricensis, M. fallax, M. lupitae, M. phenax, and M. Micheneri) is located in areas with the lowest cropland intensity (0–20 %). In contrast, M. yucatanica and M. colimana show ∼60 % of their distribution in areas with higher (0–40 %) cropland intervention (Fig. S6; Fig. S7, Online Resource 1).

4 Discussion

In this work, we modeled potential distribution for 11 Melipona species of CAM, identifying key associated environmental and ecological conditions. The scattered appearance of predicted core areas may be related in part to the merging of the models, but also to the fragmented availability of optimal conditions, as has been suggested for other Melipona species (e.g., M. capixaba; Serra et al. 2012).

One important assumption in modeling distribution is that the data used to produce models come from source populations (Phillips et al. 2006) whose area of distribution is determined mainly by the interplay of biotic and abiotic characteristics that allow positive growth rates of the species, and the area accessible to the species over time (Soberón and Peterson 2005). It has also been shown that for geographically restricted species, small sample size may yield accurate models provided that the sample is not biased and accurately represents the geographic extent of the species (Tessarolo et al. 2014). Here, models of species with small sample size (e.g., M. lupitae, M. micheneri) show distribution patterns that appear to be restricted to areas around collection localities, which may be associated with the rarity of the species and their likely narrow environmental tolerance. Moreover, in order to avoid under- or over-prediction (Barve et al. 2011), we stratified the entire study area into zones that may better represent the areas that have been accessible to each species; therefore, we presume that our models are a close approximation to the species’s potential distribution patterns.

In contrast, for species with wider distribution such as M. beecheii, areas were predicted for regions where no collection records exist, suggesting poor surveying in those regions or climatic and/or geographic barriers that have prevented species from reaching those areas. Another possible explanation is that some occurrence data may come from sink habitats—that is, habitats that do not meet the required biotic and abiotic conditions for species growth, but that remain within the area accessible to the species (Soberón and Peterson 2005). Such data may produce models that overestimate the distribution area of a species, predicting its presence, albeit with lower probability, in areas not suitable for its persistence (Soley-Guardia et al. 2014).

Despite the low reliability of models based on spatially biased samples of any size, surveys must be recommended at sites predicted as suitable in order to increase the chance of collection, or at least to confirm their absence at those places (Le Lay et al. 2010). Although these types of models are not appropriate for setting forth conservation strategies (Wisz et al. 2008), they have been useful for discovering new species and new populations of known species (Raxworthy et al. 2003).

Recent surveys of M. beecheii and M. lupitae carried out at locations where models confirm documented species occurrence (e.g., Chamela, Jalisco, and Balsas basin, Michoacán, respectively) have failed to supply new specimens. Several hypotheses can explain this situation, including a decrease in species population due to intervening factors such as increased competition for floral resources as a result of the relatively recent regional spread of the invasive Africanized honey bee (Cairns et al. 2005).

Some of the modeled species are frequently used in meliponiculture, activity that includes the transport of nests to new localities by humans. These actions should be taken into account because of the “sink records” effect associated with over-predicted distribution areas, which in reality are unsuitable for the species (Soley-Guardia et al. 2014). This may be the case with M. beecheii, for which predicted distribution values in the central part of the Yucatán Peninsula seem unexpectedly low, given that it is largely reared in the region. Based on our findings, the suggestion by Cairns et al. (2005) of a possible decline in M. beecheii populations in this region could be re-interpreted as a retraction from its actual distribution due to specific environmental and ecological conditions. If that is the case, this supports the importance of employing SDM to better understand the biogeographic patterns of bee species.

The differential contribution of precipitation- and temperature-related variables in building the models of each species may have important implications for species future survival, as projections based on the MMD-A1B models (multi-model data sets for IPCC scenario A1B) expect a rise in temperature (1.8–5 °C) and a reduction in precipitation in CAM (Christensen et al. 2007).

Some authors have found a high degree of niche conservatism between allopatric closely related species (Peterson et al. 1999), while others have found a considerable niche divergence (Raxworthy et al. 2007). Here, we found that the distribution model for M. yucatanica (Figure 1i), shows two clearly divergent areas in both geographic and environmental space which correspond to genetically differentiated populations (May-Itzá et al. 2010). These findings have conservation implications, since specific management strategies could be planned for each population, but also support the suggestion of the existence of two distinct species. Moreover, even when their modeling areas partially overlap, models for some morphologically similar and closely related phylogenetic species show clearly disjunct distribution areas associated with particular environmental variables and ecological conditions (Figure 1c vs. d, d vs. e, e vs. f, and h vs. i; Fig. S4), whereas other closely related species show overlapping distribution (Figure 1f vs. g). Based on this findings, and following other studies that have shown that allopatric sister bee species differ in their potential distribution and realized niche characteristics (Silva et al. 2014), an integrative approach is recommended for reviewing the taxonomic status of these Melipona species, including niche analysis methods to assess niche characteristics and explain speciation patterns, in order to produce more robust species hypotheses.

The relatively small proportion of predicted presence in NPAs (<20 %) and the fact that total areas show different levels of crop intensity represent challenging scenarios for the conservation of Melipona species in CAM; these species are nest forest-dependent, as they rely on mature trees (>20 cm in diameter) with suitable cavities for nesting (Antonini and Martins 2003). Guatemala, where annual deforestation rates reach 1.53 % (Tuy et al. 2009), is an example of how the areas of low cropland intensity that characterize most of Melipona species models are coming under increasing land use transformation linked to crop intensity, which along with the use of pesticides may negatively affect bee populations (Slaa et al. 2006).

Considering that a large proportion of predicted distribution is located in areas mapped as low cropland intensity (Fig. S6, S7), one might hypothesize that such areas may represent suitable habitats for Melipona species because of the interaction between the flowering dynamics of agricultural practices and the availability of nesting cavities necessary for reproduction and colony survival that natural and semi-natural habitats provide. This may be especially important in species with restricted potential distribution (e.g., M. micheneri, M. lupitae) or with a narrow tolerance of environmental conditions (e.g., M. fasciata, M. colimana), which may be more vulnerable to environmental changes associated with deforestation, other land use changes, and climate change.

Although Melipona species are nest forest-dependent, they are polylectic, and are thus able to exploit floral resources available in transformed habitats (Brown and De Oliveira 2014). Further studies and conservation strategies should focus on evaluating the roles of traditional crop systems (as opposed to industrialized systems) and habitat heterogeneity (Fahrig et al. 2011) along the different core areas of the distribution models proposed here.

Melipona species models show three main areas of potential distribution in CAM—southern Central America, inner Central America, and south-central Mexico (north of the Tehuantepec Isthmus)—which correspond approximately to three main areas of terrestrial endemism (Marshall and Liebherr 2000). The fact that these patterns are shared with those of other stingless bee species (Ayala et al. 2013), and that Melipona species are able to make use of floral resources existing in transformed habitats, may suggest that conservation strategies could be promoted by favoring landscape heterogeneity in the form of agroforestry systems (Hoehn et al. 2010) or low-intensity cropland (Schüepp et al. 2012). This approach could be combined with the creation of new NPAs such as biosphere reserves for which human interventions are considered part of conservation practices (Guillen-Calvo et al. 2009), particularly on the Pacific coast, since there are only a few scattered NPAs (Figure 1l).

A final implication of both size (i.e., the NPAs include individual areas >100 km2) and location of NPAs with regard to the proposed species distribution models suggests the potential availability of species distribution “expansion areas”, particularly for scenarios where species distribution patterns may change, whether due to climate change (Thomas et al. 2012) or human economic activities (e.g., expansion of cropland).