New insights into the geographic patterns of functional role and taxonomic richness of ants from Mexico

One approach to prioritise conservation efforts is to identify hotspots that are either species-rich or that harbour a high proportion of narrow-ranged species with varying functional roles. Given this, we have developed full ant species distribution models as tools for identifying biodiversity and hotspot areas in Mexico. We predicted ant species richness and potential areas of endemism using the more complete data available from across America (i.e., including both North America and South America) and habitat/ecological specificity (functional diversity) in a species distribution model (SDM). In Mexico, the ecoregions more relevant for ant species richness and endemism are Mesoamerica, Petenes, Sierra Madre Oriental, Trans-Mexican Volcanic, and the Sierra Madre Occidental. While for hypogeic and arboreal ants the hotspots were localized in the tropic biomes, for epigeic and subterranean ants, hotpots were most prevalent in the tropics and desert. Moreover, the geographical patterns suggest that some hotspots for ants converge with those hotspots proposed for other invertebrates and vertebrates. Our study highlights the importance of hotspots areas for biodiversity conservation, and provides data and maps for ant conservation programs.


Introduction
The current biodiversity crisis has made the description and mapping of biological diversity a priority (Rands et al. 2010;Gueńard et al. 2012). Because so many species within biological communities remain poorly studied, much of our conservation effort has evolved into a focus on a few of the better known taxa (Shipley and McGuire 2022). Thus, identifying patterns of species distribution within these few species can be a first step toward successful conservation. This procedure would allow biogeographical clusters or regions to be established based on species assemblage's data and the designation of relevant conservation units (Blackman et al. 2021).
One approach to prioritise conservation efforts is to identify hotspots that harbour many species or relatively restricted geographical areas harbouring many narrowranged species. Another approach is to identify critical ecoregions: regions under threat of habitat conversion and climate change (Van Proosdij et al. 2016;Walters et al. 2019). The protection of such hotspots should decrease extinction probabilities for a larger number of species than would be protected in areas of similar size with lower diversity (Walters et al. 2019) and lower chances of habitat modification. Evolutionary and ecological processes responsible for endemic centres are complex and are linked to climatic, geological, and biogeographic history (Harrison and Noss 2017). Endemic species are often small-ranged, and unique to a given region (Myers et al. 2000), and they can represent  (Shipley and McGuire 2022). Identifying and describing hotspots, or regions with exceptionally high rates of endemism can help to understand the physical and biological drivers of endemism, prioritise certain broad-scale regions for conservation, and focus the efforts of stakeholders (Keppel et al. 2018;Shipley and McGuire 2022). At the same time, complicating factors such as similarity of patterns across taxa and congruence of endemism hotspots and species richness, must be considered (Villalobos et al. 2013;Van Proosdij et al. 2016). Identifying biodiversity hotspots requires information regarding species distributions that are often derived from locality data (Walters et al. 2019). Notwithstanding, for many world regions -especially the tropics-little is known about the spatial distribution of most individual species (i.e., Wallacean shortfall) or the spatial distribution of diversity or geographic patterns (Blackman et al. 2021). It is also difficult to estimate the spatial distribution of biodiversity directly from sampling points (Walters et al. 2019). Thereby, Species Distribution Models (SDM) emerge as solutions as they link a limited number of observations to environmental data with high spatial resolution. Such SDMs have also been broadly used to identify and prioritise areas (e.g., biodiversity and endemism hotspots) in need of restoration or preservation (Walters et al. 2019).
Ants are one of the most prominent insect groups in terrestrial ecosystems given their high number of evolutionary and ecological interactions with other species (Moreau et al. 2006). Accordingly, ants provide many ecological services and participate directly or indirectly in different ecosystem processes (Del Toro et al. 2012) such as seed dispersal, predation, pollination, soil movement, and decomposition (Aguilar-Méndez et al. 2021). Also, assemblages of ants have been suggested as environmental indicators for conservation monitoring programs. They respond rapidly to environmental changes (Rocha-Ortega and Favila 2013), exhibit a pronounced vertical stratification (arboreal, epigeic, hypogeic, and subterranean) present different ecological traits with sensitivity to environmental impacts (Rocha-Ortega et al. 2017;Rocha-Ortega and Coronel-Arellano 2019). Worldwide there are an estimated 15,489 species of ants (www. antweb. org, www. antwi ki. org), with 887 species reported for Mexico, divided into nine subfamilies (Dáttilo et al. 2019). Only four Mexican ant species have been categorized as threatened or endangered by IUCN (IUCN 2020), and no species have been categorized by the Mexican government's national list of species at risk, the NOM-059-SEMARNAT-2010 updated in 2019. Yet, ants are likely to suffer a major conservation crisis given the high degradation of ecosystems by human actions (Challenger et al. 2009). There are studies of only a few Mexican ant species that provide adequate data of geographical patterns (Johnson and Ward 2002;Aguilar-Méndez et al. 2021). In Mexico, the southwestern and Mesoamerica ecoregions have been well studied in terms of ant research and have been suggested as biodiversity hotspots (Kass et al. 2022). Moreover, to make insightful conservation decisions, it is a priority to know the distribution for all or most ant species across Mexico. Therefore, using the full records for America (i.e., including both North America and South America) of each ant species reported in Mexico in Dáttilo et al. (2019), https:// antma ps. org, and www. antwi ki. com, here we analysed the geographic patterns of ant species richness and weighted endemism for Mexico. Our analyses include (i) calculation and comparison of species richness and endemism for ant subfamilies and ecomorphs (i.e., a proxy of functional diversity).

Study area
Mexico is a highly biodiverse country for insects (Morrone and Márquez 2008). The Mexican transition zone (MTZ) corresponds to a mixture and overlap of the biotas of the Nearctic and Neotropical regions (Morrone 2015). The insect biota in Mexico is made up of five subsets of geographic distribution patterns: Paleoamerican, Mexican Plateau, Nearctic, Mountain Mesoamerican, and Typical Neotropical biotas (Halffter and Morrone 2017). The Mexican biota is an excellent study object to address the research questions formulated above. We used all records of each species along America, to understand which ant species are distributed and restricted to Mexico.

Species distribution data
We selected the ant species lists from Dáttilo et al. (2019) to construct the database of 10,282 records extracted from the database for Mexico (Dáttilo et al. 2019), and for all America (i.e., both North and South America), we obtained 32,859 records from the AntWeb portal (https:// www. antweb. org) that allows the management, download, and use of records. Duplicate records, not georeferenced, records georeferenced with fewer than 2 decimal places, and with latitudes and longitudes coordinates (0,0) were removed during the sorting process. To avoid spatial autocorrelation, we filtered the geographic records of each species by eliminating the records that were less than 30, 50, 80, or 100 km away, using the thin data function in the R environment (R Core Team 2017). We defined the distance for each species based on the agglomeration of original data, and for the records to be distributed uniformly. From all species found, we chose to model species with more of 10 records after of being filtered. Of the 547 species modelled, records of 60 species were separated at a greater distance than 30 km; records for 139 species were separated at a greater distance than 50 km, for 208 species greater than 80 km, and for 169 species the records were separated a greater distance than 100 km (Supplementary material- Table 1).

Ecomorphs as proxies of functional diversity
Species were categorized by vertical strata if nesting or foraging within a stratum (Supplementary material- Table 1). The arboreal stratum was defined as any living or dead part of a tree; epigeic stratum was defined as any part flush with the soil surface and without organic material (e.g., rocks and aboveground soil); hypogeic stratum was defined as any part of the ground with organic material as leaf litter, twigs, or logs; and subterranean stratum was defined as not superficial ground, where colonies observed mostly belowground (e.g., Nomamyrmex), particularly in dry biomes whose organic material is frequently absent (Sosiak and Barden 2021). The related information is based on the website of https:// www. antwi ki. org/ wiki/ Welco me and https:// www. antweb. org/.

Environmental data
To model the distributional range of each species, we first defined a reference region M to calibrate the niche model. For each of the 547 species, we extracted the World Wildlife Fund Inc (WWF) ecoregions (Olson et al. 2001) (Fig. 1) that would conform to the M within the minimum convex polygon that surrounds all the territory occupied by each species (e.g., the geographic record of each species). Then, we used 15 WorldClim environmental variables data (http:// www. world clim. org) (Fick and Hijmans 2017). We excluded the following four WorldClim variables, as they contained spatial artifacts as  (Escobar et al. 2014): mean temperature of the most humid quarter, mean temperature of the least humid quarter, specific humidity means of warmest quarter, and specific humidity mean of the coldest quarter. We also included the elevation variable (https:// cgiar csi. commu nity/ data/ srtm-90 m-digit aleleva tion-datab ase-v4-1), average gross primary production between 2001 and 2018 (http:// globa lecol ogy. unh. edu/ data/ GOSIF-GPP. html), soil type, and soil pH (https:// daac. ornl. gov/ SOILS/ guides/ HWSD. html), two variables closely related to ant distribution (Jacquemin et al. 2012; Rocha-Ortega and García-Martínez 2018). All environmental variables were adjusted to a resolution of 0.5°. Out of the 19 original variables we selected those correlated with Spearman's |rho| < 0.8, within the extent of the training area of each species, to avoid overfitting models due to multicollinearity. We calculated the correlation using the vifcor function, with the package usdm (Naimi and Araújo 2016) in the R environment.

Species distribution models
Model building 547 SDMs (Supplementary material- Table 1) were generated using MaxEnt (Phillips et al. 2006), which has shown to outperform other methods when using presence-only data like ours, even when applied to small data sets (Elith et al. 2006;Van Proosdij et al. 2016). We modified the MaxEnt default settings by allowing only linear and quadratic features for all sample sizes and excluding hinge, product, and threshold features to prevent overparameterization of the models (Merow et al. 2013). Pseudoabsences were selected from raster cells with at least one database record. Continuous SDMs were converted to binary models using the 'minimum training presence' threshold. This threshold finds the lowest predicted suitability value for an occurrence point. Essentially, it assumes that the least suitable habitat in which the species has been collected is the minimum suitability value for the species. We generated and evaluated the models using the sdm function, with the package of the same name in the R environment (R Core Team 2017). The evaluation of model performance was done using true skill statistics (TSS), in the sdm package (Naimi and Araújo 2016). In order to obtain the data for the analyses of Mexican patterns we cut the models following the Mexican boundaries.

Species richness and corrected weighted endemism
Species richness (SR) and corrected weighted endemism (CWE) were calculated from different taxonomic (9 subfamilies) and vertical stratification (arboreal, epigeic, hypogeic, and subterranean) subsets of our estimated range maps (SDMs) and 123 species with 5 to 9 records after filtered which are insufficient for calculating SDM, yet they help to complement the biodiversity knowledge (Supplementary material- Table 2). We used a hexagonal sampling grid, which is the most complex regular polygon that results in less orientation bias in our analyses (Brown et al. 2016). CWE measures endemism by inversely weighting the proportion of endemics by their range size (species with smaller ranges are weighted more than those with large ranges) and dividing this value by the local species richness. Both SR and CWE were calculated using SDMtoolbox v1 (Brown 2014). Then, we mapped SR and CWE along Mexico using the raster package (Hijmans and Van Etten 2021). Tatuidris tatusia (subfamily: Agroecomyrmecinae) was included in the calculus and map in the subfamily Amblyoponinae although it is a species not phylogenetically related to this subfamily. We did this because such subfamily includes rare species, but with the inclusion allows to visualize the distribution of T. tatusia. Next, we matched the SR and CWE values with ecoregion of WWF ecoregions (Olson et al. 2001), using the raster package (Hijmans and Van Etten 2021). Finally, we analysed the relationship between local CWE values (hexagonal grid) in function of SR, using linear regression.

Species richness
Based on the resultant hotspot maps in Mexico, the most species-rich areas for ant subfamilies were found in the south-eastern part of the country: Sierra Madre del Sur, Petenes, and Mesoamerica (i.e., Oaxaca, Chiapas, and Central America) regions. Other regions in the centralnorth of Mexico with high species richness were Sierra Madre Oriental and Occidental, Altiplano and Chihuahuan deserts, and the Trans-Mexican Volcanic region (Fig. 2). For Amblyoponinae, Ectatomminae, Ponerinae, Proceratiinae, and Pseudomyrmecinae subfamilies, hotspots were located mainly in tropical biomes in south-eastern regions, while low SR was in dry areas of the north. The speciose regions for these latter subfamilies were Sierra Madre Occidental, Oriental and Sur, Mesoamerica, Petenes, and Trans-Mexican Volcanic region; only a single subfamily, Pseudomyrmecinae, also was speciose in Yucatan (Fig. 2)

Corrected weighted endemism
Hotspots of endemism in Mexico were in the southeast region and mountain systems: Mesoamerica, Petenes, Balsas, Sierra Madre Oriental, Occidental, and Sur regions. From centre to north, high endemism was found in the Trans-Mexican Volcanic region, and Sonora and Chihuahuan deserts (Fig. 3) Table 3).

Species richness and endemism of ecomorphs
Higher richness of hypogeic and arboreal ecomorphs (Fig. 4) were found in south-eastern Mexico in regions such as Mesoamerica, Petenes, and Sierra Madre Oriental. Mediums values of richness were found in Trans-Mexican Volcanic and Sierra Madre Occidental. Subterranean and epigenic ecomorphs exhibit high-intermediate values of SR in Mesoamerica, Petenes, Sierra Madre Oriental and Occidental, Trans-Mexican Volcanic, and deserts of Sonora and Chihuahuan (Fig. 4). Subterranean also showed high values of SR in Sinaloa dry forest. Hotspots of endemism in terms of hypogeic and arboreal ecomorphs presented intermediate values along Mexico, and high values in Mesoamerica, Petenes, and Sierra Madre Oriental. Arboreal ecomorphs also exhibited high CWE values in California Floristic Providence (Fig. 4). Moreover, both subterranean and epigenic  Table 4). We found a weak linear relationship (R 2 = 0.13) between CWE with SR in North America (F = 20,269, p < 0.001).

Discussion
We have illustrated how ant assemblages are structured within the context of the main ecoregions of Mexico. We showed that the ecoregions in the southeast and principal mountain systems represented the most important hotspots of SR for the different ant subfamilies. The high SR in the mountain systems of Mexico might be because these are transition zones, where species of different groups, including the entomofauna from Nearctic and Neotropical regions, have mixed (Halffter and Morrone 2017). In addition, high species richness in tropical biomes southeast of Mexico might be explained because ants originated and diversified in tropical angiosperm-dominated forests in the Neotropical (Moreau and Bell 2013). Mesoamerica and the south of Mexico had been already determined as hotspots for ant SR where robust sampling is not an issue (Kass et al. 2022). In addition, we found that the Sonora and Chihuahuan deserts are regions with high ant SR.
Overall, we found only a weak relationship between SR and CWE for Mexico. Notwithstanding, a few ecoregions in Mexico presented high values of both SR and CWE. For instance, mountain systems (i.e., Sierra Madre Occidental, Oriental, Sur, and Trans-Mexican Volcanic regions), southeast of Mexico (i.e., Mesoamerica and Petenes regions), and Chihuahuan desert share high values of both SR and CWE, whilst Altiplano is rich in ant species, and Balsas and desert of Sonora are important in endemic ants. Identifying endemism hotspots not only helps to understand the physical and biological drivers of endemism but also to prioritise certain broad-scale regions for conservation given that endemic or range-limited species often act as "indicator species" for their local biological communities (Shipley and McGuire 2022). The mismatch between SR and CWE hotspots is expected because CWE emphasises areas with a high proportion of taxa with restricted distributional range and not necessarily high in species richness (Crisp and Linder 2001). Nevertheless, our results highlight the causes of some ecoregions in Mexico being high in both species richness and endemism. It is likely due to the combination of biogeography and diversity of biomes allowing the transition of biotas (Morrone 2015) with different origins, climatic and environmental preferences. These result in regions not only rich in species number but also having a high proportion of species of restricted distribution. In this regard, the conservation priority of these particular ecoregions is high.
We found hotspots of higher SR for hypogeic and arboreal ants, especially in the tropics, while for epigeic and subterranean ants, hotspots occurred in the tropics and desert. In ants, morphological adaptation to an ecological niche may represent an ecomorph. These ecomorphs may have evolved across lineages (convergence) that are phylogenetically disparate but that share a suite of specific traits to occupy available niches (Sosiak and Barden 2021). Thereby, the ecomorphs in our study (i.e., nesting and foraging niche) seem to recover suites of traits that are linked to ecology and functional role. Thereby, hypogeic and arboreal as well as epigeic and subterranean ants each share a set of traits (Sosiak and Barden 2021) that drive their species distribution. The definition of biodiversity 'hotspots' has been intuitive (Thompson et al. 2021). However, studying functional groups can be helpful to assess how a guild responds to disturbance, as guild members may have similar functional roles in the assemblages (Cardoso et al. 2011). Thus, this functional information can help to construct a more robust hotspot definition. Furthermore, we offer new insights into the distribution of functional role hotspots that contrast with values of taxonomic richness. Thus, tropical biomes in ecoregions such as Mesoamerica, Petenes, Sierra Madre Oriental, Trans-Mexican Volcanic, and the Sierra Madre Occidental are remarkable because they combine high SR values of different subfamilies and all ectomorphs. Moreover, these regions also maintained a high proportion of hypogeic and arboreal species with a small range distribution which is not the case for subterranean and epigenic ants. Overall, Mexico is poor in endemism for subterranean and epigenic ants, but rich in species, not only in tropical biomes but also in the deserts of Sonora and Chihuahuan. Currently, the tropical ecosystem has many hotspots for all ant functional roles, but the desert is another important ecosystem for ant conservation, particularly epigeic and subterranean habitats.
Finally, our study has implications for conservation by adding the information of centres of SR and CWE for taxonomic and functional roles of ants. Our results also suggest a congruence with hotspots for other taxonomic groups in Mexico, such as vertebrates (Myers et al. 2000;Kass et al. 2022) and functional roles (Brodie et al. 2021). In this regard, our results highlight that the future of hotspots in biodiversity conservation for ants should be concentrated in the tropics and some deserts. In Mexico, all orders of insects have been sampled more intensively at the end of the last century (1981-2020). However, further sampling and taxonomic work is still needed (Rocha-Ortega et al. 2021). Given that taxonomic resolution for ants at species' level are likely to be uneven and/or biased, much work is needed (Dáttilo et al. 2019). In particular, protection of ants is urgent given the multiple threats currently face (see Rocha-Ortega et al. 2021). In particular, studies must be focused towards the north of Mexico, while their protection should be emphasized in both desert and tropical forest biomes.
Acknowledgements Authors thank the invitation of Jorge León-Cortés. Thanks to Wesley Dáttilo for sharing ant data. This paper was partly financed by a PAPIIT project (Grant No. IN204921).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.