Introduction

Spatial variation of phenotypes within species forms a fundamental basis of trait evolution. Critical biotic and abiotic factors inevitably vary based on fine-scale and large-scale spatial modifiers, interacting to generate the selective landscape for each population. Abiotic climate variables such as temperature and moisture have been well-studied in their potential to lead to rapid evolution across a wide variety of traits, such as morphology and thermophysiology (Finkel et al. 2005; Meiri 2011; Pienaar et al. 2013; Gutiérrez-Pinto et al. 2014; Baken et al. 2020; Velasco et al. 2020). While these climatic predictors indeed predict a great deal of trait evolution, biotic variables tied to space are also instrumental to trait evolution dynamics (Brockhurst et al. 2014). Interspecific competition for space famously drives the radiation of ecomorphology and physiology in Caribbean anoles (Losos 2011; Gunderson et al. 2018; Salazar et al. 2019), and pollinator densities predict environmental suitability for plants (Duffy and Johnson 2017). Frequency-dependent predator selection drives the predator–prey arms races in systems such as garter snakes and newts (Brodie and Brodie 1999), and is important for the evolution of color and pattern across fish (Gordon et al. 2012), butterflies (Mallet and Gilbert 1995; Chouteau et al. 2017; Ruxton et al. 2019), hymenopterans (Chatelain et al. 2023), and frogs (Saporito et al. 2007; Stuckert et al. 2014a; Lawrence et al. 2019). Such biotic drivers can be more difficult to adequately measure compared to abiotic variables, despite their equally important role in determining fitness outcomes. Contextualizing evolution in terms of spatial variation in selection pressure explains many elements of intraspecific variation and sets the stage for understanding the macroevolution of phenotypes. Thus, it is important to account for spatial variation in selection from both abiotic and biotic sources when investigating trait evolution.

One especially well-studied example of spatially-driven phenotype evolution is mimicry, in which multiple species evolve common warning signals to predators to signal unpalatability. Müllerian mimicry is characterized by the convergent evolution of warning signals by co-occurring toxic species (Müller 1878), and has evolved across toxic vertebrate and invertebrate taxa (Springer and Smith-Vaniz 1972; Mallet and Gilbert 1995; Symula et al. 2001; Chiari et al. 2004; Sanders et al. 2006, Chatelain et al. 2023). Maintenance of Müllerian mimicry is driven by predator learning tied to positive frequency-dependent selection – population densities of the model and mimic species determine how often predators encounter a warning signal, and more commonly encountered signals generated by higher population densities of model and mimic species will more effectively prevent predation on aposematic organisms (Endler and Mappes 2004; Sherratt 2008). Higher predator densities should lead to more intense predator selection, contributing toward the evolution of mimicry. However, these dynamics can be difficult to study in many systems because of logistical challenges in accessing research areas, intensity of surveys required to estimate population densities, and issues detecting behaviorally cryptic species to establish the prior knowledge necessary to answer questions. Some empirical studies have unveiled important dynamics of the evolution of mimicry in a limited set of taxa, particularly Heliconius butterflies (Brown and Benson 1974; Mallet 1999; Joron et al. 2006; Van Belleghem et al. 2021). However, fine-scale dynamics of color pattern evolution can be more difficult to assess in understudied or cryptic taxa, leaving unanswered avenues about how mimicry evolution interacts with ecological context.

Predictive modeling methods can offer a solution to capturing these multifaceted elements of organismal evolution even in a dearth of knowledge about all variables. Ecological niche modeling can capture undefined elements of an organismal niche (Peterson et al. 2011). Recent advances have seen many studies incorporating aspects of organismal biology into niche models derived from climate to improve their predictions, including physiology (Martínez et al. 2015; Feng and Papes 2017; Bosch-Belmar et al. 2021), life history (Chefaoui et al. 2019), and genetic relatedness (Ikeda et al. 2017; Bothwell et al. 2021). However, aspects of organismal evolution from phylogenetic tools can also be broadly integrated into the ecological niche modeling toolkit (Graham et al. 2004; Evans et al. 2009; Smith and Donoghue 2010; Guillory and Brown 2021; McHugh et al. 2022; Folk et al. 2023). Rather than refining estimates of organismal niches, phylogenetic tools can be used to leverage niche models to answer questions about organismal evolution and generate hypotheses about ecological zones of selection projected into geographic space. In the case of the evolution of Müllerian mimicry, ecological niche models can be combined with phylogenetic methods to estimate how selection pressure from biotic variables like predators may be shaping geographic ranges and phenotypes without measuring the strength of predator selection empirically.

Poison frogs (Family: Dendrobatidae) are a fitting group in which to test the ability of predictive modeling tools to identify geographic zones of Müllerian mimicry. Dendrobatidae is a New World family of frogs, many of which exhibit aposematic coloration and a few of which have evolved mimicry (Grant et al. 2006; Rojas 2017). Poison frog predators differ based on target taxon, geography, and habitat heterogeneity (Maan and Cummings 2012). Most predators are diurnal and suspected to be operating on visual cues; birds, fish, snakes, spiders, crabs, and beetle larvae have all been implicated in preying on poison frogs (Master 1999; Ringler et al. 2010; Alvarado et al. 2013; Rojas 2017). Birds are suspected to exert strongest selection toward convergent warning signals in poison frogs because of their high visual acuity and ability to distinguish aposematic signals (Maan and Cummings 2012). Indeed, clay model studies indicate that avian predation is a risk for poison frogs (Chouteau and Angers 2011; Dreher et al. 2015; Lawrence et al. 2019). Despite these predation pressures likely favoring common warning signals, many poison frog species are highly polytypic (Summers et al. 2003; Brown et al. 2011; Rojas and Endler 2013). Sexual selection is hypothesized to act as a diversifying force on color patterns, pushing against aposematism selection (Maan and Cummings 2008). Poison frogs offer a compelling opportunity to study how the interacting biotic forces of predator selection and sexual selection, combined with dynamics of sympatry, generate color pattern occurrences across geography.

Here, we focus on the genus Ranitomeya, an Amazonian dendrobatid genus currently consisting of 16 recognized species (Brown et al. 2011; Muell et al. 2022). A relatively young clade, Ranitomeya experienced rapid speciation in the Andean lowlands and expanded their range through the Amazonian basin within the last 12 million years (Santos et al. 2009; Muell et al. 2022). Many species are concentrated around central Peru, although some species possess wide distributions reaching eastward to eastern Brazil and French Guiana (Muell et al. 2022). Some Ranitomeya species are highly polytypic (Lorioux-Chevalier et al. 2023), and Müllerian mimicry is suspected to exist in around half the species in the group on the basis of color and pattern (Brown et al. 2011). To our knowledge, there are no published observations of predation on any Ranitomeya, but clay model studies indicate that birds are strong candidates (Chouteau and Angers 2011).

Four color patterns in particular syndromes have repeatedly evolved across Ranitomeya: striped, spotted, redhead, and banded (Fig. 1). Biotic factors in the form of sexual selection and predator selection for aposematism likely exert strong influence on color pattern diversification or lack thereof in poison frogs (Endler and Rojas 2009; Maan and Cummings 2009; Nokelainen et al. 2012). Extensive experimental work has shown that R. imitator has diversified into all four of these color patterns along various axes to mimic three different congeneric species, demonstrating empirical evidence of selection-mediated mimicry (Symula et al. 2001; Yeager et al. 2012; Twomey et al. 2013, 2014, 2020; Stuckert et al. 2014a, 2014b). Other species of Ranitomeya indeed show similar signatures of overlapping ranges and convergent color patterns, such as the redheaded R. reticulata with R. amazonica near Iquitos, Peru, and R. toraro with populations of R. uakarii in the Serra do Divisor region of Brazil (Brown et al. 2011). However, non-correlative evidence of selection in these species of Ranitomeya or any potential mimic pairs across taxa can be difficult to quantify empirically due to limits on access to inhabited areas and detection of individuals in thin populations. Thus, the presence of resources and repeated possible instances of mimicry make Ranitomeya a fitting system in which to address this question.

Fig. 1
figure 1

Example images of each pattern type from Brown et al. (2011). In these examples, each image shows a frog that has a ranking of 0 for all other categories. Frogs can be categorized with a nonzero value for multiple categories

In this study, we explore the spatial distribution of color pattern morphotypes in Ranitomeya poison frogs to predict regions of mimicry evolution, using a novel combination of phylogenetic and ecological modeling tools. We use a previously generated time-calibrated tree to calculate rates of phenotypic evolution for the four major recurring color pattern syndromes subject to Müllerian mimicry in Ranitomeya poison frogs. We also estimate ecological niche models for 14 species, plus a paraphyletic population of R. uakarii (Muell et al. 2022), using the most complete geographic occurrence dataset available. Finally, we combine the lineage-specific estimates of color pattern evolutionary rates with morph-derived niche model results to contextualize rates of evolution in ecological space, generating geographic maps that show where the most rapidly evolving lineages are most likely to co-occur for each of the four major phenotypic syndromes. Our study adds to a growing body of literature integrating ecological niche modeling with other methods, and serves as an effective hypothesis generation tool regarding the ecological and evolutionary context under which mimicry evolves. Our modeling approach can be applied to traits across study systems where organismal evolution is strongly tied to space.

Methods

Mimicry maps

We produced a Müllerian mimicry map for each of the major four phenotypes (see Table 1) by joining raster layers containing niche models corresponding to morphs of lineages showing those phenotypes (e.g., spotted individuals in R. imitator). A niche model for a species morph was included in a mimicry map if its corresponding phenotype had a non-zero value for that phenotype. Each morph-level raster layer added to a mimicry map was multiplied by the value of the mean relative rate of phenotypic evolution (see below for calculation) for that phenotype of the lineages representing those populations. If multiple populations of the same species with the same phenotype were represented in the phylogeny, their rates of evolution were averaged to get the weight value for the raster layer represented by that morph. Cumulative maps for each major phenotype display the geographic ranges of Ranitomeya populations with the same phenotypes. High suitability areas are determined according to their rates of evolution in combination with the likelihood of occurrence. This produced a map of environmental suitability for frogs sharing a common phenotypic pattern. Higher suitability values correspond to higher likelihood of observing Müllerian mimicry. Each species’ presence in the mimicry map was weighted by the mean relative rate of phenotypic evolution that that lineage underwent, with lineages of species that diverged greatly equating to higher mimetic selection. Both higher species diversity at a site and higher evolutionary rates will contribute to higher spatial predictions of mimicry suitability. In areas where species diversity is low and evolutionary rates are high, or vice versa, moderate suitability is expected, and areas where both species diversity and evolutionary rates are low will have low suitability. Because the generation of mimicry maps is the result of simple multiplication based on average evolutionary rates, we expect that suitability values will behave according to our predictions, though we acknowledge that these multiplications were not tested against other weighting methods (e.g., arithmetic means) and encourage future studies to evaluate effects of different weighting schemes on resulting visualizations of mimicry. Overall, mimicry maps use empirically-derived data to identify areas with the highest potential for mimicry selection within a spatial context, enabling broad comparisons of areas under potential selection in a way framed by genomic context.

Table 1 Description of definitions for ratings of each major pattern type

Quantifying color patterns

We aimed to analyze as many individuals as possible, aiming for a minimum of 10 individuals per species-morph combination, however, in some cases morphs were categorized based on few observations (Table S1). Because standardized photos from all representative populations do not exist, we cannot use reflectance values to quantify color patterns. Instead, we divided color pattern into four different gross categories based on the repeatedly evolved syndromes: striped, spotted, redhead, and banded (Brown et al. 2011). For each population of frogs represented in our phylogeny, we used an image of the most common phenotype found at a sample population to assign the population a pattern value at 0, 1, or 2 for each of the phenotypic syndromes (see Table 1 for definitions). Generally, a value of ‘0’ for a given category refers to an absence of that phenotype, a ‘1’ refers to a partial display of characteristics of the category, and a ‘2’ refers to a color pattern strongly signaling that category of the pattern. By assigning each population a value for all four phenotypes, we are able to characterize frogs with patterns containing features of multiple categories (e.g., a partially redhead with stripes down the dorsum). We derived images for populations from Brown et al. (2011) as well as georeferenced online occurrence data in cases of more recently sampled populations. Most analyzed photographs were taken with a Nikon D90 DLR with 100 mm Macro lens at a distance of 10–20 cm with an R1 Wireless Close-up Speedlight System. Color corrections cards were included in photographs when possible and images were color corrected based on this.

In any instance where populations of a species did not share the same combination of scores for each category, we identified a different morph (Table S1). For example, if one population of R. variabilis exhibited a ‘2’ for spots while another population exhibited a ‘0’ for spots, we would separate them into two different morphs. While intrapopulation variation does exist, we chose to represent the most common color pattern of a given population to characterize the most common color pattern predators encounter, which is most likely to be under positive frequency-dependent selection for mimicry. The broad categorization of the phenotypes into discrete categories allows us to represent intrapopulation variation by the dominant color pattern exhibited that encompasses all intrapopulation variation, rather than potentially biasing intraspecific representation in the phylogeny by quantifying only a single representative photo. Although we did not utilize image software for extremely fine-scale phenotype ratings, visual modeling indicates that human perceptions of texture and color for poison frogs is similar to that of major predators (Barnett et al. 2018). Overall, our characterizations should generally assess a predator-level view of dorsal and head patterns.

Evolutionary rate estimation

We used BAMM to estimate rates of evolution for each of the four phenotype categories separately (Rabosky 2014). We considered each of the four color pattern categories as separately evolving traits to identify core rate shifts within all estimated rate configurations among lineages. Studies on melanin gene expression suggest that different genes and pathways are responsible for the locations of melanized dorsum areas among color morphs in Ranitomeya and other genera within Dendrobatidae (Stuckert et al. 2019; Rodríguez et al. 2020; Linderoth et al. 2023). For this reason, it is plausible that genomic targets of selection vary among our four phenotypic syndromes. Furthermore, many populations represented in our phylogeny have color patterns ranked with non-zero values for multiple of the four color pattern categories, and separating rate analyses for each morph allows us to evaluate how melanized regions in those configurations contribute independently to a given population’s color pattern, capturing more variation in color patterns overall. Non-binary discrete traits such as our color pattern categorizations can be difficult to model because the discrete nature of trait categorization produces trait distributions that violate assumptions of Brownian motion, leading to an inability of the model to converge. To solve this issue, we simulated random values with possibilities bounded from − 0.125 to 0.125 to add to our discrete trait values for each color morph. Adding these simulated values allows the model to converge by creating enough noise for the Brownian motion model without compromising assumptions of rate shifts because the differences in the random noise are so much smaller than a shift in a discrete value.

We used a previously generated time-calibrated phylogenetic tree for Ranitomeya by Muell et al. (2022) for all analyses, which contains 65 tips and two outgroups and contains the best sampling of Ranitomeya, as well as being based on the most robust genomic dataset to date, broadly spanning representative variation in geography and color patterns. We note that we did not test how sensitive our approach was to sampling bias in the distribution of tips, as can occur for analyses of speciation rates in BAMM (Sun et al. 2020), and encourage future studies to be wary of sampling bias in evolutionary rate conclusions. For each morph, we determined our priors using the ‘setBAMMpriors()’ function in the BAMMtools R package (Rabosky et al. 2014). This function used our trait data to estimate suitable priors for the beta initial value and beta shift values. We used default, conservative priors for all MCMC functions: we specified 1 expected core rate shift for all traits; initial beta values (mean parameter of exponential) of 0.751 (striped), 1.698 (spotted), 1.499 (redhead), and 2.503 (banded); beta shift prior of 0.078 for all traits (standard deviation of rate shift parameter), and uniform prior density on our distribution of ancestral character states. The initial beta is calculated as five times the maximum likelihood estimate of the variance of a Brownian motion model fit to the tip states, causing the value to change among phenotypes based on tip state distributions. We chose these conservative priors because Ranitomeya is a relatively young clade at around 12 million years since the most recent common ancestor, meaning that the default assumption would be for very few core shifts to occur over a short evolutionary timespan. We computed the 95% credible shift set to determine how many of the core rate shift configurations account for 95% of the observed samples from the posterior, and assessed support for the best distribution by observing which rate shift configurations occurred most often in the posterior samples. We also assessed support for how many core rate shifts occurred by calculating Bayes factors for each candidate number of shifts and determined the best rate shift configuration candidate based on differences in Bayes factors. Rate shifts are considered core if their marginal odds ratio is higher than 5, meaning they are at least 5 times as likely to be observed as the prior when accounting for the length of a branch (Shi and Rabosky 2015). We ran each analysis for 30 million generations to reach convergence and we used BAMMtools to summarize posterior distributions. We calculated branch-specific evolutionary rates at the tips of the phylogeny for each phenotype category, which we use to represent extant rates of evolution. For each color pattern category, we then subsetted the branch-specific evolutionary rates only to populations that had a nonzero value for that category, and scaled the remaining distribution of evolutionary rates based on distance to one another to quantify a set of relative evolutionary rates only for tips in the phylogeny that showed that color pattern. These scaled relative rates were used in mimicry map generation, such that raster layers are scaled to communicate how much more quickly frogs are evolving compared to one another, rather than an absolute value.

Niche modeling

We used a total of 644 points for Ranitomeya from a long-term dataset collected by ourselves and collaborators to build our niche models. These points broadly represent the genetic, morphological, and geographic diversity observed in the genus and include all species. We used 19 climate layers from Chelsa to serve as our climate variables (Karger et al. 2017), as well as 9 soil layers from SoilGrids.org (Poggio et al. 2021). All environmental data were at 30 Arc-second resolution (ca.1 km). Ecological niche models were generated in MaxEnt 3.4.3 (Phillips et al. 2006) as implemented in SDMtoolbox version 2.0 (Brown et al. 2017) in ArcGIS version 10.7.1. We rarefied our dataset by cutting all points that were within 10 km of another point from the same species, to avoid biasing our models toward more highly sampled areas. We implemented minimum convex polygons with buffer distances of 200 km and produced bias files for each species in our analysis to minimize the likelihood of our models returning misleadingly high AUC scores. Bias files limit the geographic area for targeted background selection to areas around occurrence points, preventing the model from identifying suitable areas that are impossible to colonize. We performed spatial jackknifing with a 10th percentile training presence to cross-validate model performance and minimize overfitting for all species for which at least 15 occurrence points were retained after rarefying the dataset, and spatially subsampled all species with fewer than 15 points available. Two species, R. defleri and R. yavaricola, had fewer than 5 occurrence points, so instead of building predictive models we created 20 km buffers around all known occurrences and these areas were treated as suitable regions. Lastly, we evaluated eight regularization multipliers (0.5, 1, 1.5, 2, 2.5, 3, 4, 5) and five combinations of model feature class complexities (linear; linear and quadratic; hinge; linear, quadratic and hinge; linear, quadratic, hinge and product) to fine tune optimal model parameters for each species (Shcheglovitova and Anderson 2013; Radosavljevic and Anderson 2014).

Once we estimated species-level niche models, we split each species-level niche model based on morph categorization. Populations within a species described with the same four-digit categorization for morphs described above were retained in the same niche model, and populations with differing color pattern categorization ratings were split out based on geography into their own separate niche model, using methods from Rosauer et al. (2009). Briefly, these methods constrain the niche model to grid cells encompassing only the occurrence points in the combined range of the frogs within a species containing the same color pattern categorization, applying a sliding window approach (or inverse distance weighting) to assign morphs to grid cells where morphs are contained, allowing niche models to be constrained only to the ranges of phylogenetically-delimited subunits (Laffan and Crisp 2003). This strategy allows us to retain suitable habitat estimates for each morph based on species-level environmental associations, rather than calculating separate niche models for each morph and making a false assumption that different color patterns correspond to different physiologies and behaviors.

Results

Evolutionary rate estimation

All of our BAMM analyses converged, with effective sample size values for post burn-in number of shifts configurations and post burn-in log-likelihood well over 200. The striped phenotype showed the least heterogeneity in evolutionary rate (Fig. S1). The maximum shift credibility configuration, which is the number of shifts with the highest marginal probability within the 95% credible shift set, was 0 rate shifts on the phylogeny (70.6% of posterior), whereas the core rate shift configuration with the second- and third-highest support were for 1 rate shift on two different branches (8.2% and 8.1% of the posterior respectively). In the most common rate shift configuration, while there were 0 core rate shifts observed, a small yet uniform gradual increase in evolutionary rate was detected across the phylogeny. The second-highest supported configuration contained a single rate increase leading to the clade containing R. imitator and its sister clade containing R. vanzolinii, R. yavaricola, and R. cyanovittata, and the third-highest supported included a single decrease in rate on the branch leading to R. defleri. Striped frogs represent the majority of populations sampled in the tips of our tree and are evenly spread across the phylogeny, possibly contributing to estimation of 0 rate shifts total.

For the spotted phenotype, we found highest support for 10 core shifts (6% of posterior; Fig. S2) and second-highest support for 11 shifts (2% of posterior). This higher variability in rate shift configurations may be due to the localization of spotted phenotypes across the phylogeny at short tips and few sister populations with a common spotted phenotype, leading to many possible locations on branches with high marginal odds ratios that could be the location of rate shifts (Fig. S2). The six most recent shifts occurred on terminal branches, all leading to populations that show more spots in comparison to closely related lineages. The 10 core rate shifts detected are all increases in rate proportionate to existing background rates. The highest supported rate shifts (marginal odds ratios > 70) occurred on branches leading to spotted R. imitator, the two R. vanzolinii, and spotted R. variabilis near Saposoa, Peru.

In the redhead phenotype, which is similarly dispersed across the phylogeny compared to the spotted phenotype, we found the highest support for 7 core rate shifts (30% of posterior) and second-highest support for 7 core rate shifts in a slightly different configuration (9% of posterior). These rate shifts all occurred more recently in time, mostly taking place on a branch leading to a single population within a species exhibiting the redhead phenotype, with the exceptions of one rate shift taking place on the branch leading to all R. fantastica and R. summersi populations as well as another shift leading to all R. benedicta populations (Fig. S3). Lastly, for the banded phenotype, we found the highest support for 3 core shifts (27% of posterior) and second-highest support for 2 rate shifts (21% of posterior). These core shifts took place leading to the branch with the banded R. imitator, one population of R. fantastica, and a branch leading to 3 of the 4 R. summersi populations (Fig. S4).

Mimicry maps

After rarefying our dataset, we retained a total of 319 occurrence points to generate our niche models. Of the species for which we generated niche models, 7 species had 15 or greater points and underwent spatial jackknifing, while 7 had under 15 datapoints and were subsampled instead. Many of these latter 7 species not included have insular ranges, which is what led to the low number of datapoints included after rarefication. The AUC scores for all our models were high (average of 0.793) and omission error rates were low (average of 0.17), indicating high model performance between training and test data for most species (see Table S2 for species-specific metrics). Seven species had variable pattern categorization across their range and were split by geography: R. amazonica had 3 separate maps (one striped, one partially striped, one redhead), R. fantastica had three maps (one striped, one redhead, one partially banded), R. imitator had 4 maps (one for each phenotype), R. reticulata had two maps (one redhead, one redhead partially spotted), R. summersi had 3 maps (one banded, one partially redhead banded, one not matching any phenotype), R. variabilis had two maps (one striped and one spotted), and R. ventrimaculata had 2 striped maps. The remainder of species all had a common color pattern across populations.

Each of our four mimicry maps revealed areas of high suitability for mimicry selection zones across the Ranitomeya distribution in South America (Figs. 2 and 3). We included 15 niche models of 13 species in the striped model (one representative for each putative species, except for R. amazonica and R. ventrimaculata which each had two striped morphs). The striped mimic map shows 3 localized zones with the highest suitability for mimic populations. The first mimic zone flanks the Amazon river for around 50 km north and 50 km south of Iquitos, Peru in the Department of Loreto, containing partial ranges of R. amazonica, R. flavovittata, R. reticulata, some R. uakarii, some R. variabilis, and R. ventrimaculata. The second mimic zone occurs south of the first zone, on the border of Peru and Brazil near Acre, Brazil and Divisora, encompassing partial ranges of R. cyanovittata, southern R. uakarii, R. toraro, and R. variabilis. The third mimic zone is in south-central Ecuador in the Morona-Santiago and Pastaza cantones, and only contains fully striped R. amazonica and fully striped R. ventrimaculata. Of these zones, the mimic zones in Peru show about double the suitability for mimicry compared to the zone identified in Ecuador. For the spotted map, we used 8 niche models from 8 species and identified one major mimic zone in San Martín, Peru, where R. imitator and R. variabilis overlap. For the redhead map, we used 8 niche models from 7 species (two R. amazonica maps). The redhead map identified two major mimic zones, one of which takes place over the range of R. imitator and R. fantastica in San Martín (similarly to the spotted imitator range), and the other of which occurs north of the Amazon river east of Iquitos, Peru in the Department of Loreto, in the range of R. reticulata and redhead R. amazonica. Lastly, for the banded map, we used 3 niche models from 3 species. The map is restricted to the central San Martín region, and we similarly identified one major hotspot near Chazuta in San Martín, Peru in the overlapping range of banded R. imitator and R. summersi.

Fig. 2
figure 2

A Spatial mimetic phenotype prevalence. Warmer colors depict spatial areas where the most rapidly evolving morphs of species are predicted to occur. This map is a composite of the spotted, redhead, and banded spatial mimetic selection predictions. The striped morph was not included here because of the general lack of recent selection on this phenotype (note exception in box C). Arrows and box depict regions of highest predicted mimetic selection. Globe depicts extent of study. B Map of species richness in Ranitomeya. C Example of the observed mimetic radiations in San Martin, Peru. As follows are the species depicted: R. imitator (I as spotted, striped and banded phenotypes), R. variabilis (V as both striped and spotted phenotypes), R. summersi (S as banded phenotype) and R. fantastica (F as striped phenotype)

Fig. 3
figure 3

Spatial mimetic phenotype prevalence for each phenotype modeled. Warmer colors depict spatial areas where the most rapidly evolving morphs of Ranitomeya species are predicted to occur. A banded, B spotted, C redhead and D striped

Discussion

In this study, we quantified a broad assessment of the mimetic phenotypic prevalence of four phenotypic syndromes in Ranitomeya poison frogs using a novel methodological approach. We estimated rates of phenotypic evolution and estimated ecological niche models for all species of Ranitomeya poison frogs to estimate high probability geographic areas in South America for the evolution of Müllerian mimicry in this genus. Mimicry maps weighted by evolutionary rates of phenotypic evolution accurately show geographic areas with the highest probability of mimetic selection among species, confirming known areas of mimetic selection in R. imitator with its congeners (Fig. 2c), as well as two previously suspected areas of mimetic selection between redhead R. amazonica and R. reticulata near Iquitos, Peru and between R. uakarii and R. toraro in the Serra do Divisor in western Brazil (Brown et al. 2011, Figs. 2a and 3). One additional potential mimic zone was identified for striped R. amazonica and R. ventrimaculata in Ecuador, identifying an area where future empirical studies on selection for mimicry could be concentrated. These results also show areas where geographic predictions of mimicry are lower surrounding a mimic zone, generating hypotheses for further empirical study surrounding mimic zones, such as investigating predator densities with surveys or studying gradients and frequency of color pattern matching a mimic phenotype.

Our methods clearly identified geographic areas where Müllerian mimicry was most likely. Predation pressure may be higher at these localities, which could be due to either increased predator densities or increased detectability of prey. Increased predator densities could be driven by biotic factors, such as an increase in prey resources other than frogs. Increased detectability could be driven by abiotic factors, for example, a more open structural microhabitat which allows for more rapid detection of frogs, particularly by avian predators (Andersson et al. 2009). Alternatively, populations of co-mimic species may be at higher densities in these localities, ultimately allowing for accelerated predator learning (Allen and Greenwood 1988; Endler and Greenwood 1988). A synergistic combination of any of the above factors may also be possible—comparing predator–prey dynamics using field ecology studies at the sites where mimicry selection is highest versus lowest will be the most effective means of testing the relative importance of each of these factors in driving mimicry evolution. Species richness also tightly corresponded to areas of high mimetic phenotype prevalence (r2 = 0.89 and r2 = 0.94 for the composite map of spotted + redhead + banded phenotypes and the composite for all four phenotypes, respectively), which is not surprising or unexpected (Fig. 2b). However, it does mean that these predictions are less sensitive to regions with lower species diversity, which are numerically downweighed due to the summing of each species’ values. Mimetic maps where pixels were scaled by species richness (i.e. dividing by corresponding species richness value at each pixel) also support main areas of mimetic selection (Figure S7). However, given the nature of mimetic selection, where two or more species mimic one another, the co-existence of multiple aposematic species in a single community would certainly be under a higher likelihood of evolving Müllerian mimicry than regions containing a single or only a few toxic aposematic species.

Our evolutionary rate estimation results provide insights into aposematism evolution by identifying branches on which rate changes may have occurred. The striped phenotype appears to be the most probable ancestral phenotype, which is supported by predictions under ancestral character estimation (Muell and Brown unpub. data). Thus, in most cases—but not all—lineages diverged from this phenotype (Fig S1; Fig S8). The most notable exception is the likely loss of bright dorsal coloration in R. fantastica from nearby Pongo de Cainarachi (see striped frogs in Fig. 1C). In most other locations, R. fantastica has considerably more intense and complete orange coloration throughout the head. While this could be a result of female mate choice rather than selection for aposematism, the coincidence of this occurrence with other mimetic phenotypes is highly suggestive that this was driven by mimetic selection (Brown et al. 2011). Biogeographic simulations could reveal whether identified rate shifts occur around the same time that ancestral lineages of Ranitomeya (or other dendrobatids that may share visual signals) arrived in sympatry with one another (Landis et al. 2021, 2022), which would support the hypothesis that predator-induced frequency-dependent selection increases rates of evolution and maintains Müllerian mimicry. Evolutionary rates independent of novel range overlap or any association with any abiotic variable could suggest that other factors such as sexual selection or the introduction of novel predators are driving rates of evolution toward certain phenotypes. Testing these hypotheses using additional modeling in Ranitomeya as well as analyses in other dendrobatid genera (or other aposematic taxa) can help address the overall biotic and abiotic factors that contribute to the evolution of Müllerian mimicry.

With the exception of the striped phenotype, we detected a much higher number of core rate shifts across posterior distributions of our traits than typically observed for a tree with only 67 tips. Color pattern can be a highly labile trait, meaning we would expect higher numbers of rate shifts and higher frequencies of rate shifts, corresponding to patterns we observed in all morphs except the striped phenotype. Higher rates of evolution may be captured by traits that had evolved to one state in a lineage and later reverted to another state, which in this case would depend on phylogeographic histories. However, our color pattern quantification may have falsely characterized phenotypes by discretizing inherently continuous variables, leading to misleadingly high numbers of rate shifts. Quantifying color patterns using morphometrics could reveal whether this is the case. Additionally, should Ranitomeya color patterns remain quantified as discrete, we may have been able to model rates of evolution more accurately using a method explicitly designed for discrete traits, such as threshold models or continuous-time Markov models (Pagel and Meade 2006; Felsenstein 2012). Ultimately, because we rely on scaled relative rates rather than absolute rates, our mimicry maps should be robust to imperfect modeling of phenotype evolution. Regardless, interpretation of color pattern evolutionary rate shifts should be made on the basis of multiple best competing rate shift configurations produced by BAMM and validated with additional methodological approaches designed explicitly for modeling discrete character evolution.

Mimetic selection is not the only possible driver of color pattern evolution. Certain genetic architectures may also lead to a recurrence of similar phenotypes under common environments independent of selective pressure. For example, elevation is associated with recurrence of large spots in highland populations of R. variabilis, possibly suggesting a role of developmental plasticity (Brown et al. 2011), where there is physical linkage between genes controlling phenotypes and genes associated with metabolism that are selected for in cooler montane habitats. The specific genetic pathways responsible for color pattern production might also vary based on evolutionary history (Twomey et al. 2023). Further, we caution against assuming that predator selection is always driving color pattern evolution. Depending on the strength and predictability of predator selection, selection exerted by abiotic factors (or other biotic factors such as resource availability) could be affecting alleles unrelated to color pattern more strongly, and genes responsible for color patterns could be pulled along by linkage. Pioneering studies on the functional genomics of color pattern will be essential to determining the role of genetic architecture in color pattern evolution (Stuckert et al. 2019, 2021; Rodríguez et al. 2020; Twomey et al. 2020; Linderoth et al. 2023). Isolation by distance is also likely responsible for color pattern variation across populations in R. imitator (Twomey et al. 2015, 2016). Many Ranitomeya species occupy much larger geographic ranges than R. imitator (Brown et al. 2011; Muell et al. 2022), meaning there is high potential for isolation by distance to affect color patterns in Ranitomeya species that have yet to be investigated. Despite the implicit role of predator selection in driving mimicry evolution, the importance of neutral evolutionary processes in shaping these phenotypes should not be underestimated. Population-level inquiries into evolutionary drivers of color pattern in understudied Ranitomeya species are needed to examine whether implications garnered from studies in the R. imitator model system are applicable across Ranitomeya as well as all dendrobatids.

Color alone can also offer an equally important warning signal as a pattern in poison frogs, however, we were unable to examine mimicry of color because we lacked reflectance data. We strongly suggest that future analyses should quantify reflectance data to find whether mimicry zones identified by color evolution match what we found using color pattern data, especially since coloration may be among the most important elements for mimicry in Ranitomeya (Lorioux-Chevalier et al. 2023). Our methods could be repeated for any manner of color-related phenotypes, such as proportion of the dorsum occupied by different colors or brightness from reflectance values, to assess the role of color explicitly in driving mimetic selection. Additionally, in some areas of our mimicry maps that were highlighted as likely for some patterns, frogs from highlighted species remain elusive. Sampling bias due to difficulty accessing some potential habitats may have biased mimicry maps, however, these biases cannot be accounted for without additional sampling. More field expeditions can validate whether observations match our theoretical modeling. The specificity of maps will also depend on the population densities of both predators and mimic species at sites of sympatry.

Our modeling strategy has applications beyond the system of mimicry in poison frogs or the taxonomic scale of a single genus. Combining evolutionary rates with niche models should enable the prediction of geographic distributions for any phenotype with an ecological correlation to a geographic area. For example, while in our study the focal phenotype is a color pattern under putative frequency-dependent selection, one could also use this method to estimate how rates of physiological evolution are expected to covary with structural microhabitat, predict changes in life history based on elevation, or even model convergence of disease dynamics across geography. All these potential research questions could be applied to anuran trait evolution or trait evolution across taxa. In any study system characterized by unknowns and difficulty of study, predictive modeling through methods such as our technique offers an avenue for rigorous hypothesis generation and conservation-related decision-making. Overall, we present a tractable method to characterize spatial phenotype prevalence patterns. By merging phenotypic data with spatial and evolutionary genomic methods, we can better understand how these factors potentially synergize within and across species, which provides novel insight into the drivers of color pattern evolution.