Introduction

Variation in phenotypic characters, such as morphology, plumage, behavior, and vocalizations, has been traditionally used as evidence of differentiation, especially at lower taxonomic levels. Phenotypic variation has frequently been the basis for initial taxonomic and evolutionary hypotheses, which are later tested (and either corroborated or contradicted) using molecular analyses (Paterlini 2007; Gill 2014). When populations are divided by conspicuous geographic barriers, genetic differentiation generally coincides with geographic isolation and molecular analyses have frequently supported prior phenotype-based taxonomic proposals (García-Moreno et al. 2004; Navarro-Sigüenza et al. 2008; Barrera-Guzmán et al. 2012; Ortiz-Ramírez et al. 2016). In other cases, genetic variation does not correspond with differentiation patterns based on traditional phenotypic characters or even distributional ranges (Ottvall et al. 2002; Cortés-Rodríguez et al. 2008; Rocha-Méndez et al. 2018). These mixed results have underlined the need to reassess taxonomy in several biotic groups (Fritz et al. 2007; Puebla-Olivares et al. 2008; McCormack et al. 2008).

Geographic barriers are not the only cause of population divergence; divergence can also be promoted or maintained by ecological adaptation to different environments, leading to ecological niche specialization (Moritz et al. 2000; Wiens 2004; Wiens and Graham 2005; Warren et al. 2008). Furthermore, although it is well known that divergence occurs mainly in allopatry (e.g., Zink et al. 2000; Smith et al. 2014; Cracraft et al. 2020), the processes maintaining diversity are less clear (see Peterson et al. 1999), especially in continuous habitats. One example is the continental lowlands, which were temporarily fragmented due to Neogene and Quaternary climatic changes and subsequently rejoined (Weir 2006; Voelker et al. 2010; Smith et al. 2014; Sánchez-González et al. 2015). Recent studies suggest that diversity in Mesoamerican seasonally dry tropical forests has been maintained through persistent but temporarily isolated stable climatic areas (Castillo-Chora et al. 2021). Therefore, allopatry and climatic changes may act in concert at different stages of differentiation; isolation may be the initial catalyst of divergence, which is then further promoted and maintained by habitat dynamics due to climatic changes (Peterson et al. 1999; Hosner et al. 2014; Cabanne et al. 2016; Moreno-Contreras et al. 2020).

The Mesoamerican biogeographic dominion (sensu Morrone 2017) which is included in the Caribbean subregion (Morrone 2001), has been consistently recognized as a biodiversity hotspot in which a high number of species are found in a relatively small area (Myers et al. 2000; Sarkar et al. 2009). The Mesoamerican biogeographic dominion extends from the tropical lowlands of Mexico to northern Nicaragua and is characterized by a highly complex topography, which has promoted the evolution of differentiated populations separated by either lowlands or mountains (Peterson and Navarro 2000; Ornelas et al. 2013; Navarro-Sigüenza et al. 2014; Rocha-Méndez et al. 2019). In most cases, phenotypic differentiation of these populations has been deemed too weak to recognize them as full species (e.g., Howell and Webb 1995; American Ornithologists’ Union 1998), such that variation has mostly been recognized at the subspecies category. The frequency of this practice in ornithology has led to a rampant underestimation of Mesoamerican avian diversity (Navarro-Sigüenza and Peterson, 2004), especially in the isolated dry forests of Mexico’s western Pacific Slope (Peterson and Navarro 2000; Navarro-Sigüenza and Peterson, 2004; Ríos-Muñoz and Navarro-Sigüenza 2012, Navarro-Sigüenza et al. 2014; Prieto-Torres et al. 2019a, b. Recent research has demonstrated that many of these western Mexican bird taxa fulfill criteria for independent species status (e.g., Arbeláez-Cortes and Navarro-Sigüenza 2013; Montaño-Rendón et al. 2015; Vázquez-López et al. 2020), and it is now evident that the region’s ecological and historical dynamics have promoted unexpectedly high levels of differentiation in birds (Castillo-Chora et al. 2021; Sánchez-González et al. 2021) and other vertebrates (e.g., Reyes-Velasco et al. 2013; Arcangeli et al. 2018).

The Squirrel Cuckoo (Piaya cayana) is an arboreal species that is widely distributed in the Neotropics, both geographically (from both coasts in Mexico to northern Argentina) and ecologically (from lowland rainforest to dry forests, gallery forest, mangroves, and areas with scattered trees; Howell and Webb 1995; Payne 1997, 2005). It is also a highly polytypic species, with 14 subspecies recognized based on size, plumage color, and the color of the bare skin around the eye (Sharpe 1900; Ridgway 1916; Peters 1940). Two well-differentiated subspecies are distributed in the Mexican lowlands (Fig. 1). One of them is P. c. mexicana, endemic to the western Pacific slope of Mexico (southern Sonora to southeastern Oaxaca), characterized by lighter cinnamon upperparts and a long tail with a rufous underside. The other is P. c. thermophila which is distributed in forests from northeastern Mexico to western Colombia, characterized by darker cinnamon upperparts and a shorter tail with black underside (Peters 1940; Howell and Webb 1995; Payne 2005). The geographic distribution of these two subspecies overlaps in a small area in the southeastern part of the Isthmus of Tehuantepec (Binford 1989) where there is an ecotone between dry and more mesic environments, suggesting a role of ecological conditions in their divergence. Intergradation between P. c. mexicana and P. c. thermophila is practically unknown. Both forms were initially described as full species (Sharpe 1900), only to be recognized as subspecies within a larger P. cayana complex, because the difference in the color of the upperparts was deemed minor (Stone 1908). Ridgway (1916) later argued that these two forms should be granted full species status based on the absence of intermediate birds in the contact zone. It was not until 70 years later that Binford (1989) reported, without further details, at least four “definite intermediates” individuals from Tehuantepec to Tapanatepec in southeastern Oaxaca. However, Binford himself, wrote “the abruptness and apparent rarity of intergradation suggest that these two forms might be separate species”. Some recent checklists have considered these taxa as subspecies (American Ornithologists’ Union 1998), while others have considered them full species (del Hoyo et al. 2017; Gill Donsker 2017).

Fig. 1
figure 1

Geographic distribution and phylogenetics of the Squirrel cuckoo Piaya cayana (A) mDNA Bayesian phylogenetic tree based on a ND2 (1041 bp) and a fragment of Cytb (622 bp) concatenated dataset. Numbers above branches indicate divergence time (in red), and bootstrap/posterior probability values (in black). The dark gray dotted circle represents the split between South American and Mesoamerican clades, while the light gray circle represents the split between the mexicana and thermophila lineages; B geographic distribution (P. c. thermophila in green, P. mexicana in orange) and mitochondrial DNA population samples (black dots) in Mexico and Central America. Dotted lines depict both the accessible (or M) and projection areas for the models and the niche divergence analyses for the two evolutionary linages. Brown shading indicates areas at least 1000 m above sea level (m.a.s.l); and C haplotype network. Colors correspond to the vertical bar in the phylogenetic tree

Here, we assess the significance of phenotypic differentiation between P. c. mexicana and P. c. thermophila, and the role that ecological factors may have played in the divergence of the two taxa in Mexico. We also assess the phylogenetic and phylogeographic relationships within both groups, to: (1) estimate the divergence time between the two taxa, (2) estimate the genetic diversity and gene flow between them, and (3) examine whether genetic variation may be associated with barriers to gene flow and ecological niche differentiation across the historical distribution throughout the region, with an emphasis in the Isthmus of Tehuantepec.

Materials and methods

Lab procedures and molecular phylogenetic analyses

We used tissue samples and museum skins of P. cayana collected from different localities throughout the distributional range of P. c. mexicana and P. c. thermophila in Mexico (Fig. 1), housed at the Museo de Zoología, Facultad de Ciencias and the Instituto de Biología, of the Universidad Nacional Autónoma de México (UNAM). We supplemented our study with tissue from both Mexico and South America loaned by the Biodiversity Institute, University of Kansas, and the Academy of Natural Sciences of Drexel, both located in the United States. For outgroups, we also used sequences deposited in Genbank (http://www.ncbi.nlm.nih.gov/genbank/).

We extracted DNA from tissue samples using DNeasy kits (Qiagen, Valencia, California) following the manufacturer’s protocols. We amplified two mitochondrial DNA (mDNA) genes, the NADH hydrogenase unit 2 [ND2, 1041 base pairs (bp)] and a fragment of 622 bp from the cytochrome b (Cytb), via PCR (Polymerase Chain Reaction) using standard thermocycle protocols in 12 µl reactions. We used the following primers L5215 (Hackett 1996), H5766 (Sorenson et al. 1999), L5758 (Johnson and Sorenson 1998), 487L (Sánchez-González and Moyle 2011), and H6313 (Johnson and Sorenson 1998) for ND2; and L14996 y H15646 (Sorenson et al. 1999) for Cytb. We visualized PCR products on agarose gels (1.3%) stained with ethidium bromide (0.5 µl/ml). Purification and sequencing were conducted at the High-Throughput Genomics Unit at the University of Washington, Seattle, EUA (http://www.htseq.org/index.html). Sequence chromatograms were edited in Geneious 7.0 (http://www.geneious.com/) and then aligned in MUSCLE (Edgar 2004) using Cytb and ND2 sequences deposited in GenBank. We restricted our analyses to mitochondrial sequences due to limited resources/funds. We are aware that including nuclear genome data can improve the robustness of hypotheses of systematics and gene flow compared to using mitochondrial genes alone (Nichols 2002; Funk and Omland 2003; Ballard and Whitlock 2004; Degnan and Rosenberg 2009); however, mtDNA has been suggested to be reliable for determination of geographical and taxonomic limits in closely related taxa concerning to lineage divergence, population structure, and phylogeograhical patterns (Zink and Barrowclough 2008, and references therein).

For most analyses, we included only individuals for which we sequenced both genes (Supplemental File 1). To better resolve the phylogenetic placement and degree of divergence of the Central American taxa within the polytypic P. cayana, separate analyses containing only Cytb sequences were carried out (Supplemental File 1), given that South American populations are much better represented for this gene on Genbank. We estimated phylogenetic relationships based on a concatenated matrix (ND2 + Cytb gene sequences) and for the Cytb sequences using Maximum Likelihood (ML) as implemented in RAXML 7.0.3 (Stamatakis 2006) and Bayesian Inference (BI) using Mr. Bayes 3.2.2 (Ronquist and Huelsenbeck 2003). Substitution models and partition schemes were estimated separately for ML and BI to increase the accuracy of the phylogenetic inference (Table S1); these were estimated and selected in PartitionFinder 1.1.1 (Lanfear et al. 2012) via the Bayesian Criterion Information (BIC). For Cytb analyses, we used the same models determined for this gene in the concatenated analysis (Table S1). Node support for the ML analyses was assessed using 1 × 104 non-parametric bootstrap iterations. We ran 10 × 106 generations for the BI analysis using four MCMC chains and sampled parameters and trees every 1000 generations. After checking for convergence and stationarity in Tracer 1.6 (Rambaut et al. 2014), the first 3000 trees were discarded as burn-in. The rest of the trees were used to produce a single 50% majority rule consensus tree. Based on a previous higher level phylogenetic hypothesis for the Cuculiformes (Sorenson and Payne 2005), we used Genbank (http://www.ncbi.nlm.nih.gov/genbank/) sequences of the following species as outgroups: Piaya melanogaster, Coccyzus minor, and Coccycua minuta.

Because of our limited sampling of the South American group for the two mitochondrial genes in our analyses, we further tested the monophyly of the three P. cayana groups (see below and Results) using the widespread sampling of South American Cytb sequences in Smith et al. (2014) available in Genbank. We supplemented this mostly South American dataset with our more complete sampling of Mesoamerican individuals and reanalyzed the whole dataset using the same parameters as in Smith et al. (2014).

Divergence time estimations

We estimated divergence time for the Cuculiformes in BEAST 1.8 (Drummond et al. 2012). We used the following substitution rates for our mtDNA markers: 0.029 (0.024–0.033) substitutions/site/lineage/Million years (s/s/l/My) for ND2, and 0.016 (0.014–0.019) s/s/l/My for CytB (Lerner et al. 2011; Lim and Sheldon 2011). Substitution models were specifically estimated for this analysis and selected with a Bayesian Information Criterion (BIC), as implemented in PartitionFinder (Lanfear et al. 2012). Clock model selection for our dataset was performed using Bayes factors log-likelihood values applying a stepping-stone method (Xie et al. 2011) in MrBayes 3.2.2 (Ronquist and Huelsenbeck 2003), which allows the testing of the data for fitting either to a strict molecular clock or a relaxed molecular clock. We ran analyses for 50 × 106 generations, sampling every 1000 generations, and inspected chain convergence in Tracer 1.6 (Rambaut et al. 2014). Stepping-stone analyses favored an uncorrelated relaxed clock (log-likelihood: − 14,511.77) over a strict clock (log-likelihood: − 14,516.36), following Kass and Raftery (1995). We thus implemented a relaxed molecular clock and a birth–death model for the speciation process. As calibration points, we used the most recent common ancestor (MRCA) for Cuculiformes, which has an estimated age of 44 million years (mya), following Prum et al. (2015) and Claramunt and Cracraft (2015), who estimated divergence between Geococcyx and Neomorphus around 7.8 ± 1 mya (Smith and Klicka 2010), and the divergence between Cuculus poliocephalus and the rests of the species in the genus Cuculus around 4.5 mya ± 1 mya (Fossøy et al. 2016). We ran two independent runs for 50 × 106 generations, sampling every 1000 generations; chain convergence was inspected in Tracer 1.6 (Rambaut et al. 2014). Phylogenetic dated trees from the two independent runs were combined using a burn-in of 25% in LogCombiner 1.8 (Drummond et al. 2012), from which we generated a Maximum Clade Credibility Tree in TreeAnotator 1.8 (Drummond et al. 2012).

Phylogeographic analyses

We divided our samples into three groups (Western, Eastern, and South America) for phylogeographic analyses following the topology shown by the phylogenetic trees. As the South American group had a high genetic diversity (Smith et al. 2014), which likely includes several well-differentiated taxa, we included this group only for informative purposes. For each of these groups, we calculated haplotype number (h), nucleotide diversity (π), and haplotype diversity (Hd) as measures of genetic diversity. We also measured the average number of substitutions per site (Dxy) with a Jukes–Cantor correction and the net number of substitutions per site (Da) as measures of genetic differentiation. All these statistics were estimated in DNAsp 5.1 (Librado and Rozas 2009).

We also explored the genetic structure within P. cayana using a three-way AMOVA on the previously described groups. The significance of the AMOVA results was assessed through 10,000 non-parametric permutations. In addition, we estimated Fst as a measure of population structure (Weir and Cockerham 1984), which take values ranging from panmixia (0) to a complete partition of genetic variation among subpopulations (1) following Hamilton (2009). We also assessed the significance of the Fst parameter with 10,000 permutations. Both AMOVA and Fst analyses were conducted in Arlequin 3.5.1.3 (Excoffier and Lischer 2010).

Since derived haplotypes may coexist with ancestral haplotypes within a single population, it has been suggested that there may be reticulated relationships that are not represented in bifurcated intraspecific phylogenetic trees (Posada and Crandall 2001). Haplotype networks are thus an additional option for representing these reticulated patterns. We constructed haplotype networks for the concatenated and the Cyb analysis based on the statistical parsimony algorithm implemented in PopArt 1.7 (Leigh and Bryant 2015).

Morphometrics

We measured eight morphometric characters from 110 P. cayana study skins from Mexico housed in scientific collections (Museo de Zoología of the Facultad de Ciencias and the Colección Nacional de Aves of the Instituto de Biología, both at the Universidad Nacional Autónoma de México). The selected characters are likely related to habitat use, foraging (Miles and Ricklefs 1984; Jønsson et al. 2015) and communication in Cuculiformes (Payne 1997) and are considered reliable indicators for size and the least prone to deformation during the skinning process (Zink and Remsen 1986). We measured the following characters: bill length (BL), width (BW), and depth (BD), all taken at the distal end of the nostril; wing chord (WC); and the lengths of the tarsus (TL), hallux (HAL), tail (TAIL), and of the white tips at the end of the underside of the tail (TIPS). Most measurements were taken with a Mitutoyo CD-6″ digital caliper to the nearest 0.01 mm, except for WC and TAIL, for which we used a-50 cm ruler (accurate to the nearest 1 mm). We ensured the repeatability of measurements by measuring a series of 10 specimens chosen at random during three different sessions on consecutive days until Pearson correlation tests showed only positive and significant correlations (Benítez-Díaz 1993). We measured only adult male specimens to avoid variation due to age (Yezerinac et al. 1992) and sex (Holmes 1986).

As we measured only Mexican specimens, for all subsequent statistical analyses, specimens included only representatives of the Western lineage (cinnamon undertail feathers, from the Pacific slope of western Mexico) and the Eastern lineage (black undertail feathers, from the Gulf Slope and the Yucatan Peninsula). We considered Pearson’s correlation coefficient values > 0.75 between measured characters to indicate redundancy (Zink and Remsen 1986; Rising and Somers 1989). This threshold was not met for any of the pairs of measured characters, so we included all variables measured for all subsequent analyses. We tested for significant differences in morphometric differences between the two lineages using a Mann–Whitney U test (Fay and Proschan 2010) and visualized these differences using box and whiskers plots. We conducted all these analyses in R (R Core Team, 2014). Some measurements were absent because of damaged specimens (e.g., lacking tail feathers). We therefore used a probabilistic Principal Components Analysis (PPCA), as implemented in pcaMethods (Stacklies et al. 2007) for R (R Core Team 2014) to allow the inclusion of individuals with missing data.

Paleodistributions and niche divergence analyses

To assess the variation of geographical and environmental conditions throughout the distribution of P. cayana in Mexico, we used ecological niche modelling (ENM) to approximate the Grinnellian niches (Soberon and Peterson 2005) occupied by each lineage. Georeferenced records in these analyses corresponded to specimens used in the morphometric and genetic analyses as well as occurrence records from P. cayana museum specimens included in the Global Biodiversity Information Facility database (GBIF.org 2021; Supplemental Materials I). To correct for spatial bias and possible correlated localities due to a high density of collected specimens (see Peterson et al. 2011), we retained only one presence point (i.e., an independent locality) within a buffer of ~ 7 km2 (the mean distance among occurrence records). We also omitted records that lacked bioclimatic data (see below). The final spatially rarefied dataset consisted of 175 records for P. c. mexicana and 545 for P. c. thermophila. We also defined an accessibility area (or “M” according to the BAM diagram; Soberon and Peterson, 2005; Barve et al. 2011) for each evolutionary lineage, based on occurrence records matching the terrestrial ecoregions (Dinerstein et al. 2017) and the biogeographic provinces of the Neotropical region (Morrone 2014). We assumed that these regions define the historically accessible area for each lineage in the geographical space (Svenning and Skov 2004). To characterize the ecological niche of the two lineages, we used nine uncorrelated variables at a spatial resolution of 0.041665° (~ 5 km2) obtained from WorldClim 1.4 (Hijmans et al. 2005): mean diurnal range (Bio 02), annual temperature range (Bio 07), mean temperature of the wettest quarter (Bio 08), annual precipitation (Bio 12), precipitation of wettest month (Bio 13), precipitation of driest month (Bio 14), precipitation seasonality (Bio 15), precipitation of driest quarter (Bio 17), and precipitation of coldest quarter (Bio 19). The selection of these climatic variables was based on the Pearson’s correlation coefficient (r < 0.8) and Variance Inflation Factor (VIF < 10), as implemented in the “corrplot” (Wei and Simko 2017) and “usdm” (Naimi 2015) R libraries.

Using MaxEnt 3.4.1 k (Phillips et al. 2006) and the “kuenm” R package (Cobos et al. 2019), we defined ENMs representing the spatio-temporal potential geographic distribution during the Late Pleistocene climate fluctuations, which allowed us to identify the potential historical range extent, connectivity, and stability areas for each lineage. To do this, we applied a four-step methodological approach (see Peterson et al. 2011): (1) calibration of ENM for each lineage in the present (hereafter Recent) to assess the model complexity and select the best modelling parameters (see Cobos et al. 2019); (2) selection and validation of models in terms of omission (under-prediction) errors, the partial-ROC test, and AIC values (Anderson et al. 2003; Peterson et al. 2008; Cobos et al. 2019); (3) projection of the defined suitable environmental conditions to climatic conditions during the Mid-Holocene (Mid-Hol, 6000 Ya), and the Late Pleistocene climate fluctuations including the Last Glacial Maximum (LGM; 21,000 Ya) and Last Inter Glacial (LIG, ~ 120,000–140,000 Ya); and (4) conversion of the logistic suitability values into a binary (presence–absence) map by setting a decision threshold equivalent to the 10th percentile training presence logistic threshold (see Liu et al. 2013, 2016). Past climate data were extracted from three global climate circular models: CCSM4 (Collins et al. 2004), MIROC-ESM (Hasumi and Emori 2004), and MPI-ESM-P (Baehr et al. 2015). We tested the reliability of our model projections by calculating the mobility-oriented parity (MOP) metric (Owens et al. 2013), which offers a view of the novelty of past climate conditions relative to present-day conditions within the calibration area for each lineage (see Alkishe et al. 2017 for a complete explanation). Detailed descriptions of the methods used to determine ENMs and MOP for each lineage are provided in Supplemental File II. All scripts and input information are available at https://github.com/davidprietorres/piaya_cayana_paleodistribuion.

To understand the role of geographical drivers of lineage diversification in P. cayana in Mesoamerica, we assessed the degree of geographical matching between the potential ranges for the different lineages predicted across each climatic scenario (i.e., alloprediction values). Thus, models are expected to predict similar potential distribution ranges if the climatic niche for lineages is similarly distributed and historically congruent despite geographical barriers (Mota-Vargas and Rojas-Soto 2016). Also, we obtained long-term climatic stability areas or refugia for each lineage by overlapping the binary maps obtained for each climatic scenario (Carnaval et al. 2009); a cell or pixel is considered climatically stable if lineages are predicted to occur there in all of the historical scenarios analyzed (Terribile et al. 2012). This procedure allows the analysis of the significance of climatic stability/instability areas in the evolution and maintenance of lineage diversity (e.g., Carnaval and Mortiz 2008; Carnaval et al. 2009; Cabanne et al. 2016; Castillo-Chora et al. 2021).

We also estimated the climate niche overlap between the lineages using the PCA-env method (Broennimann et al. 2012). We used a principal component analysis (PCA) to transform the environmental space of the selected environmental variables into a two-dimensional space defined by the first and second principal components (Strubbe et al. 2015). We then calculated the niche overlap between linages directly from the ecological niche space (Warren et al. 2008) using the mean of Schoener’s D, which ranges from 0 (no niche overlap) to 1 (complete niche overlap). We compared this value with randomly generated simulated values from “niche identity test” and “background similarity test” metrics (see Warren et al. 2008; Broennimann et al. 2012). The hypothesis of niche conservatism between lineages is accepted only if the observed Schoener’s D value is significantly higher (P < 0.05; i.e., more similar) than values expected for simulated overlap from the 1,000 pseudo-replicated data sets allowing random shifts. All analyses were estimated in “ecospat v. 3.0” (Di Cola et al. 2017) for R (R Core Team, 2014). Finally, we used linear and blob range-break tests to test whether the geographical/ecological boundaries between the Western and Eastern lineages may be associated with steep environmental gradients, as implemented in ENMTools 1.0 (Warren et al. 2021) for R (R Core Team, 2014). We ran 100 pseudoreplicates for each test to generate a null distribution, which we compared to the observed overlap value via a one-tailed test (see Glor and Warren 2011).

Results

Phylogenetic analyses

We obtained 32 ND2 and Cytb sequences from 26 localities (1–4 samples per locality) distributed across the geographic range of P. cayana (Fig. 1, Supplemental File 1). Sequences are deposited in Genbank, under accession numbers OM169114-OM169145 for ND2, and OM169146-OM169177 for Cytb.

Phylogenetic trees obtained for both ML and BI from the concatenated dataset showed highly similar topologies; in both cases, two well-supported main clades were recovered (Fig. 1A): a Mesoamerican clade grouping all samples from Mexico and Central America was sister to a monophyletic group including all samples from South America (South American clade). The Mesoamerican clade was further subdivided into two reciprocally monophyletic and well-supported clades: one grouped all samples from the western Pacific Slope of Mexico (Western lineage), and a second group including all samples from the Gulf Slope, the Yucatan Peninsula, and Central America (Eastern lineage). The geographic structure observed in these two Mesoamerican clades matches the distribution of the two recognized Mexican subspecies P. c. mexicana and P. c. thermophila. However, the phylogenetic tree derived from complementary analysis based on the Cytb dataset revealed that the Chocoan P. c. nigricrissa was placed with marginal support (PP = 0.72) as sister of the Mexican endemic P. c. mexicana (Supplemental Fig. 1). Results of our overall analyses based on both the concatenated and Cytb trees outline the significance of the Andes as a barrier, and therefore, a Trans-Andean clade (including P. c. mexicana, P. c. thermophila and the Chocoan P. c. nigricrissa) and a Cis-Andean clade (including all South American samples in this work) might be recognized. In addition, both datasets also showed two clades with Mesoamerican P. cayana, supporting a clear split between the western endemic P. c. mexicana and the Gulf slope-Central American P. c. thermophila.

Phylogeographic analyses

The three studied clades showed high Hd values (> 0.8), lower values for nucleotide diversity (π < 0.2), and a large number of haplotypes (Table S2). Genetic differentiation among lineages based on the concatenated dataset is deep, with Dxy values for the genetic distance between the Mesoamerican clade and the South American clade higher than 3%. In comparison, the genetic distance between the Western and Eastern lineages is 2.4% (Table 1). According to the Cytb dataset analysis, the Dxy value between the Trans-Andean clade and the Cis-Andean clade was 3.1%, while Dxy values for the Western lineage and the Chocoan nigririssa were of 2.6%; of 3.1% between the Eastern and Western lineages; and of 2.8% between the Eastern lineage and nigririssa. The AMOVA analysis revealed that most of the genetic variation (73.14%) is distributed between the Eastern and Western lineages within the Mesoamerican clade. In comparison, a non-significant portion of the variation (17.27%) is distributed between the Mesoamerican and the South American lineages (Table 2).

Table 1 Genetic differentiation for P. cayana.
Table 2 Analysis of Molecular Variance (AMOVA) for the lineages in the phylogenetic tree

The haplotype network showed a clear geographic structure matching the clades in the phylogenetic trees (Fig. 1C). No haplotypes were shared between different lineages, and we detected large numbers of mutational steps separating the different groups; 32 mutational steps separated the South American clade from the Mesoamerican clade; and within the Mesoamerican clade, 35 mutational steps and a hypothetical haplotype separated the Western and the Eastern lineages (Fig. 1C). A star-like pattern is evident in the Eastern lineage. We detected no clear pattern in the case of the Western lineage, for which several hypothetical or unsampled haplotypes were determined. The inclusion of the single Chocoan sequence showed a haplotype network in which this sample was also placed as closer to the Western lineage, within a Trans-Andean clade (Supplemental Fig. 2).

Fig. 2
figure 2

Morphometrics for the two Mesoamerican lineages. Orange color depicts mean and samples for the Western lineage (P. mexicana); green color indicates mean and samples for the Eastern lineage (P. c. thermophila). Mann–Whitney U test boxplots for comparison of the means for the different characters. Abbreviations: Bill width (BW), Bill Length (BL), Bill Depth (BD), Tail Length (TAIL), Length of white tips in the tail (TIPS), Hallux Length (HAL), Tarsus Length (TL), and Wing Chord (WC). Right: principal component analysis plot showing the distribution of individuals in each lineage in the morphological space

Divergence time

Our results suggest that the South American and the Mesoamerican clades diverged during the Late Miocene–Early Pliocene, about 4.7 mya (6.5–3.2 mya, 95% HPD). Within the Mesoamerican clades, the Western and the Eastern clades diverged around 1.24 mya (1.8 – 0.8 mya, 95% HPD), during the Early Pleistocene (Fig. 1).

Morphometrics

We measured a total of 94 study skins (Supplemental File I). Boxplots for univariate analyses showed similar measurements between the lineages for all of the characters, except for TAIL (Fig. 2). The Mann–Whitney U test showed significant differences in TAIL, bill length (BL), and bill width (BW). The Western lineage (P. c. mexicana) had a significantly longer tail and shorter and narrower bill than the Eastern lineage (P. c. thermophila).

Pearson correlation tests showed no significant correlations between pairs of morphological measurements (Table S3), so we included all of the variables in the PCA. The first three components of the PCA explained 83.44% of the total variation: 58.24% in PC1, 19.02% in PC2, and 6.18% in PC3 (Table S4). For PC1, significant loadings were concentrated in TIPS and TAIL, while for the PC2, significant loadings were detected for BL and BW. Plotting individual values of the first two components suggests separation into two phenotypic groups corresponding to the Eastern and Western lineages (Fig. 2).

Distribution modelling and paleodistribution

ENMs yielded a good fit for Recent distribution for both the Western (partial-ROC test = 1.34; omission rate = 3.5%; AICc = 2754.29) and the Eastern (partial-ROC test = 1.16; omission rate = 4.6%; AICc = 9360.96) lineages. These results indicate a high model performance, showing that models are better than random and thus statistically descriptive of the climatic conditions defining the ecological niche of each lineage. According to the Jackknife test and contribution variables values in MaxEnt, the most important variables for model building for the Western lineage were Mean Diurnal Temperature Range (Bio 02, 25.8%), Temperature Annual Range (Bio 07, 22.9%), and Precipitation seasonality (Bio 15, 17.3%). For the Eastern lineage, the most important variables were Mean Diurnal Temperature Range (Bio 02, 32.2%), Annual Precipitation (Bio 12, 18.6%), and Precipitation of the Wettest Month (Bio 13, 16.4%).

Mean MOP values for the three past periods suggest that the presence of non-analogous climate areas is not a factor driving the interpretation of our predictions under past environmental conditions (Supplemental File II). Paleodistribution models (Fig. 3) for the Western lineage showed a decrease (~ 12%) in high suitability areas during the LGM compared to Recent, while the suitable area for the Eastern lineage was larger (about 45%) and less fragmented. Overall, predicted potential ranges for both lineages during the LGM showed a lower overlap (on average 54.35%) than predictions for the LIG (77.89%) and mid-Holocene (79.57%) in comparison to their corresponding Recent climate predictions. Alopredictions (i.e., predicted distributions of the Eastern lineage into the range of the Western lineage and vice versa) showed 24.29% overlap, averaging across the four scenarios (30.78% for the Recent, 21.56% for the mid-Holocene, 15.52% for the LGM, and 29.28% for the LIG), suggesting that environmental differentiation has been maintained since at least the Late Pleistocene. Our results showed that the distribution of suitable areas for the two lineages was highly fragmented and ecologically differentiated during the LGM, with areas of suitable habitat separated by the Isthmus of Tehuantepec, unlike Recent, mid-Holocene, and LIG predictions (Fig. 3). Predicted areas of climatic stability for each lineage (crossed areas within the Recent distribution map, Fig. 3) covered 38.62% of the Recent distribution for the Western lineage and 52.56% for the Eastern lineage, showing an overlap of 10.55% and 5.15% with the predicted potential ranges, respectively.

Fig. 3
figure 3

Ecological niche model projected onto the geographic areas for P. mexicana (orange) and P. c. thermophila (green). Upper maps depict suitability areas for each lineage under Recent climatic conditions; crossed lines depict regions of historical climatic stability. Lower maps depict the predicted regions of historical distribution for climatic niches across four time periods, and the aloprediction areas (darker orange and darker green) between lineages across the Late Pleistocene (LIG and LGM), the Holocene and the Recent. Bars in the upper right correspond to the size of the areas predicted by models for each lineage, and their consensus areas (blue color) across the four paleoclimate scenarios

Ecological niche similarity analyses

The PCA-env indicated that 80.11% of the environmental variation is explained by the two first principal components (PC1 = 66.05% and PC2 = 14.06%). The occurrence density surfaces (black spots) in environmental space, as determined by PCAenv, showed a clear difference in position between the lineages (Fig. 4A). Schoeners’ D observed values were 0.09 and statistically non-significant in both directions (P = 0.17, from P. c. mexicana to P. c. thermophila; and P = 0.52, from P. c. thermophila to P. c. mexicana), indicating that the observed niche was no more similar than expected by chance (Fig. 4B, C). This allowed us to reject the hypothesis of niche equivalency and similarity (i.e., niche conservatism) between lineages, suggesting that lineages occupy different environmental niches. Finally, the linear and blob range-break tests detected significant environmental differences (P < 0.05) separating the Western and Eastern lineages.

Fig. 4
figure 4

Equivalence and similarity test in the environmental space for Squirrel cuckoo Piaya cayana lineages in Mesoamerica. A Principal Component Analysis (PCA) of ecological niche for the lineages and the corresponding values of variables contribution. Grey shading in the upper right plots depicts the density of the occurrences of each lineage by cell in the ecological space; the solid and dashed lines indicate 100 and 50% of the available (background) environment, respectively. B Plots of the similarity test comparing the ecological niche of the two lineages in both directions (P. mexicana vs. P. c. thermophila and vice versa). We only present values for the D metrics for both tests. In all plots, the arrow with a red diamond represents the observed similarity value between niches; the gray columns represent 1000 randomly simulated expected values. The P value is shown in each plot (n.s non-significant)

Discussion

Our phylogenetic analyses based on the concatenated dataset showed the existence of two main clades within P. cayana. One clade grouped all samples distributed in Mesoamerica, while the second grouped all samples from South America. The Mesoamerican clade was further subdivided into two reciprocally monophyletic groups divided by the Isthmus of Tehuantepec in southern Mexico: P. c. mexicana to the west and south and P. c. thermophila to the north and east. The pattern of differentiation into two clades is congruent with the geographic limits previously established for the two currently accepted subspecies (Ridgway 1916; Peters 1940), and presumably corresponds with contemporary ecological barriers set by a humidity gradient in the Isthmus of Tehuantepec region (Binford 1989). In addition, our complementary Cytb analyses also showed these Mesoamerican and South American clades, but placed the Chocoan sample as sister to the endemic P. c. mexicana, although with marginal support. The placement of Chocoan nigricrissa within the Mesoamerican clade corroborates the recognition of the Andes as the main for the diversification in P. cayana. The inclusion of nigricrissa in a broader Trans-Andean clade and sister to mexicana further supports the evolutionary divergence of the Western lineage, and may even support the species-level recognition of P. thermophila. Further analyses based on both more specimens and complete genomic datasets are needed to clarify the phylogenetic position of the Chocoan population of the polytypic P. cayana.

Some phylogeographic studies of montane birds (see Barber and Klicka 2010) and other biological groups (Ornelas et al. 2013) have suggested multiple pulses of diversification across the Isthmus of Tehuantepec, in which some of the co-distributed taxa have experienced either simultaneous or different divergence events. These studies have dated some pulses in the Early Pleistocene (1.8–0.78 mya), coinciding with our estimates of divergence between the Western and the Eastern lineages in Mesoamerica. The probable simultaneous divergence for biotas across the Isthmus of Tehuantepec (based on the overlap in the confidence intervals) suggests a vicariant event that limited gene flow for both montane and lowland populations on both sides of the Isthmus (see Miller et al. 2005). This vicariant pattern is relatively well understood in montane populations, for which an east-to-west differentiation axis has been observed (Sullivan et al. 2000; Barber and Klicka 2010; Ornelas et al. 2013). However, it is still poorly understood in lowland birds, in which both east–west and north–south axes have been detected (e.g., Phillips 1986; Binford 1989; Ríos-Muñoz and Navarro-Sigüenza 2012; Prieto-Torres et al. 2019a, b; Mendonça et al. 2021). Differentiation along these axes in this region has been related to a possible seaway across the southern Isthmus of Tehuantepec that may have affected both lowland and highland vertebrates with different dispersal capabilities. However, this theory has been challenged (Mulcahy et al. 2006; Barber and Klicka 2010; but see Durham et al. 1955; Whitmore and Stewart, 1965; Miller et al. 2005); opening the way for alternative explanations. Thus, ecological differentiation has become a natural subject of research (see Hernández-Canchola and León-Paniagua 2017; Hernández-Romero et al. 2018; Moreno-Contreras et al. 2020).

Several aspects of our results support genetic differentiation between the lineages (i.e., genetic distances, haplotype network, and high levels of Fst), which indicates the presence of a long-lasting vicariant barrier that maintained the differentiation of these taxa. Genetic divergence in isolation in Piaya may have been reinforced by low population sizes, as well as by presumably low dispersal capabilities. This is suggested by the relatively short and rounded wings and the extensive geographic variation throughout the apparent habitat continuity across the Neotropics (Payne 2005; Smith et al. 2014).

Morphometrics

Morphological traits may reveal relationships between organisms and their environments (Hedenström 2008; Jønsson et al. 2015). Considering that both lineages inhabit environmentally different regions (Fig. 4), bill traits may differ between the lineages in a response to differences in the prey consumed, which include insects and small vertebrates (Raw 1997; Piratelli and Pereira 2002; Komar and Thurber 2003; Azevedo Silva et al., 2006; Repenning et al. 2009). These preys are from distinctive biotic assemblages in the lowlands of the two slopes. We also detected significant differences in the tail and tail color patterns between P. c. thermophila and P. c. mexicana. These two traits are important in communication and interspecific recognition among cuckoos (Payne 2005). The tail display is presumed to be a critical element of courtship, especially for members of the subfamily Coccyzinae, to which Piaya belongs. The contrast between the tail underside and the white tail tips presumably plays a significant role (Payne 1997). In addition, the color of the underside of the tail seems to be associated with sexual maturity, being pale and less defined in juveniles (Howell and Webb 1995; Payne 2005). Given the significance of the tail in communication and sexual behavior, differences in the tail size and the color of the tail underside may represent prezygotic barriers between the P. c. thermophila and P. c. cayana lineages.

Niche differentiation and geographical barriers

The background similarity test results suggest that the environmental niches occupied by the Eastern and Western lineages are not more similar than expected by chance. This result matches findings in other co-distributed taxa (e.g., Vázquez-Miranda et al. 2009; Arbeláez-Cortes and Navarro-Sigüenza, 2013; Hernández-Canchola and León-Paniagua 2017; Vázquez-López et al. 2020). Since divergence between lineages occurred during the Middle Pleistocene, the final uprising of the Trans Mexican Volcanic Belt may have led to drier conditions across the Pacific Slope (Graham and Dilcher 1995; Becerra 2005; Pennington et al. 2000; Molnar and Cane 2007). These drier environments may have enhanced the adaptation of Piaya populations, leading to geographic restriction and the evolution of many endemic lineages extended across the Pacific Slope tropical dry forests (Becerra 2005; Ramírez-Barrera et al. 2018; Prieto-Torres et al. 2019a, b; Castillo-Chora et al. 2021).

Models under past environmental conditions suggest that during the Quaternary, the Eastern lineage experienced a stronger reduction in high suitability areas than the Western lineage, which has broader habitat suitability areas. Although environmental stability areas for these two taxa were presumably extensive, there seems to have lower connectivity for the Eastern lineage. Furthermore, there appears to have been little-to-no overlap between the stability areas for each lineage, including in the small area where the two lineages currently overlap in the Isthmus of Tehuantepec (see Fig. 3), which may explain the scarcity of intermediate birds. Because ecological niche tests rejected the niche conservatism hypothesis, it is plausible that the two lineages have inhabited different environmental spaces since at least the Late Pleistocene. Our analyses suggest that, although the Isthmus of Tehuantepec has been most widely recognized as a biogeographic limit for different montane taxa in several vertebrate groups (Sullivan et al. 2000; Barber y Klicka 2010; Ornelas et al. 2013; Rodríguez‐Gómez et al. 2013; Ortiz-Ramírez et al. 2016), lowland taxa have also responded to this biogeographic boundary, likely through the evolution and maintenance of ecological niche differences, as we found in Mesoamerican Piaya.

Our analyses of niche divergence, morphometrics, and geographic range suggest that P. c. thermophila and P. c. mexicana are reproductively isolated. However, this will need further corroboration with nuclear and genomic analyses to fully detect reduction or lack of gene flow. Our results using mitochondrial genes provide evidence of a little-to-no gene flow at a single locus; this does not rule out substantial gene flow in the nuclear genome, as has been found by several studies showing discrepancies between the mitogenome and the nuclear genome (Nichols 2002; Funk and Omland 2003; Ballard and Whitlock 2004; Degnan and Rosenberg 2009). Thus, studies considering the entire genome or at least a substantial part of it will be needed to fully corroborate a lack of gene flow.

Systematics

Although the taxonomic debate about the P. cayana complex has been active, to date, it has mostly relied only on phenotypic characters and geographic distribution. However, the study of Smith et al. (2014) based on the Cytb mitochondrial gene suggested highly structured genetic diversity, with four genetically distinct lineages in South America and at least one in Mesoamerica. In that study, Mesoamerica was represented only by samples of thermophila; thus, relationships among taxa within Mesoamerica were not explored.

Our findings on phylogenetics, gene flow, morphometrics, and ecological niche differentiation, together with previously described differences in coloration patterns (Stone 1908; Ridgway 1916), all support the evolutionary divergence of the thermophila and mexicana lineages. The scarcity of proven intermediate individuals or hybrids despite the existence of a (relatively small) contact zone in the southeastern Isthmus of Tehuantepec may be explained by a lack of gene flow and presumably environmental niche differences. In addition, the Eastern and Western lineages show morphological differences in tail traits that are likely involved in communication and reproductive behavior (Payne 2005), and therefore may contribute to reproductive isolation. Evidence from all of these analyses suggests that the continued treatment of the Mexican lineages of P. cayana as a single species (e.g., Peters 1940; American Ornithologists’ Union, 1998) does not accurately reflect the evolutionary divergence of these groups in Mesoamerica. Our mtDNA phylogenetic trees suggest that both P. c. mexicana and P. c. thermophila should be recognized as a separate species from P. cayana (del Hoyo et al. 2017; Gill Donsker, 2017). Furthermore, the placement of the Chocoan nigricrissa within the Trans-Andean clade underscores the need of a more complete sampling of both P. c. thermophila and P. c. nigricrissa in southern Central America and the Chocó region before this can be determined. Our Cytb tree suggest that the monophyly of P. c. thermophila still needs to be proven, due to the close geographic proximity of Piaya c. nigricrissa in the Ecuadorian Pacific Slope of the Andes (Stone 1908).

As evidenced by our set of analyses, a provisional but potentially more accurate taxonomic scheme for Piaya would be the recognition of both mexicana and thermophila (and perhaps nigricrissa) as independent evolutionary lineages that may be recognized under different species concepts (Cracraft 1983; De Queiroz 2007; Gill 2014). Estimates of genetic and divergence times in these lineages are similar or even greater than those found in widely recognized full species of cuckoos or other bird groups (e.g., Johnson and Cicero 2004; Fossøy et al. 2016).

Conclusions

Our phylogenetic analyses suggest that Mexican populations of P. cayana represent at least two separate lineages, which are independent from Cis-Andean South American lineages (see Smith et al. 2014). Each Mesoamerican lineage shows diagnostic characters, and several lines suggest highly reduced gene flow. Estimated divergence dates indicate that the split between thermophila and P. mexicana occurred during the Pleistocene (0–78–1.79 mya). The decrease in habitat extent and connectivity promoted by Pleistocene climatic variation may have enhanced the geographic isolation pulses and the evolution of morphometric differences, which are probably involved in reproduction behavior, which may have also prevented gene flow.