The role of environmental filtering, geographic distance and dispersal barriers in shaping the turnover of plant and animal species in Amazonia

To determine the effect of rivers, environmental conditions, and isolation by distance on the distribution of species in Amazonia. Location: Brazilian Amazonia. Time period: Current. Major taxa studied: Birds, fishes, bats, ants, termites, butterflies, ferns + lycophytes, gingers and palms. We compiled a unique dataset of biotic and abiotic information from 822 plots spread over the Brazilian Amazon. We evaluated the effects of environment, geographic distance and dispersal barriers (rivers) on assemblage composition of animal and plant taxa using multivariate techniques and distance- and raw-data-based regression approaches. Environmental variables (soil/water), geographic distance, and rivers were associated with the distribution of most taxa. The wide and relatively old Amazon River tended to determine differences in community composition for most biological groups. Despite this association, environment and geographic distance were generally more important than rivers in explaining the changes in species composition. The results from multi-taxa comparisons suggest that variation in community composition in Amazonia reflects both dispersal limitation (isolation by distance or by large rivers) and the adaptation of species to local environmental conditions. Larger and older river barriers influenced the distribution of species. However, in general this effect is weaker than the effects of environmental gradients or geographical distance at broad scales in Amazonia, but the relative importance of each of these processes varies among biological groups.


Introduction
Identifying and understanding patterns in species distributions is essential for conservation planning and has long been recognized as crucial for defining conservation strategies in Amazonia (Guisan and Zimmermann 2000;Guisan and Thuiller 2005). Amazonian forests exhibit considerable internal heterogeneity (Emilio et al. 2010), but general knowledge of the distribution of the Amazonian biota is still limited: collection density is low, and taxonomically and geographically biased (Nelson et al. 1990;Hopkins 2007). Moreover, there is no consensus on the role of environmental and historical factors in predicting species composition at different spatial scales. Therefore, biogeographical studies can identify areas with unique sets of species and help to achieve the goal of preserving a representative mosaic of Amazonian habitats and the species they harbor.
How and why species composition varies among sites are some of the most frequent questions in ecology and biogeography. At broad scales, the distribution of organisms in space results from synergistic effects of species adaptations to the environment (Nekola and White 1999;Tuomisto et al. 2003) and diversification due to dispersal limitation (Hubbell 2001;Warren et al. 2014). While deterministic species responses to environmental conditions can give rise to patchy species distributions (Tuomisto et al. 2003), dispersal limitation and allopatric speciation can lead to differences in species composition across barriers and distant areas (Hubbell 2001;Warren et al. 2014).
In Amazonia, the most obvious potential dispersal barriers for terrestrial organisms are large rivers and associated floodplains. Accordingly, the Amazon River and its main tributaries have been recognized as important boundaries for the distribution of vertebrates for more than a century (Wallace 1852;Haffer 1974;Cracraft 1985;Moritz et al. 2000;Ribas et al. 2012;Boubli et al. 2015). The hypothesis that the development of the drainage system was a driver and maintainer of this pattern through allopatric speciation and/or preventing secondary contact between distinct populations (Wallace 1852;Ribas et al. 2012;Naka and Brumfield 2018) has been supported by occurrence (Cracraft 1985;Pomara et al. 2014) and phylogenetic data (Aleixo 2006;Ribas et al. 2012;Fernandes 2013;Naka and Brumfield 2018;Silva et al. 2019) for understory upland forest birds. The unique composition of understory bird and primate communities in different interfluves has led to the division of Amazonia into bird endemism areas delimited by large rivers, such as the Madeira, Tapajós, Rio Negro and Amazonas (Cracraft 1985;da Silva et al. 2005a, b). These divisions are widely used in conservation planning and are among the criteria for the definition of Amazonian ecoregions (Dinerstein et al. 2017).
Although the position of large Amazonian rivers matches the limits of the distributions of many understory birds and primates (Wallace 1852;Boubli et al. 2015;Silva et al. 2019;Maximiano et al. 2020), there has been controversy about to what degree species distribution patterns are related to rivers, especially when extrapolating for a wide range of organisms (Oliveira et al. 2017;Santorelli et al. 2018). For example, Santorelli et al. (2018) found that only 4 species with detectability above 50% out of almost 2000 species of the 14 taxonomic groups studied had their distributions delimited by the Madeira River. Among these 4 species were 2 birds and 2 primates, but no plant, invertebrate, or herpetofaunal species were found to have limits associated with the river at the studied localities. Intraspecific genetic structure associated with the position of Rio Negro river was reported for one tree species (Nazareno et al. 2017(Nazareno et al. , 2019. Other studies have documented little to no river-barrier effect on ants (Souza et al. 2016; but see Winston et al. 2017), lizards (Souza et al. 2013), plants (Pomara et al. 2014;Tuomisto et al. 2016) and termites (Dambros et al. 2017) at the community level. For these taxa, the main causes of differences in species distributions were associated with geographic distance (isolation by distance) or environmental differences.
Even in the absence of dispersal barriers, different parts of Amazonia may harbor different floras and faunas simply due to isolation by distance; natural populations are never panmictic because individuals typically disperse only a limited distance from where they are born (Hubbell 2001). In addition, spatially structured environmental heterogeneity related to environmental factors, such as soil properties and climate, can lead to differences in species-assemblage composition because distinct sets of species are favored in different environmental settings (Leibold and Mikkelson 2002;Tuomisto et al. 2003;Zuquim et al. 2014). Although some studies have controlled for both riverine barrier position and environmental heterogeneity in specific taxa (Pomara et al. 2014;Tuomisto et al. 2016;Maximiano et al. 2020), few studies have tried so far to disentangle the combined influence of dispersal barriers, geographic distance and environmental heterogeneity for a broad range of distinct taxonomic groups (but see Gascon et al 2000). Nevertheless, to conclude that a pattern observed is due to a single cause, it is important to consider the alternatives. Thus, we here advance in the still opened question on to what degree rivers have constrained species movements through time; and to what degree environmental filtering triggered by environmental differences among areas sampled in different sides of rivers drive species distributions for different taxonomic groups (Tuomisto and Ruokolainen 1997;Colwell 2000).
Species responses to the presence of barriers and environmental conditions are influenced by their dispersal capacity and width of tolerance to abiotic gradients (Pomara et al. 2014). However, biogeographic studies in Amazonia have generally tackled only one or a few taxa at a time, limiting their conclusions to the taxonomic group studied. Given that differences in sampling region, sampling design, length of environmental gradients and spatial extent among studies for different taxa can influence the results (Gilbert and Bennett 2010;Tuomisto et al. 2012), comparisons among these studies are questionable. Contrasting findings among different taxonomic groups or species may reflect different responses of taxonomic groups to the environment or to the presence of dispersal barriers. However, they may also be a consequence of differences in sampling schemes or in statistical methods employed (Fortin and Dale 2009). So far, no wide-scale comprehensive multi-taxa standardized assessment of the role of geographical distance, environment, and rivers to Amazonian biodiversity has been carried out. To draw general biogeographic conclusions, data collected using standardized protocols over large areas are necessary.
To understand how riverine barriers, contrasting environments (Tuomisto and Ruokolainen 1997;Tuomisto 2007), and spatial distance relate to the patterns of distribution of species belonging to a broad range of taxonomic groups, we integrate occurrence and abundance data collected using the same spatial grain. The analyses include data on three plant groups, three invertebrate groups, and three vertebrate groups sampled in forests and streams across all the major biogeographic regions of Amazonia. All surveys were based on a standardized sampling design, which allowed comparison of most taxa across the same river boundaries and along the same climatic and geographic gradients.

Sampling design
We compiled data generated by researchers from the Brazilian Biodiversity Research Program (PPBio), which adopts standardized protocols to create a comparable multi-taxa dataset. We sampled nine Amazonian lowland taxa: birds, fishes, bats, ants, termites, butterflies, ferns ? lycophytes (hereafter called only ferns, for simplicity), gingers (Zingiberales), and palms. A total of 822 plots were sampled, and these were placed in 32 sampling grids of 5 to 72 plots each (Fig. 1). Within each grid, plots were regularly spaced, and the nearest neighboring plots were separated by a distance of 1 km. The distance between grids varied from dozens of kilometers to 1850 km, and grids were spread over an area encompassing about 2 million km 2 . Most plots represented tall, dense, lowland terrafirme tropical forest, but a few were established in white-sand vegetation that has a simpler structure (locally known as Campinas and Campinaranas).
For terrestrial taxa, each plot had a 250-m-long centerline following the terrain contour, to minimize within-plot variation in edaphic conditions. Plot width was adjusted for each taxonomic group due to differences in species density, diversity, and detectability . Aquatic plots were established in forest streams that are not subject to seasonal floods. Fishes were sampled in 50-m-long stretches along each stream (Mendonça et al. 2005). Except for one location, streams surveyed for fishes were located within the same sampling grids where other taxa were sampled and, in most cases, other taxa have been sampled in plots adjacent to the streams. Although not all terrestrial taxa were sampled in every plot, all plots were surveyed for at least two biological groups, and each  Bates et al. (2004). Maplets show the plots (red hexagons) sampled for each of the groups taxon was surveyed in at least two Amazonian endemism areas (Cracraft 1985). Moreover, there was overlap in the geographic distribution of surveys for all taxa. The number of plots sampled for each biological group varied from 45 (butterflies) to 475 (gingers), and the number of sampling grids varied from 4 (birds) to 20 (gingers) ( Table S1).

Biological surveys
Birds, fishes, bats, ants, termites, butterflies, ferns, gingers, and palms were sampled along a 250-m central line. The database of each biological group and species identities were carefully reviewed and taxonomically harmonized by specialists.
Understory birds, bats, butterflies, and ants were sampled using traps installed along the 250 m central line. Birds and bats were sampled with mist nets of 32 mm for birds and 19mm mesh for bats. Butterflies were captured using Van Someren-Rydon traps with fermented-fruit baits installed every 50 m along the central line and checked daily during 5 consecutive days. Ants were captured using pitfall traps, leaf-litter samples (Winkler sacks), and sardine baits on plastic plates. These three sampling methods tend to collect distinct ant assemblages according to species foraging mobility, which also reflects dispersal abilities. Pitfalls tend to trap relatively larger mobile forager ants, leaf-litter samples collect small cryptic and specialist ant fauna (Bestelmeyer et al. 2000), and sardine baits capture a small set of dominant ants from both groups (Baccaro et al. 2010). While the information about ant dispersal abilities is scarce, there is evidence that larger winged individuals may fly longer distances (Helms 2018). Therefore, analyses were done separately for the ant data obtained with each sampling method, and hereafter ants are referred to as mobile-forager (pitfall), cryptic (leaf-litter), or dominant (bait) ants. Fishes were sampled in 50 m-long aquatic plots that were blocked at both ends with fine-mesh nets, and fishes were then collected during daylight hours using seine and hand nets (Mendonça et al. 2005). Fish species were classified into categories of low or high dispersal capacity that were analyzed separately. We classified fish species with small body size, restricted habitat use and poor swimming capacity as ''low dispersal capacity'' species, and all others as ''high dispersal capacity'' species (Radinger and Wolter 2014). The adopted widths of plots for termites, palms, gingers, and ferns were 2, 4, 4, and 5 m, respectively, and all were sampled along the entire 250-m centerline. Termites were sampled in 5 to 10 sections of 5 9 2 m interspaced along the centerline, with active search for nests lower than 2 m above ground, in tree logs, branches, soil, and leaf-litter. The soil was dug for a maximum of 50 cm. For ferns ? lycophytes and gingers, all individuals with a leaf longer than 10 cm were counted and identified. Palm individuals with a minimum of 1 m height above ground to the tip of the highest leaf were also identified and counted. More detailed description of the sampling methodology and species identification for each group can be found in Supporting Information S1 and in Menger et al. (2017;understory

Measured environmental variables: soil and water properties
In every terrestrial plot, topsoil (max. 10 cm depth) samples were taken every 50 m. In each plot, the samples were either mixed and taken to the soil laboratory to be analyzed as a single sample or analyzed separately, in which case average values of the six samples were used to represent the soil characteristics of the plot. The composite sample was analyzed for soil clay content and exchangeable base-cation concentration (Ca, Mg, and K; Na concentrations were below detection limit) in the Thematic Laboratory of Soils and Plants at the National Institute for Amazonian Research (INPA), Manaus. Soil samples were not taken for 18 plots (2% of all plots) and for these, we extracted base-cation concentrations from a digital map (Zuquim 2017) that uses both direct soil measures and estimations based on the occurrence of soil-indicator fern species . To avoid circularity, fern ? lycophyte inventories for the 18 plots from which soil information was derived from a digital map were not included in the analysis. Aquatic plots were environmentally characterized by the following water characteristics: pH, electric conductivity, dissolved oxygen, and temperature. These water variables were measured in the center of the stream channel and in the middle of the water column. Electrical conductivity and pH were measured using a portable Aqua-CheckTM Water Analyzer Operator (O.I. Analytical, College Station, TX, U.S.A.). Dissolved oxygen and temperature were measured using a Yellow Springs Instruments Ò (Yellow Springs, OH, USA.) model 58 portable oxygen meter thermometer.

Retrieved environmental variables: tree cover and climate
We obtained percentage tree cover as modeled using the Advanced Very High Resolution Radiometer data (AVHRR; USGS 2017) to account for differences in vegetation structure. Climatic data were extracted from the Climatologies at High Resolution for the Earth's Land Surface Areas (CHELSA- Karger et al. 2017; https://chelsa-climate.org/, accessed 14/May/2017). We chose maximum temperature of the warmest month (bioclim 5), minimum temperature of the coldest month (bioclim 6) and the precipitation of the driest quarter (bioclim 17) out of the 19 bio-climatic (bioclim) variables available in CHELSA (Karger et al. 2017). We selected only these climatic variables to avoid an excessive number of correlated variables in the model and because these are the climatic variables that have been found to be strongly associated with the distributions of several taxa (Janzen 1967;Š ímová et al. 2011). Bioclim 6 and 17 were highly correlated (r = 0.65, p \ 0.001), therefore only bioclim 5 and bioclim 17 were included as independent predictors in subsequent models (Dormann et al. 2013).
All non-climatic variables obtained for terrestrial (tree cover, soil clay content, and soil bases) and aquatic plots (pH, electric conductivity, dissolved oxygen and temperature) were only weakly correlated with each other (r \|0.36|) and were used as independent predictor variables in regression models.

Classification of areas based on Amazonian rivers
Amazonia has been subdivided into areas of endemism based on the distribution of understory upland forest birds (Haffer 1974;Cracraft 1985;da Silva et al. 2005a, b). To test the relevance of these areas for Amazonian biota in general, we assigned each sampling plot to the corresponding bird area of endemism and used area membership as a predictor variable in subsequent analyses. A few plots in southern Amazonia were positioned in areas that have not previously been classified into endemism areas. Because earlier studies suggest that the Teles-Pires River is a dispersal barrier for some bird species (Bates et al. 2004), we added a new region to the South of the Tapajós endemism area using the Teles-Pires River as a boundary between the two areas and assigned the southernmost plots to this endemism area (Fig. 1).

Analyses
Two analytical approaches have been developed to tease apart the relative roles of spatial and environmental variables on species composition. These can be divided into two conceptually different groups: distance-based methods and raw-data-based methods. In distance-based methods, the response variable is a pairwise dissimilarity matrix (n 9 n matrix) in which each site is compared with all other sites in turn, and the cell values quantify the degree of compositional dissimilarity between the two corresponding sites. The explanatory variables in distance-based methods are distance matrices quantifying geographical distance or degree of environmental difference between pairs of sites (Lichstein 2007;Tuomisto and Ruokolainen 2006). In raw-data methods, the response variable is a matrix where the rows represent objects (sites) and columns represent descriptors (often species). The explanatory variables are spatial coordinates and measured or estimated values of environmental factors (Borcard et al. 1992;Borcard and Legendre 2002;Dray et al. 2006). Distance-based and raw-data methods answer conceptually different questions given that the first asks if the differences in the response variable varies in relation to differences in the other factor whereas the second asks directly if the response variable varies in relation to the explanatory variables. There has been controversy about the relative merits and interpretation of the results from these methods (Tuomisto and Ruokolainen 2006;Laliberté 2008). As each approach targets a different null hypothesis, we used both approaches to obtain a more complete view of the distribution patterns of Amazonian biodiversity.
In both the distance-based and raw-data analyses presented below, we included the same predictor variables to test the effect of spatial isolation, environment, and barriers. In distance-based analysis, we used geographical distances calculated from geographical positioning to represent the effects of isolation by distance, the environmental distances to represent the effects of the environment, and differences in areas of endemism between plots to represent the effect of river barriers. In raw-data analyses, spatial coordinates, individual environmental variables, and endemism areas were directly used as predictor variables (see details below).

Distance-based analysis
In a first set of analyses, we asked what determines compositional dissimilarity between plots. We quantified compositional dissimilarity between plots using 1-Jaccard index, which is based on the proportion of unique species out of the total number of species observed in the two plots being compared. The Jaccard-based dissimilarity matrices were calculated for each taxonomic group separately. The explanatory distance matrices were also produced for each taxonomic group separately because different groups were surveyed in partly different sets of plots.
To test whether species dissimilarity of plots was related to geographic distance, environmental difference, or separation by a major river, we created distance matrices based on geographic location, environmental variables, and area of endemism. Each plot was georeferenced in the field using GPS and the coordinates were used to construct a matrix of geographic distances. Distance values were logarithmically transformed prior to analysis to account for the tendency of distance decay to become slower at larger distances (Nekola and White 1999;Hubbell 2001;Tuomisto et al. 2003). Environmental-distance matrices were calculated as Euclidean distances based on environmental data. Soil bases were log-transformed prior to the calculation of the environmental distances because soil cations are usually more limiting at lower levels than at higher levels. We calculated the environmental distances independently for each variable as a simple difference between the values of each site. The endemism distance was defined as zero between plots in the same endemism area and one for plots in different endemism areas.
For each organismal group, we tested for the association between dissimilarity in species composition and the predictor distance matrices using distance-based multiple regressions and variance partitioning. Significance values were calculated by permutation using the MRM function in the ecodist package of R (Goslee and Urban 2007), which tests for the significance of each predictor variable after controlling for the effect of the other variables in the model. P-values calculated using permutation (i.e. a non-parametric test) do not assume normality in model residuals (Legendre et al. 1994); therefore, we did not test for normality.
Differences in climate variables (temperature and precipitation) between plots were correlated with geographical distance (r [ 0.6, p \ 0.05), so it was not possible to separate their effects on compositional dissimilarity for most groups (Fig. S3). Therefore, we will discuss the potential effects of both variables together. Moreover, the maximum temperature in the warmest month and precipitation in the driest quarter were strongly correlated with geographical distance and were not included in multiple regression models. Because differences in biogeographical regions were correlated to geographical distance, we avoid comparing the coefficients of these variables in a single model. We test the individual effects of these variables in simple regression models using distance matrices (single variable) and compare their shared effects in variance partitioning analyses. We therefore, discuss their individual and shared contributions to changes in species composition.

Raw data analysis
In a second set of analyses, we took the raw-data approach and used the first axis of a Principal Coordinates Analysis (PCoA) based on the Jaccard dissimilarity matrix as a response variable for each taxon separately in multiple-regression models. In this case, the explanatory environmental variables for terrestrial organisms were tree cover, soil basecation concentration, and soil clay content. For fishes, we used water temperature, conductivity, pH, and dissolved oxygen. Endemism area was used as a categorical predictor variable representing the effect of rivers. Latitude and longitude were included as predictors to account for spatial gradients. Even without including high-order polynomials to represent the association of spatial gradients with the response variables, there was no spatial autocorrelation in model residuals for most groups (see results). However, spatial autocorrelation was present in the residuals for termites and gingers even after the inclusion of latitude and longitude as predictors. For these groups we used a Moran Eigenvector Map (MEM) analysis (Dray et al. 2012;Legendre and Gauthier 2014) in order to account for spatial autocorrelation in the residuals of the response variable. Spatial autocorrelation was calculated using Moran's I. Spatial autocorrelation using MEMs was only associated with fine-scale MEMs (MEMs with negative associated eigenvalue; Dray et al. 2012). Therefore, the removal of spatial autocorrelation did not change the broad-scale results shown in the simpler model using latitude and longitude as predictors. In order to simplify analyses, use more easily interpretable spatial variables and make the results of all groups comparable, we only used the models with latitude and longitude as predictors. For each taxon, models containing all possible combinations of predictor variables were compared using the corrected Akaike Information Criterion (AICc). All environmental and spatial variables were standardized before analyses to zero mean and unit variance.
Biogeographic units are spatially structured, and it was virtually impossible to separate the effect of spatial positioning and area of endemism using raw-data approaches. Therefore, we conducted analyses using geographic positioning only or area of endemism only as predictor variables separately, each one along with environmental predictors. To test for differences in species composition between biogeographic units, we also ran a posteriori Tukey test comparing each pair of regions and correcting p-values for the use of multiple comparisons. Because the Amazon river is the largest and oldest river in the region (Hoorn et al. 2010) and has been hypothesized to have stronger effects on species composition than other rivers (e.g. Fluck et al. 2020), we expected differences between regions separated by the Amazon river to be stronger than between other regions. We present results from the Tukey test with this distinction (see ''Results'' section).
To investigate if differences in geographic extent of the data could affect the observed relationships between taxa and predictor variables, we re-ran all analyses described above with a subset of plots. Out of the whole dataset, we selected plots in order to restrict the geographic extent of the sampling of each taxonomic group to the same geographic extent of palms, which was the group with the smallest extent. Most results were qualitatively similar and are presented in Supporting Information II. We here present the results for the whole dataset analysis and focus on the results that are consistent regardless of the sampling extent.
To compare model coefficients between all predictors and taxonomic groups, all predictor and response variables were standardized (mean = 0; SD = 1) in both distancebased and raw-data-based approaches.
All analyses were carried out in the R environment (R Core Team 2019) using the packages vegan (Oksanen et al. 2018) and ecoDist (Goslee and Urban 2007) and functions created to automate the process of running multiple regression models on distance matrices for each different taxonomic group.

Results
We found a total of 1,889 species or morphospecies in the 822 sampled plots. Wideforaging ants were the group with the highest number of species per plot and in total (Table S1). Species-accumulation curves did not approach an asymptote for any of the taxa investigated (Fig. S1). The sampling of all taxa covered practically the same amplitude of climatic gradients, but the sampled gradient in soil-cation concentration was one to two orders of magnitude broader for butterflies, ferns ? lycophytes, and gingers than for the other taxa (Table S1).
The strongest decay in compositional similarity with distance was observed for fish and palms (decay from 0 to 500 km: Fish low = 0.23, Fish high = 0.17, Palms = 0.20; Fig. 2;  Fig. S2). These taxa also had the highest similarities between nearby sites (intercept for Jaccard index of 0.44 for palms and 0.35 for fishes; Fig. 2). A strong decay in similarity with geographic distance was also observed in ferns ? lycophytes and gingers (Fig. 2), but the distance decay was weak or non-existent for bats ( Fig. 2; Table 1; Fig. S2; Table S3).
In regression models using distance matrices, geographical distance explained more than 17% of the variation in dissimilarity for all taxa except bats, for which geographic distance explained only 6%. Using this approach, geographical distance was associated with decay in species similarity for all taxa even after controlling for the effect of the measured environmental variables (Table 1) but it was much weaker for bats than for any other group. If only geographical distance and climate were included in the variation partitioning model, geographical distance alone explained slightly more variation in species composition than climate alone for all groups except bats, in which climate was more important (Fig. S3). However, geographical distance and climate variables were highly correlated (r [ 0.6 for all taxa) and explained a similar portion of the variance in species composition for most groups, so it was difficult to disentangle the unique contribution of these variables to species turnover.
When the variance in compositional dissimilarities was partitioned among the predictor variables in multiple regression of distance matrices, endemism area was the factor with the highest unique contribution to explain changes in the dissimilarity of bird species (Fig. 3) and also had some explanatory power for palms, dominant ants, butterflies, and high-dispersal fish, but not for the other groups. Differences in soil exchangeable base- Fig. 2 Decay of similarity in species composition with geographical distance for birds, fishes, bats, ants, termites, butterflies, ferns, gingers and palms in Brazilian Amazonia. The similarity in species composition was quantified by subtracting dissimilarity values from unity (1-dissimilarity) using the Jaccard dissimilarity index calculated for each pair of plots Table 1 Distance-based regression coefficients relating indices of dissimilarity in species composition of birds, fishes, bats, ants, termites, butterflies, ferns, gingers and palms to geographical and environmental distance matrices in multiple regression models in Brazilian Amazonia. Dissimilarity in species composition was measured by the pairwise Jaccard dissimilarity index based on presence and absence data. All predictor and response variables were standardized (sd = 1; mean = 0) for each individual group before analyses. Maximum temperature in the warmest month and precipitation in the driest quarter were strongly correlated with geographical distance and were not included in multiple regression models (see Table S2 for simple regression coefficients). For all taxa except bats, geographical distance was a better predictor of species dissimilarity than climatic variables (Table S2). Significance of coefficients was calculated using permutation tests. Coefficients marked with * and in bold correspond to values that were significant at p-value \0.05.
We defined the assemblages of as follow: Dominant ants were those sampled with bait; mobile-forager ants were those sampled with pitfall and cryptic ants were those sampled with Winkler cation concentration and tree cover had a unique contribution to compositional dissimilarities in all taxa, except birds and butterflies. The highest unique contribution of soil cations was observed in mobile-forager ants (pitfall), ferns and palms (Fig. 3, Table 1). Distance/climate explained a large and significant part of the variation in species dissimilarity even after controlling for the effects of the other variables for all groups except birds, bats, and low-dispersal-capacity fishes (Fig. 3). After controlling for spatial/climatic distances, part of the variation in compositional differences of all taxa could also be explained by differences in soil base-cation concentration (terrestrial species), water properties (fishes), tree cover and endemism area (Fig. 3). The first PCoA axis representing the composition of species (Fig. 4) captured between 8% (mobile-forager ants) and 26% (palms) of the compositional dissimilarities between plots. Explanatory variables in raw-data-based regression models were able to capture more than 40% of the variation in community composition summarized in the first PCoA axis for all taxa (Table 2). Birds, palms, cryptic ants, and fishes with high dispersal capacity had more than 80% of the variance in their first PCoA axis explained by the Fig. 3 Venn diagrams showing the relative contributions (r 2 ) of three groups of explanatory distance matrices to explaining the variation in community compositional dissimilarities (Jaccard index) of birds, fishes, bats, ants, termites, butterflies, ferns, gingers and palms in Brazilian Amazonia. Relative contributions were determined using multiple regressions on distance matrices. Overlapped areas represent the amount of variance in the response variable that was jointly explained by two or more groups of factors. Environmental distance matrices were calculated separately for each individual predictor variable: tree cover, soil clay concentration, and the sum of exchangeable cations (for terrestrial taxa), and for dissolved oxygen in the water, pH, temperature, and conductivity (for fishes)

Table 2
Association of the first Principal Coordinates Axis (PCoA) of species composition of birds, fishes, bats, ants, termites, butterflies, ferns, gingers and palms with endemism regions, soil clay content, soil bases and tree cover (environmental variables), and latitude and longitude (spatial variables) in Brazilian Amazonia. Coefficients were obtained by using an Analysis of Covariance (ANCOVA). Model selection was based on AICc values and only variables included in the best model for each group are shown. The ranking of all models with delta AICc\4 is presented as supporting material (Table S3). For Termites and Gingers, fine-scale spatial autocorrelation was found in model residuals. For these taxa, the use of MEMs instead of latitude and longitude values produced similar results (Table S3)   environmental, spatial and biogeographic variables included in the models. For low-dispersal-capacity fishes, bats, dominant and cryptic ants, termites, butterflies, and ferns, less than 10% of the variance in the first PCoA axis could be uniquely explained by endemism areas (Table 2). In contrast, for birds, areas of endemism uniquely explained nearly 20% of the variation in PCoA axis 1, while environmental variables and space per se played virtually no unique role (Table 2). Among animals, birds were also the group with the clearest compositional differences between Inambari and Guiana endemism areas (that are separated by the Amazon River) ( Table 2). The variation in species composition between endemism areas limited by the Amazon river were strong for all animal groups except termites ( Fig. 5; Tables 2 and S4: contrast between Inambari and Guiana). Among plants, Inambari and Guiana endemism areas had consistently different communities, but the differences between other endemism areas were not consistent among the plant groups ( Fig. 5; Tables 2 and S4: contrast between Inambari and Guiana). When restricting the analysis to the geographical extent of palm samples, most of the results were similar and suggest the same relative role of rivers, environment and geographical distance as observed in the analysis with the full dataset (Supporting Information II). However, restricting the data to central Amazonia, where all taxa have been sampled, reduced the variance explained by soil cations and distance/climate (Supporting Information II, Fig. SII-3), probably due to the shorter length of the gradient sampled as a result of the lack of samples in nutrient-rich soils and in more seasonal areas.

Discussion
Amazonian landscape evolution affects patterns in species distribution in several ways: by determining current environmental conditions (Tuomisto et al. 2003;Pomara et al. 2014), by imposing dispersal barriers (Ribas et al. 2012) and by constraining how far a species can establish from their center of origin (Dambros et al. 2017). When species movement is not limited by geographical barriers or distance, species can be found in all habitats for which they are adapted to survive and reproduce and local environmental conditions as well as biotic interactions determine species distributions (Hurtt and Pacala 1995;Hubbell 2005).
In Amazonia, studies have associated species distributions to soil conditions (Tuomisto et al. 2003;Pomara et al. 2014), position of large rivers (Haffer 1974;Cracraft 1985;da Fig. 5 Means and confidence intervals of the differences in community composition of birds, fishes, bats, ants, termites, butterflies, ferns, gingers and palms among endemism regions in Brazilian Amazonia. Coefficients shown in the x-axis were estimated as the mean difference between values of the PCoA first axis representing species composition in each sampling plot. Comparison pairs including one endemism area to the North and one to the South of the Amazon river are highlighted in green shadow. Non-highlighted pairs are separated by other Amazonian rivers. See Table S4 Silva et al. 2005a, b;Boubli et al. 2015;Silva et al. 2019;Maximiano et al. 2020), climate and geographic distance (Dambros et al. 2017;Fluck et al. 2020). We found that soil conditions and geographic distance were important predictors for all taxonomic groups, but their relative importance varied among taxonomic groups, and bellow we discuss the congruence and discrepancies among biological groups in their response to these factors.

Riverine barriers
Rivers have been reported to act as barriers that drive and maintain species diversity of many Amazonian birds, frogs and primate species (Burney and Brumfield 2009;Ribas et al. 2012;Boubli et al. 2015;Dias-Terceiro et al. 2015;Moraes et al. 2016;Godinho and da Silva 2018;Naka and Brumfield 2018). However, we found that riverine barriers had only a weak effect on species composition for most of the animal and plant groups studied here. Rivers could explain changes in bird community composition, but the unique effect of rivers on birds (after controlling for other variables) was not very large (4%). The largest unique component of the explained variation in bird compositional changes was related to geographic distance and endemism areas combined (15%). This suggests that when species distribution boundaries coincide with rivers, the barrier effect may actually result from a combination of factors, especially at broad spatial scales. However, when only the plots in central Amazonia were considered, the relative contribution of the Amazon river to changes in species composition increased (Supporting Information SII), indicating that the effect of rivers is more evident over environmentally homogeneous areas. Similarly, when investigating changes in species composition at landscape scales, Maximiano et al. (2020) and Pomara et al. (2014) found strong evidence of bird species turnover across rivers that could not be attributed to differences in the environment or to geographical distance as such.
In most parts of Amazonia, and for most plant and animal groups, changes in species composition across rivers could equally well be explained by geographical distance or by differences in soil nutrient concentration. Understory birds and palms were the only taxa in which this was not the case, and the presence of rivers had some unique explanatory power. For understory birds, this conforms with results of earlier studies that have demonstrated that rivers are limits to species distribution, and act as primary or secondary barriers (Ribas et al. 2012;Naka and Brumfield 2018;Silva et al. 2019), even when environmental conditions are similar on both sides of the river (Pomara et al. 2014;Maximiano et al. 2020). This result also agrees with what is known about bird ecology. Many Amazonian understory birds avoid open habitats (Laurance et al. 2004), such as those found in floodplains, secondary forests (Antongiovanni and Metzger 2005), and low-canopy forest (Mokross et al. 2018). A more surprising result was the relatively strong effect of river barriers on palms. Birds are important dispersers of palm seeds (Zona and Henderson 1989), which could explain why these two taxa were congruent in their response to the Amazon River. However, data on the occurrence of palm species were restricted to a limited geographic extent, and further studies covering longer environmental gradients and larger geographic distances are needed to clarify the relative roles of rivers and other factors.
In spite of the overall weak effect of rivers on the distribution of most taxa, differences in species composition were consistently observed between the endemism areas separated by the Amazon River for most terrestrial organisms in raw-data-based analyses (Fig. 5). This is consistent with the main west-east trans-Amazonian drainage having started to develop towards its current configuration few to several million years ago (Miocene-Hoorn et al. 2010, 2017, van Soelen et al. 2017, early Pliocene-Latrubesse et al. 2010, Neogene-Campbell et al. 2006, which allowed time for allopatric speciation or the accumulation of differences in species composition across the margins of the Amazon River (but see Rossetti et al. 2014 for an alternative hypothesis of recent origin in the Pleistocene).
For tributaries, the patterns were not as clear. For example, communities on the two sides of the Tapajós River (Tapajós and Rondonia endemism areas) were not significantly different for any of the groups investigated in this study (Fig. 5, but see Maximiano et al. 2020), which indicates that this river does not isolate communities as effectively as the Amazon river. Tributaries of the Amazon River are generally narrower and have been more dynamic over time Rossetti et al. 2014;Hoorn et al. 2017), which results in a weaker effect compared to the Amazon River. Our results indicate that the Amazon River, which is the oldest, widest and with the largest discharge, has a stronger effect on species distribution and observed biogeographic patterns, whereas younger and more dynamic tributaries have weaker or no effect on most biological groups (but see e.g., Maximiano et al. 2020 andSilva et al. 2019).

Geographic distance and environment
Our results are consistent with earlier findings of regular turnover of species along edaphic gradients in Amazonia in several plants (e.g. Tuomisto et al. 2003Tuomisto et al. , 2016Costa et al. 2009;Zuquim et al. 2012;Cámara-Leret et al. 2017) and animal groups (e.g. Menin et al. 2007;Dias-Terceiro et al. 2015;Dambros et al. 2017). The turnover is possibly a consequence of niche partitioning, in which different species are specialized to different parts of the environmental gradient (Leibold and Mikkelson 2002;Tuomisto et al. 2003;Zuquim et al. 2012). The relative importance of the environment in shaping biological assemblages may vary greatly among taxa depending on species ecological traits (Bie et al. 2012). The unique contribution of soils in explaining community turnover tended to be greater for plants than for animals. Plants obtain nutrients directly from the soil and evolve strategies that optimize the use of local resources, whereas animals obtain nutrients indirectly and have the ability to move in search of food.
Compositional differences of most taxa exhibited a strong association with geographic distance. Although these associations may represent the effect of differences in climate or in other unmeasured spatially-structured environmental variables (Tuomisto et al. 2003), the climatic variation among sites was relatively small (Table S1) and for most groups geographical distance was still important after controlling for differences in climate and soil ( Fig. S3; but see bats). Therefore, it is reasonable to expect that the vast geographical distances separating areas within Amazonia are limiting the dispersal of many organisms. Geographic distance and environment may also interact, given that species adaptations to soil conditions that are patchily distributed in Amazonia create a mosaic of habitats that provides different establishment opportunities for propagules once they have reached the area.
Although we could not disentangle the effects of distance and climate, in general, the results were consistent with expectations based on the dispersal ability of the taxonomic groups: species with higher dispersal capacity were more strongly determined by local environmental conditions (Hubbell 2005). Ferns and lycophytes, the plants with the smallest propagules (and presumably the highest dispersal capacity), were more strongly associated with soil gradients than the other plant groups (palms and gingers). Among fishes, those with high dispersal capacity were more strongly associated with water conditions than fishes with low dispersal capacity. Mobile-forager ants had stronger association with soil conditions than other ant groups, possibly because the other groups live in leaf litter rather than directly on the soil (Fig. 3). Birds were the group with the weakest association with local environmental variables (Fig. 3) but the largest joint effect of space and rivers, which suggests strong dispersal limitation. Interestingly, bats were only weakly associated with the distances in environmental variables measured. Moreover, geographical distance had almost no explanatory power for this group. In contrast to most understory birds, bats often fly long distances in densely-vegetated areas (Trevelin et al. 2013) and it is possible that unmeasured environmental variables, e.g. vegetation-clutter, terrain elevation and food (Marciente et al. 2015;Bobrowiec and Tavares 2017;Capaverde et al. 2018) could be better predictors then the ones included in our models.

Limitations
Sampling multiple groups at common localities along the entire Amazonian region is extremely challenging. Although we have used data obtained over large areas and with high overlap for most groups, some differences in sampling between groups existed. Some groups were not sampled in the more seasonal northern Amazonia or in western Amazonia where abrupt changes in soil conditions occur (Higgins et al. 2011;Tuomisto et al. 2016), and the spatial distribution of sampled plots caused differences among taxa in the length of the gradients sampled among taxa. For example, birds and palms were mainly sampled on nutrient-poor soils in central Amazonia and thus, the effect of environment and space may be partially hidden as described in the veiled gradients concept (McCoy 2002). Besides, due to the natural spatial structure of climatic gradients, the strong correlation between climate and space prevented a better assessment of the unique effects of these factors. Finally, the degree of taxonomic knowledge varies between taxonomic groups. Hundreds of species are described in Amazonia each year especially in invertebrate and plant groups. Species complexes may hide effects of the environment when similar taxa partition their niche. In the case of birds, probably the historically most intensively studied group, analyses at the subspecific level detected a stronger river barrier effect than analyses at the species level (Maximiano et al. 2020). Consideration of these limitations should be included in planning the location of future biological surveys.

Conclusion
So far, most studies of biodiversity-distribution patterns have addressed single taxa (Ribas et al. 2012;Zuquim et al. 2012;Dambros et al. 2017), and the rare attempts to integrate plant and animal groups have been spatially restricted to comparisons across distances of dozens to a few hundred kilometers (Landeiro et al. 2012;Pomara et al. 2014;Tuomisto et al. 2016). We used a comprehensive and standardized broad-scale dataset for several animal and plant taxa to explore how different biological groups perceive the environment and geographical barriers at a semi-continental scale. Our results are consistent with the idea that variation in community composition in Amazonia reflects both dispersal limitation (isolation by distance or large rivers) and the adaptation of species to local environmental conditions. However,the relative importance of each of these processes varies among biological groups. The wide and relatively old Amazon River tended to determine differences in community composition for all biological groups. Soil gradients tended to be relatively good predictors for all plant groups and some animal groups such as ants and termites.
As most studies are undertaken in only a limited number of locations and for a limited number of taxa at a time, there is an urgent need for standardized surveys of biodiversity across the whole Amazonian landscape. There is great uncertainty about the distributions of most Amazonian species. Our results advance the understanding of spatial heterogeneity of Amazonian communities, providing basic information for conservation planning. A good representation of all endemism areas based on river barriers may be an important strategy for planning new conservation units for birds and primates. However, our results indicate that the importance of rivers and environmental heterogeneity in determining patterns of diversity distribution differ greatly among taxa, and that optimized conservation planning needs to be based on data from a variety of organisms with distinct life histories. For example, in addition to locating protected areas in different areas of endemism, regions of distinct habitat types within these areas should also be prioritized, uniting the interfluvial and ecological factors to maximize biodiversity conservation.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.