Introduction

The species-area relationship (SAR) is arguably the most widely studied scaling law in ecology, having received empirical support from numerous studies spanning different geographical regions and taxa (Drakare et al. 2006; Lomolino and Weiser 2001). The predominant power law form of the SAR was first described by O. Arrhenius in 1921 (Arrhenius 1921). It related the number of species S to the area of a habitat A as \(S \sim A^z\), where the exponent z varies widely between 0 and 1 (Drakare et al. 2006). A quantitative meta-analysis of a large number of SAR studies estimated its average value as 0.27 (Drakare et al. 2006). The power law was contested by a semi-log relationship in 1922, that advocated the form \(S \sim z \log (A)\) (Gleason 1922). While the power law relationship is more widely reported, the semi-log SAR has also found support from numerous studies (Drakare et al. 2006; Lomolino and Weiser 2001). There have been attempts to explain the power law form based on species distributions (Coleman 1981; Leitner and Rosenzweig 1997; Picard et al. 2004; Šizling and Storch 2004), abundance distributions (Preston 1948) or population dynamics through constraints on immigration (Bastolla et al. 2001; Durrett and Levin 1996). The prevalence of SARs has also been attributed to the combined effects of widely observed abundance distributions and the fact that individuals from the same species cluster together (Martín and Goldenfeld 2006). The semi-log relationship can be recovered from the power law SAR in some limit using species-incidence functions that depend on colonization and extinction rates (Ovaskainen and Hanski 2003). However, there is no unified framework to explain the emergence of these competing SARs.

These scaling relationships are emergent in that they could be described by coarse-grained dynamics of large communities at the species level without reference to finer details and properties of individual organisms. Understanding the assembly of large communities could therefore underpin mechanisms that shape these scaling laws. The analysis of large systems has benefitted from many emerging approaches in the recent decades. In 1972, P.W. Anderson influenced the philosophy of science by suggesting that ‘more is different’ (Anderson 1972), based on accumulating evidence from various disciplines. This means that the properties of a collective composed of many parts could be drastically different from the parts themselves. In the same year, R. May used random matrix theory to show that large ecosystems become unstable when their complexity increases beyond a threshold (May 1972), which contradicted the prevailing notion that diversity increases stability. May’s analytical results showed that one cannot have indefinite stability in large and complex ecosystems with many interactions. There is a limit beyond which an ecosystem is not resilient to small perturbations and can exhibit large fluctuations in the population abundances of the constituent species. He defined complexity in terms of connectance and interaction strength of the random matrix that encodes species interactions.

The complex dynamics of such random interaction networks can be modelled using the Generalized Lotka-Volterra (GLV) equations. This model has been employed to uncover theoretical results ranging from identification of structural properties that affect co-existence (Serván et al. 2018) to the study of generic assembly patterns that are consistent across network structures (Barbier et al. 2018; Bunin 2017). Some recent studies have investigated the distribution of number of coexisting species that results from GLV dynamics of much larger species pools (Serván et al. 2018). Others have even explored the progression and boundaries of extinction in large ecosystems (Pettersson et al. 2020). These studies depart from identifying constraints on parameters that result in complete co-existence of all species. When the interaction strength is increased beyond the regime where all species co-exist, the system wades through a phase characterized by single-species extinctions (Pettersson et al. 2020). May’s stability limit marks the end of this phase beyond which no stable equilibria exist.

We hypothesize that a modified GLV model accounting for spatial scaling could exhibit SARs through the assembly of random communities of different sizes. Our analysis relies on introducing an area parameter to the GLV equations to test these questions. We explore a large part of this area parameter space to recover different number of surviving species beyond the regime of complete co-existence. By further allowing demographic immigration in the modified GLV model, we demonstrate that the two widely reported forms of the SAR stem from differences in immigration rates and the skewness towards weak interactions. We discuss the implications of our results in the context of island systems. The differences in the two SAR forms are more significant for smaller islands especially on distant archipelagoes, which we describe in 'Immigration shapes SARs: the case of remote archipelagoes' and Supplementary appendix S3 using data from empirical studies (Diamond 1972; Diamond and Mayr 1976; Gooriah et al. 2020a; Whittaker et al. 2014).

Methods

Generalized Lotka-Volterra with spatial scaling

Fig. 1
figure 1

Equilibrium abundances of a competitive community of 100 species for increasing values of the interaction strength parameter (\(\sigma\)). Each set of vertical dots represents an assembled community corresponding to a given value of \(\sigma\). The bold black line in the inset traces the corresponding number of surviving species. Note that higher values of \(\sigma\) correspond to more extinctions. The interaction strengths, growth rates and carrying capacities are chosen from normal distributions with means -1, +1 and +1 respectively. The standard deviation is set to 0.2 for each of these

In its usual form, the GLV model describes the dynamics of species with densities \(y_i\) through the following equations:

$$\begin{aligned} \frac{dy_i}{dt} = r_i y_i(1 - \frac{y_i}{K_i}) + \sigma y_i \sum _{j \ne i} B_{ij}y_j \end{aligned}$$
(1)

where \(K_i\) and \(r_i\) denote the carrying capacity and growth rate of \(i^{th}\) species. \(B_{ij}\) expresses pairwise interspecific interaction strengths between species i and any other species j. The full matrix B contains information about all possible pairwise interaction strengths between species. Equation 1 implies that in the absence of interactions, each species grows to its carrying capacity \(K_i\). \(\sigma\) is the interaction strength parameter that scales all pairwise interaction strengths and consequently the variance of the interaction matrix B. Note that many studies do not use the \(\sigma\) parameter explicitly in the GLV equations, but rather work directly with the variance of B.

For a given value of \(\sigma\) below May’s limit, the system eventually relaxes to a stable equilibrium that represents an assembled community where species densities are resilient to small perturbations (Fig. 1). The equilibrium densities of species at this stable fixed point could either be zero or positive. Fixed points without any extinctions are called feasible solutions but these are of little concern to us since we are interested in communities with different number of surviving species assembled from a species pool.

The number of species in the assembled community decreases monotonically as \(\sigma\) (or equivalently the variance of the interaction matrix) is increased (Fig. 1). This presents an opportunity to recover monotonic SARs using a modified version of the GLV equations. Consider Eq. 1 without the \(\sigma\) parameter, then species densities can be replaced by absolute biomass abundances to obtain:

$$\begin{aligned} \frac{dx_i}{dt} = r_ix_i(1 - \frac{x_i}{K_i}) + \frac{x_i A_0}{A} \sum _{j \ne i} B_{ij}x_j \end{aligned}$$
(2)

where A is the area of a given island, and \(A_0\) parameterises this model for a given ecosystem. We set this parameter equal to 1 from hereon. The carrying capacity \(K_i\) is now an absolute quantity instead of a density, which explains why an area factor does not appear in that term. We now assume that the carrying capacities scale non-linearly with area, which modifies Eq. 2 as:

$$\begin{aligned} \frac{dx_i}{dt} = r_ix_i(1 - \frac{x_i}{K_i (\frac{A}{A_{init}})^\gamma }) + \frac{x_i }{A} \sum _{j \ne i} B_{ij}x_j \end{aligned}$$
(3)

where \(A_{init}\) is the minimum area for which the assembled community contains all species from the mainland pool. \(\gamma\) is the carrying capacity scaling parameter. We fix \(\gamma = 0.25\) for the analysis described in this paper ( In general, \(\gamma < 0.5\) is consistent with the results that we report). The non-trivial scaling of the carrying capacity implies that in the absence of interactions, the absolute carrying capacities do not change proportionally with area. Equivalently, the equilibrium densities would increase with decreasing areas. This premise is central to the results that we report subsequently. The relevant ecological pattern corresponding to this scaling is the spatial clustering of conspecifics. In Appendix S1 of Supplementary Information, we argue for sub-linear scaling of absolute carrying capacities based on clustering indices such as relative neighbourhood density (Condit et al. 2000; Martín and Goldenfeld 2006; Ostling et al. 2000). Other ecological patterns might also correspond to this non-trivial scaling with areas, but spatial clustering is particularly interesting since it has been previously used to explain the power law SARs (Martín and Goldenfeld 2006; Plotkin et al. 2000). Clustered conspecifics would already have saturated levels of negative density dependence that is expected to change less drastically with change in areas. The carrying capacity term generally captures negative density dependence, but the peculiarity of this term in our model draws connections to the spatial distribution of individuals.

\(\gamma\) can be used to capture the extent of spatial aggregation of conspecifics — individuals become more spatially aggregated as \(\gamma\) decreases from 1 to 0. Recently Brush and Harte (2021) investigated how the strength of density dependence relates to the spatial aggregation of individuals. Their key result was that higher spatial aggregation corresponds to weaker density dependence. For a fixed area, lower values of \(\gamma\) imply the same — high spatial aggregation and weak negative density dependence. The findings from our study break down for higher values of \(\gamma\) (particularly \(\gamma > 0.75\)), which corresponds to weak spatial aggregation.

Clustering could affect interspecific interactions, but we assume that the clusters are well-mixed such that encounter rates depend on the island-level species densities as usual.

We are interested in an ecological setting where a regional pool of species is available to colonize different islands in a region (Kessler and Shnerb 2015). For an island defined by its area, the dynamics resulting from our model culminates in a final community where some species from the regional pool might not be feasible. Islands of different sizes yield communities with different compositions as a consequence. We use our model to simulate ecosystem dynamics as follows:

  1. 1.

    We pick entries of the interaction matrix \(B_{ij}\) from a normal distribution that is symmetric around a negative mean (We fix mean = -1 and standard deviation = 0.2).

  2. 2.

    The growth rates \(r_i\) are drawn from a normal distribution with mean = 1 and standard deviation = 0.2. The constraints on interactions and growth rates describe a community of competitive species.

  3. 3.

    The carrying capacities \(K_i\) are normally distributed with mean = 500 and standard deviation = 30. The parameter \(\gamma\) is fixed as 0.25. The choice of \(B_{ij}, r_i\) and \(K_i\) allows for a large range of areas for which the system relaxes to stable equilibria.

  4. 4.

    Starting from an initial area, the number of surviving species is plotted against successively smaller island areas A.

We are primarily interested in investigating the properties and processes of community assembly that could possibly influence SARs using our spatially implicit model. In all cases that we describe, we only show comparisons between the power-law and semi-log relationship forms. We perform a non-linear least-squares (NLSQ) analysis to fit and compare these forms using the least_squares function in ‘scipy.optimize’ package. This function implements the Trust Region Reflective algorithm described in Branch et al. (1999). We also plot the linear regression of the corresponding better form for each of the cases. If there are considerable differences between the parameter estimates from the linear regression and the NLSQ analysis (this is the case only for the power-law estimates from an empirical dataset with few islands (Whittaker et al. 2014)) , then we perform model averaging using the R package ‘sars’ (Matthews et al. 2019) to discern the better fit.

Analogous to the scenario of increasing \(\sigma\), the system relaxes to a unique stable fixed point when the area parameter is above a certain threshold. We obtain the number of surviving species from the fixed point for each value of the area parameter (Fig. 1). We hypothesize that the different number of surviving species obtained by varying the area parameter result in widely reported SARs. These relationships are usually studied for one type of species or species that are placed in the same trophic level. This is congenial to our choice of a competitive interaction matrix. A competitive system could represent functional groups such as pollinators that compete for some common resources. A competitive GLV model with demographic noise has been shown to reproduce neutral island theories of Wilson-MacArthur and Hubbell (Kessler and Shnerb 2015). The power-law SAR has also been recovered from a spatially explicit extension of the Lotka-Volterra competition model that allowed migration between patches (O’Sullivan et al. 2019).

Spatial scaling patterns with immigration

Immigration slows down the decline in number of surviving species either by introducing new species (MacArthur and Wilson 1963) or through the rescue effect (Brown and Kodric-Brown 1977) that delays extinctions through incoming individuals of existing species (demographic immigration). What effects do different levels of immigration have on spatial scaling patterns in our model ecosystem? To address this, we redefine our GLV model with an additional term for demographic immigration:

$$\begin{aligned} \frac{dx_i}{dt} = r_ix_i(1 - \frac{x_i}{K_i (\frac{A}{A_{init}})^\gamma }) + \frac{x_i}{A} \sum _{j \ne i} B_{ij}x_j + \lambda e^{-\beta / \sqrt{A} } \end{aligned}$$
(4)

The last term represents the immigration rate. This term has a negligible contribution for smaller values of area, where a species may go extinct without support from the growth and interaction terms. A species is considered extinct in our simulations if its abundance falls below \(10^{-5}\). As the area of an island shrinks, it is less likely to be colonized by immigrant individuals. The immigration term in the above equation has an exponential function that represents varying levels of demographic rescue (Brown and Kodric-Brown 1977) as a function of area. \(\lambda\) is the maximum possible immigration rate on the archipelago being investigated. \(\beta\) is chosen such that islands of different sizes receive disproportionate contributions from immigration, with respect to the extinction cutoff. We fixed \(\beta = 1000\), for which the immigration term contributes \(10^{-5}\) to \(10^{-6}\) for the larger islands that would not support a few species otherwise (Fig. 3). \(\beta\) is also analogous to the characteristic length scale in the spatially extended GLV model described in O’Sullivan et al. (2019). We compared the results for different values of \(\lambda\).

We also consider interaction matrices with more realistic sparsity and distributions of interactions between species. Many ecological communities are predominantly composed of weakly interacting species that have important stabilizing effects (Berlow 1999). We study how the preponderance of weak interactions influences SARs. We use interactions drawn from exponential distributions that represent communities with varying skewness towards weak interactions. The rate parameter of the exponential distribution serves as a measure of this skew.

Fig. 2
figure 2

Species-area plots generated through 50 realizations of interaction matrix with mean = -1 and variance = 0.2. \(A_{init}\) = 50000. A The semi-log form shows a better fit. B The corresponding linear regression on a semi-log plot that shows an obvious upper asymptote

In ‘Immigration shapes SARs: the case of remote archipelagoes’, we discuss our results in the light of two related empirical studies (Diamond 1972; Diamond and Mayr 1976) that exemplify the dependence of SARs on immigration rates. Both studies investigated bird diversity in the Southwest Pacific but differ in terms of their remoteness from the ‘source island’ of New Guinea. Our findings for low immigration rates (equivalently remote archipelagoes) are also consistent with the data from the Andaman and Azores Islands (Gooriah et al. 2020a, b; Whittaker et al. 2014, See Supplementary Appendix S3).

The dataset used in Diamond (1972); Diamond and Mayr (1976) has islands with areas spanning over six orders of magnitude, conclusively differentiating between competing forms of the SAR. These studies exclude ‘isolated’ islands from their analysis, that are far from large islands within the archipelago. Speciation might influence the assembled communities especially on islands with fewer species. Islands whose avifaunas have not reached equilibrium are not included either. These are recolonized volcanic islands and islands that have undergone overall size contraction or modification of connecting land-bridges in the past c. 10,000 years.

Results

Fig. 3
figure 3

Species-area plots demonstrating the better fit of power law SAR for intermediate values of immigration rates. Panels A and C show the fits for \(\lambda =0.1\) and \(\lambda =0.01\) respectively for 50 instances of the interaction matrix. Panels B and D correspond to the respective linear regressions on log-log plots. The interaction strength mean and variance are -1 and 0.2 respectively. \(A_{init}\) = 15000

Figure 2 corresponds to the simplest case of an ecosystem with full connectance and no immigration. Starting with 100 species, we plot the number of surviving species for island areas where at least one species goes extinct. The SAR is best represented by a semi-log function through our model. The curve saturates at an upper asymptote for very high values of the area parameter (Fig. 2).

The slope of the semi-log SAR varies with changes in the means of interactions and growth rates. It is also worth noting that for an intermediate range of areas, even the log-log plot could show a misleadingly good fit for a power law SAR (Fig. 2).

What determines a power law or a semi-log SAR?

Fig. 4
figure 4

Power law SAR exponent z decreases as \(\gamma\) is increased. The plot is restricted to \(\gamma \le 0.5\), beyond which the relationship form breaks down. \(\lambda = 0.01\) for all values of \(\gamma\)

Our model — in its simplest form — supports the semi-log relationship that is also widely reported in literature (Drakare et al. 2006). Our analysis suggests that varying levels of immigration lead to different functional forms of the SAR. We start with a very low value of \(\lambda\) and progressively increase it to check the resulting SAR. For very low immigration rates, the semi-log relationship is supported (see Fig. S1 in Supplementary Material) as seen in the scenario without immigration (Fig. 2). However, there exists an intermediate regime best characterized by a power law (Fig. 3). This form of the SAR also lacks the upper asymptote that we observed in the semi-log fit (Fig. 2). Interestingly, using area in the immigration term instead of its square root does not change the above results (Fig. S2 in Supplementary Material). The exponent z of the power law SAR decreases monotonically with increasing values of \(\gamma\) (Fig. 4). The range of z values (Fig. 4) is consistent with what most empirical studies report (Drakare et al. 2006).

Fig. 5
figure 5

SAR plots for exponentially distributed interactions with two different rate parameters. All plots correspond to \(\lambda\) = 0.01 and connectance = 0.1, where the entries of the interaction matrix are chosen randomly as an Erds-Rnyi graph. The semi-log form is better supported for rate parameter = 0.5, as demonstrated by the estimates in A (\(A_{init}\) = 15000). Plots C shows the fits for rate parameter = 0.25, where the power law performs better (\(A_{init}\) = 20000). B Linear regression on a semi-log plot using the same simulated data as in panel A. D Log-log plot showing the corresponding linear regression for data in C

The level of skew towards weak interactions strongly influences SAR shape. Given the same immigration level, a higher skew towards weak interactions favours a semi-log relationship (Fig. 5, Fig. S3 in Supplementary Material). This result does not change for fat-tailed distributions such as the Pareto distribution in the regime where stable solutions exist (Fig. S3 in Supplementary Material).

Discussion

Fig. 6
figure 6

Immigration rates and skewness towards weak interactions determine SAR forms. Semi-log relationship dominates in the absence of immigration. Higher immigration rates from a source pool result in power law relationships but these could shift to semi-log SARs if the relative proportion of weak interactions is increased. S, A and z represent the number of species, area and the scaling law exponent respectively. Figure S10 in the Supplementary Information shows a more quantitative form of this trade-off

We studied competitive communities on island-like systems where inter-island immigration is very low and carrying capacities scale non-trivially as a consequence of conspecific clustering. While clustering could result in such scaling of carrying capacities (Appendix S1 in Supplementary Material), we are unaware of any empirical evidence that supports this assumption. We also assumed particular functional forms of the immigration rates (Eq. 4, Fig. S2 in Supplementary Material) that represent island systems where rescue effect is negligible on the smallest islands and the immigration rates saturate for very large islands. Our results showed the emergence of two most widely observed forms of SAR through differences in immigration rates and skewness towards weak interactions (Fig. 6). Our analysis suggests that such spatial patterns emerge from community dynamics operating differentially on islands of different sizes. This is not surprising since there is emerging evidence that ecological mechanisms affect islands of different sizes disproportionately (Gooriah et al. 2021).

In addition to immigration rates and skewness towards weak interactions, sparsity of the interaction matrix also influences SAR slopes. If all other parameters are kept the same, then communities with more sparse interactions result in lower SAR slopes (Fig. 5, Fig. S3 in Supplementary Information). We also find that higher immigration rates correspond to higher SAR slopes, such that the number of surviving species fall off much more sharply with area for large areas. Our model does not capture intra-archipelago immigration that might change how SAR slopes vary with immigration from a mainland (Diamond and Mayr 1976). The SAR slopes we obtain are much more reasonable for choices based on realistic interaction networks (Fig. 5). We expect that some network structures could result in even lower slopes, but without much effect on the SAR form.

Some choices in our model incorporate effects of processes affecting SARs, but other effects might be missed. Clustering might affect interspecific interactions in non-trivial ways, which might be hard to analyse without using a spatially extended setting. Inter-island immigration could have important effects for some archipelagoes, which is not considered in our simplified mean-field model. Other functional forms of the immigration term might be better suited to describe immigration from the mainland.

Immigration shapes SARs: the case of remote archipelagoes

Our results have important implications for island systems, which we illustrate using two extensive empirical studies from the Southwest Pacific (Diamond 1972; Diamond and Mayr 1976). The Solomon archipelago in Diamond and Mayr (1976) is more than 600 km away from the ’source island’ of New Guinea. The authors assume that intra-archipelago immigration rates are much higher than the immigration rates from the ‘source’ island of New Guinea.

Fig. 7
figure 7

SAR plots for three groups of non-isolated islands within the Solomon Archipelago. These groups differ in how the islands within them were connected during the Pleistocene period. The islands in Group 3 did not have any history of connections. The semi-log relationship shows a good fit to data (A). The \(R^2\) values for the regression lines are 0.978, 0.982 and 0.955 for Group 1, 2 and 3 respectively. The slopes for the different groups are very similar. Panel B shows a clear departure from a power-law relationship for smaller areas. The linear regression lines indicate a good fit for islands larger than one square mile. In particular, the \(R^2\) value for such islands in group 1 is 0.976 from the power-law SAR. This demonstrates that a naive inference could support a power law, in spite of the islands spanning over four orders of magnitude in area ( > 1 square mile)

They further plot the SAR for three groups of islands within the Solomon Archipelago, which supports a semi-log form. The slope of the SAR is nearly the same across these three groups of islands (Fig. 7). We surmise that the immigration of birds into an island is also balanced by emigration to other islands within the same group. In other words, the system is at a steady-state of zero or very low immigration within the archipelago. Thus, any effective immigration should emanate from the source island or from islands in other distant archipelagos. The large distance from these other sources implies that the net immigration rates to the Solomon islands are very low. In fact, the authors state that with increasing isolation of an archipelago, the SAR may shift in form from a power function to an exponential (semi-log). The species richness data from the Azores (Whittaker et al. 2014) as well as Andaman Islands (Gooriah et al. 2020a) also concurs with this claim (See Supplementary Appendix S3). The Azores archipelago includes only nine islands but has been extensively studied over the past many decades. Both relationship forms show good fits to the data primarily because of the small number of islands but the semi-log form is more predictive for smaller islands. All of these studies are consistent with our theoretical finding that low immigration rates lead to semi-log SARs.

The dataset in Diamond and Mayr (1976) also has many islands smaller than a few square kilometres, which are usually absent in many SAR studies (Lomolino and Weiser 2001). Both forms of the SAR could show a very good (and similar) fit to data for larger island sizes (Fig. 7). As the authors point out, really small islands should be included in SAR analyses to conclusively identify the correct form of the relationship (Diamond and Mayr 1976).

Another study from islands that lie 5 to 300 miles from New Guinea, found a power-law SAR (Diamond 1972). Considering that these islands lie closer to the ‘source’ island of New Guinea, the immigration rates are likely to be higher than those for the Solomon Archipelago. This lends support to our theoretical results on the incidence of power-law SARs for higher immigration levels.

Conclusion

Using a simple model of interacting species that incorporates the effect of conspecific clustering, we recover many known features of SARs while also identifying factors that might best explain the variation in these relationships. The two SAR forms might show similar fits to data for a large span of areas but their differences could be stark for smaller islands especially when immigration rates from a source pool are low. Our results imply semi-log relationships for low immigration rates, which are possible through factors such as remoteness of an archipelago as in Diamond and Mayr (1976). Assuming a power law SAR in such situations could mislead extinction scenarios since these would overestimate the species richness for smaller areas. It is extremely important to investigate the effects of habitat loss, especially on small islands in distant archipelagoes, given that islands have witnessed disproportionately large number of extinctions (Loehle and Eschenbach 2012; Spatz et al. 2017). We hope that our study prompts empirical studies to systematically evaluate the effects of immigration and community structure on species-area relationships.