Seasonal diversity dynamics of a boreal zooplankton community under climate impact

Seasonality and long-term environmental variability affect species diversity through their effects on the dynamics of species. To investigate such effects, we fitted a dynamic and heterogeneous species abundance model generating the lognormal species abundance distribution to an assemblage of freshwater zooplankton sampled five times a year (June–October) during the ice-free period over 28 years (1990–2017) in Lake Atnsjøen (Norway). By applying a multivariate stochastic community dynamics model for describing the fluctuations in abundances, we show that the community dynamics was driven by environmental variability in spring (i.e., June). In contrast, community-level ecological heterogeneity is highest in autumn. The autumn months (i.e., September and October) that rearranged the community are most likely crucial months to monitor long-term changes in community structure. Indeed, noises from early summer are filtered away, making it easier to track long-term changes. The community returned faster towards equilibrium when ecological heterogeneity was the highest (i.e., in September and October). This occurred because of stronger density-regulation in months with highest ecological heterogeneity. The community responded to the long-term warming of water temperature with decreasing species diversity and increasing abundance. Unevenness associated with variabilities in abundances might affect species interactions within the community. These can have consequences for the stability and functioning of the ecosystem. Supplementary Information The online version contains supplementary material available at 10.1007/s00442-022-05165-0.


Online Resource 1
The bivariate lognormal species abundance model The model is based on a general stochastic theory, with speciation/colonization, extinction and environmental populations uctuations (Engen and Lande, 1996). For data sets collected over a relatively short time period, the model can be approximated by a stationary model with a constant given number of species ignoring possible changes in species composition during the sampling period (Engen et al., 2011). Each species is then described by an Ornstein-Uhlenbeck (OU) process (Karlin and Taylor, 1981) in the sense that the dynamics of log abundance X i of species number i is given by, where t denotes time. Here r i is the stochastic growth rate of species i in the absence of density regulation (small population sizes) and δ is the strength of density-regulation, giving carrying capacity K i = e (r i /δ) varying among species. The functions B i (t) and B c (t) denote OU brownian motions in time t with the properties EdB(t) = 0 and EdB(t) 2 = dt. The dB i (t) are noise components that are independent among species, while dB c (t) represents a noise that is common for all species in the community, that is, similarly aecting all species abundances. The common noise may in general have a given correlation ρ sc with each species specic term that is EdB i (t)dB c (t) = ρ sc dt. The noise for a single species can alternatively be written as σ e dA i (t) = σ s dB i (t) + σ c dB c t where σ 2 e = σ 2 s + σ 2 c + 2ρ sc σ s σ c is the total environmental variance for each species single species considered separately and the A i (t) are dependent Brownian motions. The stationary variance for the Ornstein-Uhlenbeck process is σ 2 e /2(δ) and the temporal autocorrelation is the exponential function e −δt . The variances in the yearly change in log population size for a population of size N is generally σ 2 e + σ 2 d /N , where σ 2 e and σ 2 d are the environmental and demographic variance (Engen etal. 1998;Lande etal. 2003). Here, we analyse communities of small organisms with large individual numbers from which only tiny fractions are sampled. So, even if the demographic variance may be of some orders larger than the environmental variance say 10-100 times as large, the temporal variation in variance, due to temporal variation with varying N , can be ignored. On the other hand, demographic stochasticity may lead to some minor spatial variation in abundance. When sampling is done at a given spatial position, such an eect may appear as confounded with the sampling overdispersion. Assuming constant species number during the period of data collection, each X i will be normally distributed with mean value r i /δ and variance σ 2 e /(2δ). We assume that the r i are themselves normally distributed in the community with mean, say r 0 , and variance σ 2 r , the log abundances (X 1 , X 2 , X 3 , . . . , X n ), then is a sample from the normal distribution with mean r 0 /δ and variance σ 2 e /2δ + σ 2 r /δ 2 . The last term is the variance in log carrying capacities among species which are lognormally distributed. This distribution which is unaected by the stochasticity of each species, may be a consequence of species dividing the niche space between them over long time periods. Further, Engen et al. (2002) have shown that two sets of log abundances (X 1 , X 2 , X 3 , . . . , X n ) and (Y 1 , Y 2 , Y 3 , . . . , Y n ) for the same community recorded at time dierences t, can be considered as pairs of log abundances (X 1 , Y 1 ), (X 2 , Y 2 ), . . . , (X n , Y n ) that are independent observations from the bivariate normal distribution with variance σ 2 s /(2δ) + σ 2 r /δ 2 and covariances σ 2 s /(2δ) + σ 2 r /δ 2 . The specic noise σ 2 s represents stochastic components that are independent among species. In addition to variation in carrying capacity that may have be an aspect of long-turn sub-division of the niche space, these terms, which are also generated by species-specic traits varying among species, may be related to niche aspect describing temporal uctuations in the sub-division.
2 Online Resource 2 The Poisson lognormal species abundance distribution To describe integer numbers of individuals in a sample from a community, we rst follow Fisher et al. (1943) and Bulmer (1974) by assuming that the number of individuals of a given species observed in a sample is Poisson distributed with mean proportional to the abundance of the species in the community. The number of individuals, N , sampled for a given species with log abundance x in the community is then Poisson distributed with mean νe x = e x+ln ν , where the parameter ν expresses the sampling intensity. Assuming that abundances of dierent species in the community are lognormally distributed, x is then normally distributed among species with mean µ and variance σ 2 , whereas the log of the Poisson mean, x + ln ν, is normal with mean µ + ln ν and the same variance σ 2 . The number of individuals N sampled from a random species in the community then constitutes a sample from the Poisson lognormal distribution with parameters (µ + ln ν, σ 2 ). The parameter µ in the underlying abundance distribution cannot be estimated unless the sampling intensity is known or can be estimated in some way. The sampling intensity aects the expected number of individuals in the sample, but has no inuence on the form of the distribution as the variance parameter σ 2 is unaected by changing sampling intensities (Engen et al. 2008, Connolly etal. 2009). Following Fisher et al. (1943) and Bulmer (1974), we only consider the numbers of individuals of species that are represented in the sample. Thus, there is an unknown number of species that are present in the community but are absent from a sample from the community. The observed numbers of individuals of the dierent species therefore follow a zero-truncated distribution because the number of species with a count of zero is unknown. The expected fraction of unobserved species given a set of parameters in the Poisson log-normal distribution and a certain sampling is q(0; µ + ln ν, σ 2 ), and the zero-truncated distribution is, q(n; µ + ln ν, σ 2 ) 1 − q(0; µ + ln ν, σ 2 ) (S2) dened for n = 1, 2, . . . The maximum likelihood estimation of the parameters of this distribution was rst derived by Bulmer (1974).

The bivariate lognormal species abundance distribution
If we jointly consider two communities (sampled from dierent locations and/or times), each species will be represented by a realization of a of a bivariate random variable expressing its abundance in the two communities (Engen etal. 2002(Engen etal. , 2008. Still assuming a lognormal species abundance distribution as the marginal distribution in each community, the log abundances in two communities will follow a bivariate normal distribution with parameters (µ 1 ; σ 2 1 , µ 2 ; σ 2 2 ; ρ). The parameters (µ 1 ; σ 2 1 ) and (µ 2 ; σ 2 2 ) represent the marginal lognormal distributions in the two communities. A high positive correlations means that a given species is relatively common (or rare) in the rst community will tend to have similar relative abundance also in the second community. Assuming Poisson sampling leads to a bivariate distribution of the number of individuals of a species in the two samples (N 1 , N 2 ) conditional on presence in at least one of them. Species that are present in both communities but absent in both samples need to be accounted for by using a truncated distribution. The zero-truncated bivariate Poissonlognormal distribution takes the form q(n 1 , n 2 ; µ 1 + ln ν 1 , σ 2 1 , µ 2 + ln ν 2 , σ 2 2 , ρ)) 1 − q(0, 0; µ 1 + ln ν 1 , σ 2 1 , µ 2 + ln ν 2 , σ 2 2 , ρ) , where the function q here is rened for the two dimensional case. The parameters σ 2 1 , σ 2 2 and ρ can be estimated without any knowledge about the unknown sampling intensities ν 1 and ν 2 that may dier among dierent samples. Estimates of µ 1 and µ 2 can only be found if sampling intensities are known. This approach can be generalized to deal with over-dispersion relative to the Poisson. Following Engen et al. (2002), such over-dispersion is handled most eciently by assuming that the number of representatives of a species with log abundance X is Poissondistributed with parameter νV e x , where ln V is itself a normal variate with mean θ 2 /2 and variance θ 2 , so that V is a lognormal variate with mean 1 taking independent values for each single sampling of a species. The distribution of V expressing over-dispersion may include a component due to minor spatial variation generated by demographic stochasticity. With these assumptions the bivariate discrete distribution is still the bivariate Poisson lognormal with the same parameter µ x , µ y and c xy , but with variance parameters σ 2 x + θ 2 and σ 2 y + θ 2 . The R-package poilog (Grøtan and Engen, 2008) Figure S1: Schematic representation of the modelling approach to analyse how environmental variability can aect the maintenance of the species diversity through its eect on community dynamics. The key parameters is the variance of the lognormal distribution (σ 2 ) which enables to quantify the unevenness and can be partitioned to estimate of the components of the community dynamics (see Eq. 5). "H s " represents the functional diversity, "D/0" represents the overdispersion and the demographic stochasticity, "E" represents the environmental stochasticity. The total number of individuals E(N ) can be estimated from the mean parameter µ and the variance parameter σ 2 of the lognormal distribution (see Eq. 6). The black line represents the connection between the parameters of the lognormal species abundance distribution and the community dynamics characteristics; the dotted grey line represents the inuence of seasonality and the long-term environmental variability. The analysis of the eects of environmental variability on species diversity (i.e., unevenness σ 2 ) is realized by using generalized additive mixed models see section "species diversity and environmental uctuations" in the main text and Online Resource 5.

5
Online Resource 4 Zooplankton community in Lake Atnsjøen and environmental variables   Year log(Nt+0.5) Figure S4: Time series of the eight most common species of freshwater zooplankton collected during during the ice-free period from 1990-2017 in the Lake Atnsjøen (Norway).

Species diversity and environmental variables
We chose a set of 5 covariables selected regarding their consistency of measures on every dates of the time-series using the same method and their potential eects on species diversity. Temperature is an important determinant of population growth rate and might aect generation time of zooplankton species (Gillooly, 2000). In Lake Atnsjøen water transparency is partly determined by the phytoplankton biomass (Brettum and Halvorsen, 2004) and by the input of dissolved organic carbon (DOC) which can aect zooplankton communities. The transparency can also aect the predation pressure of sh on the zooplankton, because the planktivorous shes depend on good visibility to detect their zooplankton prey (Mazumder et al., 1990). Phytoplankton biomass is a proxy of primary production that can aect zooplankton abundance and diversity (Hessen et al., 2006). River run-o can have indirect eect on zooplankton community composition by disturbing the chemical and physical properties of the Lake and increasing the zooplankton mortality through wash out (Tvede, 2004). The duration of icecover is important for lake ecological systems as it inuences physical, chemical and biological processes (Walsh et al., 1998).  Figure S6: Species abundance distribution of the pooled zooplankton community of Lake Atnsjøen from 1990 to 2017. Large dots: expected Poissonlognormal distribution, simulated by rst sampling species expected abundances from the tted lognormal and then Poisson sampling numbers of individuals within species; error bars are 95 % condence intervals from repeated simulations. Species abundance data are displayed in discrete bins on a logbase 3 scale, with edges at 3 j /2 for j = 0, 1, 2, 3, ..., containing 1, 2 − 4, 5 − 13, 14 − 40, ... individuals per species (Williams, 1964;Lande et al., 2003). The parameters of the pooled community are estimated from the lognormal distribution using poilog (Grøtan and Engen, 2008). The smoothed curve represents the lognormal distribution tted using poilog.
We estimate each component of the variance of the expected number of individuals in order to identify the component that contributes the most to the sampling variance.
The sampling variance of the total expected number of individuals (i.e., E(N ), Eq. S4) was estimated to be 2.272. The variance of 1 2σ 2 was the highest (1.307) while the variance of µ was lower (0.523) and the covariance between µ and 1 2σ 2 was negative (-0.442). It is the variance in species unevenness which contribute the most to variance to the total expected number of individuals.
14 Estimation of the interannual uctuations of species diversity   Table S3: Linear mixed eect model (n = 110) that describes the relationships between half of the species diversity ( 1 2 σ 2 ) and the expected number of individuals (ln(EN )) including the sampling period as random eect (Fig. 6) Generalized additive mixed eect model Generalized additive mixed model (GAMM) estimate additive non-parametric functions by smoothing splines to model covariate eects (Wood, 2006). A generalized additive mixed model (GAMM) is a generalized linear mixed model (GLMM) in which part of the linear predictor is specied in terms of smooth function of covariates (Lin and Zhang, 1999) such as, where y i is a univariate response at the i th observation. The vector of xed parameters is β; The row of xed eects model matrix is X i . The f j are smooth function of covariates x k ; Z i is a row of a random vector eects model matrix; b ∼ N (0, Ψ θ ) is a vector of random eects coecients with unknown positive denite covariance matrix Ψ θ with parameter θ and i is a residual error vector with a normal distribution.
Environmental predictors were initially included as smooth terms using penalized regression splines with maximally 4 knots (i.e., 3 degrees of freedom) (Wood and Augustin, 2002). To obtain our nal GAMM model we used an hypothesis testing procedure by dropping the least signicant term from the model and ret the model until all terms are signicant and the AIC being minimal (Zuur et al., 2009). We estimated the GAMM by using the function gamm of the mgcv R library.
16 GAMM of the species diversity and environmental variables Table S4: Generalized additive mixed models (n = 140) relating species diversity (σ 2 ) and environmental predictors in the Lake Atnsjøen (Norway) between June and October from 1990 to 2017. The × represents the integration of the variables in the model.   Table S6: Generalized additive mixed models (n = 112) relating species diversity (σ 2 ) and environmental predictors as well as 1-month lagged environmental predictors in the Lake Atnsjøen (Norway) between July and October from 1990 to 2017. The × represents the integration of the variables in the model.
Model Rank Variables 1 2 3 4 5 6 7 8 Temperature The selected GAMM model (Table S5) of the unevenness included years and water temperature was as follows, The selected GAMM model (Table S7) that describes relation between unevenness and 1month lagged water temperature, 1-month lagged phytoplankton and water transparency was as follows, σ 2 i = β 0 + β 1 1-month lagged water temperature+ β 2 1-month lagged phytoplankton+ + β 3 water transparency+ f (1-month lagged water temperature, k=4) + f (1-month lagged phytoplankton, k=4) + f (water transparency, k=4) where σ 2 is the unevenness at the i th observation, βs are the coecient of the xed eect, f and are smooth functions of the predictor variables. The random eects b (i.e., sampling period eect) and residuals i are assumed to be independent and normally distributed. We included additional autocorrelation between observations by introducing an autoregressive correlation structure of order 1 (AR-1) model on the residuals .  Figure S9: Partial plots of smooth function from a general additive mixed model of the predicted relationships between unevenness (σ 2 ) and signicant environmental variables (n = 140, R 2 = 0.23, Table S5) (a) temperature (b) water temperature (c) year. Etchings (rugs) on the inside of the x-axis indicate locations of data points along this axis. Grey shaded area represents 95% GAM specic metric credibility intervals.  Figure S10: Partial plots of smooth function from a general additive mixed eect model of the predicted relationships between unevenness (σ 2 ) and 1− month lagged and without lag environmental variables (n = 112, R 2 = 0.17, Table S7). Etchings (rugs) on the inside of the x axis indicate locations of data points along this axis. Grey shaded area represents 95% GAM specic metric credibility intervals. Linear trend in water temperature Interannual variation in standardized abundance of the most common species October Year (e) Figure S13: Zscores of the most common species through time for each sampling period from 1990 to 2017. A zscore gives an estimation of how much a given data point (here the standardized abundances of a given species) is far from the mean (i.e., mean abundances of a given species over the duration of the study). 23  Figure S14: Estimated values of the slope and their 95% condence intervals of the species for which their uctuations in abundance over time are signicantly explained by (a) the temperature and (b) the water transparency. The estimated slopes were issued from generalized linear models which expressed log abundance of species as function of temperature and water transparency. 24