Effects of choice of baseline on the uncertainty of population and biodiversity indices

Many monitoring programs provide annual indices of relative change over time in some quantitative measure of ecological status, such as population abundance or species richness. These indices are usually scaled relative to a reference year so that they represent change in ecological status compared to this particular year. An issue with this approach is that uncertainty about ecological status in the reference year can propagate into large uncertainty in all other index values. Taking instead the mean of the ecological status over several years as the reference—a reference period—may reduce uncertainty in indices. At present, this approach is not commonly used in practice. I quantitatively evaluate how the choice of reference period affects the uncertainty of two variants of population indices, either estimated independently each year or smoothed over several years, for 100 bird species using monitoring data. Short reference periods containing years early in the series lead to reduced uncertainty in independently estimated index values, but not in smoothed indices, compared to when using a single reference year. When a long reference period was used, uncertainty was substantially reduced for independently estimated annual indices in particular, but also for smoothed indices. An exception to the reduction in uncertainty with the length of the reference period was found when indices are constrained to be log-linear. Given an appropriate model and indices that are not strictly log-linear, using smoothing and/or reference the periods can be useful ways of reducing irrelevant uncertainty in the presentation of indices.


Introduction
Ecological monitoring programs often produce annual estimates of abundance or biodiversity to assess changes in the status of populations or ecosystems (Marsh and Trenham 2008;Fraixedas et al. 2020). Estimating the total number of individuals in a population, or species in a community, is however a difficult task. For birds and butterflies, for instance, regional population estimates may be derived from transect counts where the proportion of missed individuals cannot be quantified and it may not be clear which spatial area the individuals counted belong to (Ralph et al. 1995, van Swaay et al. 2008. Raw abundance estimates therefore often have no easily interpreted scale. To anchor raw estimates of the ecological quantity of interest, and to put a direct focus on temporal change rather than absolute level, they are usually rescaled relative to some baseline into index values. The use of a baseline gives index values a meaningful scale, and indices are interpreted as the proportional change in the ecological quantity relative to the baseline. Each choice of baseline gives a different index, and therefore has different associated uncertainty. A standard choice of the baseline is the first year of study (Gregory et al. 2019) so that indices represent the change in the ecological quantity relative to the first year. An issue with this approach is that uncertainty in the raw estimate in the reference year propagates into the uncertainty of all subsequent index values. This may considerably inflate uncertainty (Buckland and Johnston 2017), and is particularly problematic in cases where uncertainty of the raw estimate in the reference year is larger than in subsequent years, which is often the case if the reference year is the first year of the study and sampling effort increases over time. One approach to counter this effect is to select a year with a somewhat lower uncertainty as the reference (Fedy and Aldridge 2011), but this still means that uncertainty in the single reference year propagates into all indices. An alternative is to use the mean over multiple years instead of a single year as the reference to try to get a more stably estimated baseline. This kind of reference is not widely used in practice (but see e.g. Carlson et al. 2012, Knape 2016, Gregory et al. 2019. Another approach that has been suggested to reduce the influence of the choice of baseline (Buckland and Johnston 2017) is to use smoothing methods (Siriwardena et al. 1998;Fewster et al. 2000;Buckland et al. 2005;Soldaat et al. 2017;Harrison et al. 2014;Knape 2016). By smoothing indices over multiple years more stable indices may be obtained, but also in this case using a single reference year to anchor the smoothed index is common practice.
A broader empirical and quantitative assessment of how the choice of reference affects the uncertainty of population indices is currently lacking, and is the aim of this study. I compare indices defined from a single reference year to indices defined from reference periods consisting of sequences of years of varying lengths. I examine how the choice of reference period affect the uncertainty of annual indices estimated independently each year, and of smoothed index estimates obtained from GAMMs, for 100 bird species in Sweden.

Methods
For ease of presentation, I discuss abundance indices in the following sections, but the general ideas apply more broadly to ecological indices of temporal change, and particularly to biodiversity indices. Abundance indices are often derived from models with a log-link and, again for the sake of presentation, we assume that we have raw unscaled annual abundance indices over time at the log scale,μ 1 ,μ 2 , … These may, for example, be estimated from fixed year effects in a Poisson GLM or, in the case of smoothed indices, from a GAM with a Poisson response. If year 1 is used as the reference, then a standard relative abundance index in year t iŝ To evaluate the uncertainty of indices we focus on the variance of log(Î t ), from which approximate confidence intervals and standard errors can be computed. The reason for focusing on the variance at the log scale is that it is unaffected by simple scaling of the index, e.g. the variance of log(Î t ) is identical to that of log(100Î t ). The variance of log(Î t ) can be expressed as This shows that the uncertainty of bothμ 1 andμ t contribute to the variance of log(Î t ), and that the variance ofμ 1 will dominate if the uncertainty ofμ 1 is larger than that of μ t , which is often the case in practice. The approach evaluated here is to try to reduce the part of the variance in log(Î t ) that is due to uncertainty about the index in the reference year by using multiple reference years. Using 2 years as the reference we can define an alternative index The arithmetic mean of the raw index in the first 2 years is the reference for this index, and we would often expect the alternative index to have less uncertainty due to the denominator being more precisely estimated. This idea can be extended to use the mean over the first l years as the reference: As l increases we would expect the uncertainty of log(Î t ) to decrease further. Whether, and to what extent, this happens in practice is the focus of this paper. I will explore this in an analysis of monitoring data but first briefly describe how the uncertainty of the reference period indices may be computed in practice.

Computing uncertainty estimates for reference period indices
Compared to single year reference indices, log(Î t ) for reference period indices are non-linear as a function ofμ j in the reference period. This can make it more difficult to compute uncertainty estimates. One approach to do so is to use a delta approximation (see Appendix A). A second approach is simulation methods such as bootstrapping where allμ t are generated according to some simulation procedure, and the non-linear function log(Î t ) is computed from the samples. A third approach is Bayesian methods based on Monte Carlo integration, used for example for indices for the North American breeding bird survey (Link and Sauer 2002). In such cases, Monte Carlo samples of theμ t will usually be available and log(Î t ) can be computed for each sample to yield a posterior distribution. A fourth option would be to use the geometric mean of the exp(μ j ) as the reference instead of the arithmetic mean. Then log(Î t ) would be linear as a function ofμ j and the uncertainty could be computed from contrasts. Reference period indices are currently implemented in the R-packages rtrim (Bogaart et al. 2020) and poptrend (Knape 2016). The rtrim package uses a delta approximation (Appendix A). The poptrend package, used in the case study on Swedish birds and simulations below, instead uses a simulation approach. Simulated parameter estimates are drawn from a multivariate normal distribution with covariance matrix equal to a covariance matrix for the parameter estimates (Wood 2006b;Mandel 2013;Harrison et al. 2014).

Case study
To investigate how multiple reference years affect the uncertainty of population indices in practice, I analyzed data from the Swedish Bird Survey (Lindström and Green 2020). These data consist of annual line transect counts of birds from about 700 survey routes spread across a regular grid over Sweden. Not all routes are surveyed in every year. The survey was initiated in 1996 when 47 routes were surveyed, increasing to 84 in 1997, 166 in 1998, 179 in 1999, and 203 in 2000. The number of routes then continued to increase and 400-500 routes are now surveyed annually. Because so few routes were surveyed in the first 2 years, 1998 is used as the single reference year to compute official indices.
I ranked the species in the survey according to the number of non-zero counts across all years and routes. I then selected the 100 species with the most non-zero counts for analyses of the impact of the baseline on index uncertainty. For each species, I first removed routes that had only zero counts and then fitted (I) a negative binomial regression model with a log link, and route and year as fixed factors and (II) a negative binomial GAMM model with log link, route as a fixed factor, and year both as a smooth function and as a random effect (Knape 2016). The smooth function was implemented as a cubic regression spline. To get a uniform analysis across all 100 species I fixed the number of degrees of freedom in the GAMM at 8. This conforms with other empirical studies of bird census data (Fewster et al. 2000). I fixed the degrees of freedom since model selection of degrees of freedom in the GAMM analysis can lead to smooth functions that are near linear (1 degree of freedom). In this case there is a simple approximate relationship between uncertainty and the choice of reference period, which is examined further in Appendix B. The random time effect in the GAMM model was included to handle short term variation in abundance, and can affect uncertainty of the estimated trend (Knape 2016). The overdispersion parameter of the negative binomial distribution was estimated rather than fixed a priori, but was not included in the covariance matrix used to estimate uncertainty. Indices were then computed from the estimated year effects for (I) (hereafter referred to as 'independently estimated annual indices'), or from the estimated smooth function evaluated at the years of interest (see below) for (II) (referred to as 'smoothed indices').
To evaluate the effect of choice of reference year or period on uncertainty in the resulting indices, I compared the uncertainty of index values representing year 2020 computed from different reference periods. I used two sets of reference periods, one containing years early in the series when the number of surveyed routes was low, and another containing years in the middle part of the series when more routes were surveyed. Specifically, the first set had reference periods of varying lengths that all ended in 2000 (i.e. 2000, 1999-2000, 1998-2000, etc), and the second set had periods that all ended in 2010 (2010, 2009-2010, 2008-2010 etc). The longest reference period used all 15 years from 1996 to 2010. For the same species and model, all reference period indices were computed from the same model fit. In other words, all years were included in model fits irrespective of the reference period.
For all models and baselines, the amount of uncertainty was measured via the width of a 95% confidence interval for the log scale index in 2020. Data were analyzed using the R-package poptrend (Knape 2016), which uses mgcv (Wood 2006a) as the model fitting engine, and computes confidence intervals using the simulation procedure described above (Wood 2006b;Mandel 2013).
To complement the case study with a more controlled set up, I also analyzed simulated data. As the results were largely similar to the results of the case study the details are provided in Appendix C. R-code for the analyses is available in Supplement 1.

Independently estimated annual indices
In the case of reference periods in the start of the series (ending in 2000), increasing the number of years in the reference period from one to two reduced the width of confidence intervals by between a few percent and up to almost 60% with a median of 18% (Fig. 1a). The median reduction when using three years was around 24% compared to the single year 2000, while also including the first 2 years with fewer routes sampled on average did not lead to further reductions but instead to slight increases. In a very small number of cases, a reference period including years early in the series led to higher uncertainty than for a single reference year.
Using a single reference year (2010) in the middle of the series gave uncertainty in the same range as when using multiple reference years early in the series (Fig. 1b). With longer reference periods ending in the middle of the series the width of confidence intervals were reduced by up to over 60%, with a median of over 40% and with at least Including the first few years in these reference periods led to only slight increases in uncertainty (Fig. 1b).

Smoothed indices
For reference periods in the start of the series, more baseline years did not have a strong effect on uncertainty in the smoothed indices (Fig. 1c). Indices with reference period ending in the middle of the series had lower uncertainty (Fig. 1d). This was not only a consequence of more recent years being included in the reference period as extending the reference period backwards in time gave less index uncertainty than using only year 2010 as the reference (Fig. 1d). This is the opposite of what one would expect for a log-linear index for which uncertainty is mainly determined by the location of the midpoint of the reference period (Appendix B).
The magnitude of reduction in uncertainty for longer reference periods was lower than for independently estimated annual indices. The maximum reduction in uncertainty was on average around 15% compared to using year 2000 as the reference (Fig. 1d).
All results reported above are scaled by the width of confidence intervals for an index with the single year 2000 as the reference period ((width of the confidence interval for index with reference period)/(width of the confidence interval for index with year 2000 as the reference)). This scaling removes heterogeneity in estimation errors due to species properties, such as abundance and overdispersion. The variability in this scaling factor (the denominator of the ratio) is presented in Fig. 2.
An illustration of the different models fitted to data on willow warblers can be found in Appendix D (Fig. 8).

Discussion
Choosing a baseline to anchor indices of population or biodiversity status is often necessary for meaningful presentation of change. The typical baseline choice is a single year in early parts of the series. The results of this study show that when annual indices are independently estimated, uncertainty can be greatly reduced by redefining the index using a longer reference period as the baseline. A single reference year may give the impression that uncertainty about population change or biodiversity is large, while in fact the main portion of uncertainty comes from asserting the level in the specific reference year. A longer reference period may therefore be beneficial and give a more accurate picture of uncertainty.
Previous studies have suggested that smoothed indices are less sensitive to the choice of reference year or period (Buckland and Johnston 2017). The results of this study confirm this empirically in that the length of reference periods had less impact on uncertainty than for independently estimated annual indices, and for all baselines there was less variation in uncertainty. Even so, smooth indices with long reference periods had, on average, around 15% less uncertainty than smooth indices with a single reference year. Longer reference periods therefore can be useful also for smooth indices, as long as the smooth index is not near linear in which case there is not much to gain from using reference periods instead of a single year (Appendix B).
When deciding on a baseline, the first priority should be a choice that reflects the purpose of the index. If the purpose is to compare the current status to the status in a specific year in the past, then a single reference year is appropriate despite potentially high uncertainty. In other cases, a reference period may fit well with the purpose of the index. Using a moving ten year period as the reference may for example fit well with IUCN red list assessments where change during the last 10 year period is an important criterion, or in the context of consequences of climate change a reference period coinciding with a climate normal period such as 1961-1990 could be suitable. Often there may not be an obvious year or period against which comparisons should be made, as the main purpose of many indices is in understanding how the size of a population, or the biodiversity of a community, has changed relative to the past in a more loose sense. In such cases choosing a baseline so that the index does not convey irrelevant uncertainty should be an important consideration. In light of the results here, using the mean over a large part, or all, of the series, or over the previous ten-year period (Buckland and Johnston 2017) seem like reasonable default choices in such situations.
Alternative suggestions have been to entirely remove the influence of baseline years by focusing on the slope or curvature of estimated smooth index curves (Buckland and Johnston 2017). Specifically, p-values for whether the first or second derivative of the smooth curve deviates from zero may be computed (Fewster et al. 2000), and one may use these to produce indicators for periods where the change in the curve, or the slope of the curve, is significant. Such indicators are a highly useful complement to indices, but address a partially different question. They are indicators for periods of change in status, but do not provide direct estimates of the cumulative magnitude of the change. Both of these are of prime interest for monitoring, and can be simultaneously presented in displays of indicators.
Uncertainty is an important but sometimes neglected component of population indices (Fraixedas et al. 2020). It is imperative that uncertainty estimates account for the main sources of error, which mainly comes down to a sound choice of model for producing raw index estimates. Given that important sources of error in the data have been properly accounted for, indices should be presented in a way that does not include irrelevant uncertainty. Smoothing indices and/or using longer reference periods, are useful approaches to achieve this. by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Delta approximation for reference period indices at the log scale
For a scalar function g taking input in the form of a vector x, the delta approximation for the variance of g(X ) where X is multivariate random variable with mean vector μ and variance matrix is Ver Hoef (2012) We first assume that we have estimates of abundance or biodiversity at the log scale, μ t , as in the main text, and also that we have estimates of uncertainty for these in the form of variances V(μ t ) and covariances Cov(μ j , μ k ). The delta approximation gives us a way of approximating the variance of log(I t ) from this information about the μ t .
Here, we are interested in the function log(I t ) = μ t − log(1/m m j=1 e μ j ). We restrict attention to the case t > m, minor adjustments below would be needed to cover the case t ≤ m. The vector ∇ log(I t ) contains the derivatives of log(I t ) with respect to μ j for j = 1, . . . , m − 1, m, t. For j ≤ m the derivative is and for t the derivative is equal to 1. The gradient therefore becomes: The variance matrix is defined from the variance and covariance terms Cov(μ j , μ k ). To compute the approximate variance of log(Ī t ) in practice it is often convenient to use (A.1) directly by plugging in the gradient and the covariance matrix. However, we can also expand the matrix product to arrive at a direct expression for the variance as shown below.
The 2 m + 1 terms involving μ t in the sum ∇ log(I t ) T ∇ log(I t ) combine into: The m 2 terms not involving μ t combine into: Adding the two sums together gives

Delta approximation at the arithmetic scale
The delta approximation can also be used on the arithmetic scale to compute standard errors ofĪ t from the variance of M j := e μ j . This is what the rtrim package does to compute the uncertainty for reference period indices. In this case the gradient is and is the covariance matrix for the M j .

Appendix B: Uncertainty of linear indices
We here examine properties of indices that are strictly linear at the log-scale. When the trend line is assumed to be log-linear a simple approximate expression can be derived for the uncertainty of indices as a function of the reference period. One can then show that the relative uncertainty of the linear indices does not depend on the details of the sampling model. Assume that uncertainty is evaluated for year t0 + n (say 2020 = 2010 + 10 in our case) and that the reference period is (in R-like notation) t0 + (m − l + 1) : m so that m is the last year of the reference period (relative to t0) and l is the number of years in the reference period. To simplify notation, assume that t0 corresponds to year 2010 and set t0 = 0, which is equivalent to redefining t as the number of years since 2010. The reference period then is simply (m − l + 1) : m.
To derive an approximate estimate of the uncertainty of the index in year n we compute the variance of the index Recall thatμ was defined at the log scale so that we are effectively using the geometric mean over the reference period rather than the arithmetic mean as in the case study. This is an approximation of the uncertainty for indices defined from arithmetic mean reference periods, but it holds approximately if the annual indices do not vary considerably, i.e. if the slope of the trend line is not steep. It turns out to work quite well for linear estimates of trends for the 100 species in the case study (Fig. 3).
If the index is derived from a linear model we havê μ t =α +βt for some estimated interceptα and slopeβ. The (log-scale) index in year n is then and its variance is Uncertainty is measured via the width of confidence intervals for the index in year n, which should be approximately proportional to the standard deviation of the index. The uncertainty of the index with reference period (m −l + 1) : m relative to the index with year 0 (corresponding to year 2010) as the reference is then approximately This expression suggests that the reduction in uncertainty for linear indices is mainly determined by the proximity of the center (midpoint) of the reference period to the evaluation year. Note that the variance of the slope estimate, which depends on the details of the sampling model, cancels out of the relative uncertainty calculated above, so this result is independent of the sampling model. To check how linear indices behave for the case study of 100 bird species, I fitted models with a linear effect of year plus random year and site effects at the log-scale and a negative binomial response distribution (analogously to the setup in the main text). Relative uncertainty is shown in Fig. 3, and corresponding scaling factors in Fig. 4. Extending reference periods backwards in time (i.e. keeping m fixed but increasing l)  leads to increased uncertainty (Fig. 3), which may seem counter intuitive. However, this is due to the resulting backward shift of the center of the reference period. If we alternatively extend reference periods both forward and backward in time while keeping their centers fixed, the uncertainty of linear indices is largely unaffected by the length or reference period. This is shown in Fig. 5 where reference periods are centered at year 2003 and simultaneously extended both forward and backward in time (2003, 2002-2004, 2001-2005 etc.).  and d). Uncertainty is measured via the width of 95% confidence intervals at the log scale for indices in year 2020. In panels a and b uncertainty is relative to the uncertainty of independently estimated annual indices using year 2000 as the baseline, and in c and d relative to smooth indices using 2000 as the baseline with 100 species, 400 sites and for a time period corresponding to 1996-2020, but with no missing data for any of the site and year combinations. I used an overall intercept of 1 at the log scale and site effects were drawn iid from a normal distribution with mean zero and standard deviation 0.5. Year effects were composed of two parts, a quadratic function peaking in 2006, −0.002(year − 2006) 2 , and added to that random iid draws from a normal distribution with mean zero and standard deviation 0.1. The negative binomial size parameter was set to 1 in the simulations. The same models as in the main text were fitted to each simulated data set. Results showed similar patterns in reduction of uncertainty with increasing length of the reference period as for the case study in the main text (Fig. 6, corresponding scaling factors in Fig. 7). However, as the data were simulated with identical sample sizes for all years, independently estimated annual indices had similar uncertainty when the reference period ended in 2000 as when they ended in 2010, and smooth indices with the single reference year 2010 had slightly higher uncertainty than indices with the single reference year 2000.

Appendix D: Example indices
Examples of the different models for data on willow warbler are shown in Fig. 8.   Fig. 8 Comparison of indices and their uncertainty (95% confidence intervals) for the willow warbler using a single reference year early in the series (2000), a reference period with the first five years of the study, a single reference year in the middle part of the series (2010), and a long reference period consisting of all years between 1996 and 2010. The first row shows independently estimated annual indices, the second row shows smooth GAMM indices with 8 degrees of freedom" and the bottom row shows log-linear indices. Green segments show periods of significant increase and red segments significant decrease, computed using finite differences (Fewster et al. 2000)