1 Introduction

Several studies have produced evidence of declines in diversity and abundances of some pollinators, including managed and unmanaged bees (Biesmeijer et al. 2006; Potts et al. 2010). Multiple causes contribute to this phenomenon, but habitat loss (including floral resources) is one of the most commonly cited factor (Winfree et al. 2009). Floral resource availability is therefore seen as a crucial driver in studies of abundances of such pollinator populations and the pollination services they provide (Roulston and Goodell 2011). Assessing the floral resources of a highly structured landscape composed of several habitats is a key issue to better understand ecosystem services provided by pollinators.

Landscape assessments of floral resource availability make use of spatially explicit land use information comprising land use classes, representing administrative classes rather than climatic and ecological meaningful classes. Even for a specific climatic region, the variability of floral coverage for a given land use class can be large from one site to the other, due to the type of species growing in each site, but also to differences in management and environmental conditions like solar radiation and soil moisture. An example of variability can be seen in Fig. 1, representing different semi-natural grassland sites at different spatial locations in the world. Floral resource availability also varies over a growing season, since different plant species are blooming in spring compared to early and late summer. This variability is conceptualized in floral resource assessments by dividing a season into different floral periods. This approach is used for example in landscape assessments of pollinator abundances using spatially-explicit models (Lonsdorf et al. 2009). These models take into account the spatial structure of the landscape and more specifically the distance between nests and floral resources. Thus both spatial and temporal modelling of floral resources is important when assessing floral resources availability at landscape level.

Fig. 1
figure 1

Four different sites falling under the ‘semi-natural grassland’ category in southern Sweden. Source: Maj Rundlöf

Floral cover in a particular land use class or habitat can be estimated from field measurements of floral cover. Information is up-scaled to the landscape level by considering the frequency of habitats in each land use class and shapes of land use class information. A simple model would be to average measurements obtained from repeated measurements obtained during the seasonal period of interest. However, it is likely that floral cover measurements from the same site are more similar than floral cover measurements from another site. Failing to account for the dependency structure between the data from the same site may lead to a misspecified model, the main consequence being a biased estimation of the error or variability of the final parameter estimates. This can have implications when testing for differences in floral covers between habitats or periods. An under-estimation of uncertainty could lead to falsely detect a difference more often than expected.

Due to the lack of data for many land use classes and habitats, a common practice is to rely on expert judgment (see e.g. Lonsdorf et al. 2009; Polce et al. 2013; Koh et al. 2015, among others). Expert elicitation can be carried out by asking experts to provide ‘best guess’, and in some occasions a range of possible values on parameters of interest (O’Hagan et al. 2006). However, expert elicitation are sensitive to several cognitive biases, and careful attention is needed to avoid or adjust these. Biased floral cover estimates or disregard of floral cover variability could lower the quality of assessments relying on these. Approaches combining expert judgments and available field measurements are needed to strengthen assessment of floral resources.

The objective in this study is to derive a model to estimate the spatio-temporal variation in floral coverage in a mosaic landscape consisting of several different land-use types using both expert judgments and empirical data. The intention is to build a model for floral cover parameters in pollinator abundance assessments such as Lonsdorf et al. (2009). Temporal within-season variability is therefore handled by dividing the season into at least two periods. The model is further developed to (1) quantify variability in land use specific floral cover across the landscape, and to (2) consider possible dependencies in data. The model is applied on floral cover estimation using field measurements from a mosaic agricultural landscape in southern Sweden. Possible dependencies between observations are accounted for using copulas, and based on the results, we compare the performance of alternative copulas to consider data-dependencies and look for possible biases in expert judgments on floral covers.

2 Material and methods

The starting point for the assessment consists of best guesses and ranges of floral covers obtained from expert elicitation. In addition to these expert judgments, observed floral covers are also available. The possible dependency between repeated measurements obtained at a site but at different occasions during the season is then considered by specifying a model for all the seasonal periods using a hierarchy of copulas to specify the multivariate distribution of observations.

2.1 Model

2.1.1 Variability in floral cover

The model assesses floral cover \(\varPhi \) for each land use class separately. Floral cover is defined as the proportion of a site s that is covered by flowers and is bounded between 0 and 1. Based on previous expert knowledge, a unimodal distribution was deemed appropriate to model the variability of floral coverage, and thus it was modeled as a Beta distributed random variable: \(\varPhi \sim {{\mathrm{Beta}}}(a,b)\). The Beta distribution is very flexible and allows for different shapes from highly skewed to symmetric ones (see Fig. 2) and is suitable to express uncertainty in random proportions. It also includes the particular case of the uniform distribution on the [0, 1] interval, when \(a=b=1\).

Fig. 2
figure 2

Different shapes of the Beta distribution according to a and b values

A season is divided into at least two seasonal periods (as in e.g. Lonsdorf et al. 2009), and floral cover for seasonal period p is assigned a period-specific Beta distribution with parameters \((a_{p},b_{p})\).

For each land use class and season there are n sites denoted by \(s=1,\dots ,n\), with multiple observations of floral cover. The season is then divided into periods, and each observation made within a period is seen as a repeated measurement of the true (unobserved) floral cover in that land use class during that period (see Fig. 3 for an example where the season is divided into two periods according to the blooming period of oilseed rape).

Fig. 3
figure 3

a) Location of the sites on the map of the province of Scania, at the southern tip of Sweden, and b) Data structure for a given land use category, where \(\varPhi _p\) is the true (unobserved) floral cover on period p, and \(y_{spr}\) is the r-th repeated observation of floral cover made in period p on site s

In the example below, sites were selected by stratified randomization along a gradient of landscape heterogeneity, and within a given site, data were sampled at four occasions. This sampling scheme introduces a possible dependency between data points sampled from the same site. In the next section a model is formulated which take into account this dependency when assessing period-specific floral cover for a given land use type.

2.1.2 Dependency structure

Observations from different sites are assumed to be independent since the sampled sites were at least 7 km apart, while we allow for dependency between observations made at the same site but at different times. Further, observations are assumed to have a stronger dependency within- compared to between-seasonal periods. The latter is motivated arguing that these observations are closer in time compared to replicates from the other period. Finally, we did not keep information about the exact dates at which observations have been made, but only the corresponding periods. Since the time elapsing between two observations is different from one site to the other, this simplification allowed us to treat data from each site in a similar way. Thus a hierarchical dependence structure is assumed, where dependence is higher for data gathered during a seasonal period than between two seasonal periods (see Fig. 4).

2.1.3 A multivariate distribution with the dependency structure

A flexible way to consider the type of dependency described here is to use copulas. In simple terms, a copula is a multivariate probability distribution derived from the marginal distributions of each variable and a dependency structure between these variables. A key point is that a univariate marginal distribution \(\mathscr {D}\) can be uniquely described by a uniform distribution on the range [0, 1], using the inverse of its cumulative probability function \(F^{-1}_{\mathscr {D}}\).

A copula C is a multivariate cumulative distribution function whose margins are uniform over [0, 1] (see Joe 1997 or Nelsen 2006 for an introduction to copulas). From Sklar’s theorem (Sklar 1959), we know that given a set of univariate continuous probability functions \(F_i, i=1, \dots , d\) and a copula \(C : [0,1]^d \rightarrow [0,1]\), we can construct a (unique) multivariate probability function F with margins \(F_i\), using:

$$\begin{aligned} F(x_1,\dots ,x_d) = C(F_1(x_1),\dots ,F_d(x_d)). \end{aligned}$$
(1)

In other words, any multivariate probability distribution can be expressed as a combination of its marginals and of the dependency structure between these marginals, given by the copula. The uniqueness of the decomposition is guaranteed whenever the margins are continuous.

Fig. 4
figure 4

Tree-structure of an Archimedean copula: example of the \(C_0(C_1(u_1,u_2),C_2(u_3,u_4))\) nested copula represented as a tree

Several types of copulas are available in the literature, one of the most commonly used being the class of Archimedean copulas. These copulas can be defined explicitly through the use of a generator \(\psi \) and a one- or two-dimensional parameter \(\theta \). The parameter of the generator can also be linked with Kendall’s tau measure of dependence, and with tail dependence coefficients (Joe 1997, see also Hofert and Mächler (2011) for the explicit relationships between these different quantities for some Archimedean families), which are easier to interpret.

However, the main disadvantage of this class of copulas is its exchangeability property, which implies in particular that the marginal probability distributions are identical. Hierarchical Archimedean copulas [first mentioned in Joe (1997), and more recently by Savu and Trede (2010)] are a useful tool to take into account more complex features. The basic idea is that at a given hierarchical level, copulas from the above levels are combined together via another copula. In the end, the “root” copula [using the terminology of Hofert and Mächler (2011)] is a copula whose arguments can be other copulas (see Fig. 4 for example). If these allow for more flexibility, the resulting hierarchical structure only defines a proper copula if it verifies a sufficient nesting condition. This condition can be expressed using Kendall’s tau as \(\tau _i > \tau _j\), for a child copula i and its parent copula j. In other words, dependence should be stronger between elements gathered by the child copula than with the others. For example, the copula in Fig. 4 is constructed to allow dependence between \(u_1\) and \(u_2\) to be stronger than between \(u_1\) and \(u_3\). This is the same type of structure we assumed for our data, with a stronger dependence within the periods, each of them being a child copula, than between the periods, which are then gathered at a higher level using a parent copula.

Thus, an Archimedean copula is defined through a generator function corresponding to different dependency structures. Figure 5 show simulated pairwise distributions of a four-dimensional vector whose dependency is given by the hierarchical Archimedean copula defined in Fig. 4. The choice of a copula family can thus have an influence on the model as it corresponds to different dependency structures, e.g. in tail dependence.

Fig. 5
figure 5

Pairwise distributions of a 4-dimensional random vector with Beta marginal distributions and a dependency structure defined by the hierarchical Archimedean copula of Fig. 4. Examples with Joe and Clayton families. a Joe family, b Clayton family

2.2 Parameter estimation

The model was implemented in a Bayesian framework to be able to consider expert’s judgments and learn from data. The Bayesian approach quantify uncertainty in parameters and model predictions by subjective probabilities, which can be used to construct credible intervals representing uncertainty in floral cover relevant statistics such as the mean, median and spread or to account for uncertainty when sampling variability in floral cover in simulation studies at the landscape level.

Fig. 6
figure 6

Directed acyclic graph (DAG) representing the structure of the model for observations made at a given site and for a given land-use category. Observations are represented inside the square, parameters of the model are represented in circles and hyperparameters in squares. Dashed rectangles represent one Archimedean copula

We let \(y_{psr}\) denote the r-th observation of floral cover in period p and site s (see also Fig. 3). To simplify model description, the model is from now on described for two seasonal periods \(p=1,2\) with two measurements each \(r=1,2\).

2.2.1 Bayesian formulation

A Bayesian model is derived from a likelihood and a prior. The likelihood specify the data generating process given the model parameters. A prior is a probability distribution specifying uncertainty in parameters before seeing the data. Parameters are updated using Bayes rule, resulting in posterior distributions.

To make it simpler to use for expert elicitation, a re-parametrized version of the Beta distribution is used, which define floral cover by its mode and the so-called sample size, defined respectively as \(\mu = (a-1)/(a+b-2)\) and \(s = a + b\), for a \({{\mathrm{Beta}}}(a,b)\). In the sequel, we use either \({{\mathrm{Beta}}}(a,b)\) or \({{\mathrm{Beta}}}(\mu ,s)\) to refer to the corresponding Beta distribution. Note that the use of this parameterization relies on the assumption that the mode does exist, and in particular on the assumption that \(a > 1\) and \(b>1\).

Likelihood We denote by \(\mathbf {y}_s = (y_{s11}, y_{s12},y_{s21}, y_{s22})\) the set of all the observations made at site s. The likelihood of the model is given by (see also Fig. 6):

$$\begin{aligned} \mathbf {y}_{s} \mid \theta\sim & {} C_{B}(C_{W}(u_{s11},u_{s12}),C_{W}(u_{s21},u_{s22}))\nonumber \\ u_{spr}= & {} F_{p}(y_{spr}) \quad \text {where } F_p = \text {cdf}(\text {Beta}(\mu _p,s_p) ), \end{aligned}$$
(2)

where \(C_{W}\) and \(C_{B}\) are the copulas expressing respectively within- and between-period dependency with corresponding Kendall’s tau \(\tau _W\) and \(\tau _B\), and where \(F_p\) is the cumulative distribution function of a Beta distribution of parameters mode and sample size \((\mu _p,s_p)\). For a given land use category, the parameter vector is then defined as \(\theta = (\mu _1,s_1,\mu _2,s_2,\tau _W,\tau _B)\).

Priors To define priors for the components of \(\theta \), we use the relationship between the parameters of a bivariate Archimedean copula and Kendall’s tau measure of association. Uniform priors are assigned for all parameters.

Lower and upper bounds for the uniform priors \(\pi (\mu _1)\) and \(\pi (\mu _2)\) on the modes of the Beta distributions are based on bounds elicited by the experts:

$$\begin{aligned} \mu _{1}&\sim \mathscr {U}([m_{1},M_{1}]),&\mu _{2}&\sim \mathscr {U}([m_{2},M_{2}]), \end{aligned}$$
(3)

where the quantities \(m_1\), \(m_2\), \(M_1\) and \(M_2\) differ for each habitat and are given in Table 1.

Lower and upper bounds for the uniform priors \(\pi (s_1)\) and \(\pi (s_2)\) on the sample sizes of the Beta distribution are chosen in order to ensure that the support of the prior is large enough:

$$\begin{aligned} s_{1}&\sim \mathscr {U}([2,200])&s_{2}&\sim \mathscr {U}([2,200]). \end{aligned}$$
(4)

We also assume a positive association between the observations, and to ensure that the hierarchical Archimedean copula properly defined a copula, we need to ensure that \(\tau _B < \tau _W\). The latter condition is controlled by the conditional distribution of \(\tau _B\) given \(\tau _W\). The last two priors are thus equivalent to assuming a uniform distribution of \((\tau _W,\tau _B)\) over \(D = \{(u,v) \in \mathbb {R}^2, 0 \le v \le u \le 1 \}\). This results in the following priors \(\pi (\tau _W)\) and \(\pi (\tau _B \mid \tau _W)\) for the copula parameters:

$$\begin{aligned} \tau _{W}&\sim \mathscr {U}([0,1]),&\tau _B | \tau _W&\sim \mathscr {U}([0,\tau _W]) \end{aligned}$$
(5)

Finally, the joint prior distribution for parameter \(\theta \) is:

$$\begin{aligned} \pi (\theta ) = \pi (\mu _1)\pi (\mu _2) \pi (s_1)\pi (s_2)\pi (\tau _B \mid \tau _W )\pi (\tau _W ). \end{aligned}$$

Posterior The posterior distribution of the parameters is defined as:

$$\begin{aligned} \pi (\theta \mid \mathbf {y})&\propto L(\mathbf {y} \mid \theta ) \ \pi (\theta ) = \prod _{s=1}^n L(\mathbf {y}_s \mid \theta ) \ \pi (\theta ), \end{aligned}$$
(6)

where L is the likelihood of the data and \(\pi \) is the prior distribution.

2.2.2 Bayesian updating

Posterior distributions were estimated using a Markov Chain Monte Carlo (MCMC) algorithm. We used the HAC package in R to compute the density of the copula and the (unnormalized) posterior distribution at each iteration of the MCMC algorithm. To improve the efficiency of the algorithm, a hybrid strategy for MCMC sampling based on a local adaptive scheme in a Robbins–Monro stochastic approximation framework, and inspired from Andrieu and Thoms (2008), was implemented.

More precisely, at each iteration of the algorithm, and given the current value of the chain \(\theta ^{m}\), a candidate \(\tilde{\theta }_k\) was generated for the k-th component of \(\theta \), using a univariate adaptive proposal distribution \(q_m^k( \cdot | \theta ^{m})\). This candidate was then accepted with probability

$$\begin{aligned} \alpha _k(\theta _{m},\tilde{\theta }) = \min \left( 1, \frac{\pi (\tilde{\theta }_k \mid \mathbf {y}) \ q_m^k(\theta _{m}^k\mid \tilde{\theta })}{\pi (\theta _{m}^k \mid \mathbf {y}) \ q_m^k(\tilde{\theta }_k \mid \theta _m)} \right) . \end{aligned}$$
(7)

When the proposal distribution is symmetric, the above ratio can be simplified and expressed only using the ratio of posteriors. In our case, the proposal was chosen to be a random walk, either Gaussian or uniform. As mentioned previously, a hybrid strategy was adopted, which means that at each iteration of the algorithm, we randomly selected either the uniform or the Gaussian random walk.

The basic idea of the adaptive algorithm is to adapt the variance of this random walk in order to reach an optimal acceptance probability (Andrieu and Thoms 2008). More precisely, the variance of the random walk proposal is computed as the product of an empirical estimation of the Markov Chain’s covariance matrix and a correction coefficient. This correction coefficient increases when the computed acceptance probability is too large, suggesting that the current variance of the proposal distribution is too small since too many candidates are accepted, which is often due to small moves of the chain. On the other hand, it decreases when the computed acceptance probability is too small, suggesting on the contrary that the current variance of the proposal distribution is too large, resulting in a high rejection rate.

More precisely, following Andrieu and Thoms (2008), a stochastic approximation procedure was adopted to update the algorithm’s parameters, using a time step \(\gamma _m\). If we denote by \(\lambda _m^k\) and \(\upsilon _m^k\) respectively the current mean and variance of the Markov Chain for the k-th component of \(\theta \) at iteration m of the algorithm, and by \(\kappa _m^k\) the corresponding correction coefficient, we have:

$$\begin{aligned} \lambda _{m+1}^k&= \lambda _m^k + \gamma _{m+1} (\theta ^{m+1}_k - \lambda _m^k) \end{aligned}$$
(8)
$$\begin{aligned} \upsilon _{m+1}^k&= \upsilon _m^k + \gamma _{m+1} ((\theta _k^{m+1} - \lambda _m^k)^2 - \upsilon _m^k )\end{aligned}$$
(9)
$$\begin{aligned} \log \kappa _{m+1}^k&= \log \kappa _m^k + \gamma _{m+1} (\alpha _k(\theta _m,\tilde{\theta }) - \alpha ^*). \end{aligned}$$
(10)

The support of the uniform proposal at iteration m of the MCMC algorithm is then defined through its mean and variance as \([\theta _m^k - \sqrt{3 \kappa _m^k \upsilon _m^k} , \theta _m^k + \sqrt{3 \kappa _m^k \upsilon _m^k}]\). The choice of the time step \(\gamma _m\) is crucial to ensure the convergence of the stochastic approximation, and should verify \(\sum _m \gamma _m = \infty \) and \(\sum _m \gamma _m^2 < \infty \), which is the case for sequences of the type \(\gamma _m = 1/m^{a}\), for some \(a \in (1/2,1]\).

Several approaches can then be used to take into account the strong constrained supports of parameters \(\mu _1\), \(\mu _2\), \(\tau _W\) and \(\tau _B\). A first one is to use unconstrained proposals, resulting in the rejection of candidates falling outside of the support thanks to the prior distributions densities in (7). However, several issues may arise in this case. First, it is somehow inefficient since we allow the algorithm to generate candidates for which we already know that they will be rejected, without the need to compute the acceptance probability. In our context, since the algorithm is already time consuming, it seems wiser to make a better use of the resources by generating candidates within the support of the target distribution. Second, when using an adaptive scheme, rejecting candidates which fall outside of the targeted support will have an influence on the variance of the proposal, which is adapted at each step of the algorithm according to the current acceptance rate.

Another approach, which was adopted here, is to include the constraints in the proposal. The adaptive proposal is a bit more complex than in the unconstrained case, and in particular it is no longer symmetric which means that the ratio in (7) can no longer be simplified. When \(q_m^k\) is chosen to be a Gaussian distribution, the variance of the distribution is adapted at each iteration of the MCMC algorithm as described above in (8), and then constraints on the parameters are taken into account using truncated versions of the Gaussian distribution. When \(q_m^k\) is chosen to be a uniform distribution, the variance of the distribution is also adapted as in (8), and then the constrained are accounted for by truncating the upper and lower bounds when necessary (see Fig. 7).

Fig. 7
figure 7

Accounting for the constraints on \(\tau _B\) when using a uniform proposal distribution to generate a candidate \(\tilde{\tau }_B\) at iteration m of the MCMC algorithm. Depending on the adaptive variance at iteration m, \(\kappa ^B_{m} \upsilon _{m}^B\), and on the candidate value \(\tilde{\tau }_W\), the lower and upper bounds might need to be truncated. Case (a): the current adaptive variance lead to a uniform distribution which allow for the sufficient nesting condition to be fulfilled, so that no truncation is needed. Cases (b): the current adaptive variance lead to a uniform distribution with an upper bound which is too high and could lead to candidates \(\tilde{\tau }_B\) that would not fulfilled the nesting condition: in this case, the upper bound of the proposal distribution is truncated. a No truncation is needed, b The upper bound is truncated to \(\tau ^m_{B}=\tilde{\tau }_W\)

Moreover, to ensure that the sufficient nesting condition is verified, \(\tau _B\) was sampled conditionally on the value of \(\tilde{\tau }_W\) and two cases were accounted for: (i) if \(\tau _B^m < \tilde{\tau }_W\), a random walk (either Gaussian or uniform) was used, taking into account the constraint that \(\tilde{\tau }_B\) should remain smaller than \(\tilde{\tau }_W\) (an illustration in the case of the uniform proposal is given in Fig. 7), and (ii) if \(\tau _B^m \ge \tilde{\tau }_W\), then we set \(\tilde{\tau }_B = \tilde{\tau }_W\). The algorithm is described in more details in Algorithm 1.

figure a

2.3 Convergence in the Bayesian updating

Convergence and properties of the chains were assessed by performance measures derived from the coda package in R. In particular, acceptance rates, autocorrelation and cumulative quantile plots were surveyed. Convergence diagnoses were made using Heidelberger and Welch’s criterion (Heidelberger and Welch 1983) implemented in the heidel.diag function of the coda package.

2.4 The model applied on floral cover assessment

2.4.1 Land use information

Floral cover was estimated on four pollinator relevant land use types in southern Sweden: oilseed rape, which is a mass-flowering crop and attractive for foraging bees, semi-natural grassland, uncultivated field borders, which are important nesting habitats with a low density of floral resources, and flower strips, which are created with the purpose of providing floral resources to pollinators in intensive agricultural landscapes. The season is divided into two periods, an early period corresponding to the flowering of oilseed rape, and a late period after this flowering period to the end of the growing season.

By definition of the two floral periods, oilseed rape is only blooming in the first period. Therefore floral cover of oilseed rape was only assessed during the first seasonal period, and the model was simplified to only take into account within-period dependency for this land use class.

2.4.2 Floral cover from expert elicitation

A group of experts were asked to assess the floral cover of different habitat types in Sweden. The experts had been asked to provide a best guess value, along with lower and upper bounds representing a range of variability or their uncertainty about which values floral cover may take.

Prior model of floral covers, i.e. the parameter \(\mu \) of the land use class and period specific Beta distributions (see Table 1), were partly based on the expert’s judgments. The lower bound of priors were set to zero for the land use semi-natural grassland, flower strips and field edges, while the upper bounds were set to the experts’ judgments on upper bounds. For oil seed rape, the experts judged floral cover to be high, and therefore the prior was set to the range 0.5–1. The best guess on floral cover provided by the experts were not used here to justify the comparison between the posterior mode and the best guess provided by experts. However, it is possible to include the best guess when specifying the prior.

Table 1 Lower and upper bounds for the uniform priors on the modes of the Beta distribution for each floral period

2.5 Floral cover data

Data on floral coverage were obtained from a large field study in 2011 and 2012 in 16 sites in Scania (southernmost part of Sweden) (Holzschuh et al. in press). The study consisted in 16 landscape sectors with either a high or a low cover of oilseed rape and an orthogonal gradient of semi-natural habitat. Within each sector, different habitats which were assumed to contain flowers were sampled. In 2012, sown wildflower strips were also planted and surveyed in half of the 16 sites. Observations obtained in 2011 and 2012 were considered as independent replicates, i.e. as if coming from different sites.

The percentage of flowers was recorded on two occasions during the flowering period of oilseed rape, i.e. between middle May and beginning of June, and on two other occasions after OSR flowering period, i.e. between late June and beginning of July. For each habitat, observations were made along \(150\times 1\,\hbox {m}^{2}\) transects, and then averaged at the square meter level.

Finally, for semi-natural grassland, field edges and oilseed rape fields, we have 32 vectors of 2 repeated observations for each period, and for flower strips we have 8 vector of 2 repeated observations for each period. In other words, \(n=32\) for semi-natural grasslands, field edges and oilseed rape fields, and \(n=8\) for flower strips.

2.6 Model selection

We compared the results obtained with different copula families, in order to identify the one that best suits the data, but also to check the robustness of the results concerning the Beta distribution parameters. The latter are of main interest, and it is important to be sure that the choice of the copula family does not have a too strong influence on the marginal distribution parameters. The DIC and WAIC criteria were used to compare the different competing models, i.e. those corresponding to different copulas family, but also the model where we assumed independence between the observations.

The DIC criterion [Deviance Information Criterion, Spiegelhalter et al. (2002)] is defined as:

$$\begin{aligned} DIC = -2 \left( \log L(\mathbf {y} \mid \hat{\theta } ) - p_{DIC}\right) , \end{aligned}$$
(12)

where \(L(\mathbf {y} \mid \hat{\theta }) \) is the likelihood function evaluated at \(\hat{\theta }\), a Bayesian point-estimate of \(\theta \), and where \(p_{DIC}\) is the effective number of parameters, measuring the model complexity. This quantity can be defined in two different ways (see Gelman et al. 2014), either as:

$$\begin{aligned} p_{DIC} = 2 \left( \log L(\mathbf {y} \mid \hat{\theta }) - \mathbb {E}_{\theta }(\log L(\mathbf {y} \mid \theta ) \right) , \end{aligned}$$

or as:

$$\begin{aligned} p_{DIC,2} = 2 \ \text {Var}_{\theta } \left( \log L(\mathbf {y} \mid \hat{\theta }) \right) , \end{aligned}$$

where the expectations are taken over the posterior distribution of \(\theta \). Using the mode a posteriori as the point estimate for \(\theta \), \(p_{DIC}\) can be negative if the posterior mode is far from the posterior mean. On the other hand, the definition of \(p_{DIC,2}\) ensures that is always positive, but results in a quantity that is less stable numerically. For this reason, we used the \(p_{DIC}\) to compute the DIC criterion. \(\hat{\theta }\) is usually taken to be the posterior mean, but as suggested in Spiegelhalter et al. (2002), it can also be the mode or the median of the posterior distribution. Here, DIC was derived from the median of the MCMC sample.

While the DIC relies on a point estimate for \(\theta \), the WAIC criterion [Watanabe-Akaike Information Criterion, also called the Widely Applicable Information criterion, Watanabe (2010)] is a fully Bayesian criterion, since it relies on the whole posterior distribution of \(\theta \). Its expression is given by:

(13)

where the first term between the parentheses is the log pointwise predictive density, and \(p_{WAIC}\) is the effective number of parameters, accounting for the complexity of the model. Here again, two alternative definitions are possible for this term (Gelman et al. 2014):

$$\begin{aligned} p_{WAIC}&= 2 \sum _{i=1}^n \left[ \log \mathbb {E}_{\theta } (L(y_i \mid \theta ) - \mathbb {E}(\log L(y_i \mid \theta )) \right] \end{aligned}$$
(14)
$$\begin{aligned} p_{WAIC,2}&= \sum _{i=1}^n \text {Var} (\log L(y_i \mid \theta )). \end{aligned}$$
(15)

(Gelman et al. 2014) recommend to use \(p_{WAIC,2}\) in practice, due to the fact that it gives results which are closer to what one would get using leave-one-out cross validation. Both were calculated in our case.

As for the AIC and BIC criteria, one should choose the model with the smallest DIC or WAIC values.

3 Results

3.1 Convergence of the Markov Chains

We obtained similar results when running the algorithm from different starting points, suggesting that the posterior distributions are independent of the initialization of the algorithm. Acceptance rates were all comprised between 0.26 and 0.29, reflecting a good mixing of the chains. Figure 8 gives an example of the trace plot obtained when running the algorithm on the semi-natural grassland data and with the Clayton copula family. The auto correlation plots were satisfying for all variables except for \(\tau _W\), due to the strongest dependency of this parameter on the history of the chain. After 100 000 iterations, the Heidelberger and Welch’s criterion indicated that every chain has reached its stationary distribution. Thus, the convergence was judged acceptable for the forthcoming analysis.

Fig. 8
figure 8

Trace plots of the parameters, on the semi-natural grassland data and with the Clayton copula family

3.2 Model selection

Model comparison were performed using DIC and WAIC values obtained in each land use type and for each copula family (Table 2). Both criteria lead to the selection of the same copula families. Model comparison showed that taking into account the dependence allowed for a better fit of the data (Table 2), except for flower strips. The sample size is much smaller in this land use category, since it existed in only half of the sites and only in 2012. In this case it is more difficult to estimate the dependency structure, as shown for example by Huard et al. (2006). For the semi-natural grasslands, the DIC and the WAIC computed with the effective number of parameters \(p_{WAIC}\) (see Sect. 2.6 for the definition of these quantities) gave very similar values for the Clayton copula and the independent model. The difference is bigger between these two models when using the WAIC criterion computed with \(p_{WAIC,2}\), and tends to favor the independent model.

Table 2 DIC and WAIC criteria and their associated effective number of parameters, for each habitat type and each copula family. WAIC is calculated using the effective number of parameters \(p_{WAIC}\) and \(WAIC_2\) using \(p_{WAIC,2}\). The values obtained with the different runs of the algorithm were very similar, and we give here the smallest criteria values

When Kendall’s tau is low, identifying the right copula is difficult, particularly for families sharing similar characteristics under small values of the Kendall’s tau (Huard et al. 2006).

3.3 Floral cover estimates

Floral cover parameters for each land use type and each period were chosen from the copula identified by model selection (Table 3). The corresponding assessments of land use and seasonal period specific floral cover are plotted in Fig. 9, together with expert judgments.

Table 3 Median and 95% credible interval of floral cover for four land use types
Fig. 9
figure 9

Comparison of the spatial variability of floral cover across a landscape for four land use types and two seasonal periods estimated by our model, and provided by expert judgment when available. The colored regions are 95% prediction intervals based on the posterior distributions of the parameters, and the shape of these regions reflect the different shapes obtained for the Beta distribution, for different values of \(\mu _1\), \(s_1\), \(\mu _2\) and \(s_2\). The solid dark green line is the prediction obtained with the posterior median, and the solid light green line is the triangle distribution obtained using expert knowledge. The ticks on the x-axis mark data on floral cover

In semi-natural grassland and field edges, both the mode and the variance of floral coverage are higher in the first period. This may reflects a loss of floral resources with grazing and/or cutting, as well as to the ending of the main flowering period of species that account for a high cover early in the season, such as dandelions. Conversely, the mode and the variance of floral coverage in flower strips is higher during the second period. Floral cover in flower strips were estimated to be smaller/higher than semi-natural grasslands early/late in the season. Floral cover in oil seed rape were estimated to be close to 100%. There is also a very large variability in the floral cover of oilseed rape fields, which might be due to differences in varieties and in sowing dates from one field to the other.

Results were similar across copula families, especially for the main parameters of interest, i.e. the parameters of the Beta distribution, which means that our results are robust with respect to the choice of the copula family. Figure 10 illustrates the posterior distributions obtained for semi-natural grasslands with each copula family.

While the results are consistent across the different copula families, floral cover estimates were different when no data dependencies were considered (Table 4).

Fig. 10
figure 10

Comparison of the posterior densities obtained with each copula family in semi-natural grasslands

Table 4 Median and 95% credible interval of floral cover when assuming independence between the observations

As illustrated in Fig. 10, uncertainty associated to each parameter is lower in the independent case, due to the fact that we artificially increase the quantity of information of the sample when assuming that data are independent. We observed a larger difference between the modes of the posterior distributions for parameters \(\mu _1\) and \(\mu _2\), whether we accounted for the dependence or not.

3.4 Expert bias

Expert judgment on the mode of floral cover are represented by triangular distributions using the minimum, maximum and best guess values provided by the experts (Fig. 9). For small floral coverage values, parameters provided by expert judgment are higher than those obtained with our model. For semi-natural grasslands in the first period the minimum value provided by the experts is even higher than the mode of the Beta distribution estimated with our approach. This indicates that the expert elicitation suffer from low discriminative power, i.e. that low proportions are judged as higher than they are (O’Hagan et al. 2006). In semi natural grassland and field edges, experts tended to overestimate variability, i.e. they believe there are more flowers than there actually are. On the contrary in the case of oilseed rape fields, the expert’s judgements were less biased but more narrow compared to the posterior variability (Fig. 7).

4 Discussion

In this paper, we proposed a model to estimate the floral cover in different land use types by taking into account the dependency between data observed at repeated times in the season. Our results showed that failing to account for the dependency between observations can have an impact on parameter estimates.

The four land use categories represented in our study correspond to very low (semi-natural grassland, flower strips, field edges) or very high (oilseed rape fields) flower coverage, and our results tend to show that in such extreme situations it might be easier for experts to provide over estimated or under estimated values. For these types of habitats, using empirical evidence and accounting for the potential dependency between the data is therefore crucial in order to provide accurate estimates of the corresponding floral coverage.

The model is based on a partition of the season into two periods, one early and one late season, in order to capture the within-season dynamics. The methodology can be easily generalized to more than two periods, for example by considering other nodes on the hierarchical copula. The framework of hierarchical Archimedean copulas is very flexible, and one can also think of more complex hierarchical structures to account for other types of data dependency. In particular, if in our case the sampling sites were far enough from each others to assume spatial independence, this might not always be the case. Copulas can also be used to model spatial dependence, for example using the natural graph representation of neighboring points to construct a vine copula (Gräler and Pebesma 2011). Other time-dependent structures can also be implemented, for example conditional copula taking into account the time elapsing between two observations.

The focus has been here on floral cover to assess floral resources at the landscape scale. The variability of floral coverage in each period was modeled using a Beta distribution, while the dependency structure was accounted for using a nested Archimedean copula. The method is also applicable on other types of data on floral cover, e.g. using more trait-based approaches to quantify floral resources (Hicks et al. 2016). In our case, the marginal distributions were fixed as Beta distributions, but one can also think of more complex models where not only different copula families but also different marginal distributions are compared [see for example Silva and Lopes (2008)]. The model can then be modified to estimate land-use associated nectar and pollen resources across the season or species abundances. Another interesting perspective would be to consider a non-parametric model, where both the marginal distributions and the copula are estimated empirically.

To simplify the model, we assumed that the association between observations was stronger within than between periods, so that a hierarchical structure as described in Fig. 4 could be used. We also assumed that the parameters of the two copulas linking within-period observations were identical. This is a strong assumption, and the small sample sizes make a proper validation of this assumption difficult using empirical estimates of Kendall’s taus. However, our main interest here was not to accurately estimate the copulas parameters, but rather the Beta distributions parameters. Moreover, we obtained consistent and robust results with the different copula families, corresponding to different dependency structures. This enhance the confidence on the results.

Compared to the results obtained by expert elicitation, we are able to distinguish both the variability of the floral cover in the landscapes and uncertainty in model parameters and predictions of land use specific floral resources. Any bias discovered in the expert elicitation can be due to the procedure of elicitation rather than to the judgment of the experts itself. There are several reasons why expert judgments can be biased and there are ways to consider these. One issue is the differentiation between asking for uncertainty in the mode of floral cover, and asking for the full range of variability in floral cover. Experts are actually asked to provide a range of a mixture of aleatory and epistemic uncertainty, which makes the elicitation harder (O’Hagan et al. 2006). A difference between expert judgments and data can also be explained by observation biases. For example, the floral cover data in semi-natural habitats had actually been sampled from the parts of selected landscapes where floral cover was, from visual inspection, judged as relatively high. This sampling strategy deliberately introduced an over estimation of floral cover in semi-natural habitats, which can be motivated to get an upper estimate of floral cover in habitats with in general, a low amount of flowers. In this study, this bias in floral data from semi-natural habitats does not change the conclusions, since the floral cover updated from data were still lower than the judgment provided by experts (Section 2.2).

Evaluating the availability of floral resources is a key issue when studying the dynamics of pollinators populations in the landscape. This is particularly true when dealing with spatially explicit pollination models, which use land use type and seasonal period-specific floral resources as inputs. An accurate estimation of these quantities is then crucial in order to enhance the model predictions, in particular for the main land use types encountered in the landscapes of interest.