Bayesian design methods for improving the effectiveness of ecosystem monitoring

Thilan, A. W. L. Pubudu; Peterson, Erin; Menéndez, Patricia; Caley, Julian; Drovandi, Christopher; Mellin, Camille; McGree, James

doi:10.1007/s10651-024-00623-9

Bayesian design methods for improving the effectiveness of ecosystem monitoring

Open access
Published: 04 July 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Environmental and Ecological Statistics Aims and scope Submit manuscript

Bayesian design methods for improving the effectiveness of ecosystem monitoring

Download PDF

A. W. L. Pubudu Thilan^1,2,3,8,
Erin Peterson^1,2,7,
Patricia Menéndez⁴,
Julian Caley^1,2,
Christopher Drovandi^1,2,3,
Camille Mellin^5,6 &
…
James McGree^1,2,3

282 Accesses
Explore all metrics

Abstract

Adaptive design methods can be used to make changes to survey designs in ecosystem monitoring to ensure that informative data are collected in an ongoing, cost-effective, and flexible manner. Such methods are of particular benefit in environmental monitoring as such monitoring is often very costly and in many cases consists of only a few sampling sites from which inference about a larger geographical region is needed. In addition, ecological processes are continuously changing, and monitoring programs must account for both known and unknown drivers, so making changes to data collection plans over time may be needed based on the current state and understanding of the process of interest. Through considering a Long-term Monitoring Program of Australia’s Great Barrier Reef, this paper aims to develop adaptive design approaches to efficiently monitor coral health through the consideration of a statistical model that accounts for both spatial variability and time-varying disturbance patterns. In particular, to develop this model, we considered time-varying disturbance data that have been reproduced at a fine spatial resolution for uniform representation over the study region. By adopting our proposed approach, we show that adaptive designs are able to significantly reduce survey effort while still remaining effective in, for example, quantifying the effects of different environmental disturbances.

Bayesian Calibration of Blue Crab (Callinectes sapidus) Abundance Indices Based on Probability Surveys

Article 24 July 2017

Optimization of sampling effort for a fishery-independent survey with multiple goals

Article 16 April 2015

Evaluating impacts using a BACI design, ratios, and a Bayesian approach with a focus on restoration

Article Open access 08 September 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The health and long-term resilience of ecosystems worldwide are at risk due to rising environmental and human impacts (Hoegh-Guldberg et al. 2007; Jackson et al. 2001). Coral reefs are such ecosystems impacted by, for example, bleaching, crown-of-thorns starfish (CoTS), and cyclones disturbances (Hoegh-Guldberg et al. 2007; Sweatman et al. 2011). However, by effectively monitoring such ecosystems, it should be possible to identify their vulnerabilities for different environmental disturbances. Therefore, the information gained through such monitoring will enable the development of informed management practices and policies to reduce risk of decline (e.g. due to disturbances) and foster more resilient ecosystems (Lovett et al. 2007).

The Australian Institute of Marine Science (AIMS) monitors coral reefs in the Great Barrier Reef (GBR) using its Long-Term Monitoring Program (LTMP). The LTMP surveys are based on a fixed design in that data are gathered from predetermined reefs, and sites within reefs, that do not change over time (Sweatman et al. 2011). As such, the LTMP does not incorporate information gained from previous years of data collection, nor does it allow for disturbance data to be included when selecting reefs for future surveys (Miller et al. 2003). Similarly, other ecosystem monitoring programs are often based on such fixed design approaches (Carstensen et al. 2003; Strobl and Robillard 2008). Thus, there is the potential to enhance these monitoring practices through considering adaptive survey methods, which enables such monitoring programs to incorporate new information (e.g. on disturbances such as bleaching events) when selecting future survey locations.

Kang et al. (2016) proposed such adaptive design methods to lower experimental costs and increase the information being captured on coral cover; an indicator of coral health (Osborne et al. 2011; Bruno and Selig 2007). For this purpose, the authors developed a linear regression model for logit transformed hard coral cover proportions incorporating variables such as Year, $\hbox {Year}^2$, Shelf position, and Open/Closed for fishing. The advantage of this model is that it is simple and computationally efficient to implement. However, the disadvantage is that it is mechanistically too simplistic, does not capture the influence of disturbances (e.g. cyclones, coral bleaching, or land-based pollution), and fails to describe the spatial variability inherent in coral reef communities. As such, the adaptive designs found by the authors may not be the most appropriate for future data collection.

When sub-optimal survey designs are used, in general, more data or more survey effort is required to achieve monitoring objectives. As the cost of monitoring is mostly related to collecting samples (Hill and Wilkinson 2004), adopting sub-optimal designs may thus unnecessarily inflate monitoring costs. Furthermore, if the approach does not capture disturbance information, the resulting designs may not collect data to precisely estimate the environmental impact of such disturbances. For these reasons, it is important that decisions about future surveys are based on the most appropriate model for the data that can be developed.

This paper proposes new adaptive design methods to reduce monitoring costs and identify survey locations for quantifying the impacts of time-varying disturbances within an ecosystem, and applies these methods to the GBR as a case study. A holistic approach is utilised, integrating a statistical model with information from previously collected data into the adaptive design process. As a basis for this approach, a model is developed to capture the underlying variability within the collected data such as spatial variability and time-varying disturbances, and uses this information to guide future data collection. In doing so, this study will not focus on monitoring objectives specifically related to the GBR but rather considers a general monitoring objective to demonstrate the benefits in adopting an adaptive design approach. As such, we hope our approach and our results are of broad interest and potentially applicable across a variety of environmental monitoring programs where reduced costs and more informative data collection is sought.

2 Case study

Due to the relatively large amount of available data and the large and diverse range of disturbances that have occurred over time, the Whitsunday region of the GBR (Fig. 1) was selected as the region of interest for this paper (Osborne et al. 2011). In this region, a total of three sites are sampled on each reef, and three reefs are surveyed in each of the inner, middle, and outer reef habitats (5 observations $\times$ 3 sites $\times$ 3 reefs $\times$ 3 habitats) (Jonker et al. 2008). Five coral cover observations are collected at each site based on five $50\times 1$ $\hbox {m}^2$ transects which are located at a depth between 6–9 m each separated by at least 10 m and parallel to the reef crest. In some years, surveys could only be partially completed due to adverse weather conditions, resulting in fewer observations. Consequently, there were a total of 1077 observations in the data set collected in 2002, 2004, 2005, 2007, 2009, 2011, 2013, and 2015.

AIMS has been collecting data from coral reefs in the GBR annually between 1993 and 2005, and biennially thereafter using the LTMP surveys (Sweatman et al. 2008). Prior to 2006, images were taken based on digital photographs at 1 m intervals, and after that, the surveys were undertaken via video frames (i.e. many still images). To estimate coral cover at the site-level, 40 randomly selected images out of 50 are being considered. Once 40 images are chosen, the coral cover estimate is obtained by projecting five points onto each selected image, which are subsequently classified manually as being hard coral cover or not, by a marine scientist (Jonker et al. 2008).

Based on the available data and in discussion with coral experts from AIMS, a total of 14 potential covariates were considered including natural and anthropogenic disturbances, physico-chemical conditions, and topographic position that might have a direct or indirect influence on the coral cover (Table 1). Depending on whether the values of a covariate vary over time or not, one can classify them as either time-varying (i.e. 1st row) or static site-specific covariates (i.e. 2nd row). The time-varying disturbances include temporally and spatially varying disturbance data of three major environmental disturbances, namely cyclone, bleaching, and CoTS. We used time-varying disturbance covariates which have been reproduced at a fine spatial scale (0.01 $^{\circ }$ $\times$ 0.01$^{\circ }$) for a broad and consistent representation across the entire GBR (Matthews et al. 2019). The site-specific covariates consist of a total of ten topographic and physico-chemical covariates. Out of all of these covariates, one is discrete, four are categorical, and the rest are continuous. All the continuous covariates except CoTS were standardised (i.e. mean of 0 and a variance of 1) prior to use, as these were measured on different scales. The CoTS variable was treated differently due to the preponderance of zeros (i.e. absence of CoTS). This will be discussed later.

Table 1 Summary of the potential covariates considered for modelling hard coral cover. The spatial resolution is recorded in decimal degrees and the temporal resolution is sampling year

Full size table

3 Design framework

The adaptive design framework considered in this work consists of three key components. The first component involves modelling historical data from the ecological process being monitored (Fig. 2 (left)). For this purpose, a statistical model is fitted for the LTMP data, and as we will be working within a Bayesian inference framework, the resulting posterior distribution will then be used as the prior information for the design. In the second component, this prior information is exploited to assess the usefulness of newly proposed designs through quantifying monitoring objectives mathematically via a utility function (Chaloner and Verdinelli 1995) (Fig. 2 (middle)). Finally, the third component describes the optimisation approach used in this study and the different reef monitoring scenarios adopted for evaluating existing and proposed monitoring designs (Fig. 2 (right)).

3.1 Modelling historical data

We model hard coral cover as a function of the covariates described in Section 2. The model used for this purpose is described next.

3.1.1 Fit a statistical model

We fit a spatial Beta regression model to the LTMP hard coral cover data as such a model can be considered for bounded data (i.e. proportions) and can accommodate a variety of distributional forms including symmetric and skewed distributions (Verkuilen and Smithson 2012). In general, the Beta distribution is parameterised in terms of two shape parameters (Ferrari and Cribari-Neto 2004). However, ecologically it seems more sensible to model the mean (and potentially the precision) of the response variable as a function of covariates. Accordingly, we adopted a Beta regression model that was parameterised in terms of a mean and a precision parameter (Ferrari and Cribari-Neto 2004).

We assume that $y_{ijk}\sim \text {Beta}(\mu _{ik},\psi _{ijk})$ where $y_{ijk}$ denotes the j-th response, from the i-th site, in the k-th sampling year, where $i=1,\dots ,27$, $j=1,\dots ,5$, and $k=1,\dots ,8$. To account for potential relationships between coral cover and covariates (Table 1) and spatial autocorrelation, the following regression structure was assumed for mean coral cover $\mu _{ik}$:

$$\begin{aligned} \eta _{ik}=\varvec{G}(\mu _{ik})=\beta _{0}+\varvec{x}^{t}_{i} \varvec{\beta _{d}}+\varvec{z}^{t}_{ik}\varvec{\beta _{z}}+k\beta _{time} +r_{i}, \end{aligned}$$

(1)

where $\varvec{x}^{t}_{i}=(x_{1i},...,x_{{p}_{i}})$ represents static site-specific covariates at $\varvec{d}_{i}$ and $\varvec{z}_{ik}$ represents time-varying disturbances. The terms $\beta _{0},\,\varvec{\beta _{d}},\,\varvec{\beta _{z}},$ and $\beta _{time}$ are the intercept, and regression coefficients for the site-specific covariates, time-varying disturbances, and sampling year, respectively. Appropriate covariates to model the probability of hard coral cover were determined based on expert opinion and those found to be significant in previous studies. In addition, plots were used to assess whether certain modelling assumptions were appropriate e.g. based on Eq. (1), it is assumed that a continuous covariate has a linear relationship with the logit of the probability of hard coral cover (see Supplementary Material S3, for more details). Since the support of the Beta distribution is (0, 1), a transformation for the mean $\mu _{ik}$ of $y_{ijk}$ requires mapping the mean linear predictor to the interval (0, 1). A typical choice for this is the logit link function $\varvec{G(\cdot )}$ (Lagos-Alvarez et al. 2017), where mean coral cover is modelled as: $\mu _{ik}=\exp (\eta _{ik})/(1+\exp (\eta _{ik}))$.

The precision parameter $\psi$ is a scalar and represents the inverse of the variance of data from a Beta distribution. A model for the precision parameter based on covariate information could be developed, and it could also remain constant/common over the entire region (Lagos-Alvarez et al. 2017). Both forms of the model were considered for modelling our data, with the former having a linear predictor as described above but with a log link function and distinct parameter values (Smithson and Verkuilen 2006).

Preliminary work on adaptive design for GBR by Kang et al. (2016) chose the Cooktown Lizard Island region of the GBR for their study. As the reefs were located sufficiently far away from each other in that region, the authors neglected the spatial dependency in finding adaptive design. However, a different region, i.e. Whitsunday on the GBR is considered in this study. To capture potential spatial dependency in coral cover, a spatially correlated random effect $\varvec{r}|\varvec{\Sigma _{r}}\sim MVN(0,\varvec{\Sigma _{r}})$ was included based on distance between sites. The covariance matrix $\varvec{\Sigma _{r}}$ was chosen based on a Gaussian covariance function (Ecker and Gelfand 1997) defined as follows:

$$\begin{aligned} \varvec{\Sigma _{r}}=\sigma _{r}^2\exp \left( -\left( \dfrac{h_{i_{1} i_{2}}}{\phi }\right) ^2\right) ,i_{1},i_{2}=1,...,27, \end{aligned}$$

(2)

where $h_{i_{1} i_{2}}$ is the distance between sites $i_{1}$ and $i_{2}, \sigma _{r}^2>0$ is the variance of the spatial process (i.e. the partial sill), and $\phi >0$ is the range parameter.

3.1.2 Obtaining the posterior distribution of model parameters

Within a Bayesian framework, all inferences are based on the posterior distribution. This posterior $p(\varvec{\theta }\vert \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}} )$ is proportional to the prior distribution $p(\varvec{\theta })$ multiplied by the likelihood $p(\varvec{y}_{p_{0}}\vert \varvec{\theta }, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})$, where $\varvec{d}_{p_{0}}$ is the previously used design (or the initial survey design) for data collection, $\varvec{y}_{p_{0}}$ represents previously collected coral cover data, and $\varvec{z}_{p_{0}}$ represents previously collected time-varying disturbances. Here, the subscript $p_{0}$ is used to specifically denote previously collected/historical data. For the purposes of modelling our historical data, we considered vague prior information such that all inferences were essentially data driven, see Table S1 of Supplementary Material S1.

When it comes to finding optimal sampling designs, we will be required to find many posterior distributions, which can be computationally expensive. To circumvent this, throughout this paper we consider a computationally efficient approximation to the posterior distribution known as the Laplace approximation. In using a Laplace approximation, a Multivariate Normal distribution is formed as an approximation to the posterior distribution (see Section S1 in Supplementary Material S1, for more details specific to this application). We assessed the appropriateness of this approximation empirically, and we also note that we are considering a reasonably large sample size meaning we can lean on the Bernstein-von Mises theorem (Freedman et al. 1999) that states that, under certain conditions, as the sample size approaches infinity, the posterior distribution is Multivariate Normal. Another advantage of using this approximation is that it also provides an approximation to the model evidence which can be used for model choice as described below.

Bayesian model selection entails finding the most appropriate model to describe the data. For this purpose, we considered an $\mathcal {M}-$closed class of models (Bernardo and Smith 2009) where the true model is assumed to be contained within a finite set of M candidate models $\{m \in 1, 2, \dots , M\}$. Model formulation was accomplished using a forward stepwise approach. However, it should be noted that alternative approaches could be considered for this purpose (Burnham and Anderson 2002; Neuberg 2003). We chose a forward stepwise approach as opposed to backwards stepwise regression as there were many covariates in the data set, and some of these did not appear to be useful (based on the exploratory data analysis). In particular, to undertake stepwise regression, we started with the null model (i.e. intercept only), first on the mean and then on the precision parameter. We then included covariates (Table 1) one-at-a-time to determine which covariates, if any, improved the model fit based on the posterior model probabilities, where all models were assumed equally likely a priori. The posterior model probabilities, which were evaluated based on the model evidence approximated via the Laplace approximation (MacKay 2003), were thus used to determine the preferred model for the historical data.

3.1.3 Validate the fitted statistical model

A number of approaches were adopted to ensure our fitted statistical model appropriately described the historical data. First, models with and without a spatial component were compared, and it was found that including the spatial term was preferred (based on the posterior model probability). We also compared our model with the one developed in Kang et al. (2016), and found that our model was preferred based on a posterior predictive check. This involved simulating replicated data from each model and then comparing posterior predictive intervals with the observed data (Gelman and Hill 2006). In addition, block cross-validation was considered to evaluate the predictive performance of this model, and a prior sensitivity analysis was also conducted, see Section S2 in Supplementary Material S1 for further details.

3.1.4 Form prior for design

In Bayesian adaptive design, designs are evaluated via simulated supposed future data based on prior information. To simulate such data, parameter values $\varvec{\theta }$ are needed. For this purpose, the posterior distribution of the final model choice discussed above was used as the prior information or distribution $p(\varvec{\theta }\vert \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}} )$ (Table 2). This prior information was used for all subsequent design selections.

Table 2 Definitions of distribution functions

Full size table

3.2 Propose new alternative design and assess its usefulness

This section describes the approach used to compare and optimise designs in terms of addressing a specified monitoring objective which relates to the second component of our Bayesian adaptive design framework (Fig. 2 (middle)). In a broad sense, this procedure is similar to a power analysis (MacCallum et al. 1996; Fisher et al. 2019), in which one simulates data with known parameters, and then refits a model to determine whether a parameter of interest is significant or not. One of the primary distinctions here is that the “known” values come from a prior distribution, thus it is necessary to integrate over uncertainty in those values. Furthermore, instead of evaluating an indicator variable related to power, a utility function which reflects the aim of data collection is evaluated. Thus, a more general approach is considered in this study through considering a utility function in order to find adaptive designs for a variety of different aims as described below.

Understanding the objectives of reef monitoring is needed in adaptive design (Nichols and Williams 2006) so that an appropriate utility function can be formulated for the intended purpose of data collection. However, in the case of reef monitoring, programs have been created with ambiguous objectives in mind, such as monitoring the state and trends of coral cover without explicitly quantifying how this will be assessed (Kang et al. 2016). Kang et al. (2016) draw attention to a few selected monitoring objectives related to reef monitoring. For example, if there is a specific objective at hand in reef monitoring such as the precise estimation of the impact of coral bleaching or a CoTS outbreak, in such cases, an estimation utility could be appropriate to use, and this could focus on all parameters within a model or just specific parameters that may be of interest. Here, we focus on maximising the precision in estimating all model parameters as our measurable objective to find adaptive designs, and to also evaluate our approach against alternative methods (Nichols and Williams 2006).

3.2.1 Propose a design

Within our design framework, a survey design defines sites for data collection. As shown in Eq. (1), once a selection of sites has been determined, the values of the static site-specific covariates are known, and these will be used within the coral cover model. We will thus use $\varvec{d}_{i}$, for $i = 1,\cdots ,27$ to determine the values of the site-specific covariates across all sites in the Whitsundays. A design can then be defined as $\varvec{d}=(\varvec{d}_{1}, \varvec{d}_{2},\cdots , \varvec{d}_{i},\cdots ,\varvec{d}_{v})\in R^{n_{s}\times v}$, where $n_s$ represents the number of site-specific covariates included in the final model choice and v is the total number of sites within a design, noting that a given site may appear multiple times. Later, the notation $\varvec{d}^L$ will be used to represent the current LTMP design where all the sites are sampled (i.e. $v = 27$).

In the context of Bayesian experimental design, the goal is to find a design $\varvec{d}^*$ that is expected to provide the maximum amount of information to address a specific research question. To quantify such information, a utility function $u(\varvec{d},\varvec{y}\vert \varvec{d}_{p_{0}}, \varvec{y}_{p_{0}})$ is used, which describes the worth of choosing design $\varvec{d}$ (i.e. out of all the candidate designs) that yields data $\varvec{y}$ for achieving the specified objective (Ryan et al. 2016). Such a utility function depends on $\varvec{y}$ which is not yet observed. Furthermore, in natural ecosystems such as coral reefs, there are additional uncertainties associated with several factors, for example, where and when disturbances will occur (i.e. about time-varying disturbances $\varvec{z}$). Due to such uncertainties, a utility function cannot be applied directly to find $\varvec{d}^*$. Instead, the expectation of the utility function is taken with respect to such unknowns. Accordingly, an expected utility function can be defined as follows (Chaloner and Verdinelli 1995):

$$\begin{aligned} E[u(\varvec{d},\varvec{z},\varvec{y}\vert \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})]=\int _{\varvec{y}}\int _{\varvec{z}}&u(\varvec{d},\varvec{z},\varvec{y}\vert \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}}) p(\varvec{y}\vert \varvec{z},\varvec{d}, \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})\nonumber \\&p(\varvec{z}\vert \varvec{d},\varvec{\kappa }_{p_{0}},\varvec{z}_{p_{0}}, \varvec{d}_{p_{0}}) d\varvec{z} d\varvec{y}. \end{aligned}$$

(3)

The above expected utility is not evaluated based on specific values of time-varying disturbances, but rather evaluated across their distribution. Thus, an assumption must be made about the distribution of unobserved time-varying disturbances; in this case, that they follow a distribution $p(\varvec{z}\vert \varvec{d},\varvec{\kappa }_{p_{0}},\varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})$, see Section S2 in Supplementary Material S2 for further details.

Given our monitoring goal of maximising precision of model parameters, we chose a specific utility function for parameter estimation called Kullback–Leibler divergence (KLD) (Kullback and Leibler 1951; Friel and Pettitt 2008). The KLD is a measure of how one probability distribution is different from another. Here, as the two distributions, we consider the prior and posterior distributions. We formed the prior distribution $p(\varvec{\theta }\mid \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})$ above by considering the LTMP data (Section 3.1). For the posterior distribution $p(\varvec{\theta }\mid \varvec{y}, \varvec{z},\varvec{d}, \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})$, this was formed by additionally simulating supposed future data based on a proposed design $\varvec{d}$.

The evaluation of the expectation of KLD between the prior and the posterior provides the means for comparing different designs. The larger the KLD is, the more different the two distributions are. Lindley (1956) proposed that such a measure could be used for design selection if one is interested in maximising the expected information gained on the model parameters. We also note that this approach has been adopted by other authors (Cook et al. 2008; Huan and Marzouk 2013). We adopted the KLD utility expressed as follows (Friel and Pettitt 2008):

$$\begin{aligned} \begin{aligned} u(\varvec{d},\varvec{z},\varvec{y}\vert \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})=\int _{\varvec{\theta }}&p(\varvec{\theta }\vert \varvec{y},\varvec{z},\varvec{d}, \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})\\&\times \text {log}\, p(\varvec{y}\vert \varvec{\theta },\varvec{z},\varvec{d}, \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})d\varvec{\theta }\\&-\text {log}\, p(\varvec{y}\vert \varvec{z},\varvec{d}, \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}}). \end{aligned} \end{aligned}$$

(4)

3.2.2 Approximating the expected utility function

In general, one cannot evaluate the expectation defined by Eq. (3) directly, and therefore an approximation is needed. A straightforward approach to provide such an approximation is via Monte Carlo integration. Such an approximation can be expressed as follows (Ryan 2003):

$$\begin{aligned} \begin{aligned} E[u(\varvec{d},\varvec{z},\varvec{y}\vert \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})]\approx \dfrac{1}{T}\sum _{t=1}^T u(\varvec{d},\varvec{z}^{(t)},\varvec{y}^{(t)}\vert \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}}), \end{aligned} \end{aligned}$$

(5)

where $\varvec{z}^{(t)}\sim p(\varvec{z}\vert \varvec{d},\varvec{\kappa }_{p_{0}},\varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})$ and $\varvec{y}^{(t)}\sim p(\varvec{y}\vert \varvec{\theta }^{(t)}, \varvec{z}^{(t)},\varvec{r}^{(c)},\varvec{d},\varvec{y}_{p_{0}},\varvec{z}_{p_{0}},\varvec{d}_{p_{0}})$.

Further details of our approach to approximate the above expected utility are given in Supplementary Material S2 along with pseudo-code in Algorithm S1.

3.3 Optimisation and evaluation of the design

This section describes the third component of our Bayesian adaptive design framework: optimisation and evaluation of designs (Fig. 2 (right)). In order to find the optimal design, an approach is needed to maximise $E[u(\varvec{d},\varvec{z},\varvec{y}\vert \varvec{y}_{p_{0}}, \varvec{z}_{p_{0}}, \varvec{d}_{p_{0}})]$. The procedure used for this maximisation is described next, along with four reef monitoring scenarios to evaluate the resulting designs.

3.3.1 Optimise design

In the examples that follow, we optimise designs within reef monitoring scenarios where there are a number of sites to choose from. Thus, there will be a large but fixed number of potentially optimal designs. Enumerating all possible designs would be computationally infeasible, so we employ an optimisation algorithm. For searching within a fixed number of sites (i.e. a discrete design space), the coordinate-exchange algorithm (Meyer and Nachtsheim 1995) can be used.

To find an optimal design using the coordinate-exchange algorithm, for example a design with v sites, we initialise the algorithm using a random design that consists of v sites (i.e. an arbitrary selection of v sites out of a total of 27 sites). Then, the algorithm optimises one site at-a-time by holding all other sites fixed. This is achieved through iteratively substituting each alternative site (i.e. out of all the available sites) for the given site. The alternative site that maximises the expected utility is then selected for inclusion into the design. This process is then repeated for all sites in the design. As optimal choices for each location may change depending on what other sites have been selected, the algorithm iteratively cycles through the whole design a fixed number of times (i.e. a maximum number of iterations) or until no further improvement is observed in the expected utility.

3.3.2 Reef monitoring scenarios

First, we consider future disturbance patterns consistent with historical patterns, and find optimal designs using our approach and the approach from Kang et al. (2016). Second, we explore the performance of our designs in comparison to the LTMP design under reduced survey scenarios and a variety of different future disturbance patterns. Furthermore, where appropriate, the performance of our designs and those found by employing methods from Kang et al. (2016) are compared against random selection. For all cases, design efficiency was evaluated based on comparing our adaptive designs with the LTMP design, designs found based on methods proposed by Kang et al. (2016) and random selection.

3.3.3 Comparison with Kang et al. (2016) designs

Kang et al. (2016) derived adaptive design using a linear model. Our model differs from their original model by incorporating spatial covariates, the spatial correlation structure, and time-varying disturbances. To make a fair comparison with Kang et al. (2016) approach, we used their proposed linear model with our selected covariates without a spatial correlation structure. The resulting designs were then evaluated with respect to our Beta regression model with spatial random effects (Eq. (6)). To compare designs found by adopting methods from Kang et al. (2016), denoted as $\varvec{d}^K$, with our adaptive designs $\varvec{d}^*$, the design efficiency was evaluated (see Supplementary Material S2: Section S4, for more details on efficiency evaluations). Such an efficiency can be interpreted as the proportion of sampling required under $\varvec{d}^K$ to achieve an equivalent amount of information as given by $\varvec{d}^*$. A mean efficiency less than 100% would suggest that our designs are expected to yield more precise estimates of parameters compared to designs found based on methods from Kang et al. (2016) and vice versa for a mean efficiency greater than 100%.

3.3.4 Impacts of reduced survey

To further evaluate our proposed design framework, we found optimal designs under reduced survey scenarios. These scenarios were considered to investigate the possibility of lowering survey effort, and the subsequent impact this would have in terms of achieving the monitoring objective. This was investigated to determine which reefs or sites could potentially be dropped from the LTMP, and to assess the associated impact. For this purpose, two approaches were considered: 1) dropping reefs from the LTMP design, and 2) dropping a site from each reef within the LTMP design.

First, to determine the least informative reef, the approximate expected utility was evaluated for all designs formed by excluding one reef (only). Thus, the optimal design will consist of 24 sites (i.e. v = 8 reefs $\times$ 3 sites = 24 sites). The design that yielded the largest utility was inspected to determine which reef was missing and this reef was then proposed as the least informative reef. This procedure was repeated to determine the next least informative reef, and so on. Second, we investigated the impact of dropping the least informative site from each reef. To do so, we sought an optimal design that consisted of 18 sites (i.e. v = 9 reefs $\times$ 2 sites = 18 sites), see Table 3 for a description of sites by each reef. For this latter investigation, the design optimisation was performed using the coordinate-exchange algorithm. Such an optimisation approach was not needed for the first investigation as there were a relatively small number of designs to choose from, so an exhaustive search was employed.

Table 3 Reefs in the Whitsunday region and the corresponding site numbers

Full size table

3.3.5 Impacts of different disturbance scenarios

Ecosystems are often affected by different environmental disturbances. Here, we investigate adaptive design subject to such disturbances under two scenarios. In the first scenario, we chose disturbance conditions to reflect historical disturbance patterns in the Whitsunday region. To achieve this, future time-varying covariate values were simulated solely based on estimated parameter values for each site using the previously collected time-varying disturbances data (Supplementary Material S2: Table S1). In the second scenario, we simulated different disturbance conditions to explore how resulting optimal designs may change. This second scenario includes four schemes, where CoTS disturbance conditions varied as follows:

(i)
One site from each reef affected;
(ii)
All the sites in the inshore reefs affected;
(iii)
All the sites in the middle-shelf reefs affected;
(iv)
All the sites in the outer-shelf reefs affected.

Under Scheme (i), we randomly selected one site from each of the nine reefs in the Whitsunday region and changed CoTS disturbance probability at those sites to equal 1; thus, this implies that there will be a CoTS outbreak at the corresponding reefs. For the next three schemes, we changed this CoTS disturbance proportion to equal 1 for each of the inshore (Scheme (ii)), the middle-shelf (Scheme (iii)), and the outer-shelf (Scheme (iv)) sites, respectively. The corresponding optimal designs were found using the coordinate-exchange algorithm.

To compare our optimal designs with the LTMP design $\varvec{d}^L$ under “Impacts of reduced survey” and “Impacts of different disturbance scenarios”, design efficiency was again used. In the case of exploring optimal designs under a reduced survey approach (i.e. compared to the $\varvec{d}^L$), the efficiency was based on the expected utility of the design $\varvec{d}^L$ (Supplementary Material S2: Eq. S3). The interpretation of the resulting mean efficiency is as given before with a mean efficiency close to 100% meaning little information is expected to be lost compared to the design $\varvec{d}^L$ by using our reduced survey design approach. As there will not be such survey reduction in the later investigation, the efficiency was based on the expected utility of the optimal design $\varvec{d}^*$, i.e. the inverse of Supplementary Material S2: Eq. S3.

4 Results

4.1 Quantifying prior information

Following the procedure described in Section 3.1, the most appropriate coral cover model found was:

$$\begin{aligned} \text {logit}(\mu _{ik})&=\beta _{0}+\beta _{1} \text {Middle-shelf}_i+\beta _{2} \text {Outer-shelf}_{i}+\beta _{3}\text {Open Reef}_{i}\nonumber \\&+\beta _{4} \text {Bathymetry}_{i}+\beta _{5} \text {Chlorophyll}_i+\beta _{6} \text {CRS}\_\text {T}\_\text {AV}_{i} +\beta _{7} \text {Cyclone}_{ik} \nonumber \\&+\beta _{8} \text {Bleaching}_{ik}+\beta _{9}\text {log CoTS}_{ik}+\beta _{10} \text {Time}_{k}+r_i,\nonumber \\&i=1,\cdots ,27\, \text {and}\,k=1,\cdots ,8. \end{aligned}$$

(6)

Baseline categories for categorical covariates are incorporated into the intercept (e.g. inshore for Shelf position, see Table 1 for more details on covariates). A summary of the posterior distribution of the parameters for the above model is provided in Table 4. The posterior means and standard deviations are shown with 95% credible intervals. The credible intervals indicate that all parameters are significant (i.e. do not contain zero) except the coefficients for Time, Middle-shelf, and log CoTS. In general, these results are consistent with what other similar studies have reported (e.g. Kang et al. (2016); Vercelloni et al. (2017); Peterson et al. (2020); MacNeil et al. (2019)). However, some variation is expected as we are only focusing on a particular region on the GBR, and we fit a different model (Table 5).

Table 4 The retained and dropped sites within the optimal design and the corresponding shelf and reef details. The first column relates to the three shelf positions in the region. The reef names and the corresponding numbers are shown in the next two columns, respectively. The last two columns report the retained and dropped sites from the corresponding reefs under the optimal design

Full size table

Table 5 Summary of the posterior distribution of the model parameters

Full size table

4.2 Optimisation and evaluation of the design

4.2.1 Comparison with Kang et al. (2016)

When comparing $\varvec{d}^K$ with $\varvec{d}^*$, an average design efficiency of 47 (0.49)% was found, where the standard deviation is provided in the parenthesis. It is worth noting that an efficiency of less than 100% is expected here as both designs were evaluated based on the spatial Beta regression model, i.e. the model assumed when finding our design, so justification of the appropriateness of including a spatial variability term in the model is needed. We note that this was provided in the model selection section, see Section 3.1. Interestingly, the resulting relatively low efficiency value for $\varvec{d}^K$ compared to $\varvec{d}^*$ (i.e. 47%) indicates that our optimal design is expected to provide more informative data than the data gathered using the approach of Kang et al. (2016). More specifically, this efficiency value implies that over twice as much sampling is needed using $\varvec{d}^K$ compared to $\varvec{d}^*$ to achieve an equivalent amount of information about our monitoring objective.

4.2.2 Impact of reduced surveys

The effect of reduced surveys achieved by dropping reefs and sites was compared with the current LTMP design in the Whitsunday region. To begin this investigation, dropping reefs was considered, and the design efficiencies for dropping one, two, and three reefs are shown in Fig. 3. In the case of dropping one reef, d5 has the highest mean efficiency of approximately 89 (1.13)% (Fig. 3(a)). One reef is missing in each design here, and the missing reef in the optimal design d5 is Hayman Island. Similarly, under dropping two reefs, d2 has the highest mean efficiency of approximately 81 (0.59)% (Fig. 3(b)). The corresponding missing reefs are Hayman Island and Rebe. Interestingly, some designs remain more than 75% efficient even after dropping three reefs; a one third reduction in survey effort (Fig. 3(c)).

It is of interest to understand why particular reefs/sites are less informative compared to other reefs/sites in the region. Potential reasons for a given reef/site being less informative compared to others could include the differences in covariate values between reefs/sites, and the distance between reefs/sites (i.e. spatial effect in the model). In terms of the results related to dropping reefs (discussed in the previous paragraph), it seems that information about a dropped reef may be being leveraged from data collected at neighbouring reefs. This can potentially be seen in Fig. 4 where we observe that the least informative reef, i.e. Hayman Island is close to Langford-brid, and the second least informative reef, i.e. Rebe is located close to Hyde. Moreover, it should be noted that these reefs are located within a distance that is less than the estimated range parameter. This suggests that those reefs are associated, see Supplementary Material S2: Section S6, for more details on a qualitative comparison of the optimal designs. In summary, it appears as though our designs contain locations that can leverage information about neighbouring locations (rather than sampling there) due to the spatial dependence within the data.

Next, we consider the results related to dropping the least informative site from each reef. The sites the optimal design retained are shown in Table 4 and the reader is referred to Fig. 5 for the locations of those sites. Although dropping the least informative site from each reef was equivalent to a one third reduction in survey effort, the optimal design was still able to maintain an approximate mean efficiency of 85 (0.44)%. This small decline in efficiency relative to the decline in survey effort suggests there is value in considering an adaptive design approach when survey effort is being reduced.

Next we explore potential reasons why one site appears to be less informative than the other two sites within a given reef. When investigating this, it is worth noting that distances between sites within a reef are at least 250 m apart (where possible) (Australian Institute of Marine Science 2021), and they are relatively constant compared to the distances between reefs. This perhaps suggests that the spatial component is not particularly driving the site selection. Instead, one could consider the differences between covariate values as a potential reason for site selection, and we will investigate this next.

For examining the covariate differences, summaries of the distributions of time-varying and other covariates at each site are shown in Fig. 6 (a)–(f). As can be seen, bleaching event proportions vary only between 0.125 and 0.128 among reefs in this region. Consequently, site selection may not be driven by differences in bleaching. Instead, if we consider bathymetry, Sites 11 and 12 from reef 20104S seem to have quite different values of bathymetry so this is a potential reason they are retained in the optimal design. For Sites 14 and 15 on reef 19138S, they share similar values for all covariates, so this is potentially why the optimal design drops one of them (i.e. Site 15) to retain the two most dissimilar sites. The same argument could be made for sites on the remaining middle-shelf reef, i.e. 19131S. In the outer-shelf, the optimal design keeps Sites 16 and 18 from Rebe reef. It seems this is due to the contrast in bathymetry and the mean temperature values of these sites. Furthermore, the optimal design retains Sites 2 and 3 on Broder Island reef (Table 4). However, sites on this reef share similar features (Fig. 6). The same is true for sites on the remaining two inshore reefs, i.e. Langford-bird and Hayman Island. In such cases, the site choice within these reefs may be largely inconsequential. In summary, these results seem to show that site selection appears to depend on variability in site-specific features as this would allow the effect of these covariates to be estimated more precisely. It is also potentially a feature of the sites being relatively similar distances apart (within a reef).

4.2.3 Impacts of different disturbance conditions

Two scenarios were considered to find designs that vary over time depending on the effects of environmental disturbances. In Scenario 1, environmental disturbances were simulated to match the historical disturbance patterns in the Whitsunday region. To understand how design points were distributed under this scenario, the location of sites within the optimal designs (i.e. $\varvec{d}^{*}$) were visualised with the current LTMP (i.e. $\varvec{d}^{L}$) sites (Fig. 7). To help interpret the results in Fig. 7, a dot plot was produced (Fig. 8), which shows the number of samples to be collected from each site under the optimal design. Accordingly, the optimal design does not collect data from all the sites in the Whitsunday region. Instead, it collects more data from a particular selection of sites. The mean design efficiency of $\varvec{d}^{L}$ compared to $\varvec{d}^{*}$ was 41 (1.40)% which suggests $\varvec{d}^{*}$ is providing highly informative data compared to $\varvec{d}^{L}$ when disturbance patterns similar to historical patterns are observed. This inefficiency for $\varvec{d}^{L}$ compared to $\varvec{d}^{*}$ suggests that $\varvec{d}^{*}$ provides highly informative data with less survey effort than $\varvec{d}^{L}$ for maximising precision in parameter estimation.

In considering why certain sites were sampled more than others and some not at all, one can explore the differences in covariate values and distances between reefs/sites, in the habitats, reefs, and sites scales. As a habitat is concerned, the optimal design collects more data from sites at a reef located far away from the other reefs in the same habitat (Fig. 7). Furthermore, when two reefs are nearby, the optimal design collects fewer data from sites at either of the reefs. For the reef level, Hayman is the only reef open for fishing in the inshore habitat. To capture the underlying contrast of this reef compared to others, the optimal design collects more data from sites on this reef (i.e. Sites 7, 8, and 9) (Fig. 8). Similarly, the only reef closed to fishing in the outer-shelf habitat is Slate, and thus, the optimal design collects more data from sites at this reef. Regarding the site level, the most diverse sites in the Whitsunday region in terms of covariates are Sites 11 and 27 (Fig. 6). To potentially estimate covariate effects, the optimal design collects more data from these two sites compared to the other sites in the region (Fig. 8). Overall, it seems that the optimal design collects more data from reefs/sites that are quite dissimilar from others for maximising precision in parameter estimation.

In Scenario 2, we determined optimal designs subject to CoTS disturbance under four survey schemes, i.e. one site from each reef affected, all the sites in the inshore reefs affected, all the sites in the middle-shelf reefs affected, and all the sites in the outer-shelf reefs affected. The mean efficiencies of the design $\varvec{d}^{L}$ compared to the optimal designs from each of these four schemes were 47 (0.47)%, 48 (0.57)%, 50 (0.59)%, and 51 (0.69)%, respectively. These efficiencies suggest $\varvec{d}^{*}$ is expected to capture approximately twice as much information as $\varvec{d}^{L}$ to address our monitoring objective. The optimal designs selected under these four schemes reflect similar patterns to those discussed above in Scenario 1, so they are given in Supplementary Material S2: Section S5 where further discussion is provided.

In addition, further evaluations of our optimal designs were undertaken in Scenario 1 and Scenario 2 (scheme i), see Supplementary Material S2: Section S6 for further details.

5 Discussion

This study contributes to improving the effectiveness of ecosystem monitoring through the use of adaptive design approaches, and this was demonstrated using hard coral cover data collected on the GBR. We demonstrated how an adaptive design approach can be used to appropriately leverage information from previously collected data, reduce monitoring costs/resources and capture a relatively large amount of information about coral health. Our results suggest that efficiency can be gained through leveraging information from nearby locations, and this perhaps could be an important insight to inform ecosystem monitoring practices more generally.

The current study compared the effect of having fewer LTMP sites in the Whitsunday region, either by removing reefs or removing one site from each reef. Most notably, removing these reefs and sites did not result in substantial loss of information about the parameters in our coral cover model. This is because our model captures spatial variability; thus, information about the coral cover at a particular site/reef could be obtained from neighbouring reefs. For example, Hayman and Langford-bird reefs are located nearby in the inshore habitat (Fig. 4), while the remaining inshore reef, Broder Island is relatively isolated. Therefore, Hayman reef was identified as the least informative reef in the Whitsunday region while Broder Island was retained. A similar pattern was seen in the outer-shelf habitat where Hyde and Rebe reefs are close to each other; thus, Rebe reef was identified as one of the least informative reefs. Out of interest, we also compared our optimal designs with those based on methods proposed by Kang et al. (2016). Our optimal designs performed well, and again appeared to exploit the spatial dependencies in coral cover, leveraging information from nearby locations, see Supplementary Material S2: Section S6, for further details.

Our other objective was to find designs that could change over time, depending on environmental impacts on ecosystems. Using the GBR as a case study, we found adaptive designs when disturbance patterns matched historical and potential future disturbance patterns. From this, we found that data were not collected from all LTMP sites but instead more data were collected, for example, in regions that were more affected by disturbances. The efficiency comparison of such designs with the LTMP suggested that the collected data were highly informative in terms of reaching the monitoring goal, i.e. maximising precision in parameter estimation. Such a finding may have implications for developing adaptive designs to quantify the impact of environmental disturbances in ecosystem monitoring.

In assessing this work, it is important to note that the choice of model can have a significant influence on the optimal design. We demonstrated this by considering two different models for coral cover data, and found a significant reduction in the performance of designs when they were found assuming a different model was preferred for coral cover. This highlights the important role the model plays in determining an optimal design and how well a given design can address a monitoring objective. Given this, we suggest more ecologically representative or mechanistic-based models be considered within design as such models should be more appropriate, in general, to describe coral cover, and thus this should lead to additional efficiencies in ecosystem monitoring.

In this paper, we focused on a general monitoring objective, which was to maximise the expected information gained on all model parameters simultaneously. Kang et al. (2016) proposed and implemented other utility functions that could be used in the reef monitoring context and more broadly for other ecosystems monitoring. Given the generic nature of the methods developed in this study, it is worth noting that it would, in principle, be straightforward to consider a variety of other monitoring objectives such as prediction accuracy, trend estimation, or combination of both. Of interest is whether similar efficiency as observed in this paper would be gained from using an adaptive design approach with such utilities.

There is scope to extend the methods presented in this paper. For example, we did not consider the potential spatial and serial correlation of time-varying disturbances, or the correlations that may exist between such variables, or lagged effects of some covariates (such as water quality and SSTA) on coral cover. The effects of such associations could be explored in future studies, and potentially could lead to more informative adaptive designs. Furthermore, previous research has shown that inferences can change depending on the spatial scale and extent of spatial smoothing that is considered (Kang et al. 2013, 2014). Therefore, it would be interesting to explore whether such changes significantly impact the chosen optimal design.

5.1 Summary of approach and outcomes

To potentially provide direction to apply our design framework to monitor other ecosystems, we provide a summary of our approach and some general outcomes. To initialise the process, we suggest defining the research question you wish to answer or the hypothesis you would like to test based on the planned future data collection. Such clarity will aid in the development of an appropriate analysis model and/or utility function to address this question/hypothesis. Next, propose the statistical model that will be used to analyse the collected data to provide an answer (with uncertainty) to your research question/hypothesis. To complement your future data collection, consider whether there are other sources of information that could be incorporated into your analysis to reduce this uncertainty. This could be, for example, expert elicited and/or previously collected data. Then, based on this, different future sampling designs can be proposed and assessed quantitatively in terms of how much they reduce the uncertainty about your research question/hypothesis. Such sampling designs can then also be assessed within a variety of scenarios that may be encountered in the future e.g. scenarios based on a variety of values of the considered time-varying covariates. Through adopting this approach and applying it to reef monitoring case study, the following general outcomes were found:

Considering an appropriate statistical model for the data can lead to sampling efficiency, so we suggest domain experts are included in the model development phase.
Contrast data at extreme values of covariates to learn about (linear) effects.
Collect data at levels that are most uncertain e.g. if there is relatively large variability between reefs, then collect data on a larger number of reefs rather than collecting multiple samples on the same reef.
If the model includes a spatial term and the aim is to maximise the precision of the model parameters, select locations that are not spatially correlated.

6 Software

A desktop computer with R version 3.5.2 and HPC system were used for the analysis.

Data Availibility Statement

The dataset used for this analysis is available at:

https://figshare.com/articles/dataset/GBR_Whitsunday_csv/9983396

References

Australian Institute of Marine Science (2021) Reef monitoring sampling methods. Available from https://www.aims.gov.au/docs/research/monitoring/reef/sampling-methods.html Accessed 25 September 2023
Beaman R (2017) High-resolution depth model for the Great Barrier Reef-30 m. https://doi.org/10.4225/25/5A207B36022D2
Bernardo JM, Smith AF (2009) Bayesian theory. John Wiley & Sons
Google Scholar
BOM (2014) EReefs marine water quality dashboard data product specification. Bureau of Meteorology. Available from http://www.bom.gov.au/environment/activities/mwqd/documents/data-processing-specification.pdf Accessed 26 September 2023
Bruno JF, Selig ER (2007) Regional decline of coral cover in the indo-pacific: timing, extent, and subregional comparisons. PLoS ONE 2(8):e711. https://doi.org/10.1371/journal.pone.0000711
Article PubMed PubMed Central Google Scholar
Burnham KP, Anderson DR (2002) Statistical Theory and Numerical Results. In: Guthery FS (ed) Model Selection and Multimodel Inference, A Practical Information-Theoretic Approach. Springer, New York, pp 352–436
Google Scholar
Carstensen J, Conley D, Müller-Karulis B (2003) Spatial and temporal resolution of carbon fluxes in a shallow coastal ecosystem, the kattegat. Mar Ecol Prog Ser 252:35–50. https://doi.org/10.3354/meps252035
Article CAS Google Scholar
CERF (2009) Marine biodiversity hub. Available from https://www.nespmarine.edu.au/ Accessed 07 October 2023
Chaloner K, Verdinelli I (1995) Bayesian experimental design: A review. Stat Sci 10(3):273–304. https://doi.org/10.1214/ss/1177009939
Article Google Scholar
Cook AR, Gibson GJ, Gilligan CA (2008) Optimal observation times in experimental epidemic processes. Biometrics 64:860–868. https://doi.org/10.1111/j.1541-0420.2007.00931.x
Article PubMed Google Scholar
Devlin M, Schroeder T, McKinna L et al (2012) Monitoring and mapping of flood plumes in the great barrier reef based on in situ and remote sensing observations. In: Chang NB (ed) Environmental Remote Sensing and Systems Analysis. CRC Press Boca Raton, FL, USA, pp 147–191
Google Scholar
Dunn RJ (2009) CSIRO Atlas of Regional Seas (CARS) Database. Available from http://www.marine.csiro.au/~dunn/cars2009/ Accessed 13 September 2023
Ecker MD, Gelfand AE (1997) Bayesian variogram modeling for an isotropic spatial process. J Agric Biol Environ Stat 2:347–369. https://doi.org/10.2307/1400508
Article Google Scholar
Ferrari S, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31:799–815. https://doi.org/10.1080/0266476042000214501
Article Google Scholar
Fisher R, Shiell GR, Sadler RJ et al (2019) epower: An R package for power analysis of Before-After-Control-Impact (BACI) designs. Methods Ecol Evol 10(11):1843–1853. https://doi.org/10.1111/2041-210X.13287
Article Google Scholar
Freedman D et al (1999) Wald Lecture: On the Bernstein-von Mises theorem with infinite-dimensional parameters. Ann Stat 27(4):1119–1141. https://doi.org/10.1214/aos/1017938917
Article Google Scholar
Friel N, Pettitt AN (2008) Marginal likelihood estimation via power posteriors. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70(3):589–607. https://doi.org/10.1111/j.1467-9868.2007.00650.x
Article Google Scholar
GBRMPA (2014) Great Barrier Reef (GBR) Features (Reef boundaries, QLD Mainland, Islands, Cays, Rocks, and Dry Reefs) shapefile. Great Barrier Reef Marine Park Authority GeoPortal. Available from https://eatlas.org.au/data/uuid/ac8e8e4f-fc0e-4a01-9c3d-f27e4a8fac3c Accessed 05 October 2023
Gelman A, Hill J (2006) Simulation for checking statistical procedures and model fits. In: Data Analysis using Regression and Multilevel/Hierarchical Models. Cambridge university press, p 155–165, https://doi.org/10.1017/CBO9780511790942
Great Barrier Reef Marine Park Authority (2014) Great Barrier Reef (GBR) Features (Reef boundaries, QLD Mainland, Islands, Cays, Rocks and Dry Reefs) (GBRMPA). Available from https://geoportal.gbrmpa.gov.au/ Accessed 07 September 2023
Hill J, Wilkinson C (2004) Methods for ecological monitoring of coral reefs. Australian Institute of Marine Science, Townsville, Available from https://portals.iucn.org/library/efiles/documents/2004-023.pdf Accessed 24 September 2023
Hoegh-Guldberg O, Mumby PJ, Hooten AJ et al (2007) Coral reefs under rapid climate change and ocean acidification. Science 318(5857):1737–1742
Article CAS PubMed Google Scholar
Huan X, Marzouk YM (2013) Simulation-based optimal Bayesian experimental design for nonlinear systems. J Comput Phys 232(1):288–317. https://doi.org/10.1016/j.jcp.2012.08.013
Article CAS Google Scholar
Jackson JB, Kirby MX, Berger WH et al (2001) Historical overfishing and the recent collapse of coastal ecosystems. Science 293(5530):629–637. https://doi.org/10.1126/science.1059199
Article CAS PubMed Google Scholar
Jonker M, Johns K, Osborne K (2008) Surveys of benthic reef communities using underwater digital photography and counts of juvenile corals. Long-term Monitoring of the Great Barrier Reef, Standard Operational Procedure (10). Available from https://www.aims.gov.au/sites/default/files/Sop%20No%2010.pdf Accessed 13 September 2023
Kang SY, McGree J, Mengersen K (2013) The impact of spatial scales and spatial smoothing on the outcome of Bayesian spatial model. PLoS ONE 8(10):e75957. https://doi.org/10.1371/journal.pone.0075957
Article CAS PubMed PubMed Central Google Scholar
Kang SY, McGree J, Baade P et al (2014) An investigation of the impact of various geographical scales for the specification of spatial dependence. J Appl Stat 41(11):2515–2538. https://doi.org/10.1080/02664763.2014.920779
Article Google Scholar
Kang SY, McGree JM, Drovandi CC et al (2016) Bayesian adaptive design: improving the effectiveness of monitoring of the Great Barrier Reef. Ecol Appl 26(8):2637–2648. https://doi.org/10.1002/eap.1409
Article Google Scholar
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Article Google Scholar
Lagos-Alvarez BM, Fustos-Toribio R, Figueroa-Zuniga J et al (2017) Geostatistical mixed beta regression: a Bayesian approach. Stoch Env Res Risk Assess 31(2):571–584. https://doi.org/10.1007/s00477-016-1308-5
Article Google Scholar
Lindley DV (1956) On a measure of the information provided by an experiment. Ann Math Stat 27(4):986–1005. https://doi.org/10.1214/aoms/1177728069
Article Google Scholar
Lovett GM, Burns DA, Driscoll CT et al (2007) Who needs environmental monitoring? Front Ecol Environ 5(5):253–260
Article Google Scholar
MacCallum RC, Browne MW, Sugawara HM (1996) Power analysis and determination of sample size for covariance structure modeling. Psychol Methods 1(2):130. https://doi.org/10.1037/1082-989X.1.2.130
Article Google Scholar
MacKay D (2003) Probabilitiesandinference. In: Information theory, inference and learning algorithms. Cambridge university press, p 281–457
MacNeil MA, Mellin C, Matthews S et al (2019) Water quality mediates resilience on the Great Barrier Reef. Nature Ecology & Evolution 3(4):620–627. https://doi.org/10.1038/s41559-019-0832-3
Article Google Scholar
Matthews SA, Mellin C, MacNeil A et al (2019) High-resolution characterization of the abiotic environment and disturbance regimes on the Great Barrier Reef, 1985–2017. Ecology 100(2):e02574. https://doi.org/10.1002/ecy.2574
Article PubMed Google Scholar
Meyer RK, Nachtsheim CJ (1995) The coordinate-exchange algorithm for constructing exact optimal experimental designs. Technometrics 37(1):60–69. https://doi.org/10.2307/1269153
Article Google Scholar
Miller I, Jonker M, Coleman G (2003) Crown-of-thorns starfish and coral surveys using the manta tow and scuba search techniques. Australian Institute of Marine Science Townsville, Australia, Available from https://www.aims.gov.au/sites/default/files/sop09.pdf Accessed 12 September 2023
Neuberg LG (2003) Causality: models, reasoning, and inference. Economet Theor 19(4):675–685
Google Scholar
Nichols JD, Williams BK (2006) Monitoring for conservation. Trends in Ecology & Evolution 21(12):668–673. https://doi.org/10.1016/j.tree.2006.08.007
Article Google Scholar
Osborne K, Dolman AM, Burgess SC et al (2011) Disturbance and the dynamics of coral cover on the Great Barrier Reef (1995–2009). PLoS ONE 6(3):e17516. https://doi.org/10.1371/journal.pone.0017516
Article CAS PubMed PubMed Central Google Scholar
Peterson EE, Santos-Fernández E, Chen C et al (2020) Monitoring through many eyes: Integrating disparate datasets to improve monitoring of the Great Barrier Reef. Environmental Modelling & Software 124:10455. https://doi.org/10.1016/j.envsoft.2019.104557
Article Google Scholar
Puotinen M, Maynard JA, Beeden R et al (2016) A robust operational model for predicting where tropical cyclone waves damage coral reefs. Sci Rep 6:26009. https://doi.org/10.1038/srep26009
Article CAS PubMed PubMed Central Google Scholar
Ryan KJ (2003) Estimating expected information gains for experimental designs with application to the random fatigue-limit model. J Comput Graph Stat 12(3):585–603. https://doi.org/10.1198/1061860032012
Article Google Scholar
Ryan EG, Drovandi CC, McGree JM et al (2016) A review of modern computational algorithms for Bayesian optimal design. Int Stat Rev 84(1):128–154. https://doi.org/10.1111/insr.12107
Article Google Scholar
Smithson M, Verkuilen J (2006) A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychol Methods 11(1):54. https://doi.org/10.1037/1082-989x.11.1.54
Article PubMed Google Scholar
Strobl RO, Robillard PD (2008) Network design for water quality monitoring of surface freshwaters: A review. J Environ Manage 87(4):639–648. https://doi.org/10.1016/j.jenvman.2007.03.001
Article PubMed Google Scholar
Sweatman H, Delean S, Syms C (2011) Assessing loss of coral cover on Australia’s Great Barrier Reef over two decades, with implications for longer-term trends. Coral Reefs 30(2):521–531. https://doi.org/10.1007/s00338-010-0715-1
Article Google Scholar
Sweatman H, Cheal A, Coleman G, et al (2008) Long-term monitoring of the Great Barrier Reef. Status Report No. 8. Australian Institute of Marine Science, Townsville, Australia. Available from https://eatlas.org.au/sites/default/files/eatlas/articles/gbraimsltmp-status-report-no8-2008.pdf Accessed 02 October 2023
Vercelloni J, Mengersen K, Ruggeri F et al (2017) Improved coral population estimation reveals trends at multiple scales on Australia’s Great Barrier Reef. Ecosystems 20(7):1337–1350. https://doi.org/10.1007/s10021-017-0115-2
Article Google Scholar
Verkuilen J, Smithson M (2012) Mixed and mixture regression models for continuous bounded responses using the beta distribution. Journal of Educational and Behavioral Statistics 37(1):82–113. https://doi.org/10.3102/1076998610396895
Article Google Scholar
Wickham H (2011) ggplot2. Wiley Interdisciplinary Reviews: Computational Statistics 3:180–185
Article Google Scholar

Download references

Acknowledgements

We would like to acknowledge computer resources and HPC services provided by QUT. We are also grateful to Terry Walshe, Helen Thompson, Paul Wu, Samuel Clifford, Julie Vercelloni, and Alan Pearse who provided comments on this work.

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions.

Author information

Authors and Affiliations

School of Mathematical Sciences, Queensland University of Technology, 2 George St, Brisbane, QLD, 4000, Australia
A. W. L. Pubudu Thilan, Erin Peterson, Julian Caley, Christopher Drovandi & James McGree
Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), Peter Hall Building, University of Melbourne, Melbourne, VIC, 3010, Australia
A. W. L. Pubudu Thilan, Erin Peterson, Julian Caley, Christopher Drovandi & James McGree
Centre for Data Science, Queensland University of Technology, 2 George St, Brisbane, QLD, 4000, Australia
A. W. L. Pubudu Thilan, Christopher Drovandi & James McGree
School of Mathematics and Statistics, Peter Hall Building, University of Melbourne, Melbourne, VIC, 3010, Australia
Patricia Menéndez
Australian Institute of Marine Sciences, 1526 Cape Cleveland Road, Townsville, QLD, 4810, Australia
Camille Mellin
The Environment Institute and School of Biological Sciences, G05 Benham Building, North Terrace, Adelaide, SA, 5005, Australia
Camille Mellin
Institute for Future Environments, Queensland University of Technology, 2 George St, Brisbane, QLD, 4000, Australia
Erin Peterson
Department of Mathematics, University of Ruhuna, Wellamadama, Matara, 81000, Sri Lanka
A. W. L. Pubudu Thilan

Authors

A. W. L. Pubudu Thilan
View author publications
You can also search for this author in PubMed Google Scholar
Erin Peterson
View author publications
You can also search for this author in PubMed Google Scholar
Patricia Menéndez
View author publications
You can also search for this author in PubMed Google Scholar
Julian Caley
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Drovandi
View author publications
You can also search for this author in PubMed Google Scholar
Camille Mellin
View author publications
You can also search for this author in PubMed Google Scholar
James McGree
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James McGree.

Ethics declarations

Financial interests

PT was supported by the Australian Technology Network of Universities Industry Doctoral Training Centre Scholarship. JM was supported by an Australian Research Council Discovery Project (DP200101263). CD and CM were supported by an Australian Research Councils Discovery Early Career Researcher Award funding scheme (DE160100741 and DE140100701).

Non-financial interests

None.

Additional information

Handling Editor: Luiz Duczmal.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 1087 KB)

Supplementary file 2 (pdf 4574 KB)

Supplementary file 3 (pdf 1440 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Thilan, A.W.L.P., Peterson, E., Menéndez, P. et al. Bayesian design methods for improving the effectiveness of ecosystem monitoring. Environ Ecol Stat (2024). https://doi.org/10.1007/s10651-024-00623-9

Download citation

Received: 10 November 2023
Revised: 22 April 2024
Accepted: 03 May 2024
Published: 04 July 2024
DOI: https://doi.org/10.1007/s10651-024-00623-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Bayesian design methods for improving the effectiveness of ecosystem monitoring

Abstract

Similar content being viewed by others

Bayesian Calibration of Blue Crab (Callinectes sapidus) Abundance Indices Based on Probability Surveys

Optimization of sampling effort for a fishery-independent survey with multiple goals

Evaluating impacts using a BACI design, ratios, and a Bayesian approach with a focus on restoration

1 Introduction

2 Case study

3 Design framework

3.1 Modelling historical data

3.1.1 Fit a statistical model

3.1.2 Obtaining the posterior distribution of model parameters

3.1.3 Validate the fitted statistical model

3.1.4 Form prior for design

3.2 Propose new alternative design and assess its usefulness

3.2.1 Propose a design

3.2.2 Approximating the expected utility function

3.3 Optimisation and evaluation of the design

3.3.1 Optimise design

3.3.2 Reef monitoring scenarios

3.3.3 Comparison with Kang et al. (2016) designs

3.3.4 Impacts of reduced survey

3.3.5 Impacts of different disturbance scenarios

4 Results

4.1 Quantifying prior information

4.2 Optimisation and evaluation of the design

4.2.1 Comparison with Kang et al. (2016)

4.2.2 Impact of reduced surveys

4.2.3 Impacts of different disturbance conditions

5 Discussion

5.1 Summary of approach and outcomes

6 Software

Data Availibility Statement

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Financial interests

Non-financial interests

Additional information

Supplementary Information

Supplementary file 1 (pdf 1087 KB)

Supplementary file 2 (pdf 4574 KB)

Supplementary file 3 (pdf 1440 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation