Occupancy modelling as a new approach to assess supranational trends using opportunistic data: a pilot study for the damselfly Calopteryx splendens
- First Online:
- Cite this article as:
- van Strien, A.J., Termaat, T., Kalkman, V. et al. Biodivers Conserv (2013) 22: 673. doi:10.1007/s10531-013-0436-1
There is limited information available on changes in biodiversity at the European scale, because there is a lack of data from standardised monitoring for most species groups. However, a great number of observations made without a standardised field protocol is available in many countries for many species. Such opportunistic data offer an alternative source of information, but unfortunately such data suffer from non-standardised observation effort and geographical bias. Here we describe a new approach to compiling supranational trends using opportunistic data which adjusts for these two major imperfections. The non-standardised observation effort is dealt with by occupancy modelling, and the unequal geographical distribution of sites by a weighting procedure. The damselfly Calopteryx splendens was chosen as our test species. The data were collected from five countries (Ireland, Great Britain, the Netherlands, Belgium and France), covering the period 1990–2008. We used occupancy models to estimate the annual number of occupied 1 × 1 km sites per country. Occupancy models use presence-absence data, account for imperfect detection of species, and thereby correct for between-year variability in observation effort. The occupancy models were run per country in a Bayesian mode of inference using JAGS. The occupancy estimates per country were then aggregated to assess the supranational trend in the number of occupied 1 × 1 km2. To adjust for the unequal geographical distribution of surveyed sites, we weighted the countries according to the number of sites surveyed and the range of the species per country. The distribution of C.splendens has increased significantly in the combined five countries. Our trial demonstrated that a supranational trend in distribution can be derived from opportunistic data, while adjusting for observation effort and geographical bias. This opens new perspectives for international monitoring of biodiversity.
KeywordsDetectionMonitoringDistributionCitizen science dataOdonataJAGS
Biodiversity is in decline worldwide (Butchart et al. 2010) and this had led to a growing concern for wildlife. Recently, the European Union launched a strategy aimed to halt biodiversity loss in the EU and restore it as far as feasible by 2020 (European Union 2011). In order to assess whether this target will be met, monitoring data are required on the status of many species, preferably at the European scale. However, data from standardised monitoring yielding information on European trends are scarce. Such information is currently mainly available for birds, some butterflies and some mammal species (de Heer et al. 2005; Gregory et al. 2005; European Environmental Agency 2007; van Swaay et al. 2008). For these species annual supranational population indices are available with confidence intervals allowing the statistical testing of trends. For birds and butterflies these species trends are combined into biodiversity indicators (European Environmental Agency 2007).
It seems hardly feasible to collect standardised monitoring data on a large spatial scale for other species groups. Yet, in many countries a great number of opportunistic records is available, i.e., observations collected without standardised field protocol and without a design ensuring the geographical representativeness of sampled sites. The opportunistic records are single records for particular species and day-lists of species, i.e., records of multiple species collected by a single observer on one site and date. In recent years, the number of opportunistic records has increased greatly, with data entry facilitated through internet portals (e.g. waarneming.nl and observado.org). These data, often labelled as citizen science data, are a potentially valuable source of information on changes in biodiversity (Schmeller et al. 2009; Devictor et al. 2010). However, these data should be used with caution because the non-standardised observation efforts and the often uneven geographical distribution of records make national trend assessments unreliable (Dennis et al. 1999; Dennis and Thomas 2000; Robertson et al. 2010; Szabo et al. 2010; Hassall 2012). It is even more challenging to assess supranational trends from such opportunistic data, because the imperfections in the data may differ between countries.
In recent years, dynamic occupancy models (MacKenzie et al. 2006; Royle and Kéry 2007) have been proposed to derive reliable trend information from opportunistic data (Kéry et al. 2010; van Strien et al. 2010, 2011). Occupancy models use presence-absence data and yield estimates of the percentage of occupied sites (occupancy), e.g. 1 × 1 km2, per year. These models take into account the imperfect detection of species and this characteristic makes them useful for analyzing opportunistic data. The basic idea to analyse opportunistic data is that, all else being equal, greater observation effort increases the probability of detecting a species, so variation in observation effort over the years can be translated into variation in species detectability (Kéry et al. 2010). Using such models, Van Strien et al. (2010) demonstrated that in the Netherlands the trends for seven dragonfly species derived from opportunistic records during 1999–2007 were similar to trends derived from standardised monitoring data.
To the best of our knowledge, no attempts have so far been made to assess supranational trends from opportunistic data using occupancy modelling. The aim of this study is to explore whether a supranational trend for dragonflies could be generated from opportunistic data while adjusting for the imperfections mentioned. Several European countries have databases with many opportunistic records of dragonflies. We used records of dragonflies from Ireland, Great Britain, the Netherlands, Belgium and France. As a test species we chose Calopteryx splendens, which is a widespread species in all five countries.
Materials and methods
All records used in this study were from adult dragonflies only. Data from Ireland include Northern Ireland and were obtained from the DragonflyIreland dataset managed by the Centre for Environmental Data and Recording (Northern Ireland) with the support of the National Biodiversity Data Centre (Republic of Ireland). Records collected are largely opportunistic and were submitted via email and websites. Data from Great Britain were obtained from the Dragonfly Recording Network of the British Dragonfly Society. Most records are opportunistic and verified by the national network of Vice County Recorders. The opportunistic data from the Netherlands were obtained from the National Database Flora and Fauna, maintained by the National Authority for Data concerning Nature. These data are owned by the Dutch Society for Dragonfly Studies, Dutch Butterfly Conservation, and the European Invertebrate Survey—the Netherlands. Most records are currently collected through the internet portals waarneming.nl and telmee.nl. Data from the Dutch Dragonfly Monitoring Scheme were excluded because these were based on standardised field work. Dragonfly data from Belgium are collected by the Flemish Dragonfly Society and the Walloon Dragonfly Working Group and through the internet portals waarnemingen.be and observations.be which are managed by Natuurpunt and Natagora. Data from France came from the database managed by the French Society of Odonatology (SFO). The French data have been collected within the framework of the Odonata’s national surveys, called INVOD (1980–2004) and CILIF (from 2004 onwards) (Dommanget 2002, 2010).
Different geo-reference systems were used in each country for the observations. Hence, all observations were converted to the Universal Transverse Mercator (UTM) system. Because we used 1 × 1 km as the definition of a site in our analyses, all observations were referenced to 1 × 1 km UTM squares.
Generating non-detection data
Almost all data obtained were records of species presence. But occupancy models also require absence data, more precisely non-detection data, to estimate detection probabilities. Detection probability is estimated from the pattern in the detections and non-detections in replicated visits at sites. Valid replicated visits are only those visits made in a period of closure within the year; this is the period during which a site is considered to be either occupied or unoccupied and not abandoned or colonised (MacKenzie et al. 2006).
The non-detection records were generated from the information of sightings of other dragonfly species, following Van Strien et al. (2010, 2011). Any observation of C.splendens was taken as 1 (detection), whereas we rated 0 (non-detection) if any other species but not C.splendens had been reported by an observer at a particular 1 × 1 km site and on a particular date within the closure period. Usually, C. splendens is observed between Julian dates 130–250 and we used as closure period Julian dates 150–220. We made an exception for Ireland, where we used Julian dates 160–260 because the species seems to have a later flight period there. Despite many dragonflies having advanced their phenology in recent decades (Dingemans and Kalkman 2008), data exploration revealed no changes in flight period of our study species during 1990–2008, so the closure period was kept the same for all years.
We fitted the models in a Bayesian mode of inference using JAGS (Plummer 2009) on the computer cluster LISA (https://subtrac.sara.nl), with essentially the same WinBUGS code (Spiegelhalter et al. 2003) as given by Royle and Dorazio (2008; p. 309), but in addition we estimated the intercept αt as a random year effect. We chose uninformative priors for all parameters, using uniform distributions with values between 0 and 1 for all parameters except δ1 and δ2 (values between −10 and 10), β1, β2 (values between −5 and 5) and αt (values between 0 and 5 for the standard deviation of the normal distribution used as prior for the random year effect; see Kéry (2010) for examples of WinBUGS code for random effects).
For each analysis, we ran three Markov chains with 15,000 iterations to ensure convergence as judged from the Gelman-Rubin Rhat statistic. We discarded the first 10,000 iterations as burn-in and used the remaining iterations for inferences. Model fits were assessed using Bayesian p-values. This value is near 0.5 for a fitting model and values close to 0 or to 1 indicate inadequate fits (Kéry 2010). Our p-values varied between 0.44 and 0.59, suggesting that model fits were adequate. The model produced annual estimates of occupancy, persistence and colonisation per country and their regression coefficients across years were estimated as derived parameters (Kéry 2010).
Ingredients to treat relative oversampling and undersampling of countries with respect to Calopteryx splendens
(10 × 10 km2 with C. splendens)
(1 × 1 km2 surveyed within range of C. splendens)
(% range/% sampling intensity)
276 (7.8 %)
1,768 (4.2 %)
962 (27.4 %)
15,513 (36.6 %)
317 (9.0 %)
15,798 (37.3 %)
230 (6.6 %)
3,987 (9.4 %)
1,725 (49.1 %)
5,275 (12.4 %)
Number of day-lists per country and day-list category in 1990–2008 in 1 × 1 km2 where Calopteryx splendens has been observed at least once
Single records data (%)
Short day-lists (%)
Comprehensive day-lists (%)
Trend in occupancy, colonisation and persistence (±se) of Calopteryx splendens per country in 1990–2008
Country (no. of sites)
Trend in occupancy
Trend in colonisation
Trend in persistence
−0.002 ± 0.006
−0.001 ± 0.012
0.001 ± 0.011
Great Britain (4,954)
0.007 ± 0.001a
0.004 ± 0.003
0.001 ± 0.001
The Netherlands (3,294)
0.007 ± 0.002a
0.005 ± 0.006
0.003 ± 0.002
0.014 ± 0.003a
0.003 ± 0.004
0.005 ± 0.003
0.002 ± 0.002
0.000 ± 0.008
0.001 ± 0.001
Detection probability of Calopteryx splendens (±se) per country and day-list category
Country (no. of sites)
Single records data
0.83 ± 0.06
0.75 ± 0.09
0.65 ± 0.10a
0.44 ± 0.03
0.55 ± 0.03a
0.62 ± 0.02a
0.38 ± 0.03
0.40 ± 0.03
0.50 ± 0.03a
0.45 ± 0.04
0.56 ± 0.04a
0.61 ± 0.04a
0.19 ± 0.02
0.44 ± 0.03a
0.68 ± 0.03a
We have described a new approach to compose supranational trends using opportunistic data. The approach takes into account the two main imperfections in opportunistic data. The non-standardised observation effort is dealt with by occupancy modelling and the unequal geographical distribution of sites by a weighting procedure.
In monitoring schemes variation in observation effort is minimized by adopting a standard field methodology, e.g. reporting all species detected at a site and adhering to a particular field method and timing of visits to a site. In contrast, variation in observation effort is substantial in opportunistic data. Many attempts have been made to extract trend information from opportunistic data, e.g. by comparing only sites that had been equally surveyed (see Hassall and Thompson 2010) or by a statistical correction method with a proxy for observation effort (Szabo et al. 2010). Occupancy models provide a more general method to control observation effort by assuming that variation in observation effort will result in a different detection probability of species, whatever its source may be. So, the variation in number and timing of field visits, variation in field efforts during a visit and variation in observer skills and in their readiness to report a species after detection are all assumed to be reflected in variation in detection. We adjusted for these sources of variation by taking into account detection probability in an occupancy model and we also included day-list category and Julian date as a covariate for detection.
Our trial of the new approach showed an increase of C. splendens which agrees well with expert knowledge of the species. As a direct cross-check for the trend in the Netherlands, we used independent monitoring data available for the Netherlands. We selected a subset of squares (n = 105) from which both opportunistic data and monitoring data were available and found a similar trend in occupancy in 1999–2010 (trend ± se −0.005 ± 0.006 and −0.004 ± 0.006 respectively). This confirms our earlier findings that opportunistic data may produce reliable trends if analysed by an occupancy model (Van Strien et al. 2010). Note that the decline found in the data used for comparison contradicts the overall trend found for the Netherlands. This is because the subset of squares was not representative for the whole country.
Sites from which we had opportunistic records were not selected by using a formal sampling design, but instead by the free choice of observers. This might lead to an unequal geographical distribution of sites and to biased results within countries (Yoccoz et al. 2001; Hassall 2012). We have ignored this potential bias, because we identified no clear skewed geographical distribution within countries (Fig. 2a). An exception is Great Britain, where England has a higher density of surveyed sites than Scotland. However, this is not relevant in our case because C. splendens does not occur in Scotland (Fig. 2b). Where apparent geographical bias at the national level exists, this could be treated by a post-stratification of sites, e.g. by using regions or habitat types as strata, followed by weighting of strata (van Swaay et al. 2002; Gregory et al. 2005), much as we did to calculate trends at the supranational level.
Like other dragonflies living in running water, our study species suffered considerably during 1950–1980 from water pollution, deterioration of aquatic vegetation and physical alterations to water bodies. Improvements in these conditions have led to local recovery (Ward and Mill 2004; De Knijf et al. 2006) and here we show that the species has increased on a large spatial scale as well. The lack of any change in France hides a disparity: in several catchments the quality of running waters has improved during the two last decades (Service de l’observation et des statistiques 2010), but in some other catchments water quality has not much or not at all improved (Service de l’observation et des statistiques 2009). In addition, the species has expanded its range northwards in the UK, probably as the result of a combination of the effects of climate change (Hickling et al. 2005) and of improved water quality of rivers and streams in the northern part of the UK, which acted as a barrier to range expansion due to historic water pollution (Ward and Mill 2004).
Some additional assumptions which may invalidate our results need to be addressed. First, in the occupancy modelling, we have assumed a period in the season during which no colonisation or extinction of the study species in sites happened. But dragonflies may disperse during the entire season. A lack of closure may lead to low estimates of detection probability and to positive bias in the occupancy estimate (Rota et al. 2009). This is a problem in case occupancy is taken to mean ‘permanent presence’. But if random movement occurs to and from sites that are not permanently occupied, as we believe to be the case with mobile organisms like dragonflies, the occupancy parameter should be interpreted as the proportion of sites “used” by the target species during the period over which closure is assumed (MacKenzie et al. 2006).
Secondly, we assumed that sightings of other species were informative about a non-detection of our study species. Some observers might have surveyed running waters in the 1 × 1 km2, which are possibly inhabited by C. splendens, so any detection of another species is indeed informative about a non-detection of C. splendens. Others, however, might have surveyed only fens or ditches or other habitats unsuitable for C. splendens. In the latter habitat types, the detection of other species is not informative about the detection probability of C. splendens. Nevertheless, we expect that this sampling behaviour does not lead to biased occupancy estimates. Kendall and White (2009) demonstrated that sampling of spatial subunits without replacement in a site leads to bias in occupancy estimates, but not sampling with replacement. We consider the collection of opportunistic data by many observers comparable to sampling with replacement, leading to a decent quality of our estimates.
Thirdly, our procedure to generate non-detections for our study species from sighting of other species will not work in practice if there are only a few species in a site or only rare species. Then day-lists will often have length zero, but such informative non-detections rarely enter the databases. In such situations, many records are presences of the study species leading to unlikely high detection estimates. This happens in Ireland, which is naturally poor in dragonfly species and where C. splendens is often found on its own (Kalkman et al. 2010; Nelson et al. 2011). Single records data form the largest group of records here (Table 2) and detection probability is exceptionally high (Table 4). To a lesser extent this is also true for short day-lists in Ireland. In such cases the mechanism to adjust for variation in observation effort via taking into account detection fails. Some form of standardised monitoring is probably the only option to achieve an unbiased trend estimate for this species in Ireland. Incidentally, any bias in the Irish data will hardly affect the supranational trend estimate, because Ireland contains only a limited share of this species anyway (Table 1).
Our trial demonstrated that supranational annual indices with confidence intervals and a supranational trend can be derived from opportunistic data, while adjusting for observation effort bias and geographical bias. The annual indices with confidence intervals allow the formal testing of trends. These characteristics make our approach superior to previous large-scale assessments of changes in species, such as for dragonflies by Clausnitzer et al. (2009).
Occupancy models, however, can only be applied if the data contain a sufficient number of replicated visits at sites within the season (MacKenzie et al. 2006). Outside Europe, the number of dragonfly records seems quite limited (see e.g. Hassall 2012), so the number of records from replicated visits might be too low for large-scale application of these models. But we suspect that over half of the EU member states currently have useful databases available with considerable amounts of opportunistic dragonfly records. Several other EU countries would be able to join with relatively little extra effort in data collection, for instance by focussing on the collection of records at a limited number of selected sites. This situation might be similar for some other insect groups in the EU, e.g. for grasshoppers, and is likely to be even better for butterflies. We envisage the growing databases of opportunistic data becoming an important source of information to track trends in multiple species groups. When owners of opportunistic data are prepared to cooperate in a Pan-European network, it should be feasible to achieve Pan-European trends in distribution for a number of species groups in the future. The usefulness of databases with opportunistic data can be further enhanced by encouraging the collection of day-lists rather than of single records (van Strien et al. 2010). However, trend information derived from opportunistic data will only be reliable if sufficient attention is given to using appropriate methods of analysis.
Our approach could make it possible to compile large-scale multispecies indicators, based on averaging annual indices per species. Such indicators resemble existing indicators for breeding birds and grassland butterflies (Gregory et al. 2005; European Environmental Agency 2007; van Swaay et al. 2008). But for dragonflies a supranational indicator would be based on changes in distribution rather than in population abundance as in the indicators for birds and butterflies. Finally, occupancy models enable to produce annual species distribution maps from opportunistic data (Kéry 2011; van Strien et al. 2011), which may facilitate large-scale studies on climate change, e.g. to compare range shifts of various species groups driven by climatic change (Devictor et al. 2012).
The work of AJvS is facilitated by the BiG Grid infrastructure for eScience (www.biggrid.nl). The DragonflyIreland project (2000–2003) was supported financially by National Parks and Wildlife Service, Northern Ireland Environment Agency and the National Museums Northern Ireland. SP’s post is funded by Countryside Council for Wales, Environment Agency, Esmeé Fairbairn Foundation, Natural England and Scottish Natural Heritage. The work of CV is supported by ARCir BioImpact (Arrêté 11000922). We are grateful to Grégory Motte for organising field work in Wallonia and to Willy van Strien for critically reading the draft and to Rita Gircour for improving the English. Marcel Straver helped in preparing Fig. 2. Finally, we thank all contributors to the dragonfly recording databases in the each country.