Quality of citizen science data and its consequences for the conservation of skipper butterflies (Hesperiidae) in Flanders (northern Belgium)
Citizen science projects have become important data sources for ecologists. However, opportunistic data are not only characterized by spatial and temporal biases, but probably also contain species identification errors, especially concerning morphologically similar species. Such misidentifications may result in wrongly estimated distribution ranges and trends, and thus in inadequate conservation measures. We illustrate this issue with three skipper butterflies (Hesperiidae) in Flanders (northern Belgium) using photographs uploaded with observations in data portals. Ochlodes sylvanus and Thymelicus lineola records had relatively low identification error rates (1 and 11 %, respectively), but the majority (59 %) of Thymelicus sylvestris records turned out to be misidentified. Using verified records only allowed us to model their distribution more accurately, especially for T. sylvestris whose actual distribution area had hitherto been strongly overestimated. An additional field study on T. sylvestris confirmed the species distribution model output as the species was almost completely restricted to sites with verified records and was largely absent from sites with unverified records. The preference of T. sylvestris for unimproved grasslands was confirmed by the negative correlation between its model-predicted presence and elevated nitrogen and ammonia levels. Thus, quality control of citizen science data is of major importance to improve the knowledge of species distribution ranges, biotope preferences and other limiting factors. This, in turn, will help to better assess species conservation statuses and to suggest more appropriate management and mitigation measures.
KeywordsAerial ammonia pollution Nitrogen-induced environmental change Ochlodes sylvanus Species distribution modelling Thymelicus lineola Thymelicus sylvestris
Citizen science, especially when supported by online tools, is clearly benefitting the fields of ecology and conservation, as it allowed for a step change in both the amount of data and the spatial scale at which they are collected (Dickinson et al. 2012). In Red List assessments, for example, opportunistically collected citizen science data are regularly used to calculate trends and distribution areas of species (Maes et al. 2015), which are two important criteria to estimate a species’ extinction risk (Mace et al. 2008). Opportunistic records, however, often induce problems in citizen science datasets because of their non-random sampling nature (Isaac and Pocock 2015). Different techniques have been applied to correct for spatial (Hill 2012) and temporal biases (Isaac et al. 2014) in such datasets. However, there is a need for increased emphasis on data quality, which is not only impacted by the temporal and/or spatial coverage of surveys, but also by the observer’s ability to correctly identify species (Dickinson et al. 2012; Hochachka et al. 2012; Kelling et al. 2015). Indeed, inexperienced citizen scientists are likely to induce errors in online databases by uploading misidentified species records. Unconditional use of such data may lead to, for instance, overestimations of distribution ranges, and thus underestimations of extinction risks, with erroneous conservation decisions as a possible consequence (Mace 1994). Therefore, an important issue when using citizen science data from online data portals is the reliability of the volunteers’ identification skills. This should not be a problem for some very conspicuous and unmistakable species (e.g. fox squirrel in the USA—Tye et al. 2016). However, some species groups are much more difficult to identify, which can sometimes only be done with certainty by experts and/or involves microscopic (e.g. micro-moths) or genetic analyses (e.g. Dincă et al. 2011).
Butterflies are popular organisms in citizen science, and often represent a large proportion of the records in online data portals (e.g. Maes et al. 2016). Such data allow for the compilation of distribution atlases and Red Lists at continental (Kudrna et al. 2011; van Swaay et al. 2011), national (e.g. The Netherlands—Bos et al. 2006; van Swaay 2006) and regional scales (e.g. Flanders—Maes et al. 2012, 2013). In Flanders (northern Belgium), for example, citizen scientists have increasingly been uploading butterfly records via http://www.waarnemingen.be, the online data portal of Natuurpunt, the largest nature NGO in Flanders. The data from http://www.waarnemingen.be are integrated in the global data portal observado.org. This resulted in a strong increase in the number of records in recent years (on average ca. 43,000 records/year in the period 2000–2010, vs. 101,000 records/year during the period 2011–2014—Maes et al. 2016). Flanders is a small (ca. 13,500 km2) region with only 70 indigenous and/or regular migrant butterfly species, of which 19 are extinct (Maes et al. 2012). Despite this low present-day species diversity (i.e. 51 species) and the existence of excellent field guides (Wynhoff et al. 2014), some species remain difficult to identify by inexperienced volunteers (e.g. Polyommatus icarus vs. Aricia agestis, Colias hyale vs. C. croceus, Pieris spp.). Another group with morphologically similar species are the skipper butterflies (Hesperiidae). Differences among these species are often subtle and some experience is, therefore, needed to correctly identify them in the field (Louy et al. 2007). However, the advantage of data portals is that photographs can be added to the uploaded observations. In Flanders, for instance, 11 % of all butterfly observations are accompanied by photographs (Maes et al. 2016). By carefully checking uploaded photographs, the original species identification can be verified, communicated to the observer and, if necessary, corrected in the data portal. This validation step strongly increases the quality of the data and, therefore, the reliability of projects using citizen science data (Kelling et al. 2015).
Here, we address this data quality issue in citizen science projects, by (i) contrasting the outcomes of species distribution modelling using either all records or only verified records of three often misidentified skipper butterflies (Hesperiidae: Ochlodes sylvanus, Thymelicus lineola and Thymelicus sylvestris) in Flanders, and (ii) performing a field survey which contrasts sites with and without photographic evidence of T. sylvestris, which among the three skipper butterflies in Flanders is the rarest and the species whose alleged observations display the highest misidentification rate. This approach allowed us to gain a better insight into the biotope preferences and tolerance levels towards aerial ammonia and nitrogen deposition of these three skipper butterflies in Flanders and also to suggest an update of their regional Red List status. We discuss the importance of correctly identified species records in citizen science projects for conservation and policy actions that make use of opportunistic data collected by volunteers.
Materials and methods
Study area and species
Flanders covers an area of 13,522 km2 and is situated in the north of Belgium (Fig. 1). It is mainly covered by agricultural land (51 % of the area) and urban areas (30 %). The population density is very high (477/km2—statbel.fgov.be) and semi-natural areas not only represent a limited area but are also highly fragmented (Poelmans and Van Rompaey 2009). This has led to a very high pressure on biodiversity and resulted in strong regional declines for several species groups (e.g. plants—Van Landuyt et al. 2008; butterflies—Maes and Van Dyck 2001).
In Flanders, three skipper butterflies—O. sylvanus (Esper, 1777), T. lineola (Ochsenheimer, 1808), and T. sylvestris (Poda, 1761)—are suspected to be regularly misidentified, and thus incorrectly entered in online data portals by inexperienced volunteers. According to field guides and text books, these three species share similar ecological resources (e.g. host and nectar plants, basking sites) and life-history traits (e.g. number of generations, flight period—Dennis 2010). Their biotope in NW-Europe is usually described as grasslands in the vicinity of woodlands (Bink 1992). Additionally, all three species are rather small and have an orange ground colour (Lafranchis 2004), making it difficult for inexperienced volunteers to correctly distinguish between them. Many recorders, fortunately, add photographs to their uploaded observations, which allows for a posteriori verification and validation by butterfly experts. The validation criteria for a positive identification of the three species, either in the field or from uploaded photographs, are given in Table 1. In order to quantify the extent of misidentifications, we subsampled all Flemish records of the three species from the period 2013–2014 and checked the 1739 photographs uploaded with them.
Shape antennal club
Underside antennal club
Orange base/black tip
Species distribution modelling
In order to model the potential distribution area of the three skipper butterflies in Flanders, all data of O. sylvanus, T. lineola and T. sylvestris from the period 2011–2015 were retrieved from http://www.waarnemingen.be (n = 18,958). All observations were attributed to 1 × 1 km2 grid cells of the Universal Transverse Mercator (UTM) projection (n = 14,344). A species was considered as present in a grid cell when it was recorded as such in the data portal. To define grid cells in which the species was absent, we used grid cells that were visited at least 20 times by butterfly experts but without observations of the species (Table 2).
Overview of the number of grid cells in which the species was observed (all records = PrsA; verified records only = PrsV) in the period 2011–2015; well-surveyed grid cells in which the species was considered absent (Abs); total number of grid cells in the calibration set (all records = CalA; verified records only = CalV)
1733 (60 %)
615 (53 %)
63 (28 %)
Per 1 × 1 km2 grid cell, we calculated the area of 11 land use types (source: Biological Valuation Map of Flanders—De Saeger et al. 2014; Table 3) and two soil variables (source: soil map of Flanders—OC-GIS Vlaanderen 2001): (i) soil texture, ranging from 1 (clay) to 8 (sand), and (ii) soil moisture, ranging from 1 (very dry) to 8.5 (very wet). Soil texture is an important variable because it can strongly influence the microclimate (Titeux et al. 2009), while soil moisture is important for the survival of the non-adult life stages (Tjørnløv et al. 2015). Prior to analyses, we normalised all variables using a square root transformation. Subsequently, these variables were used in species distribution models for each of the three skipper butterflies. Collinearity among variables was checked with the Variable Inflation Factor (VIF) function in the R package ‘car’ (version 2.1.1—Fox and Weisberg 2011) and correlated variables were excluded from the analysis (VIF values ≥3). In order to model the distribution of the three skipper butterflies in Flanders, we used the biomod2-package (Thuiller et al. 2012) in R version 3.1.1 (R Core Team 2015). We applied five different modelling algorithms that are frequently used in species distribution modelling (Elith et al. 2006; Li and Wang 2013): Generalized Additive Models (GAM—Hastie and Tibshirani 1987, so as to avoid overfitting, we limited the number of knots to 5 in the GAM algorithm), Generalized Boosted Regression Modelling (GBM—Friedman et al. 2000), Generalized Linear Models (GLM—McCullagh and Nelder 1989), Maximum Entropy (MaxEnt—Phillips et al. 2006) and Random Forest (RF—Breiman 2001). The dataset was split into a calibration set with which the models were built (70 %) and an evaluation set (30 %) with which we evaluated the models. Per species, we performed 20 random splits resulting in a total of 100 model runs per species (5 modelling techniques × 20 random splits). Apart from the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC—Swets 1988) with which the models were evaluated, we also calculated the sensitivity of the different models as a measure of the correctly predicted presences. Finally, we applied the ensemble modelling approach in biomod2 (Araújo and New 2007) to produce a predicted species distribution map, using only models with an AUC ≥ 0.7.
Average overall values with 95 % confidence intervals per 1 × 1 km2 UTM grid cell for the variables used in the modelling, and their species-specific average values (±SE) for grid cells with recent (2011–2015) verified species records
Arable land (ha)
20.17 ± 0.46a
25.20 ± 0.88b
11.68 ± 1.63c
Nutrient-rich grasslands (ha)
16.31 ± 0.33ac
17.50 ± 0.54a
12.36 ± 1.35bc
2.32 ± 0.23a
1.08 ± 0.23b
3.17 ± 1.08a
Small landscape elements (ha)
1.48 ± 0.05
1.53 ± 0.08
1.86 ± 0.28
2.15 ± 0.12
1.77 ± 0.17
2.30 ± 0.59
Unimproved rough grasslands (ha)
2.29 ± 0.11a
2.17 ± 0.16a
6.52 ± 1.26b
2.11 ± 0.11a
2.02 ± 0.23a
3.41 ± 0.57b
22.98 ± 0.52a
20.28 ± 0.80bc
18.74 ± 2.54ac
Nutrient-poor dry grassland (ha)
3.87 ± 0.15
4.09 ± 0.27
5.71 ± 0.97
Nutrient-poor wet grassland (ha)
1.02 ± 0.07a
2.02 ± 0.25b
1.88 ± 0.42b
Woodland edge (ha)
5.71 ± 0.14a
5.17 ± 0.26b
7.87 ± 0.81c
Soil moisture (1 = dry–8 = wet)
3.82 ± 0.03a
3.67 ± 0.05b
3.48 ± 0.16b
Soil texture (1 = clay–8.5 = sand)
5.37 ± 0.04a
5.13 ± 0.08b
5.70 ± 0.21a
Detailed field study of T. sylvestris
Since T. sylvestris was the rarest of the three skipper butterflies in Flanders (Maes et al. 2013) and since its records were suspected to have the highest identification error rate, we performed a more detailed survey of this species to get a better understanding of its regional distribution and biotope preferences. To select suitable study sites, we compiled all observations of T. sylvestris from the period 2013–2014 and removed duplicate locations (i.e. same date/location but from different observers). The resulting dataset contained 190 records and was split into two groups: (i) type A locations, i.e. records with photographic confirmation (n = 42) and (ii) type B locations, i.e. records without photographic evidence (n = 148). Additionally, we selected a third group, i.e. type C locations that consisted of semi-natural grasslands in the vicinity (1–5 km) of sites with verified records (type A locations) based on the Biological Valuation Map of Flanders (De Saeger et al. 2014) as potentially suitable habitat for T. sylvestris. From each of the location types, we randomly selected a subset of 15 locations with a minimum intragroup distance of 5 km. A minimum intergroup distance of 15 km was used between locations of type B and locations of type A/C (Fig. 2). During the main flight period of all three species (6–17 July 2015), we visited the 45 selected locations (type A, B and C) and recorded their abundance. When necessary, butterflies were caught with a net to assure a correct identification. In each location, a visit consisted of a 45 min walk in grassland habitat within a maximum radius of 500 m from the selected location. These visits only took place during favourable weather conditions (i.e. ambient temperature >20 °C, wind speed <3 Beaufort and sunny conditions—van Swaay et al. 2008). In order to minimize biases caused by weather conditions (cf. Wikström et al. 2009), sites from the three different location types were visited simultaneously by three different butterfly experts (PVT, AK and TM).
Differences in observed abundance—among the three location types, the three species, and the three observers—were analysed using Generalized Linear Models (GLM) with a negative binomial distribution (McCullagh and Nelder 1989). Since the species × location type interaction was highly significant (χ24 = 43.518; p < 0.0001), separate models were tested for each of the three skipper butterflies. In order to test relationships between species abundance and nitrogen-related pollution, species-specific GLMs with a negative binomial distribution were run with either the aerial ammonia concentration or the total nitrogen deposition as fixed effects. Ammonia concentration and nitrogen deposition were calculated per UTM 1 × 1 km2 grid cell based on air quality data in 2014 (VMM 2015) and refer to the yearly average aerial ammonia concentration (μg NH3/m3) and the total amount of nitrogen deposition (kg NHx + NOx/ha/year).
Verification of the 1739 records with photographic evidence of the three skipper butterflies showed that only 1 % of the O. sylvanus observations appeared to be incorrect. The misidentification percentage of the T. lineola records was 11 %, with the large majority (95 %) of misidentifications consisting of O. sylvanus. Regarding T. sylvestris, the majority (59 %) of the observations were incorrect, with 68 % of these misidentifications consisting of O. sylvanus and 32 % of T. lineola (Table 4).
Number of records and misidentifications of recent records (2013–2014) of three orange skipper species uploaded with photo in the online data portal http://www.waarnemingen.be
Originally classified as
Error rate (%)
Ecological differences among the three skipper butterflies
Comparing grid cells with verified records of the three skipper butterflies revealed subtle differences in their ecological preferences (Table 3): T. lineola occurred in grid cells with more arable land and nutrient-rich grasslands than T. sylvestris and O. sylvanus, and more nutrient-poor wet grasslands compared to O. sylvanus. O. sylvanus occurred in grid cells that are wetter and more urbanized than grid cells with the other two species, while grid cells with T. sylvestris had more unimproved (nutrient-poor) rough grasslands, scrub and woodland edges than grid cells with the other two species. Both O. sylvanus and T. sylvestris occurred on more sandy soils and in grid cells with more heathland than is the case for T. lineola (Table 3). The most important variables explaining the species distribution models were urban area, woodland edge and soil texture (sandy soils) for O. sylvanus, arable land, urban area and nutrient-rich grasslands for T. lineola and unimproved rough grasslands, moisture (dry) and urban area for T. sylvestris.
Species distribution modelling
Applying species distribution models using all (unverified) records for the three skipper butterflies in Flanders resulted in fair models (AUC > 0.7) for O. sylvanus and T. sylvestris but in a poor model (AUC < 0.7) for T. lineola. Using only verified records improved the model performance for O. sylvanus and T. sylvestris, but decreased that of T. lineola (Table 5; Fig. 3). The number of correctly predicted presences (i.e. sensitivity) was slightly higher for O. sylvanus using verified records, but did not differ significantly when using verified versus all records for the two other species (Table 5). The use of verified distribution records only versus all distribution records for modelling the potential distribution of the three skipper butterflies, however, resulted in moderate declines of the number of grid cells with predicted presences for T. lineola (−18 %) and O. sylvanus (−27 %), and in a strong decline for T. sylvestris (−51 %; Table 5; Fig. 3). Results were very comparable when using different threshold values for the selection of grid cells in which the species were considered as absent (i.e. 10 or 30 visits by butterfly experts; results not shown).
Summary of average AUC (±SD) of the models with all (AUCa) and with verified records only (AUCv), the sensitivity (i.e. the percentage of correctly predicted presences) with all (SENSa) and with verified records only (SENSv), the number of model runs with AUC ≥ 0.7 with all (n0.7a) and with verified records only (n0.7v) and modelled distribution area (number of 1 × 1 km grid cells) using all (Na) and verified records only (Nv). Difference indicates the difference (in %) between the number of predicted grid cells using all records (Na) and using verified records only (Nv)
0.703 ± 0.02
0.732 ± 0.02***
0.695 ± 0.02**
0.687 ± 0.02
0.704 ± 0.03
0.729 ± 0.05***
Grid cells in which T. sylvestris was predicted as present had a lower aerial ammonia concentration (−33.3 %) and a slightly lower nitrogen deposition (−4.4 %) than grid cells in which the species was predicted as absent (Table 6). For O. sylvanus and T. lineola, the opposite was true with grid cells in which both species were predicted as present having higher aerial ammonia concentrations (+10.4 and +83.5 %, respectively) and slightly higher or similar nitrogen deposition values (+4.4 and +0.4 %, respectively—Table 6). The grid cells in which T. sylvestris was predicted as present had on average the lowest values for aerial ammonia concentration and T. lineola the highest values, with O. sylvanus being intermediate (Table 6). For nitrogen deposition, the grid cells in which T. sylvestris was predicted as present had, on average, the lowest values and O. sylvanus the highest, with T. lineola being intermediate (Table 6).
Average aerial ammonia concentration (in μg NH3/m3 ±SE) and nitrogen deposition (in kg N/ha/year ±SE) in 1 × 1 km2 grid cells where species were predicted as present versus absent, based on all records or using verified records only
Aerial ammonia concentration
a4.70 ± 0.03***
3.45 ± 0.04
a4.58 ± 0.04***
4.15 ± 0.04
b5.44 ± 0.04***
2.79 ± 0.02
b5.69 ± 0.04***
3.10 ± 0.03
c3.25 ± 0.09
4.45 ± 0.03***
c2.95 ± 0.14
4.42 ± 0.03***
a23.1 ± 0.04***
22.2 ± 0.07
a23.2 ± 0.05***
22.3 ± 0.05
b22.8 ± 0.05
22.9 ± 0.06
b22.9 ± 0.05
22.8 ± 0.05
c22.2 ± 0.15
22.9 ± 0.04***
c21.9 ± 0.19
22.9 ± 0.04***
For T. sylvestris, plot types differed in abundance (χ22 = 51.635, p < 0.0001), with type A plots (plots with photographic evidence) containing 99 and 96 % more individuals than type B plots (plots without photographic evidence) and type C plots (semi-natural grasslands in the vicinity (1–5 km) of type A plots), respectively (Fig. 4). For the two other species, no differences in abundance were detected among the different plot types (p > 0.05) (Fig. 4).
Differences in the abundance of the three skipper butterflies were not explained by the site-specific amounts of nitrogen deposition (p > 0.05) (Fig. 5), and there was no difference in total nitrogen deposition among the plot types (p = 0.18). Plots with verified records of T. sylvestris (plot type A) were characterised by a lower aerial ammonia concentration than plots with unconfirmed records (plot type B) and plots in the vicinity of locations with confirmed records (plot type C) (A–B: −40 %, t = 1.97, p = 0.056; A–C: −30 %, t = 1.21, p = 0.023). Aerial ammonia concentrations did not have an influence on the abundance of T. lineola (p = 0.90). O. sylvanus was more abundant in locations with high aerial ammonia concentrations (z = 2.33, p = 0.020) and T. sylvestris was more abundant in locations with low aerial ammonia concentrations (z = −2.07, p = 0.039). T. sylvestris reached high abundances when aerial ammonia concentrations were below 3 μg NH3/m3, low abundances at aerial ammonia concentrations of 3–7 μg NH3/m3 and was absent when aerial ammonia concentrations were higher than 7 μg NH3/m3 (Fig. 5).
Citizen science has become a very useful contribution to ecological research and conservation biology (Dickinson et al. 2012). The often opportunistic nature of the citizen science data, however, can cause spatial and/or temporal biases. A third possible bias is observer quality, i.e. the variation in identification skills among citizen scientists (Isaac and Pocock 2015; Kelling et al. 2015). Here, we showed that the use of verified records (using uploaded photographs in a regional online data portal) for species distribution modelling of three often misidentified skipper butterflies (Hesperiidae: O. sylvanus, T. lineola and T. sylvestris) in Flanders (northern Belgium), resulted in smaller distribution ranges than previously estimated using non-verified records. A field study on T. sylvestris, the rarest of the three species whose records turned out to be most often misidentified, confirmed its more restricted distribution and more specialist biotope preferences compared to the other two species. Data quality control is an often neglected issue in citizen science projects, but is of major importance when using opportunistic data in Red List assessments (Maes et al. 2015) or for management and policy recommendations.
Verification of records of the three skipper butterflies showed that, as predicted, T. sylvestris records were most often misidentified (Table 4). Surprisingly, given the difference in body size, a high proportion of alleged T. sylvestris records turned out to be O. sylvanus. A possible explanation of this erroneous identification is the similarity regarding the underside of the antennal club (orange) and the androconial stripe (long and curved) between both T. sylvestris and O. sylvanus (Table 1), when observed from the front or from above, respectively.
In NW Europe, the number of butterfly species is relatively low and good field guides are available in the local languages (e.g. Wynhoff et al. 2014). One would expect that this would lead to a large amount of correctly identified observations in citizen science data portals. Beginning and/or inexperienced volunteers, however, are not as skilled in the field as trained butterfly experts, especially when it comes to butterflies in flight, and they could thus induce errors in citizen science datasets. Apart from the three skipper butterflies we discussed here, other similar-looking species groups are also likely to be misidentified by beginning recorders. Examples of such species groups are whites (e.g. Pieris spp., Colias spp.), blues (e.g. Polyommatus icarus and Aricia agestis) and satyrids (e.g. Maniola jurtina and Pyronia tithonus). In more species-rich areas (e.g. southern or eastern Europe, mountainous regions in Central Europe), however, we expect even higher identification error rates due to the much larger amount of morphologically similar species (e.g. Pyrgus, Melitaea, Polyommatus spp.).
A recommendation to improve data quality of citizen science data portals is to encourage volunteers to add photographs (nowadays even smartphone lenses are usually of high enough quality for this purpose) to their uploaded observations, to allow for an a posteriori verification by butterfly experts. Currently, there is a strong tendency for uploading proofs with increasing rarity of a species. However, here we show that uploading photographs should become more of a standard practice even for species which are allegedly common, as population and distribution declines can go largely unnoticed due to misidentifications of morphologically similar species. Researchers should properly communicate the importance of verifiable records to citizen scientists and give feedback/training on species identification, which in turn may be an extra motivation for citizen scientists to participate and further enhance the quality (and quantity) of the data they contribute (Tweddle et al. 2012). Obviously, this applies not only to butterflies but to all species groups where opportunistic citizen science data are regularly used in ecological and conservational studies.
Ecological differences among the three skipper butterflies
We found subtle species differences in ecological preferences, which are often not described in detail in the literature (e.g. Bink 1992) or for which existing literature is not regionally applicable. In SW Germany, for example, T. sylvestris and O. sylvanus are described as being rather generalist species (occurring in dry to wet grasslands, not necessarily near woodland edges) whereas T. lineola is called a more specialist species (only dry grasslands in the neighbourhood of woodlands—Ebert and Rennwald 1993; Louy et al. 2007; Engler et al. 2014), while the inverse seems to be true in Flanders. In the UK, the biotopes of T. sylvestris and T. lineola are described as dry, while the habitat of O. sylvanus is described as more damp than for the two Thymelicus species (Asher et al. 2001), which resembles the situation in Flanders. For the Netherlands, the habitat of the three skipper butterflies is described rather broadly: sheltered, damp and rough grasslands for O. sylvanus, rough grasslands for T. lineola, rough grasslands, woodlands, marshes and reed beds for T. sylvestris (Bos et al. 2006).
In Flanders, we show that O. sylvanus and T. lineola occur in grid cells with a higher amount of arable land and nutrient-rich grasslands, reflecting a greater tolerance to agricultural intensification, compared to T. sylvestris. The higher amount of unimproved rough grassland, scrub and woodland edge also shows the more stringent habitat requirements of T. sylvestris compared to the two other skippers (Table 3). This probably explains its much more restricted distribution, which is mainly situated in the eastern part of Flanders, a region where nutrient-poor biotopes (e.g. unimproved grassland, heathland) are more widespread than in the more urbanised and even more agricultural western part of Flanders (Maes et al. 2013). The, on average, higher amount of urban area in grid cells with O. sylvanus, compared to the two other species, is probably explained by its greater dispersal capacity, which allows reaching, for example, urban road verges or urban parks (Asher et al. 2001), two habitat types that are either unsuitable for the habitat specialist T. sylvestris and/or unreachable for the less mobile Thymelicus species (Dennis 2010; Engler et al. 2014).
Species distribution modelling
Species distribution modelling is increasingly used in conservation decisions and planning (Guisan et al. 2013; Tulloch et al. 2016). Many algorithms are now readily available in open access statistical packages (e.g. R) that allow the combined use of different modelling techniques (e.g. Araújo and New 2007; Thuiller et al. 2012). The use of incomplete (e.g. Hamilton et al. 2015) or biased datasets (e.g. Beck et al. 2014), however, can lead to erroneous outcomes, either leading to over- or underestimations of distribution ranges or trends. In all three skipper butterflies for which we made species distribution models, a moderate (−18 and −27 %, for respectively T. lineola and O. sylvanus) to strong decline (−51 % for T. sylvestris) in the number of grid cells in which the species was predicted as present was observed when using only verified records. This clearly emphasizes the importance of data quality control when using volunteer observations from citizen science projects prior to analyses (Isaac and Pocock 2015). Restricting the models to verified records only, probably allows for a better fit between the species’ presences and the dependent variables. Conversely, using unverified records, and thus the wrong species, will inevitably induce more blurred relationships between species and the (a)biotic variables used in the models. The small decrease in model performance for T. lineola using only verified records might be due to its occurrence in a broader range of biotopes than the two other species.
Consequences for conservation
Red List status: distribution range and population trends
In Flanders, opportunistic data collected by citizen scientists are the main source for distribution atlases (e.g. Maes et al. 2013) and Red List assessments (e.g. Maes et al. 2012). The recently published atlas of butterflies in Flanders (Maes et al. 2013) gives species distribution maps based on records in http://www.waarnemingen.be that were not all verifiable with uploaded photographs. Although many butterfly distribution atlases mention that maps of T. lineola and T. sylvestris may contain errors due to misidentifications (e.g. Ebert and Rennwald 1993; Asher et al. 2001; Bos et al. 2006), the identification error rate for T. sylvestris records in Flanders was much higher than expected. Overestimations of distribution ranges could have clear consequences in, for example, Red List assessments (Maes et al. 2015). In Flanders, both T. lineola and T. sylvestris have been assessed as Vulnerable in the most recent IUCN Red List of butterflies, based on a presumed declining trend and/or a restricted distribution range (Maes et al. 2012). Despite a relatively low amount of historical data, verified museum specimens show that T. sylvestris used to be present in the whole of Flanders before 1980. The present-day distribution, however, is concentrated within the (north) eastern part of Flanders (Maes et al. 2013). An overestimated present-day distribution range (criterion B in the IUCN Red List assessments) on the one hand, and the use of such overestimated present-day ranges to calculate a population trend (criterion A in the IUCN Red List assessments) on the other hand, could result in a lower Red List classification and thus in wrong prioritisations in conservation policy. In NW Europe, (strong) declines in abundances have been observed in monitoring schemes for the three skipper butterflies, especially for the two Thymelicus species (UK—Brereton et al. 2015; the Netherlands—van Swaay et al. 2016), which show particularly strong negative associations with neonicotinoid usage either due to a causal link or to neonicotinoid usage representing a proxy for other environmental factors associated with intensive agriculture (Gilburn et al. 2015). Flanders has only a limited number of butterfly transects, and changes in abundances could, therefore, not be calculated (Maes et al. 2012). Similar population trends as in the Netherlands and in the UK (for instance, −88 % for T. lineola and −75 % for T. sylvestris over a recent 40 year period—Brereton et al. 2015), however, are expected for T. sylvestris and T. lineola in Flanders. The use of verified records in combination with species distribution modelling allows us to direct citizen scientists to a set of grid cells for targeted surveys of particular (often misidentified) species to check their presence. The outcomes of such targeted surveys are necessary to ground-truth these models and to iteratively improve them by gradually incorporating more reliable data.
Policy and management measures
Having shown new insights into biotope preferences, and more specifically that T. sylvestris has a preference for grid cells with unimproved grasslands, we subsequently tested correlations at a 1 × 1 km2 scale between model-predicted presence and both aerial ammonia concentration and nitrogen deposition. These tests clearly showed that acidification and eutrophication may be limiting factors with regard to the occurrence of T. sylvestris, as this species turned out to be the most sensitive of the three skipper butterflies to high aerial ammonia concentrations and nitrogen deposition levels (Table 6). This probably explains its disappearance in the western part of Flanders where much higher values are measured for these pollutants than in the eastern part (VMM 2015). Nitrogen excess can impact butterflies in different ways, such as via microclimatic cooling in early spring (Klop et al. 2015). This effect has been shown to be especially important for grassland butterflies that overwinter as eggs or larvae (WallisDeVries and van Swaay 2006). Although our three study species are all egg-larva hibernators, the negative effect of nitrogen excess is expected to be stronger in T. sylvestris since its first instar larvae are more exposed to microclimatic cooling than unhatched eggs (T. lineola) or nearly full-grown larvae (O. sylvanus—Klop et al. 2015). Excess of ammonia (leading to acidification) and increased nitrogen deposition (leading to eutrophication) are known causes of biodiversity loss, especially in NW Europe (Oenema et al. 2012). In NW Europe, where both aerial ammonia concentrations and nitrogen deposition are very high (http://www.eea.europa.eu/data-and-maps/indicators/exposure-of-ecosystems-to-acidification-2/exposure-of-ecosystems-to-acidification-3), this is exemplified by their adverse effects on butterflies in general (Maes and Van Dyck 2001) and on species of nutrient-poor grasslands (such as T. sylvestris) in particular (Stevens et al. 2010). Agriculture is responsible for more than 90 % of the European ammonia emissions (e.g. livestock, manure management, fertilizer application—Reis et al. 2009; Skjøth et al. 2011) and could hence strongly contribute to a reduction of such emissions. Optimizing the use of air scrubbers and biofilters, for example, would considerably lower the emissions of ammonia and other pollutants in livestock facilities (Van der Heyden et al. 2015). Although the use of fertilizers has declined in Europe during the last three decades, on average 106 kg N/ha/year is still used in Belgium (http://ec.europa.eu/eurostat/statistics-explained/index.php/Agri-environmental_indicator_-_mineral_fertiliser_consumption). This amount largely exceeds the critical load of semi-natural grasslands and other nutrient-poor biotopes such as heathlands and explains the strong decline of biodiversity in this NW European region (Maes and Van Dyck 2001; Van Landuyt et al. 2008; Desender et al. 2010).
Apart from the environmental mitigation measures described above, appropriate grassland management can also benefit the three skipper butterflies. Since they all hibernate as egg or larva, rotational mowing schemes are advisable to assure a sufficient amount of patches with tall vegetation during the overwintering period. The resulting heterogeneity has the added benefit of a continuous availability of nectar sources on which they are highly dependent in summer. Road verges can act as functional corridors between such well-managed local habitat patches, but again, inappropriate mowing regimes can hinder exchange and expansion into new areas (Asher et al. 2001). As for mowing, grazing is only suitable for the three skipper butterflies when done at very low intensity as this provides a sufficient amount of tall grass vegetation (WallisDeVries and Ramaekers 2001). Creating shelter (e.g. large hedgerows) along unimproved rough grasslands could bring further benefits, especially for T. sylvestris that is shown to be more strongly dependent on woodland edges than the other two skippers.
In summary, although records from citizen-science projects are a valuable contribution to ecological research and conservation biology, inexperienced volunteer recorders are likely to induce errors in public online data portals. Therefore, quality checks of such data are essential to assure their correct use. With a focus on three morphologically similar skipper butterflies (O. sylvanus, T. lineola and T. sylvestris) in Flanders, we showed that using only verified records resulted in different modelled distribution ranges. T. sylvestris displayed a stronger specialist biotope preference (in casu sheltered unimproved rough grasslands) and higher sensitivity to nitrogen and ammonia pollution than O. sylvanus and T. lineola, which appear to deal better with landscapes characterised by agricultural intensification and urbanisation. Additionally, a detailed field survey contrasting verified and unverified records of T. sylvestris showed that the species was almost completely restricted to sites with verified records only. In general, our study clearly exemplifies how unverified citizen science data may lead to inappropriate conservation and policy measures.
We thank all volunteers sharing skipper observations on http://www.waarnemingen.be, and are grateful to Natuurpunt Studie (Wouter Vanreusel and Karin Gielen) and Stichting Natuurinformatie for access to this database. We also thank Hans Matheve (TEREC, UGent) for help with GIS. Finally, we thank Butterfly Conservation Europe and De Vlinderstichting for the opportunity to present preliminary results of this study at the Future4Butterflies conference. We also thank two anonymous reviewers and Marc Pollet for commenting on a previous version of the manuscript.
- Asher J, Warren M, Fox R, Harding P, Jeffcoate G, Jeffcoate S (2001) The millennium atlas of butterflies in Britain and Ireland. Oxford University Press, OxfordGoogle Scholar
- Bink FA (1992) Ecologische atlas van de dagvlinders van Noordwest-Europa. Schuyt & Co Uitgevers en Importeurs bv, HaarlemGoogle Scholar
- Bos F, Bosveld M, Groenendijk D, van Swaay CAM, Wynhoff I, De Vlinderstichting (2006) De dagvlinders van Nederland. Verspreiding en bescherming (Lepidoptera: Hesperioidea, Papilionoidea). Nederlandse Fauna 7. Nationaal Natuurhistorisch Museum Naturalis; KNNV Uitgeverij; European Invertebrate Survey, LeidenGoogle Scholar
- Brereton TM, Botham MS, Middlebrook I, Randle Z, Roy DB (2015) United Kingdom butterfly monitoring scheme report for 2014. Centre for Ecology & Hydrology/Butterfly Conservation, Wallingford/East LulworthGoogle Scholar
- De Saeger S, Guelinckx R, Van Dam G, Oosterlynck P, Van Hove M, Wils C, Paelinckx D (2014) Biologische Waarderingskaart en Natura 2000 Habitatkaart, uitgave 2014 vol INBO.R.2014.1698392. Rapporten van het Instituut voor Natuur-en Bosonderzoek. Instituut voor Natuur- en Bosonderzoek, BrusselGoogle Scholar
- Ebert G, Rennwald E (1993) Die Schmetterlinge Baden-Württembergs, Band 2, Tagfalter II. Verlag Eugen Ulmer, StuttgartGoogle Scholar
- Fox J, Weisberg S (2011) An R companion to applied regression, 2nd edn. Sage, Thousand OaksGoogle Scholar
- Kudrna O, Harpke A, Lux K, Pennerstorfer J, Schweiger O, Settele J, Wiemers M (2011) Distribution atlas of butterflies in Europe. Gesellschaft für Schmetterlingsschutz e.V., HalleGoogle Scholar
- Lafranchis T (2004) Butterflies of Europe. New field guide and key. Diatheo, ParisGoogle Scholar
- Maes D, Vanreusel W, Van Dyck H (2013) Dagvlinders in Vlaanderen: nieuwe kennis voor betere actie. Uitgeverij Lannoo nv, TieltGoogle Scholar
- OC-GIS Vlaanderen (2001) Bodemkaart van het Vlaams Gewest, schaal 1/20000. Ondersteunend Centrum GIS Vlaanderen, GentGoogle Scholar
- Oenema O, Velthof G, Klimont Z, Winiwarter W (2012) Emissions from agriculture and their control potentials, TSAP Report 3, version 2.1. International Institute for Applied Systems Analysis (IIASA), LaxenburgGoogle Scholar
- Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190:231–259 doi:10.1016/j.ecolmodel.2005.03.026
- R Core Team (2015) R: a language and environment for statistical computing, 3.1.1 edn. R Foundation for Statistical Computing, ViennaGoogle Scholar
- Thuiller W, Georges D, Engler R (2012) Biomod2: Ensemble platform for species distribution modeling. R package version 1.3.7/r529.Google Scholar
- Tweddle JC, Robinson LD, Pocock MJ, Roy HE (2012) Guide to citizen science: developing, implementing and evaluating citizen science to study biodiversity and the environment in the UK. Natural History Museum/NERC Centre for Ecology and Hydrology for UK-Environmental Observation Framework, UKGoogle Scholar
- van Swaay CAM (2006) Basisrapport Rode Lijst Dagvlinders. De Vlinderstichting, WageningenGoogle Scholar
- van Swaay CAM, Termaat T, Kok J, Huskens K, Poot M (2016) Vlinders en libellen geteld. Jaarverslag 2015 vol 2016.001. Rapport VS. De Vlinderstichting, WageningenGoogle Scholar
- VMM (2015) Verzurende en vermestende luchtverontreiniging in Vlaanderen—jaarrapport 2014. Vlaamse Milieumaatschappij, AalstGoogle Scholar
- Wynhoff I, van Swaay CAM, Veling K, Vliegenthart A (2014) De Nieuwe Veldgids Dagvlinders. KNNV Uitgeverij i.s.m. De Vlinderstichting, Zeist/WageningenGoogle Scholar