Using citizen science in road surveys for large-scale amphibian monitoring: are biased data representative for species distribution?

Road-based citizen science surveys are increasingly used for long-term monitoring of wildlife, including amphibians, over large spatial scales. However, how representative such data are when compared to the actual species distribution remains unclear. Spatial biases in site selection or road network coverage by volunteers could skew results towards more urbanised areas and consequently produce incorrect or partial trend estimations at regional or national scales. Our objective was to compare and verify potential spatial biases of road-based data using distribution datasets of different origins. We used as a case study the common toad (Bufo bufo), a fast-declining species and the main amphibian targeted by conservation action on roads in Europe. We used Maxent models to compare road survey data obtained from the 35 year-long “Toads on Roads” project in Great Britain with models using national-scale toad distribution records as well as with models using randomly generated points on roads. Distribution models that used data collected by volunteers on roads produced similar results to those obtained from overall species distribution, indicating the lack of selection bias and high spatial coverage of volunteer-collected data on roads. Toads were generally absent from mountainous areas and, despite the high availability of potential recorders, showed nearly complete absence of road-based records in large urban areas. This is probably the first study that comparatively evaluates species distribution models created using datasets of different origin in order to verify the influence of potential spatial bias of data collected by volunteers on roads. Large-scale declines of widespread amphibians have been demonstrated using data collected on roads and our results indicate that such data are representative and certainly comparable to other existing datasets. We show that for countries with high road network coverage, such as Great Britain, road-based data collected by volunteers represent a robust dataset and a critical citizen science contribution to conservation.


3
Introduction Worldwide, roads are constructed at a rapid pace, accelerating the global transition towards increasingly fragmented landscapes and causing complex and long-lasting effects on ecosystems and wildlife (Coffin 2007;Ibisch et al. 2016). In addition to direct mortality from vehicle collisions, roads also have other, less obvious impacts on wildlife, including barrier effects, soil, water, light and noise pollution, habitat loss, and increased anthropogenic disturbance (Gibbs 1998;Foreman 2003;Fahrig and Rytwinski 2009). Road impacts are exceptionally high on amphibians due to their particular physiological traits, namely their permeable skin and the need to access diverse terrestrial and aquatic habitats in order to complete their life-cycles (Glista et al. 2008;Beebee 2013;Consetino et al. 2014). The vulnerability of amphibian species to road impacts is exacerbated by the long-distance mass migrations of both adults and juveniles to and from the breeding areas, as well as by their slow movement overland (Hels and Buchwald 2001;Mazerolle et al. 2005;Petrovan and Schmidt 2019). In Europe, the common toad (Bufo bufo) is one of the most wide-ranging and abundant amphibians (Agasyan et al. 2009), but similar to other formerly common taxa, it currently appears to be undergoing dramatic declines over large areas, with road impacts as an important potential factor (Carrier and Beebee 2003;Bonardi et al. 2011;Petrovan and Schmidt 2016). Toads generally select large ponds or lakes for breeding and often prefer deciduous woodland or rough pasture as terrestrial habitat (Denton and Beebee 1994;Hartel et al. 2008). However, they can also inhabit a large variety of other habitats including arable farmland, urban gardens and parks, alpine areas and coastal dunes, migrating up to 1 km or more between terrestrial and aquatic sites (Heusser 1960;Sinsch 1988). In densely populated regions, such extensive migration patterns can involve multiple road crossings at each migration stage, greatly increasing the possibility of traffic collisions (Hels and Buchwald 2001;Bonardi et al. 2011). Consequently, amphibians, and toads in particular, represent the highest percentage of vertebrate roadkill across much of Europe (Schmidt and Zumbach 2008;Sillero 2008;Matos et al. 2012), and this has led in turn to the formation of large scale, volunteer-led projects attempting to rescue toads from road traffic.
Millions of amphibians, mostly toads, but increasingly also other amphibian species, are annually collected and moved by people away from roads and towards the wetlands where they breed (Langton 2015). Often known as "Toads on Roads", these projects have taken place for decades across Central and Western Europe and currently represent the most comprehensive spatial and temporal datasets for inferring long-term trends at regional and national scales for these species in Europe (Bonardi et al. 2011;Petrovan and Schmidt 2016;Kyek et al. 2017). In the UK, the "Toads on Roads" project has a centralised database, and the volunteers that collect and move amphibians on roads during spring migrations record the data using standardised survey sheets (Wormald et al. 2015). Numerous other volunteer-driven initiatives that rescue and record amphibians on roads exist across the globe for other amphibian species, especially in the USA (Sterrett et al. 2019;Petrovan and Schmidt 2019).
However, similar to other wildlife road-based monitoring projects, both for amphibians and other taxa (Wembridge et al. 2016), volunteers do not select in a stratified or 1 3 survey-focused manner the sites where amphibians are moved across roads (Helldin and Petrovan 2019). Sites can be a single point or multiple points over a short stretch of road and vary in the number of people taking part each year but, typically, a single person is responsible for data collation and submission at each site. Given that volunteers repeatedly survey roads at night and on foot, often for an average of 20 nights per season, year after year (Wormald et al. 2015;Kyek et al. 2017), road sites are potentially selected to facilitate such patrolling actions. Therefore, logistical factors could influence site selection and introduce substantial spatial bias. Equally, detecting significant road mortality or live amphibians on roads might favour road segments with high early evening traffic and therefore close to or even integrated into urbanised areas. By contrast, more remote sites might be under-represented as traffic volume, numbers of potential observers and amphibian road mortality are all likely to be lower (Hels and Buchwald 2001). In addition, amphibian carcass persistence on roads can be very low, particularly in wet conditions (Santos et al. 2011) and thus, remote sites might evade detection. Overall, the potential sources of spatial bias can be categorised as: 1. All volunteer amphibian rescue projects tend to be performed on roads but these records might not be representative of wider amphibian populations. 2. Some roads and road sectors are more likely to be selected by volunteers than others (near where people live, easy to access, not too dangerous, high enough toad count). 3. Roads have potentially different detectability, partly due to different toad-killing propensity-traffic volume and time of traffic, but also dead toads might have different detectability to live toads.
These sources of bias can lead to significant uncertainty about the representativeness of road-based datasets collected by volunteers as primary sources of data for large-scale monitoring, compared to species' ranges from standard distribution atlases, which use both volunteer and professional surveys (e.g. Sillero et al. 2014). Exploring such potential biases is important for two reasons: to allow a better understanding of the general validity of road-based projects for long-term species monitoring (Bonardi et al. 2011;Petrovan and Schmidt 2016;Kyek et al. 2017), and to develop ecological niche models (Sillero 2011) that could be applied more widely for such datasets. Ecological niche models are very sensitive to the density of records (Veloz 2009;Mercx et al. 2011;Varela et al. 2014); it is, therefore, essential that the dataset of records accurately represents the species distributions, with a uniform sampling effort over the entire study area (Barbosa et al., 2012). Clusters of records in particular parts of the study area must be real and not an artefact of the sampling effort (i.e. where surveyors repeatedly monitor a defined area). Citizen science projects such as "Toads on Roads" are typical examples of uneven datasets. The development of accurate ecological niche models is also important for identifying overlooked areas and undetected road mortality hotspots. Population declines and local extinctions might occur in such hotspots in the absence of conservation actions (Cooke 2011).
Using common toads in Great Britain and the "Toads on Roads" project as case studies, our main aim was to assess the potential of citizen science for providing reliable information for modelling and conservation. By comparing models built with datasets of different origin, we aimed to understand the effects of potentially biased data on ecological niche models. Our specific objectives were to: 1 3 (1) Develop an accurate ecological niche model for the observed distribution of common toads at national scale for Great Britain (GB). We hypothesised that this model should be as close as currently possible to the true distribution of toads in GB, as it would be built using all information available for the species' presence (i.e. publicly available records that are collected using both volunteer and professional surveys).
(2) Identify the main variables driving toad distribution in GB. We expected that landscape and habitat type, followed by road density and degree of urbanisation would be the main drivers of toad distribution. (3) Build an ecological niche model of the "Toads on Roads" volunteer-selected sites at a national scale in GB. We hypothesised that these will be distributed in a non-random fashion, related to road type and landscape type as well as to the degree of urbanisation. (4) Assess the effect of road-based bias in models built with the "Toads on Roads" database, by comparing them to a model produced from full range data and a null model built with a set of random road points.
Ultimately, we hope that this work will allow us to understand if volunteer-recorded data collected on roads affect the accuracy of ecological niche models built at national scales, thus providing valuable information that can be applied more widely when using volunteer-based data, especially for rare and endangered species.

Datasets
For the "Toads on Roads" sites dataset collected by volunteers in Great Britain, hereafter TOR (632 points; Fig. 1a), we used the national database which is collated and managed by Froglife Trust. All road-based points in this dataset were collected with GPS devices or map referenced to a specific location with an error of less than 100 m. For comparison, we used all publicly available distribution data for common toads, hereafter Range (21,580 points with a spatial resolution of 1 km; Fig. 1c), accessible via the National Biodiversity Network (NBN Atlas) data repository ('NBN Atlas website at http://www.nbnat las.org Accessed March 2016.'). NBN Atlas represents "UK's largest collection of freely available biodiversity data" and is a collation of survey data from multiple sources, including volunteers and professional surveys, either species-focused or multi-taxa. Data in both datasets were recorded after 1975 with the majority of records in 1990-2015. While toads have suffered declines over this period (Petrovan and Schmidt 2016), we expect those declines to affect abundance rather than distribution. Both TOR and Range datasets were compared with 10 datasets created with 632 points each (the same sample size as TOR), placed randomly on roads, hereafter called Roads ( Fig. 1b; Supplementary material Fig. S1). This Roads model acted as a null model in order to assess biases in the TOR and Range models.
As previously stated, ecological niche models are very sensitive to the density of records. Thus, as the TOR dataset is the observed distribution of crossing sites, we did not delete the records below a particular distance to other records. On the other hand, the Range dataset was filtered and cleaned (no more than one record in 1 km 2 ), to ensure its relatively unbiased character. After filtering, the Range dataset included 8452 points. This approach was chosen as it has consistently outperformed other methods such as bias file or targeted background in reducing sampling bias particularly with high sample sizes 1 3 (Kramer-Schadt et al 2013;Fourcade et al 2014). The Range dataset should represent the expected observed distribution of the species given its wide coverage and complexity; it is a combination of multiple sources of information. The Roads datasets were not filtered in order to guarantee that they were truly random, without including any type of condition.

Environmental variables
The ecogeographical variables used to build Realised Niche Models (RNMs, sensu Sillero 2011) were: (1) human (distance to urban areas and roads); (2) hydrology (distance to rivers and lakes) and (3) land cover (distance to arable areas, broadleaved forests, and coniferous forests). These variables were obtained as categorical variables from EDINA Digimap under a UK Higher Education access license (Ordnance Survey 2012) Some correlative modelling algorithms, such as Maxent (see below) accept both continuous and categorical variables. However, categorical variables are difficult to interpret because it is not possible to disentangle the importance of each class included in the categorical variable. Thus, it is preferable to work with continuous variables. Hence, we converted all categorical variables into continuous ones by calculating distances of any pixel to the closest individual class. This was done using the function Raster distance of QGIS software 2.18.3 for all variables with a spatial resolution of 100 m. The environmental variables used for modelling the TOR and Roads datasets had a spatial resolution of 100 m, and those for the Range model were aggregated to a resolution of 1 km, in concordance with the spatial resolution of the species records (TOR and Roads: 100 m; Range: 1 km). All variables had a Spearman's correlation lower than 0.75 (Dorman et al. 2013).

Fig. 1
Datasets used for ecological niche modelling: (A) TOR (registered "Toads on roads" sites in Great Britain from the national registry managed by Froglife, with 632 points); (B) Road records (presence records placed randomly on roads; here it is shown one of ten random datasets with 632 points each); (C) Range records (all publicly available common toad Bufo bufo range records from the NBN Gateway data repository, with 21,580 points).

Ecological niche modelling
We built a set of Realised Niche Models (RNMs; sensu Sillero 2011) for each dataset (TOR, Roads and Range). All models were built using the same set of environmental variables but with different spatial resolutions: TOR and Roads with 100 m, and Range with 1 km (see above). All RNMs were built using the Maximum Entropy approach implemented in the software Maxent (Phillips et al. 2004(Phillips et al. , 2006. We used Maxent as it is a general-purpose machine learning method that uses presence-only occurrences and background data (Phillips et al. 2004(Phillips et al. , 2006. It consistently outperforms other modelling techniques (e.g, Bioclim, Domain, GAM or GLM;Elith et al. 2006). Additionally, it is particularly well suited to noisy or sparse information (Phillips et al. 2004(Phillips et al. , 2006. Models were performed with Maxent 3.4.1 software (http://www.cs.princ eton.edu/~schap ire/maxen t). We ran Maxent in cloglog format with default parameters (auto-features, 500 iterations, and a regularization multiplier of 1) using 70% of the presence records from each dataset as training data (TOR: 439, Roads: 437, Range: 5890), and the remaining records (30%) as test data (TOR: 188, Roads: 187, Range: 2562). Although Maxent raw format is recommended (Royle et al. 2012;Merow et al. 2013), model comparisons would not be possible as the scales are different. The sum of all background points of the raw output is equal to 1, thus presence points are not normalised. As such, we used the new output "cloglog", which avoids the problems associated with the logistic output (Phillips et al. 2017). Duplicated records (i.e. records inside the same pixel forming the raster) were removed. As Maxent is a probabilistic modelling method, we calculated the arithmetic mean and the standard deviation of a set of ten replicates per dataset (TOR, Roads and Range) through an iterative process. For Roads, we have run ten replicates for each of the 10 Roads datasets. We chose 10 replicates as a compromise among statistical analysis power, computation time, and physical storage. The cloglog output gives an estimated probability of presence, ranging from 0.0 to 1.0 (Phillips et al. 2017). We identified the importance of each environmental variable for explaining the species distribution by factor analysis: (1) jack-knife analysis of the average AUC with training and test data; and (2) average percentage contribution of each variable to the models. For this purpose, variables were excluded in turn and a model was created with the remaining variables; then, a model was created using each variable. We did not apply an arbitrary threshold (Liu et al. 2005) to obtain a habitat suitability map (sensu Sillero 2011), where the raw model is transformed in a map with two categories: species presence and absence. Arbitrary thresholds are another source of errors in ecological niche modelling, and there is no fixed rule to choose one (Liu et al. 2005). In nature, the change from suitable to unsuitable habitats is gradual. Applying thresholds may be too reductionist. In order to prevent introducing more noise in the resulting models, all the analyses were performed with the original values of the models.

Model validation
We used the area under the curve (AUC) of the receiver operating characteristic (ROC) plots as a measure of the overall fit of the Maxent models (Liu et al. 2005). AUC is used to discriminate a species' model from a random model. AUC was selected because it is independent of prevalence (the proportion of presence in relation with the total dataset size) as assessed by its mathematical definition (Bradley 1997;Forman and Cohen 2005;Fawcett 2006). Random models have an AUC equal to 0.5; good fitting models get AUC values close to 1. In addition, we calculated with R 3.51. statistical software (R Core Team 2017) a set of 100 null models, following the methodology by Raes and Steege (2007) in order to compare all modelled AUCs with random AUC values. For this, we created 100 different datasets with the same number of random points as the dataset presences following a Poisson distribution.

Model comparisons
The RNMs of 10 average replicates per dataset were compared using a Spearman's correlation (ρ) analysis, performed in R and using ENMTools package (Warren et al. 2017) and by subtracting them by pairs: TOR-Roads, TOR-Range, and Roads-Range. RNMs produce output with continuous values between 0 and 1. Therefore, to visualise spatial differences between pairs of models we calculated the absolute value of a mathematical subtraction. Values close to 0 indicate total similarity; values close to 1, maximum dissimilarity.

Results
TOR and Roads models obtained AUC values higher than 0.88 for the training and testing datasets ( Table 1). The Range model had lower AUC values (Table 1). Maxent null models always had significantly lower AUC values than the TOR, Roads and Range datasets (Table 1). Distance to roads had a high importance in all three models (Table 2): it was the most important variable in both TOR and Roads models, and the second one in the Range model. As expected for a null model, in the case of the Roads models, the rest of the variables only had a minimal contribution. The second most important variable was distance to broadleaved forests in the TOR model; distance to urban areas was the most important variable for the Range model.
In general, the Roads and Range models predicted urban areas as suitable areas (Fig. 2b, c) while the TOR model did not (Fig. 2a). TOR model forecast a lower suitability for London than for other urban areas (e.g. Liverpool, Manchester, and Birmingham), although the areas surrounding Greater London had a high suitability. The Range model (Fig. 2c) predicted that most of Great Britain would be suitable for the occurrence of the common toad, except for areas of uplands and mountains in Scotland, some upland areas of centralnorthern England (mostly in the Pennines), and in mountains of Wales. The areas with the highest habitat suitability were in central and southern parts of England. The Roads model (Fig. 2b) predicted almost perfectly the road network in Great Britain, identifying clearly all major urban centres with high road density, such as London, Birmingham, Manchester, Liverpool, and Edinburgh. There were clear similarities and high correlation values between TOR and Range models (ρ = 0.83; Fig. 2a, c): both models excluded mountains, but the Range model predicted Greater London. The subtraction between these two models ( Fig. 3b) highlighted some major urban centres (at least London, Birmingham and Liverpool), that were predicted by the Range model, but not as strongly or even not at all by TOR (Fig. 2), and habitats of Highlands in Scotland as areas of similarity. A similar pattern was found when comparing the TOR model with the Roads (ρ = 0.86; Fig. 3a) and the Roads with the Range models (ρ = 0.60, Fig. 3c), highlighting again some major urban centres not predicted by the TOR models.

Discussion
For widespread species which are heavily impacted by roads, road-based data represent vital resources for long-term trend estimators, especially as they can be the only available dataset at large enough temporal and spatial scales (Petrovan and Schmidt 2016;Kyek et al. 2017). Road-based surveys can also outperform other survey designs in terms of potentiality to detect changes in large-scale population trends (Roos et al. 2012). However, such data are rarely assessed concerning their representativeness in the wider landscape. For "Toads on Roads" type projects, which currently take place across the majority of European countries and involve millions of animals and thousands of volunteers, this is especially relevant, as site selection and annual data collection are non-random and are entirely dependent on volunteer efforts. Our results indicate that, at least for the common toad in Great Britain, road-based surveys can offer a suitable and valuable representation of the wider species distribution. Our TOR and Range models were highly correlated, as well as, of course, the TOR and Roads models. Roads and Range had a lower correlation, but still relatively high (0.6). Further, the most important variable for TOR and Roads models was distance to roads, while this variable was the second most important one in the Range model. These results were expected, as TOR models and Roads models had the same bias given that both datasets use records collected along roads. Despite being important, distance to roads was not the most important variable for the Range model, which may indicate its less "biased" character. Thus, we can conclude that the TOR dataset is representative of the observed distribution of the common toad in Great Britain. Several factors are probably contributing to this, including that the UK has a high human population density (B) between TOR and Range models; (C) between Roads and Range models. TOR model was calculated with the "Toads on Roads" dataset; Road model with the Roads dataset (one of 10 random replicates is shown); and Range model with the Range dataset. Values close to 0 (blue colours) indicate total similarity; values close to 1 (yellow colours), indicate lowest similarity or maximum dissimilarity.

3
and a very dense road network (OECD 2013) as well as a long tradition in citizen science and volunteer-led conservation projects, especially for birds but also butterflies, mammals, invasive species or wildlife health (Harris et al. 2014;Brereton et al. 2011;Roos et al. 2012;Lawson et al. 2015;Woodward et al. 2018). Numerous opportunities exist to detect road-crossing sites for amphibians at large spatial scales as toads travel long distances overland, are often killed en masse on roads, and are easily identifiable by non-specialists (Petrovan and Schmidt 2016). Additionally, the overall species' range datasets also suffer to some degree from biases towards roads because most distribution records are collected near roads, namely at 1 km grid squares. At lower resolution scales (grid squares of 10 km or higher), road biases are completely lost in chorological datasets (Sillero et al. 2014). The identified spatial biases related to road proximity, which were apparent in both datasets we used for creating niche models, could be resolved by targeted stratified surveys that ensure adequate spatial coverage at national scale, as already the case for other taxa such as birds (Woodward et al. 2018). However, it is currently unclear if additional large-scale monitoring effort is feasible or required, at least for the common toad, in order to target metapopulation conservation adequately and to be able to infer population trends.
Amphibians are the most threatened vertebrate group (Stuart et al. 2004) and currently, even some widespread and formerly abundant species, such as the common toad or common frog (Rana temporaria) appear to be rapidly declining in parts of Europe (Bonardi et al. 2011;Petrovan and Schmidt 2016;Kyek et al. 2017). While conservation efforts prioritize rare and threatened species, the marked declines of widespread and abundant species can have significant implications on ecosystem functioning given their disproportionate contribution in terms of biomass, structure and energy turnover (Gaston 2010;Baker et al. 2019). In this context, both the focusing of conservation action where it is most needed and the collection of robust monitoring data at adequate spatial scales are important for reversing population and species declines. Toads are highly adaptable and can survive and breed in private gardens, parks and allotments (Cooke 1975). Yet, the TOR model excludes the largest urbanised areas and cities from toad distribution. The expectation would be that the higher human density in such areas would promote a higher level of observation effort and therefore increase the chance collection of local records of toads, including on roads bordering public parks and private gardens. However, urbanisation is also a major factor promoting biodiversity decline (McDonald et al. 2008;Montgomery 2008;Elmqvist et al. 2013) and continued urbanisation threatens numerous amphibian species worldwide (Hamer and McDonnell 2008). Our results indicate that indeed toads may have been declining or largely eliminated from such areas, something originally indicated for London as early as the 1960s (Cooke 1975). The reasons for the apparent toad rarity in large areas of towns and cities in the UK are unclear but are probably a consequence of continuous habitat loss, fragmentation and degradation, unsustainable road mortalities, pollution and reduction in breeding areas such as large ponds (Scribner et al. 2001). Large roads with heavy traffic can significantly alter amphibian distribution by creating a roadzone exclusion effect of up to 1 km distance (Eigenbrod et al. 2009), and even secondary roads can cause rapid chemical pollution in nearby amphibian habitats and road mitigation systems (White et al. 2017). Toad populations require 'green corridors' that allow them to move between terrestrial and breeding sites (Hartel et al. 2008) but such corridors can rapidly disappear with increased urbanisation. However, the fact that the Range model predicted some large urban areas, and especially Greater London, as suitable for toads, suggests that they are still present in such areas but at low density and thus large-scale mortality on roads was not recorded. Alternatively, volunteers might not select locations for amphibian road rescues in large urban areas, although some examples exist, in both London and elsewhere.
Long-term datasets are needed for adequate estimations of population trends of amphibians given that populations naturally fluctuate between years (Green 2003). While "Toads on Roads" projects appear to be insufficient to prevent long-term and very significant amphibian declines (Petrovan and Schmidt 2016;Kyek et al. 2017), they are probably able to slow down declines and at the same time provide data that would otherwise be entirely missing. Such projects also contribute to a stronger local involvement of citizens in wildlife-related issues and generate interest in groups of people for different reasons (animal welfare, conservation of species, a sense of community action, intergenerational activities, learning, etc.) (Dickinson et al. 2010;Haywood and Besley 2014;Petrovan and Schmidt 2019). However, even if, as shown here, the spatial distributions derived from the different datasets are broadly similar, the temporal patterns may differ if populations in road-fragmented habitats decline at greater rates; this would be an important area of future research. Another valuable application of such data would be to prioritise migration corridors strategically and to select sites where more effective road mitigation measures should be implemented, such as amphibian tunnels or underpasses. If adequately installed, such systems can dramatically reduce road mortality and promote metapopulation connectivity (Schmidt and Zumbach 2008;Beebee 2013;Matos et al. 2017;Jarvis et al. 2019;Helldin and Petrovan 2019).
Our results support the relevance of road-based surveys as a highly valuable source of data that, while imperfect and suffering from bias, are recorded at exceptionally large spatial and temporal scales. We urge other organisations in Europe and further afield to promote such projects, verify the data in comparison to other distribution records, and use the collected data for monitoring population trends. In countries or regions with a high coverage of the road network, the data collected in relation to amphibian road surveys and conservation action can represent a realistic and adequate image of the wider species distribution and, as such, of long-term trend estimations.