How useful are volunteers for visual biodiversity surveys? An evaluation of skill level and group size during a conservation expedition
The ability of volunteers to undertake different tasks and accurately collect data is critical for the success of many conservation projects. In this study, a simulated herpetofauna visual encounter survey was used to compare the detection and distance estimation accuracy of volunteers and more experienced observers. Experience had a positive effect on individual detection accuracy. However, lower detection performance of less experienced volunteers was not found in the group data, with larger groups being more successful overall, suggesting that working in groups facilitates detection accuracy of those with less experience. This study supports the idea that by optimizing survey protocols according to the available resources (time and volunteer numbers), the sampling efficiency of monitoring programs can be improved and that non-expert volunteers can provide valuable contributions to visual encounter-based biodiversity surveys. Recommendations are made for the improvement of survey methodology involving non-expert volunteers.
KeywordsVolunteer data Conservation voluntourism Observer performance Visual survey Sampling efficiency
Scientists have engaged volunteers increasingly to assist and support their research (Silvertown 2009). Data collected by volunteers often contribute significantly to research projects, especially when guided by experienced scientists (Foster-Smith and Evans 2003). This ‘citizen-science’ model has also been applied within the tourism industry to generate what has been termed ‘conservation voluntourism’, ‘scientific tourism’ or, relatedly, the ‘conservation holiday’ (Brown and Morrison 2003). Volunteer data collection schemes have been successfully adopted by organizations such as Earthwatch and The School for Field Studies. This form of volunteer program has proven to be a valuable option for conservation research from a financial point of view (Brightsmith 2008), but is also beneficial with regards to the completion of ecological studies (Holt et al. 2013).
The accuracy and consistency of data collected by volunteers is a critical aspect of these projects, as data are often used to support scientific publications and management planning decisions. Many components of scientific research can be learned relatively quickly and volunteers, if sufficiently trained, can gather high quality data (Foster-Smith and Evans 2003; Newman et al. 2003). However for some tasks, the learning is more protracted, with expertise slow to accumulate and the data collected by novice volunteers has sometimes been questioned (Cohn 2008; Léopold et al. 2009).
A large part of volunteer contributions to research in the tropics entails reinforcing biodiversity surveys of understudied natural regions. To estimate the abundance of many terrestrial organisms including herpetofauna in tropical forests, visual encounter surveying is one of the most common and efficient techniques (Doan 2003). The distance sampling method is widely applied to estimate population size and/or density of targeted species (Fewster et al. 2009). This entails employing a ‘line transect’ method, in which observers walk along transects estimating or measuring perpendicular distance of each detected animal from the center of the transect (Cassey and McArdle 1999). Assessment of population abundance and density are closely related to estimations of the distance from the transect (Fewster et al. 2009). As such, inaccurate detection or estimations of distance are likely to result in biased assessments of population size and/or density. Furthermore, modeling the probability of detection is an important step in the analysis of distance sampling data (Thomas et al. 2010).
Detection issues have historically been ignored (Stauffer et al. 2002), but have been acknowledged more recently as important factors in survey methodologies and statistical models (MacKenzie and Kendall 2002; de Solla et al. 2005; Lind et al. 2005). Although some studies have highlighted areas of unreliability in data collected by volunteers, few objective comparisons between novice volunteers and experts have been conducted in the field (Fitzpatrick et al. 2009). There are many observer-related factors that can affect the accuracy of data collection in biodiversity surveys. Understandably, some researchers argue that the experience of the person collecting data has an impact on the ability to detect target species (MacKenzie and Kendall 2002; Fitzpatrick et al. 2009), but data accuracy can also be affected by an individual’s abilities and characteristics, irrespective of skill level (Newman et al. 2003; Pipino et al. 2002; Schmitt and Sullivan 1996). If novice volunteers’ detection rates are low, this introduces an undocumented source of variation and bias (Fitzpatrick et al. 2009). Thus, the relationship between observer experience and detection in field research requires further investigation (McCarthy et al. 2013).
Detection can also be affected by characteristics of the survey protocol, including duration and number of participants (Gooch et al. 2006; Schmeller et al. 2009). It has been suggested that a larger volunteer sampling effort, which increases with group size, could counterbalance measurement errors in the data collected (Hochachka et al. 2000). But, surprisingly, the relationship between survey duration and detection efficiency has received little attention (Pierce and Gutzwiller 2004). Similarly, being able to determine the optimum group size has important implications for the way monitoring protocols are designed, especially when human and temporal resources are limited (Ryan et al. 2002). The quality of data collected by volunteers is, in fact, more likely to be affected by survey protocols and design than by volunteer ability per se (Schmeller et al. 2009). Understanding the optimal sampling effort required for specific studies could reduce the likelihood of missing specimens during data collection (de Solla et al. 2005). It will also have important implications for the design and organization of voluntourism projects.
Against this background, we examined the efficiency of volunteers to detect amphibians and reptiles in a tropical forest. We used an experimental design with imitation animals along a transect in the Honduran cloud forest during a conservation ‘voluntourism’ expedition and compared detection and distance estimation accuracy of volunteers and more experienced observers in different group sizes. This study had two major goals; (1) to evaluate the importance of skill level (along a gradient from high school students to trained scientists and local guides) on biodiversity data collection and (2) to examine the effects of group size on the accuracy of survey data.
Materials and methods
Each year Operation Wallacea, a UK-based, volunteer-driven conservation organisation, monitors the biodiversity in CNP along a fixed set of transects emanating from different camps. Surveys are led by professional scientists joined by volunteers, including high school students and university students. High school students, who generally only spend a week in the Park accompanied by their teachers, undertake their own programs as school expeditions and go through skills training, academic lectures and practicals to demonstrate the differing types of surveys being undertaken. University students, who spend between 2 and 8 weeks in the forest depending on their objectives, join the program to strengthen their resumé, gain course credit, or collect data for a dissertation or thesis. High school students, university students, professional scientists specialised in different taxa (therefore considered experienced in collecting ecological data using various monitoring techniques) and local guides were all invited to survey the experimental transect.
A total of 280 people were involved in the study; 238 student volunteers (181 high school students and 57 university students), 30 members of staff (mostly scientists and university academics specializing in different taxa), 6 experienced herpetologists and 6 local guides. Participants ranged in age from 16 to 43 years, however the great majority was not older than 25 years. Operation Wallacea provided ethics approval to work with student volunteers, and all participants were made aware of purpose of the study before undertaking the experiment.
Before undertaking the experiment, examples of the models were shown to participants for a brief period of time (less than a minute) and instructions on how to complete the recording sheet were provided. Participants were asked to walk the 200 m transect at their chosen pace and try to detect the models, recording the taxonomic group (e.g., snake) and the color. They were also asked to estimate the distance of the models from the center of and along the transect. To facilitate the estimation of distance along the transect, the path was marked every 25 m with pink flagging tape. Local guides were asked only to detect species and estimate the perpendicular distances from the transect. As the models did not resemble any particular species present in the park, participants were not required to identify any of the models. Data were collected between 9 a.m. and 12 a.m. or between 1 p.m. and 4 p.m. in order to limit variation in ambient illumination.
Experiment replicates for the different subgroups
Data collection and analysis
For every “survey” we collected the following data: survey time (minutes), group size (1, 2, 4, 8), skill level (high school, university, staff, scientists, guides), each observation of a species, height (low, medium, high), distance from the transect (0, 1, 2, 3 or 4 m), estimated distance along transect and perpendicular distance from the center of the transect.
The central dependent variables analyzed were the number of models detected and the average accuracy of estimated distance along and perpendicular to the transect. Distance accuracy was operationalized as the mean absolute value of the difference between estimated and objective distance. There were two indicators, one for distance along transect and one for perpendicular distance from the center of the transect.
Given this description of the data, one might expect them to be analyzed using an omnibus univariate general linear model in which the effects of group size, skill level, distance and height are examined simultaneously, allowing for main effects and interactions. This was not feasible for several reasons. First, all participants did not provide all data. For example, the guides did not estimate distance along the transect for detected targets. Of greater importance, the number of individuals providing data for each “cell” of the design varied greatly because there were far more volunteers than staff, herpetologists and guides. This very common reality in fieldwork, in which many cells contain no data (e.g., guides working in larger groups) created a non-orthogonal design. There is controversy about the use of “ignoring” and “allowing” tests in such cases (Maxwell and Delaney 2003). As such, although variants on the general linear model were employed for all analyses, there were somewhat separate analyses used to examine the main effects of group size and skill level.
Tests of the main effects of skill and group size were conducted using between-subjects Analyses of Variance (ANOVA) and Analyses of Covariance (ANCOVA), the latter to control for differences in search duration. To test the effect of skill level, performance was examined with one analysis that ignored group size and included all skill levels (hereafter named the ‘group dataset’, because only one response was recorded for each group) and a second analysis that included only the data from those working as individuals, which included all skill levels (hereafter referred to as the ‘individual dataset’). All analyses were performed in SPPSv.22 (IBM 2013).
The experiment was repeated 148 times, with a total of 844 targets detected. On average, observers detected 38 % of the models and ranged from 0 to 75 %, comparable to other work (Foster-Smith and Evans 2003). The majority of models detected were snakes (41 %), followed by frogs (33 %) and lizards (26 %). Not suprisingly, the largest models were detected with greatest frequency. Models were detected with greater likelihood in the middle level (43 %), followed by ground (29 %) and top (28 %).
No significant difference was found in the mean number of models detected or mean time spent walking each of the two transects (independent t test p > 0.05) and, therefore, data for the two transects were combined.
In the group data, the mean time spent to walk a single transect was 29.6 min. It differed significantly among the five skill groups (F = 4.81, p = 0.001). Post hoc tests found the herpetologists (meantime spent = 37.83 min.) spent statistically more time than both high school (meantime spent = 28.48 min.) and university students (meantime spent = 28.03 min.). In the individual data, time spent walking the transect was related to group skills (F = 2.63, p = 0.047). However the post hoc tests revealed that the only significant difference was between the performance of the herpetologists and high school students (p = 0.026).
When analyzing the group data, the time taken to walk the transect was also correlated with group size (r = −0.208, p = 0.011), with larger groups spending less time to complete the task.
Effects of skill level
Time spent on the transect had a moderate positive correlation with the number of models detected for both the group data (r = 0.199, p = 0.015), and individual data (r = 0.389, p = 0.04). Consequently, to examine detection accuracy as a function of skill level, a between-subjects Analysis of Covariance (ANCOVA) was conducted, using time spent as the covariate.
A different picture emerges within the individual data (See Fig. 3, lower panel). Time spent on the survey explained some of the variance in performance (F = 5.97, p = 0.018), but the ANCOVA showed that skill level still had a significant facilitative effect on detection (F = 2.59, p = 0.049). It is clear that those with more experience were more likely to detect targets, although there are only trivial differences among university students, staff and herpetologists.
Effects of group size
Independent t tests examined differences in estimation accuracy between the two transects. No difference (p = 0.21) was found for the accuracy of perpendicular distance estimations between transect A (meanerror = 0.45 m) and B (meanerror = 0.52 m) There was, however, a difference (p = 0.02) in the accuracy of estimations for distance along the transect, with a slight improvement in transect B (meanerror = 1.26 m) compared to transect A (meanerror = 1.66 m). This difference may well be due to practice, because transect order was fixed and transect B always surveyed last.
This study found skill level to have a positive effect on detection, corroborating the view that the number of detected targets in a survey is associated with expertise, particularly in the case of low-density populations (Shirose et al. 1997; Fitzpatrick et al. 2009). However, this effect was observed only when analyzing the individual data. It should be borne in mind that skill, as operationalized here, is a combination of familiarity with the task and the context. This can be seen in the observation that local guides searching as individuals had the best detection rates.
Our study also illustrates how detection performance of relatively untrained volunteers can be augmented by using larger groups. Thus, these data corroborate the notion, already discussed by Freilich and LaRue Jr (1998), that inexperienced volunteers can perform straightforward tasks, such as a visual detection survey, as competently as more experienced observersm be it in larger numbers.
In contrast, the ability to accurately estimate distance was no different between experienced and inexperienced observers, suggesting that experience alone does not ensure greater accuracy in the survey estimations. This finding is at odds with other evaluations, (Shirose et al. 1997; Alldredge et al. 2007). It should be emphasized that the presence of longitudinal distance markers greatly simplified the estimation of distance along the transects and that distance estimation at these distances is known to be quite accurate, even among untrained observers (Wiest and Bell 1985). There may well be group and/or skill differences in more demanding estimation tasks. Additionally, local guides were not asked to estimate the distance of models along the transect.
Elements of the survey protocol were also examined to understand which ones might have an effect on detection and accuracy of estimations. A significant effect of survey duration on detection efficiency was observed. As suggested in previous research, detection increased when the survey lasted longer (Pierce and Gutzwiller 2004; Gooch et al. 2006). This reinforces the importance of standardised surveys, where the recording of survey time is essential to interpret the results. Contrary to data reported by Pierce and Gutzwiller (2004), survey duration had no significant effect on the accuracy of distance estimations but, again, the distance estimation task was quite simple.
There was a facilitative effect of group size on performance. The number of models detected initially increased with group size but then leveled off, with no statistically significant difference between groups of four and eight participants. As groups grew larger, participants may have been able to share task responsibility and focus on different sections of the footpath. Personal observations during the trials suggested that distraction might be the cause of the similarity in detection ability of the larger groups, as with an increased number of people, disturbance and interference between members increased. Distance estimation accuracy, however, did not seem to be associated with a greater number of observers.
The experiment disclosed significant differences in the positions of models detected and also in the proportion of ‘species’ detected. The experiment was designed to replicate real field conditions. Amphibians and reptiles in Cusuco National Park are terrestrial as well as arboreal, the reason for which models were placed at different heights in the canopy. Despite being equally distributed in the three height categories, models were found more often in the eye-to-knee level. Different factors might have affected the detection rate at different heights, including foliage density, light availability and participant expectations to find models clinging to the vegetation.
A cursory examination of these data indicate that the assumption of perfect detection on the transect is violated by both groups. This can be accommodated in programs such as Distance (Cassey and McArdle 1999). The more challenging observation is that the function that can be best fitted to the data is very different for these groups. For example, a second-order polynomial function fits the data of scientists with an r2 of .85. The same function produces an r2 of only .53 for the high school students. At a less statistical level, it is clear that the decline in detection with distance is much greater for high school students than for scientists. Any estimates of true density must then take into account both skill level and group size in order to arrive at unbiased conclusions regarding species numbers in a region.
From an analytic point of view, a limitation of this study was that we were unable to obtain data from a large number of participants in the more experienced groups; less experienced people were more numerous than experts. This is a common occurrence in all domains, including field ecology, and a consequence of the fact that in such projects there is always a large disparity in numbers of experts and novices. Additionally, local guides conducted the experiment with a slight difference in the protocol, which might have affected the results on distance estimations. It would be optimal to have data collected using the same procedures throughout all groups.
The models used did not represent species commonly seen in the field, which could have disadvantaged those experts who have very specific strategies for very specific target species with which they are familiar. It would therefore be useful to replicate this work using models that more closely resemble the field target species.
This study supports the view that when working in groups, non-specialist volunteer researchers can perform simple tasks and collect data as proficiently as more experienced observers with regard to object detection in complex natural habitats. In our study an optimum number of volunteers hovered around four individuals. This is not to suggest, however, that all observers are identical. In fact, some experienced observers were remarkable in their ability to detect models and make accurate distance estimations. We also acknowledge that experts are essential for identifying species, as well as training and leading the volunteers.
As the experiment focused mainly on detection ability, in confirming that volunteers can be as capable as their experienced counterparts in collecting reliable data for a baseline herpetology visual survey, this study reinforces the view that novice volunteers are able to bring valuable contributions to field research, not only financially, but also in practical terms. In so doing, it strengthens the idea that voluntourism expeditions can play an important role in global conservation and research programmes (Pattengill-Semmens and Semmens 2003) by accelerating data collection.
Also highlighted is the potential value of involving local communities when conducting field studies. Local guides demonstrated an excellent ability to detect models, showing their value to contribute to field research, thereby reinforcing the notion that local expert knowledge is becoming increasingly important for field conservation projects (Starr et al. 2011) and validating the belief that local experts can be used for quantitative wildlife studies (Gilchrist et al. 2005). Moreover, the involvement of local community members can be extremely beneficial for the socio-economical sustainability of the projects (Andrianandrasana et al. 2005).
This study has shown how the characteristics of monitoring protocols can have important implications for detection probability and sampling efficiency. Survey duration and number of surveyors had a substantial impact on detection probability during the experiment. Group size in particular appeared to be positively correlated to the increase in detection. This relationship suggests that examining and eventually adjusting these elements of survey protocols would improve the sampling efficiency of the research. For instance, in our results an optimum number of volunteers “per survey group” hovered around four individuals.
This information is extremely important for the development of long-term monitoring programs (Crouch III and Paton 2002) and for the design of studies involving volunteers (Foster-Smith and Evans 2003).
Drawing on these findings, the following recommendations can be made for managers planning future research conservation voluntourism works. The performance of volunteers in collecting data for monitoring studies should be evaluated. This should be done not only to compare volunteers to their professional counterparts and to evaluate overall data validity, but also to improve the protocols used for data collection. A thoughtful analysis and management of resources available can enable sampling efforts to be optimized and the efficiency of such studies to be improved. Volunteer training clearly contributes significantly to the success of monitoring programs (Genet and Sargent 2003), and it is strongly recommended that detection probabilities be incorporated into survey design and analysis in order to improve the accuracy of wildlife population estimation.
The authors would like to thank all volunteers who took part in the study; without their effort and goodwill this research would have not been possible. We would also like to acknowledge the considerable effort of all staff members involved in the running of the expedition, particularly of volunteers’ coordinators. A special note of thanks must also be made to the people of the community of Buenos Aires, Cofradia, Honduras for their constant support of the project and their joyful company.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.