Introduction

Wildlife monitoring is important to address wildlife-human conflicts (Delaney et al. 2008) and monitoring programs provide essential data for the decision-making process for the conservation and management of wildlife populations (Barea-Azcón et al. 2007).

Monitoring has mainly been performed by experts in the targeted species, reducing the typical biases of wildlife surveys (Newman et al. 2003). However, expert monitoring is usually expensive and may not be feasible in many sampling situations (e.g., large-scale sampling). When logistic and financial resources are limited, a potential alternative is the use of non-expert volunteers (Newman et al. 2003). Several studies have highlighted the quality of surveys carried out by trained volunteers in which reliable data was produced (Genet and Sargent 2003; Newman et al. 2003), but see also the study by Sauer et al. (1994) where they claimed that surveys performed just by volunteers might be done with caution for certain target species (e.g., bird surveys).

Carnivore species are difficult to follow and study due to their elusive behavior and typically low abundances (Delaney et al. 2008). Most direct methods have been considered unsuitable when the survey consist of counting the animals themselves. This is especially a problem for large-scale sampling where effort and costs are high as well as getting permits for conducting the survey (Wilson and Delahay 2001; Barea-Azcón et al. 2007; Newman et al. 2003). Furthermore, it may not be possible to use direct methods in all habitats, and they cannot be applied by non-qualified observers or over large spatial scales (Sadlier et al. 2004; Barea-Azcón et al. 2007). As a result, non-invasive methods based on indirect abundance measures have increasingly been used to estimate carnivore presence and abundance (Vanak and Gompper 2007). These indirect methods are based on detecting field signs of carnivore presence (so-called sign surveys), e.g., breeding refugia and recovery of hairs and feces (Delaney et al. 2008). Indeed, fecal counts can provide relative estimates of animal abundance (Cavallini 1994; Gros et al. 1996; Stander 1998; Sharp et al. 2001; Tuyttens et al. 2001) and can be used to estimate occupancy (e.g., Karanth et al. 2011; Reid et al. 2013).

Sign survey methods are less expensive than direct methods (Barea-Azcón et al. 2007) and non-experts or volunteers can be trained to use them. However, sign surveys have been severely criticized for several reasons mainly related to species misidentifications (Davison et al. 2002; Harrington et al. 2010), lack of index validation against population abundance (Anderson 2001), or failure to account for imperfect detection (Rhodes et al. 2011). To accommodate for possible biases in the collected data when using expert or non-expert surveyors, wildlife monitoring programs can also implement supplement protocols to estimate detectability (e.g., an appropriate modeling framework) which incorporate variation in detectability when estimating abundance (e.g., comparisons over time or across space).

The impact that red foxes (Vulpes vulpes) have on human interests is important for conservation biology because persecution of foxes can greatly reduce threatened non-targeted predators (Virgós and Travaini 2005). In addition, human interests are affected by red fox presence due to (i) problems associated with rabies disease transmission; (ii) effects on key game species; (iii) impacts on endangered species; and (iv) they might be invasive species in some habitats. Thus, it is urgent to develop methods that are quick, efficient, simple, and economical to monitor red fox abundance and current population trends. The red fox is widely distributed and abundant throughout its entire distribution range. However, direct observation is often difficult as the red fox is nocturnal and usually elusive (Cavallini 1994; Reynolds and Tapper 1995). Therefore, camera trapping and counting scats along transects are the most suitable methods for detecting foxes in the wild. In a study comparing different methods for detecting foxes in low-density areas, Vine et al. (2009) highlighted the suitability of spotlight counts and camera trapping. However, they did not test the efficiency of scat counting on transects, which is “a priori” one of the most cost-effective methods (see above comments; Barea-Azcón et al. 2007). Despite the drawbacks and potential pitfalls of the sign survey method, it is considered to be the most suitable indirect method for measuring red fox relative abundance or occupancy over large spatial scales and for the subsequent management of their populations (see Webbon et al. 2004). To date, tests relating density (e.g., estimated by capture-recapture) and scat abundance have not been performed, and these are needed to determine the problems and biases associated with sign surveys, especially when non-expert volunteer observers are used.

Several aspects need to be improved in order to better apply the sign survey method with a higher confidence. A critical issue of any survey method and probably the most important in sign surveys is how detectability is affected by different sources of error (Clark and Bjørnstad 2004; Buckland et al. 2007; Sӕther et al. 2007; Ahrestani et al. 2013). For example, sources of error can be related to scat visibility, differences in observer ability to detect scats, or the effects of scat abundance on detectability rate. All of these factors can affect scat counts and the abundance index derived from them, or to estimate occupancy. Hence, sources of error in detectability are a critical parameter to evaluate the use of scats both in relative abundance studies or occupancy models.

The aim of this study is to analyze how multiple sources of bias in scat detectability can affect sign surveys of red foxes performed by volunteers with a minimum training period. Specifically, we test the effects of the following factors on the probability of scat detection: (1) the microsite of scat deposition; (2) observer personal differences in red fox scat detection; and (3) scat abundance. This is the first study to evaluate how these factors can affect detectability through an experimental approach where scats were laid in particular places and density in order to estimate detectability without biases when studies were performed using natural situations.

Methods

Study area

We conducted the fieldwork at five localities (Braojos, Madarcos, Cinco Villas, and Prádena del Rincón) in the north of the Madrid province in the Sierra de Guadarrama, Central System, Spain (Fig. 1). The study area is situated in the Supramediterranean belt in the Madrid Autonomous Region. This belt is defined by an average annual temperature of 18.2 °C with hot summers and cold winters and the annual precipitation varies from 44 to 52 mm (Rivas-Martínez 1983). The human population is 33 habitants per km2. Vegetation primarily consists of Pyrenean oak (Quercus pyrenaica) forests interspersed with pastures for cattle grazing, plantations of several pine species (Pinus spp.), and areas with patches of Holm oak forests (Quercus ilex). In this area, red foxes share the habitat with other small vertebrates such as wildcats (Felis silvestris), common genet (Genetta genetta), European badger (Meles meles), and beech marten (Martes foina) among others.

Fig. 1
figure 1

Study area in the Madrid Autonomous Region, Spain. Black stars indicate the four specific localities where scat collection transects were established

Study design

We designed a transect survey following different sources of literature (Cavallini 1994; Webbon et al. 2004). Following previously established methods (Carreras-Duro et al. 2016; McHenry et al. 2016), we chose nine man-made footpaths as transects to conduct the survey in the study area. Transects chosen in these habitats are tracks that are between 1.5 and 3 m wide, scarcely used by vehicles from the community. Although red foxes can defecate at different landscape elements, literature on carnivore marking behavior highlights the importance of transects, roads, and similar landmarks on carnivore communication through feces (Macdonald 1980; Barja et al. 2004). In addition, a previous study show the higher prevalence of scat depositions along transects than in cross-country elements of the landscape (Webbon et al. 2004). The average transect length was 915 m, varying from 730 to 1100 m and 1.5 to 3 m wide, and they were made for access to agriculture in the adjacent areas. We conducted the study during 1 month in March 2011. Every volunteer conducted a consecutive survey within a maximum time of 2 weeks. Before starting the survey, all visible red fox scats in the transects were collected and cleared. The collected scats plus some additional fox feces collected in a breeding center where used to perform our experiments on scat detectability. We placed red fox scats in five different microsites in the nine cleaned transects where red foxes tend to deposit their scats: (1) central area of transects; (2) exposed edge of transects, within the first 50 cm on any side of the transect; (3) non-exposed edge of transects (covered by vegetation; > 50 cm on any side of the transect); (4) top part of small shrubs (< 0.5 m.) on the edge of transects; and (5) rocks on the edge of transects (see Fig. 2). We also used three different red fox scat densities (low, medium, and high) to simulate realistic differences in fox density. These three categories were defined according to our previous experience on scat abundances along transects in Spain and should be considered as normal abundance levels for typical Mediterranean ecosystems. Low density was defined as < 5 scats per kilometer, medium density as 5–10 scats per kilometer, and high density as > 10 scats per kilometer. Three transects were used for each scat abundance category. To facilitate the interpretation of results, each transect was divided into 100-m segments by marking a visible rock or branch with colored tape. For example, this allowed us to determine if some of the observed scats could include scats missed during transects cleaning. These scats were not considered in our analyses because they were not experimentally placed at selected microsites or densities. Volunteers walked each transect individually without prior information of number of scats detected by other volunteers. Moreover, enough time was allowed so volunteers could not observe how and where previous volunteers detected scats and were informed not to manipulate scats or the environment in not to affect the observations of the remaining observers. Hence, the probability of a particular scat being detected was independent for each observer (thus, detectability observer 1 ≠ detectability observer n). We then obtained detectability for each microsite and scat abundance based on estimates of 12 different observers without prior scat survey experience. Each observer conducted from two to four transects (3.08 transects on average). We observed every observer from the end point of each transect to verify they were doing the survey according to the instructions during the training. Before conducting the study, in order to form an improved fox scat search image, each observer received minor training consisting of 1 h of fox scat searching in the field along different transects and explanations about where red foxes tend to deposit the feces. The training was conducted for one of the authors of this study (EV), with large experience in scat surveys of carnivores. Each observer surveyed multiple transects containing all different types of microsites. This mimicked a sample of the typical skills of non-experienced volunteers who participate in large-scale carnivore surveys. Observers walked along the transects counting the number of red fox scats divided per microsite and red fox scat abundance category. Observers also recorded the distance from the start of the transects to where they found the scats, as well as within the 100 m segment.

Fig. 2
figure 2

A schematic illustration of a typical transect with the different microsites where the scats were deposited: (1) central area of transect, (2) exposed edge of transect, (3) non-exposed edge of the transect, (4) top part of the shrubs, (5) rocks on the edge of the transect. On the right side of the illustration are photographs of three of the transects included in this study. The modified illustrations show the microsites where potentially the scats were deposited

Red fox scat detectability and model selection

We estimated scat detectability by dividing the number of red fox scats observed by the number of red fox scats deposited in each microsite in each transect by each independent observer. Across all transects, we deposited 120 scats in total.

We tested the effects of microsite, fox scat abundance, and observer identity on scat detectability using generalized linear mixed models (GLMM). Scat detectability was considered the response variable using binomial denominators. The response variable was created by controlling the number of detected scats by the total number of scats placed in each microsite (the binomial denominator). This procedure enabled us to use a GLMM with binomial errors and a logit link. To be more precise, the GLMM model selection was performed according to Bolker et al. (2009) and Zuur et al. (2009). Briefly, we constructed “the beyond the optimal model” including all possible interactions between fixed factors. For the full model, we used microsite, scat abundance, and their interaction. With this structure of fixed effects, we then optimized the structure of the random effects (effect of observer identity on the estimate of the intercept of the model and on the estimate of microsite), and the random structure to be retained for further analyses was selected by the lowest Akaike information criteria (AIC). Once random effects were optimized, we performed model selection for fixed effects fitted by maximum likelihood (ML). Among all the possible combinations of independent variables given the beyond the optimal model, we selected the best-fitting model that minimized the second-order AICc (AICc is AIC corrected for small sample sizes). The model with the lowest AICc was considered the best model. Other models with an AICc < 2 compared to the best models were considered approximately equivalent in explanatory power (Burnham and Anderson 2003). To control for observer effect, observer identity was used as a random factor in all models. All statistical analyses were conducted in R environment (R Core Team 2018) using package “lme4” (Bates et al. 2015) and visualized using package ggplot2 (Wickham 2009).

Results

Scat detectability per transect and microsite

Our results show an average detectability of 0.43 (SD 0.14) of red fox scats across all transects, ranging from 0.26 to 0.65 between observers (Table 1). Detectability rates between observers vary when different volunteers performed the sign surveys but those differences are not significant between observers for the different microhabitats (see Table S1). In total, 120 scats were deposited along all transects combined (Table 2; Table S2). Scats were detected differently depending on the microsite. We found the highest detectability in exposed and non-exposed microsites (0.73 and 0.45, respectively) and the lowest detectability in the central area (0.21) of the transects and on the rocks (0.27) (Table 3; Fig. S1).

Table 1 Observer average detectability estimation per microsite and transect. Observed ID, observer identity from 1 to 12. Transect ID, transect identity from 1 to 9. Detectability, average detectability per observer over the performed transects. Scats Obs, number of scat observed by every observer per transect, no distinction between microsites
Table 2 Scat deposited per microsite per transect. Microsite ID, microsite identity from 1 to 5 ((1) central area of the transect, (2) exposed edge of the transect, (3) non-exposed edge of the transect, (4) top part of small shrubs, (5) rock on the edge of the transect). Transect ID, transect identity from 1 to 9. Scat deposited, number of scat deposited per microsite. Total scats, total number of scats deposited per transects simulating three different densities (low > 5 scats; medium from 6 to 10 scats; high < 10 scats)
Table 3 Detectability estimates, standard error, and z values for a generalized linear mixed model (GLMM) performed using detectability as a response variable and microsite

For optimizing the structure of random effects, we tested different variables as random factors in our models (scat density, transect, and observer). Observer identity was selected as the best random structure included on the intercept of the model. The best model in the model selection, according to Bolker et al. (2009) and Zuur et al. (2009) model selection criteria, only included microsite as fixed effect (see Table 4). All meaningful models from an ecological point of view, which include scat detectability as response variable, red fox scat abundance, microsite and their interaction as fixed factors, and observer identity as random factor, are shown in Table 4.

Table 4 GLMM effects table using red fox scat detectability as response variable. The model containing only microsite as a fixed factor has the lowest AIC. To account for observer effects, they are treated as a random factor in all models

Discussion

This study is one of the first to test the effects of different sources of bias in the efficiency of performing a sign survey using scats to estimate relative abundance and stablishing occupancy. Four key results can be highlighted: (i) differences in red fox scat detectability were dependent on the microsite of deposition; (ii) red fox scat detectability was not affected by scat abundance (e.g., different red fox densities); (iii) a relative weak effect of observer identity. Observer identity was not included in any model as a fix factor with AICc < 2 due to its weak effect in the model, (iv) a low scat detectability observed when non-expert volunteers are used in the surveys.

Our results showed that some heterogeneity in detectability rates can be found when different volunteers were used for sign surveys. Although most of the observers performed equally well, there were some moderate non-significant differences among observers. This indicate that inter-observer trials are needed to correct for biases or to disqualify people with large deviances over the mean detectability values before red fox sign surveys are undertaken. It is an important factor to take into account when designing large-scale surveys, because great diversity of observer skills can be expected in these types of studies, as it can affect the reliability of comparisons among sites or times when observers change (Robbins et al. 1989; Sauer and Droege 1990; Sauer et al. 1994). Furthermore, our observers detected scats at an average rate of 42%, which is a much lower value compared with previously shown detectability of dog scats (overall > 80%; Mackay et al. 2008) or for red fox spotlight census (53–75%; Ruette et al. 2003). Indeed, with scat detectability values around 20% for some of the transects, it is important to take the probability of falsely considering a species absent is high under consideration (e.g., false absence recording in occupancy models). With a mean scat detectability of 42%, the probability of false absence recording is not so high, but the probability of false negatives is still elevated (see discussion on this topic in Reynolds and Tapper 1995). Therefore, the low detectability found in our study might raise concern of the use of non-expert volunteers for large-scale surveys of red foxes. Our findings show that we need to improve the detectability rate of non-expert volunteers by before they can participate in a sign study. An easy way to mitigate this problem is to increase the training period. We suggest a training period with tests that the volunteers should pass before they are allowed to participate, for example, a week per group of observers consisted of “8–12 people”. We will supply them technical information including background theory about the study area, focal species behavior, and practical demonstration. During this period of training, some tests of the performance and reliability of the surveys could be undertaken with the sources of bias detected in each case. Furthermore, because erroneous species identification of scats can be a large source of biases in sign surveys (Davison et al. 2002; Harrington et al. 2010), we urge to include training for volunteers or surveyors to correctly identify scats from different species. Volunteers should only be cleared to participate in surveys when misidentification rate was lower than a specific threshold (e.g., identification rate above 85%).

The effect that scat abundance has on sign detectability has rarely been evaluated in sign surveys. However, non-monotonic detectability could be expected to vary with the density of signs (e.g., changes in observer saturation when signs are abundant or lack of concentration during surveying when signs are scarce). The effects of scat abundance on the transects was non-significant in our study. This is an important finding as red fox scat detectability was not affected by factors associated with low or high number of signs on the transects (e.g., changes in detectability for saturation effects or distraction effects linked to high and low abundance of signs, respectively).

We also observed low detectability in the central area of the transects, which could be viewed as surprising. However, this is probably due to observers maximizing their concentration to avoid losing scats in more difficult locations (e.g., those with cover at the side of the transects) and also it is related to the composition of the transect surface (Kluever et al. 2015). For instance, it has been confirmed in laboratory settings that different sized and textured transects as well as the transect width can directly influence detection probabilities (de la Rosa et al. 2011). During surveys, we observed that observers tend to concentrate their inspection to the side of the transects thereby showing a strong tendency to miss scats in the open and apparently visible middle of the transects. This point could be easily mitigated by clarification during the training period.

All these results suggest that red fox scat surveys can be affected by differences in detection rate among observers and differences in red fox marking behavior and scat deposition. This fact is very important because relative abundance indices or occupancy models derived from scat counts can be used for habitat modeling or other spatial comparisons, as well as for developing population monitoring strategies. In all of these critical cases, the differences observed can produce misleading conclusions about wildlife-habitat relationships or population trends (Anderson 2001). From our study, we can conclude that although both spatial and temporal comparisons can be performed by different non-expert observers, some caution is needed due to differences in detection rates. To improve these large-scale surveys, all projects should include training periods of volunteers where they can improve their skills on detecting scats. Moreover, all large-scale comparisons should include and correct for among-observer differences in detectability, even after a training period to minimize the bias (see also Sauer and Droege 1990; Sauer et al. 1994). Furthermore, red foxes can modify their marking behavior depending on habitat structure, spatial organization, or density (Gosling and Roberts 2001). This needs to be considered in further research on the applicability of red fox scat sign surveys as it offers a very exciting interplay between the disciplines of wildlife monitoring and behavioral ecology. We need more information about how marking may be dependent on landscape structure and how this may affect detectability rates between expert and non-expert observers as well as within class differences.

Finally, for future studies, it would be interesting to test whether we can find variation between observers ability to detect scats when performing the experiment in different habitats and comparing experts and volunteers over a period of days, under different weather conditions, using shorter and longer transects and different features transects. All these factors can produce large differences in detectability in which can lead to a strong variation in the reliability of large-scale surveys for carnivores. We encourage researchers and conservation managers to apply the most suitable indirect surveying method to estimate abundances by using volunteers and to always investigate further the uncertainties and sources of errors in the carnivore monitoring programs before their implementation.