From species detection to population size indexing: the use of sign surveys for monitoring a rare and otherwise elusive small mammal

Monitoring the occupancy and abundance of wildlife populations is key to evaluate their conservation status and trends. However, estimating these parameters often involves time and resource-intensive techniques, which are logistically challenging or even unfeasible for rare and elusive species that occur patchily and in small numbers. Hence, surveys based on field identification of signs (e.g. faeces, footprints) have long been considered a cost-effective alternative in wildlife monitoring, provided they produce reliable detectability and meaningful indices of population abundance. We tested the use of sign surveys for monitoring rare and otherwise elusive small mammals, focusing on the Cabrera vole (Microtus cabrerae) in Portugal. We asked how sampling intensity affects true positive detection of the species, and whether sign abundance is related to population size. We surveyed Cabrera voles’ latrines in 20 habitat patches known to be occupied, and estimated ‘true’ population size at each patch using DNA-based capture-recapture techniques. We found that a searching rate of ca. 3 min/250m2 of habitat based on adaptive guided transects was sufficient to provide true positive detection probabilities > 0.85. Sign-based abundance indices were at best moderately correlated with estimates of ‘true’ population size, and even so only for searching rates > 12 min/250m2. Our study suggests that surveys based on field identification of signs should provide a reliable option to estimate occupancy of Cabrera voles, and possibly for other rare or elusive small mammals, but cautions should be exercised when using this approach to infer population size. In case of practical constraints to the use of more accurate methods, a considerable sampling intensity is needed to reliably index Cabrera voles’ abundance from sign surveys.


Introduction
Monitoring wildlife populations is critical to understand species responses to environmental change, and to inform conservation planning and management (Nichols and Williams 2006;Lindenmayer and Likens 2010). Typically, wildlife monitoring involves estimates of occupancy or population size within an area of interest (e.g. Yoccoz et al. 2001;Mackenzie et al. 2002). Occupancy can be assessed as the proportion of an area containing a species, based on repeated observations of its presence or absence (or more properly detection or non-detection) at several sites (Mackenzie et al. 2002). Higher quality data for estimating population size usually requires more demanding study designs and sophisticated sampling techniques (e.g. Mills et al. 2000;Royle and Young 2008;O'Brien 2011), which is why many studies often use simplified methods providing indices of population abundance (Engeman 2005;Jareño et al. 2014). Methods for estimating occupancy and population size or abundance can provide meaningful information even when a proportion of occupied patches or individuals remains undetected. However, they still pose a number of challenges related to the appropriate monitoring strategies, scales, and sampling techniques considered (e.g. Pollock et al. 2002;Joseph et al. 2006;Steenweg et al. 2018). This is particularly true for rare or elusive species (i.e. patchily distributed species occurring at low abundance, or secretive and difficult-to-observe species, Thompson 2004), for which accurate population monitoring typically requires time and resource intensive techniques (e.g. live-trapping; camera-trapping; non-invasive DNA sampling). These techniques are often difficult and/or costly to implement over large spatial and temporal scales (Witmer 2005;Perkins et al. 2013), and, in the case of rare or hardly detected species, they frequently provide only sparse data, from which robust estimation of demographic parameters often remains challenging (Engeman 2005).
Although surveys based on field identification of signs (e.g., faeces, footprints, hair, dens) cannot deliver information on important population parameters such as age class structure, sex-ratios and reproduction, they are generally considered a cost-effective alternative to more demanding wildlife monitoring techniques for many terrestrial mammals (Wemmer et al. 1996;Wilson and Delahay 2001;Stanley and Royle 2005). However, to provide comparable and informative inferences on population status and trends, sign surveys require the use of explicit and easy-to-replicate field sampling protocols, based on adequate searching strategies and sampling intensity. Low detectability of signs in occupied patches, resulting for instance from inadequate sampling intensity, may lead to bias and imprecision in occupancy estimates (MacKenzie et al. 2002;Ward et al. 2017), and prevent the use of sign surveys for indexing local population abundance (e.g. Rhodes and Jonzén 2011). Therefore, studies assessing the impact of sampling intensity (e.g. duration of surveys and/or spatial coverage of surveyed area), on sign detection success, and on the utility of sign surveys to infer population size are essential for designing efficient monitoring programs, and to correctly interpret ecological and evolutionary processes (e.g. Holbrook et al. 2015;Carreras-Duro et al. 2016;Bowden et al. 2000).
Here we address these issues for the globally near-threatened, Iberian endemic Cabrera vole (Microtus cabrerae), a lowabundance small mammal with fragmented distribution, associated to tall and humid grassy habitat patches, and showing a metapopulation-like spatial structure and dynamics in Mediterranean agricultural landscapes (Pita et al. 2014). Because detectability from life-trapping is relatively low, even under large capture efforts (e.g. Sabino-Marques et al. 2018), Cabrera vole population sampling is often based on sign surveys to infer occupancy and relative abundance within habitat patches (San Miguel 1992;Santos et al. 2006;Pita et al. 2007Pita et al. , 2016Valerio et al. 2020). However, there are still uncertainties regarding for instance the survey effort required for detecting the species at low densities, as well as on whether sign abundance can be used as a proxy for population size. Notably, it is still largely unknown how sampling intensity may influence species detectability and the strength of inferences regarding population abundance (e.g. Gopalaswamy et al. 2015). Understanding these issues is important for evaluating the effectiveness and potential limitations of surveys based on field identification of signs for monitoring Cabrera vole populations.
We investigated how sampling intensity affects the detection probability (or more precisely the true positive detection probability) of the Cabrera vole in occupied patches from Mediterranean farmland, and whether sign abundance provides a proxy to infer 'true' population size at habitat patches. Population size was estimated from capture-recapture (CR) data based on genetic non-invasive sampling (gNIS) of vole faeces, which is known to provide reliable estimates of Cabrera vole's demographic parameters Sabino-Marques et al. 2018;Proença-Ferreira et al. 2019). Despite its cost-effectiveness relative to other sampling methods such as live-trapping (see Ferreira et al. 2018), gNIS still requires considerable time and laboratory costs (e.g. DNA extraction kits, PCR, species, sex and individual identification through genotyping faeces) to be easily implemented at large scales (Proença-Ferreira et al. 2019). Testing how much information is lost when using more practical and rapid methods relying solely on field identification of signs, is therefore important for researchers and conservation practitioners (Pita et al. 2014).
Overall, we expect that sampling intensity should have a major overriding effect on true positive detection probability of Cabrera voles based on sign surveys, as well as the strength of relationships between sign abundance indices and population size within habitat patches. Specifically, we expect that detection and inferences on population size should improve with increasing sampling intensity at least up to a certain level beyond which there might be no significant improvements in information quality (e.g. Jones 2011; Reynolds et al. 2011;Green et al. 2020). If true, these expectations will affect the way population monitoring programs based on sign surveys should be implemented in order to maximize information gain and minimize sampling costs (Legg and Nagy 2006; Jones 2011).

Study area and species
The study was carried out in south-western Portugal (Fig. 1), where Cabrera voles typically occur in small marginal habitat patches (often < 0.2 ha) amid a matrix of unsuitable agricultural habitats, largely dominated by irrigated crops, pastures and greenhouses (Pita et al. 2007). Within patches, voles are typically grouped in subpopulations or colonies consisting of a few individuals (often < 30 animals/ha, Sabino-Marques et al. 2018), which are usually organized as a monogamous breeding pair and their offspring (Pita et al. 2010(Pita et al. , 2014. Individuals generally show strong site fidelity, with average home-ranges around 400m 2 (Pita et al. 2010). Home-ranges are scent-marked by deposition of faeces in latrines of up to several dozen of faeces, which are thought to be related to individual communication for territory defence and mate advertisement (Gomes et al. 2013).

Study design and sampling
The study focused on 20 habitat patches occupied by Cabrera voles (Fig. 1; Table S1 and Fig. S1, Supplementary Information). Habitat patch sizes ranged between ca. 320 and ca. 3184 m 2 and were generally dominated by a dense cover of perennial herbs from the genus Juncus, Carex, Scipus, Agrostis, Festuca, and Briza, among others, together with scattered shrubs mostly from the genus Rubus, Cistus, Ulex, Genista and Ditrichia. Each patch was examined once between December 2016 and February 2018, except during the hottest and driest months (May-August 2017), when population densities and activity of voles tend to be low (Ventura et al. 1998;Pita et al. 2011a;Grácio et al. 2017). In each patch, surveys were resumed within a mean (± SD) 6.0 ± 1.1 days, so as to assume demographic closure of local populations.
Surveys involved initial mapping of habitat boundaries and the characterization of the internal vegetation structure by visually estimating the percentage cover by herbs and shrubs in randomly placed circular plots of 5 m-radius (see e.g. Peralta et al. 2016). Within each plot we also recorded herb and shrub heights in four sampling points orthogonally located at about 2 m from the plot centre. The number of plots considered in each patch was approximately proportional to its size, varying between 2 and 18 (mean = 8.0 ± 4.8), and in each case the mean measurements were taken to represent vegetation structure of the patches.
Because suitable habitat patches in intensively used farmland are easily identified and delimited, these were taken as our fundamental spatial units for sampling sign abundance and estimating population size. Sign surveys were repeated in each patch in three consecutive days in order to obtain a larger sample size and increase the robustness of our findings. Cabrera vole signs are easily identifiable (particularly based on the size, shape and colour of their faeces), and in our study region (as in most of its distribution range) these signs can be hardly confounded with those of other species (e.g. Grarrido-Garcia and Soriguer 2015). In each patch and day, our survey protocol involved intensive searches for Cabrera vole latrines, by slowly walking crouched through the patch and carefully inspecting areas with microhabitats suitable for the species and other more conspicuous signs of vole activity, such as burrows, runways on vegetation, and grass clipping accumulations (see e.g. Santos et al. 2006;   -Larena and Lopez 2007;Pita et al. 2011b), where (or near which) latrines tend to be found. This resulted in zigzag-like transects guided by the continuous tracking of those more conspicuous signs, which were then carefully inspected for the presence of latrines. Our sampling protocol is therefore consistent with a guided adaptive sampling (e.g. Ringvall et al. 1998;Maxwell et al. 2012;Ståhl et al. 2000;Pacifici et al. 2016), through which places with conspicuous signs are used as priors to enhance detection effectiveness and get more precise information (in or case, on vole latrines). In these adaptive sign surveys, we considered a latrine as any cluster of ≥ 3 droppings of different ages (indicating reuse at different times), where each dropping is at less than 10 cm from any other dropping from that cluster, and thus at more than 10 cm from any other cluster. We focused on latrines instead of individual droppings because isolated droppings are more rare and do not indicate repeated use of a site by animals (e.g. St-Laurent and Ferron 2008).
Our adaptive sampling protocol was conducted in each patch and day by 2-3 well-trained, experienced observers (DP, TVF, TM) simultaneously searching different parts of the patch, such that virtually the whole habitat surface was thoroughly covered. The total survey duration within a patch (hereafter TSTime, given as sum of searching times by each observer) was similar across days, ranging between 20 and 240 min across patches, largely depending on their size (see Fig. S1 and Table S1 in Supplementary Information). Sign sampling intensity (estimated as the rate between TSTime and patch size) remained, therefore, roughly similar across patches and days, averaging (95% confidence interval) 17.4 (16.4-18.4) min/250m 2 (see Tables 1 and S1 in Supplementary Information). We present sampling intensity rates scaled to 250m 2 habitat units for ease of understanding and replication across patches of variable size, even though our adaptive sampling protocol does not involve any prior field delimitation of fixed area units within habitat patches.
During sign surveys, genetic non-invasive sampling (gNIS) was performed by collecting vole faeces for species and individual identification (Proença- Ferreira et al. 2019). Specifically, we collected faeces from all latrines that were at least 2 m apart from the nearest collected sample. This strategy was used to increase the chance of recording as many distinct individuals as possible, thereby achieving a reasonable balance between the potential numbers of gNIS-based 'captures' and 'recaptures' to allow population size estimation (Proença- Ferreira et al. 2019). Each sample consisted of up to 12 of the freshest faeces (mean = 5.11 ± 1.68), collected using sterilized tweezers from each latrine into individual 2 mL microtubes containing 96% alcohol. In most cases (> 90%; see Results), latrines were not completely removed, so they remained potentially detectable by each observer throughout the 3 days survey. Therefore, considering that new faeces were certainly being deposited by voles along the survey days, we assumed no time effects on sign detection probability. During faecal sample collection, chronometers used to record latrine counts were paused and time count restarted when searches were resumed. To increase the probability of detecting individuals and guarantee that enough material was collected for genetic analyses, faecal sample collection in each patch and sampling day often extended beyond the duration of searches directed to latrine counts. In addition, gNIS was further repeated in each patch a few days later (mean of 3.0 + 1.1 days) for collecting additional faecal samples for genetic identification. While we used roughly the same reference sampling intensity in these additional surveys, the searching pattern strategy was not consistent with the guided sampling protocol used in previous days, often involving additional inspections at lower quality microhabitats (e.g. drier and less vegetated spots). Therefore, we did not use these additional gNIS surveys to derive latrine counts. All faecal samples were stored at − 20 °C until DNA extraction and individual genotyping. The number of captures and recaptures of individual genotypes in each patch were then used to estimate local population size (e.g. Sabino-Marques et al. 2018;Proença-Ferreira et al. 2019), and to assess how these relate with different sign abundance indices.

Sign abundance indices
Sign abundance indices in each habitat patch were directly obtained from latrine counts, a method frequently used to estimate the relative abundance of small mammals, including voles (e.g. Woodroffe et al. 1990;Bonesi et al. 2002). In addition, in order to integrate information on the spatial distribution Table 1 Sampling intensity considering the total time duration of surveys (TSTime) in 20 patches ranging between 320 and 3184 m 2 , and respective estimates after successive shortening of TSTime into 10% to 90% fractions (see Table S1 in Supplementary material for detailed estimates per patch). Sampling intensity is given as the rate between total survey duration in minutes and habitat area (scaled to 250m 2 area units)

Survey duration
Sampling intensity (SI) of latrines (e.g. Lambin et al. 2000), besides latrine counting, we also estimated the extent of the area where vole latrines were found within patches (hereafter, extent of occurrence), as described in Pita et al. (2016). Briefly, the procedure consists in creating and merging 10 m radius buffers centred on each latrine location (see also Poccock et al. 2003), so defined to provide circular areas close to the mean home range size of the species in the study area (Pita et al. 2010).
To assess the influence of sampling intensity on the suitability of sign surveys for inferring local population size, we calculated the number of latrines that would be detected in each patch and day under shorter survey time durations (e.g. Green et al. 2020). Specifically, for each patch and day, we resampled the original survey data after reducing TSTime to 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20% and 10%. This resulted in a total 10 measurements of latrine counts and extent of occurrence per patch and day, overall comprising a tenfold change in sampling intensity (see Tables 1, and S1 in Supplementary Information).

DNA extraction and genotyping
DNA was extracted from faecal samples using the E.Z.N.A. ® Tissue DNA Kit (OMEGA bio-tek) following the manufactures instructions, with an initial digestion step using a lysis washing buffer (Maudet et al. 2004) for 15 min at 56 °C. Only faecal samples with potential for being successfully genotyped, as judged by their apparent freshness, were considered for analysis, with a maximum number of ca. 100 samples per patch. Selected samples were genotyped for a set of 11 microsatellites following a stepwise approach, which involved an initial screening for sample quality based on a set of three loci (see Ferreira et al. 2018 for genotyping details). Samples that failed to amplify this first set were discarded from subsequent analyses. Samples that amplified well were then tested for the remaining set of 8 microsatellites (see Table S2, Supplementary Information). Previous results showed that the set of 11 microsatellites are highly informative in providing accurate individual identification and diversity values . Species ID was confirmed using a small fragment of mitochondrial DNA, Dloop (Alasaad et al. 2011). The samples were also sexed using two small-sized sex chromosome introns (DBX5-S and DBY7-S, Ferreira et al. 2018). To account for genotyping errors (e.g. allele dropout and false alleles) and to obtain a consensus genotype, each multiplex reaction was replicated four times (three times for the sex chromosome introns amplification). PCR reactions were performed in a final volume of 10 μL, consisting of 4 μL of Qiagen © Multiplex PCR Kit Master Mix, 1μL of DNA, and primer concentrations and thermal profiles according to Ferreira et al. (2018). All products were sequenced on an ABI3130 Capillary Sequencer (Applied Biosystems). The extractions and PCR reactions of the non-invasive samples were performed in physically isolated rooms, and all the equipment used was sterilized with bleach and ethanol and exposed to UV light before and after usage. Aerosol-resistant pipette tips were used, and negative controls were included in each manipulation, maintaining conditions to monitor and reduce risk of DNA contamination (Beja-Pereira et al. 2009;Barbosa et al. 2013;Costa et al. 2017).
Allele calling of the microsatellite loci and sex chromosome introns were performed using GeneMapper (v.4.0; Applied Biosystems), while Dloop sequences were analysed with Geneious (v.8.0; Kearse et al. 2012). Consensus genotypes for each sample were obtained by analysing all replicate genotypes with the software Gimlet (v.1.3.3, Valière 2002). For genotypes differing by up to two loci or with up to two missing data, additional PCR replicas were performed, to try to complete the genotypes, and check for genotyping errors. Genotyping error rates were estimated using the software Pedant (Johnson and Haydon 2007), with 10,000 search steps. Since the software only allows the comparison of two replicates, all possible pairwise comparisons were performed and the results were averaged. Sample consensus genotypes were then compared with each other to identify individuals. The criteria used to assign samples to individuals was very strict, with only individuals differing in more than two alleles assigned as new captures. Less strict criteria were not evaluated here, as these have little impact in population size estimates of the species (Sabino-Marques et al. 2018). The expected heterozygosity (H E ) and observed heterozygosity (H O ) for each locus were calculated using the software GenAlEx, and overall inbreeding (F IS ) was estimated in the program INEST 2.0 (Chybicki and Burczyk 2009) using 2 × 105 iterations, with 50 iterations of thinning and a burn-in of 2 × 104 iterations ).

Population size estimates
To assess Cabrera vole population size at each habitat patch, we pooled the gNIS-based estimated number of genotypes from each sampling day into a single session. We then used the Eggert accumulation curve for capture-recapture (CR) data (Eggert et al. 2003), which is based on the exponential function given as E(x) = a(1 − e (bx) , where x represents the number of genotyped samples, E(x) is the cumulative number of unique genotypes found in x genotyped samples, a is the asymptote of the function that represents the estimated population size, and b is the non-linear slope of the function. The Eggert estimator performs relatively well compared to other popular accumulation curves such as the hyperbolic curve proposed by Kohn et al. (1999), which tends to overestimate population size and result in less precise estimates (Eggert et al. 2003;Frantz et al. 2004), as confirmed in preliminary analyses of our data (not shown here). We used this 9 Page 6 of 14 approach rather than closed-CR because the small sample sizes from most habitat patches prevented the use of algorithms directly estimating recapture probability (e.g. Lukacs and Burnham 2005). Since the order in which the identified genotypes are added may influence the shape of the accumulation curves (Eggert et al. 2003), we randomized each dataset 100 times, and fit the equations to Eggert's curve using least squares regressions. Estimates of a (population size) for each dataset were taken as the average of all replicates, and respective point estimates were used as reference of the 'true' population size in each patch.

Modelling
We used Generalised Linear Mixed Effects Models (GLMM) implemented in the 'lme4' R package (Bates et al. 2015; R Core Team 2020) to model species detection probability based on detection/non-detection data of the species across different sampling intensities, and to assess the effects of sampling intensity on the relationships between population size estimates and abundance indices derived from latrine counts.
We considered detection probability as the probability that a species will be detected at a site, given its presence, i.e. the probability that a species both occupies and is detected in a survey (as in Mackenzie et al. 2002, Mackenzie and, which essentially corresponds to true positive detection probability. This definition assumes that species are never falsely detected (no false positives) and that they may or may be not detected at a site when present (true positive and false negatives, respectively) (Mackenzie et al. 2002). Given that false negative detection probability is the complement of the true positive detection probability, and that false positive detection probability is the complement of true negative detection probability (e.g. Miller et al. 2013), when assuming no false positives in a species survey, the true positive detection may fairly describe detection uncertainty (Mackenzie et al. 2002). In our study, because the data was conditioned to occupied patches, and a non-detection was assumed to represent the overlooking of vole signs, rather than a true absence, we directly modelled detection events using GLMM, and did not need to account for possible nonoccurrence based on a site-occupancy model (MacKenzie et al. 2002; see e.g. Chen et al. 2009).
True-positive detection probability of vole latrines was modelled using binomial error distribution (logit link function) and considering the maximal random structure effects justified by our sampling design, so as to better control variation, increase the power of the analyses, and optimize generalization of the findings (e.g. Gillies et al. 2006). Therefore, we included in the random component the patch and the month of sampling, as well as the identity of the observer that first detected Cabrera signs in each patch and day. We then built a set of models including as fixed factors the main and additive effects of sampling intensity and the variables describing vegetation structure within patches, which may also affect true detection probability (e.g. higher shrub cover may prevent or retard the progression of observers across the habitat and therefore affect sign searching efficiency), while also considering the model including only the random effects (null model). To avoid multicollinearity among vegetation variables, we used a principal component analysis (PCA) to quantify main patterns of variation among patches, and used the results as predictors of true-positive detections. We implemented the PCA using the 'prcomp' function in R and used the Kaiser criterion (Legendre and Legendre 2012) to keep only principal component axes with associated eigenvalues > 1. We then performed a varimax rotation of the significant PCA axes using principal function in the R package 'psych' (Revelle 2015).
The support of each candidate model was based on the Akaike Information Criteria corrected for small samples (AICc; Burnham and Anderson 2002), with ΔAICc < 2 indicating equally supported models (Burnham and Anderson 2002). We also assessed the conditional probability of each model being the best model by estimating the respective AICc-weighs (Burnham and Anderson 2002). The goodnessof-fit of the best model was assessed by computing marginal and conditional R 2 for GLMM (Nakagawa and Schielzeth 2013), using the MuMIn package (Barton 2018). Marginal R 2 shows the proportion of variance explained by the fixed effects, while conditional R 2 provides the proportion of variance explained by both fixed and random effects.
The relationship between population size estimates (rounded to the nearest integer and taken as the dependent variable) and each sign abundance index calculated for each sampling day under variable sampling intensities (fixed effects) was modelled with the negative binomial distribution (log link function), as observations were overdispersed with respect to the Poisson distribution (null models residual deviance decreased from 334.7 to 306.8). Initial models included sampling month and patch as random effects. However, patch effects were removed from the random structure to avoid non-singular fit (Bates et al. 2015). Models including each abundance index were compared with the respective null model based on AICc (ΔAICc and AICcweighs). Model fit was assessed by computing marginal and conditional R 2 for GLMM.
We genotyped a total of 101 different Cabrera voles (58 females and 43 males), with 5.01 ± 2.96 (1-12) individuals genotyped per patch. Population size estimates based on accumulation curves were computed for all but one patch, which appeared to be occupied only by a solitary male that was genotyped 46 times (see Table S3, Supplementary Information). Overall, the population size estimates obtained through Eggert's accumulation curve were notably close to the number of vole genotypes enumerated through gNIS, totalling 118.3 ± 4.0 animals, with 6.0 ± 4.1 (1-16) individuals per patch (see Table S3 and Fig. S2, Supplementary  Information).
The PCA on variables describing vegetation structure produced one single principal component with eigenvalue of 2.13 and explaining 53% of the variation in vegetation data. This principal component (PC-Veg) described a gradient of vegetation structure contrasting patches with higher shrub cover and high with those largely dominated by a welldeveloped herbaceous layer (Table 2), and was considered, together with sampling intensity, as predictor of true positive detection probability. From the set of four candidate models (Table 3), the one including the effect of sampling intensity received greatest support, with an AICc more than 4 units lower than the second most supported, which also included PC-Veg (Table 3). The top ranked model returned a marginal R 2 of 94%, and revealed significant positive effects of sampling intensity in true positive detection probability (Table 3). This model suggested that a sampling intensity rate of ca.
3 min/250m 2 provides a true positive detection probability always > 0.85 (L95%CI), while shorter searching times per unit area result in lower and much more variable detectability of the species when present. According to this model, sampling intensities higher than ca. 6 min/250m 2 can virtually achieve perfect detectability of species true presence in a given patch (Fig. 2), suggesting that non detections under such sampling intensity should represent true negatives.
Sign abundance indices and estimates of local population size were in general positively correlated, although the strength of correlation was strongly dependent on sampling intensity. In particular, our results suggest that correlation between abundance indices and population size gets stronger with increasing sampling intensity up to searching rates around 12 min/250m 2 , tending to stabilize or increase much slowly thereafter (Table 4). Both indices provided moderately strong relationships with estimates of population size only under greater sampling intensity, with slopes reaching 0.34 ± 0.06 and 0.40 ± 0.06 in the case of latrine counts and extent of occurrence, respectively (Table 3). In addition, compared to latrine counts, the extent of occurrence explained a higher proportion of the variance in population size estimates, with marginal R 2 reaching 0.31 and 0.39, respectively (

Discussion
The development of cost-effective methods based on field identification of animal signs for monitoring wildlife population has been for long a priority in ecological and conservation studies (Witmer 2005). However, for most species such methods are largely lacking or, when available, they are often poorly calibrated and tested (e.g. Hopkins and Kennedy 2004;Gervais 2010), making it difficult to properly infer population changes, determine species status, and inform conservation management (Thompson 2004). Focusing on Cabrera voles, our study provides evidence that, where other vole species producing similar signs are absent (i.e. no false positives are likely to occur), and when other more accurate methods like live-trapping or gNIS are not available, sign surveys may provide useful low-cost alternative for monitoring occupancy and inferring local population abundance. However, our study also showed that, when employing sampling protocols based on continuous zigzag-like tracking paths adapted to improve vole latrine detectability, the rate of time spent searching for signs per unit area, strongly affects the reliability of this method. This suggests that differences in population size indices may arise due to differences in sampling intensity, so affecting the quality of inferences on population status and trends (Holbrook et al. 2015;Carreras-Duro et al. 2016;  Bowden et al. 2000). Our study thus supports the idea that standardized survey protocols and time-based sampling intensities designed to enhance species detectability and population size indexing from sign surveys are needed to provide comparable and informative population assessments of Cabrera voles, and other similar species, across space and time (Pollock et al. 2002;Yoccoz et al. 2001). Therefore, we believe our approach provides important insights on how sign-based population monitoring should be implemented in a cost-effective way, particularly as regards to optimal allocation of survey time duration, which is critical to design and planning monitoring studies over large spatial and temporal scales.

Cabrera vole detectability
Cabrera vole population assessments are often limited to single-visit presence-absence surveys within suitable habitat patches, based on sign searches conducted within short (though often poorly defined) time intervals (e.g. Santos et al. 2006;Pita et al. 2007;Valerio et al. 2020). Although such surveys may raise concerns regarding detectability issues, our study confirmed that where the species is present, its signs may be readily detected within the very first sampling minutes, suggesting that this method provides a reliable approach for studying species distribution and occupancy patterns, at least where other species producing similar signs are absent (no false positives). However, our results also showed that, according to our expectations, sampling intensity had a major influence on true positive detection probability of Cabrera vole signs, overriding other eventual sources of variability, such as vegetation structure. This suggests that careful consideration is needed regarding the sampling intensity employed, in order to enhance sign detection probability, and the use of sign-surveys in monitoring programs either based on species detection histories. While standard methods accounting for observation error in occupancy modelling can efficiently deal with imperfect detection, low true positive detection rates may result in less accurate and precise estimates of species occupancy (MacKenzie et al. 2002;Mackenzie and Royle 2005), thus making the choice of the sampling intensity a critical step in such studies. Several studies have recommended that the survey duration in occupancy studies should be considered in a way that the probability of detection exceeds 0.8. (e.g. Long et al. 2008;Steenveg et al. 2018). Our results indicated that an experienced observer surveying at a sampling intensity rate of up to 3 min/250m2 of suitable habitat will likely achieve a true positive detectability > 0.85 (L95%CI), suggesting that this sampling intensity could be used as a reference threshold for guiding vole sign surveys based on detection-non detection data, assuming no false positives. Sampling intensities up to 6 min/250m 2 apparently guarantee a virtually perfect true positive detectability. We stress, however, that achieving nearly perfect detectability for monitoring occupancy may be desirable only in studies relying on naïve occupancy estimates, or single location surveys focused for instance on determining whether voles are present at a given patch where some management activity is likely to affect local habitat quantity or quality (de Solla et al. 2005). Otherwise, for distributional studies conducted over large scales, a lower level of detectability involving sampling

Population size indices
Our results suggest that both latrine counts and the estimated area used by voles within patches (extent of occurrence) correlated positively with population size estimates. The strength of such relationships was weak for low sampling intensities, but increased along with survey durations of up to ca. 12 min/250m 2 , tending to stabilize at moderate levels thereafter. This indicates that sign surveys based on guided sampling under low sampling intensity should not be used for indexing Cabrera vole population size. The optimal sampling intensity should thus be around ca. 12 min/250m 2 , above which there may be limited gains in population size information.
Our results also indicate that the spatial extent of latrines (extent of occurrence) within patches ) is more strongly related than latrine counts to the estimates of population size, and so the former might be preferred in Cabrera vole monitoring. This result is probably because the estimated area occupied by vole integrates information on the spatial distribution of latrines (e.g. Lambin et al. 2000), thus likely accounting to some extent for sources of variability related to non-uniform distribution of voles within patches (St-Laurent and Ferron 2008), individual variations in marking behaviour within occupied territories (e.g. Ferkin et al. 2004), or possible individual heterogeneity in sign detection between individuals (Watkins et al. 2010).
Despite the observation of a significant relation of Cabrera vole abundance with both latrine counts and extent of occurrence, the magnitude of such relation was at best moderate. Reasons for this are uncertain, but besides the individual variations and social context that may affect individuals' spatial marking behaviours (Ferkin et al. 2004), the uncertainties in population size estimates based on gNIS and asymptotic estimators may have also affected the strength of relationships found. Although gNIS was based on an optimized protocol for reducing genotyping error rates , a large number of samples that were extracted did not produce results for one or more microsatellites, or were contaminated, resulting in a relatively low genotyping success which can affect population estimates (Waits and Leberg 2000;Luikart et al. 2010). On the other hand, asymptotic approaches provide the simplest population size estimators from CR data, relying on overly naive assumptions (e.g. random spatial distribution, homogeneous capture probabilities in space and time) that are seldom found in natural systems (Waits and Leberg 2000;Miller et al. 2005;Luikart et al. 2010). However, since the total number of individuals identified in each patch was very close to the estimates produced by asymptotic estimators, we assumed that our sampling procedures and rarefaction-based methods provided a reliable basis to infer local population size,  Table 1). In each case, lines are mean values; grey areas are 95% confidence intervals; circles are observed values (see also Fig. S3 in Supplementary Information) potentially allowing the identification of most (if not all) the individuals present in each patch. Nearly complete censuses are considered a useful approach for inferring abundance of small populations (Gerber et al. 2014), because traditional CR-based estimation methods are difficult to apply at the patch level, as previously shown for the Cabrera vole (e.g. Fernández-Salvador et al. 2005;Ferreira et al. 2018;Sabino-Marques et al. 2018).

Conclusions and implications
Overall, our study suggests that whenever field sign identification is made with negligible error (no false positives), and other more accurate tools are not available, sign searches by experienced observers may provide an adequate alternative to help understanding the distribution changes and occupancy dynamics of small mammals like the Cabrera vole, making this method suitable for population monitoring across large spatial and temporal scales (e.g. Pita et al. 2007Pita et al. , 2016. Furthermore, our work also shows that despite being potentially time-consuming, sign surveys based on the abundance indices analysed in this study may also provide a useful approximation to infer Cabrera vole local population size, at least in systems where habitat patches are mostly small, easily recognised and delimited, and where other vole species producing similar signs are absent, as it is the case of our study area. However, whenever detailed information on local population size is needed, gNIS combined with CR modelling should provide a more adequate approach (e.g. Sabino-Marques et al. 2018;Proença-Ferreira et al. 2019). We thus stress that the decision of whether to use or not abundance indices in any particular study based on sign searches should involve a cost-benefit analysis accounting for the specific study objectives (e.g. Pollock et al. 2002;Falcy et al. 2016;Ferreira et al. 2018).