Crested porcupine (Hystrix cristata) abundance estimation using Bayesian methods: first data from a highly agricultural environment in central Italy

Wildlife abundance estimation is one of the key components in conservation biology. Bayesian frameworks are widely used to adjust the potential biases derived by data collected in the field, as they can increase the precision of model parameter as a consequence of the combination of previous pieces of knowledge (priors) combined with data collected in the field to produce an a-posteriori distribution. Capture-recapture is one of the most common techniques used to assess animal abundance. However, the implementation with camera traps requires that animals present unique phenotypic traits for individual-based recognition. The crested porcupine Hystrix cristata is a semi-fossorial rodent with a continuous, but patchily distribution across Italy. Despite the species does not present evident individual-specific phenotypic traits, the information gathered using presence-only data obtained from camera traps, opportunistic observations, and road-killing events could be used to provide a rough estimate of the species abundance within an area. The main purpose of the present research was hence to provide the first preliminary estimate of the abundance of the crested porcupine in central Italy using presence-only data obtained from the above different monitoring methods. The results obtained estimated an average minimum number of 1803 individuals (SD = 26.89, CI 95% = 1750–1855) within an area covering about 17,111 km2. Since the porcupine is considered as “potentially problematic” because of damages to croplands and riverbanks, assessing its abundance is even more important to delineate adequate conservation and management actions to limit the potential trade-off effects over human activities.


Introduction
Estimating the abundance of wild species is a key issue in animal ecology. Because humans cannot constantly observe and/or monitor animals, assessing wildlife abundance is difficult and the field methods used to achieve this goal may be prone to bias (Iijima 2020). Block count and aerial surveys are some of the most common methods used to assess the abundance of wild species. Nevertheless, it is conceivable that data obtained from direct counts may contain notable variation depending on population dynamic and distribution, in turn, linked to the biological and ecological features of the target species (e.g., reproduction, migration) (Iijima 2020). Bayesian analyses are frequently used to overcome the potential imprecisions derived by field-work data collection (Iijima 2020). Modeling using Bayesian statistical inferences can increase the precision of model parameter because of the combination of previous knowledge (known as priors) with data collected in the field to produce a-posteriori distribution (McCarthy and Masters 2005;Martin et al. 2013; Morris et al. 2015). Capture-recapture is one of the most common methods used to assess animal abundance and it is frequently implemented with Bayesian analyses (i.e., spatially explicit capture-recapture models) (Efford 2004;Royle et al. 2014) to estimate the number of individuals inhabiting an area (Royle et al. 2017;Romairone et al. 2018). Although capture-recapture methods were originally developed to mark individuals, this approach can be extended without the need to physically capture animals (Royle et al. 2017). For example, throughout the use of camera traps, individualbased recognition can be done for those species presenting specific phenotypic traits in the form of stripes or spots (e.g., tigers, leopards) (Royle et al. 2017). Nevertheless, for species in which such traits are absent or hardly recognizable, individual identification is impractical and potentially prone to considerable biases. In these cases, assessing species abundance at a site without individual recognition could be less reliable and requires the use of alternative analyses. These so-called presence-only data are often collected during opportunistic surveys and are generally stored in online data bases and/or museums (Dorazio 2014). These data can be gathered using different opportunistic (e.g., road-killing events, citizen-science platforms) and/or systematic monitoring methods (e.g., radio-telemetry, camera traps) (Abadi et al. 2012;Wilson et al. 2016). However, the main limit is that they do not take into account for the errors in the detection of individuals (Chen et al. 2013). In spite of this limitation, analyses of presence-only data are partially motivated by the difficulties and expenses of conducting planned surveys of wild populations (Dorazio 2014) and could be used to provide a rough abundance estimate of those species, like crested porcupines Hystrix cristata, in which the absence of specific phenotypic markers do not allow for an accurate individual-based recognition.
The crested porcupine in Italy is strictly protected under the National Law 157/1992, whereas at the European level it is included within the Annex II of the Bern Convention (1979) and the Annex IV of the "Habitat" Directive 1992/43/ EEC. Before 1970, in the Italian Peninsula, it was present only in the central-southern regions facing the Tyrrhenian coast, and in the southern areas facing the Adriatic Sea (Toschi 1965). In Italy, the species is well and continuously distributed in the central regions (including Tuscany), while both north and southward presents a much more fragmented distribution . Nonetheless, to the best of our knowledge, to date information regarding its abundance at both small and large scale are still widely lacking. Factors including the abandonment of rural areas favored the species expansion because of the increase of woodland habitats, particularly suitable for porcupines especially for food and denning (Monetti et al. 2005;Lovari et al. 2013;Mori et al. 2014aMori et al. , 2017Mori and Fattorini 2019). In addition, although agricultural habitats are generally avoided, extensive cultivations may provide important food resources (especially during the warmer months) (Lovari et al. 2013;Mori et al. 2014a), as long as they are intermixed with patches of natural or semi-natural vegetated areas (Torretta et al. 2021). As a consequence, although crop damages are occasionally reported (Laurenzi et al. 2016), the crested porcupine is widely considered as one of the main agricultural pests and is frequently subjected to poaching (Laurenzi et al. 2016;Cerri et al. 2017). Therefore, assessing the species abundance (especially in highly human-altered habitats) assumes remarkable importance to delineate adequate management and conservation strategies.
Using presence-only data obtained from camera traps, opportunistic observations, and road-killing events, here we present the first attempt to provide a rough estimate of the crested porcupine abundance within a highly agricultural environment in central Italy.

Study area and data collection
The study was carried out in the Tuscany region (central Italy), within an area of approximately 18,073 km 2 . Such an area was defined through a 100% Minimum Convex Polygon (100% MCP) and starting from the coordinates identifying camera trap locations and independent opportunistic observations of porcupines ( Fig. 1).
For what concerns camera-trapping, we used independent presence-only data collected from previous studies realized in the whole Tuscany region during the year 2013. Because in these studies cameras were placed to detect multiple species and researches were not tailored to specifically address porcupine abundance (data available on www. inatu ralist. org, and published works: e.g., Mori et al. 2014b;Franchini et al. 2017;Mori and Menchetti 2019;Viviano et al. 2021), there is a mismatch in terms of number of cameras placed in each province. Overall, we used information obtained from 134 cameras located in nine provinces: Arezzo (n = 19), Florence (n = 19), Grosseto (n = 67), Livorno (n = 3), Lucca (n = 6), Massa Carrara (n = 1), Pisa (n = 4), Pistoia (n = 1), and Siena (n = 14). Because cameras were distributed across the whole region ( Fig. 1), the average distance between them was of about 74,749 m (Standard Deviation (SD) = 41,887 m). Cameras were placed along trails and/or in the near proximity of denning sites at a height of 30-50 cm above the ground, activated 24 h per day, and checked every 15 days to download data and check for their functionality. The overall monitoring effort was of 4319 camera-trap/night recording an average number of 38 (SD = 22) porcupine independent records per camera. To be defined as an independent event, images of porcupines were filtered considering a time span of 30 min. between pictures at the same site (Meek et al. 2014).

Spatial analyses
Spatial analyses were conducted using the QGIS Software (v. 3.18). Two buffers of 180 and 4896 m, respectively, were applied to each coordinate representing independent opportunistic observations and camera trap locations. The decision to apply such buffer sizes was taken considering the minimal and maximal dispersal distance known for both adult and sub-adult crested porcupines in central Italy (Mori and Fattorini 2019). Within each buffer, we counted the number of points representing either independent camera trap positive detections and/or independent opportunistic observations. Those buffers including only one point were considered as representative of only one individual roaming in the buffer area. On the contrary, more points falling within the same buffer were considered as spatially autocorrelated. Therefore, to avoid potential pseudo-replication biases, only one individual was assumed to roam within the buffer area.
For what concerns the habitat composition analyses, we used the 100% MCP calculated for each area (see "Statistical analyses" section and Fig. 2 for details) and we extracted the percentage of each land cover class starting from the Corine Land Cover 2012 (CLC) (EEA 2018) of the Tuscany region (excluding all the islands) and represented through a 100-m raster layer. We then re-classified the original CLC categories into five macro-habitat categories: agricultural areas, urban areas, canopy-covered and open areas (i.e., woodlands, forests, shrublands, grasslands), water bodies, and cliffs, glaciers, and coastal areas (Table 1). Canopy-covered and open areas were considered as "suitable habitats," agricultural areas as "moderately suitable habitats," while urban areas, water bodies, and cliffs, glaciers, and coastal areas as "unsuitable habitats" (Monetti et al. 2005;

Statistical analyses
To estimate the abundance of porcupines in each monitoring area (A1, A2, and A3 -see Online Resources, "Results" section, and Fig. 2 for details) we used a simple Bayesian model with just one parameter and considering only the surfaces covered by suitable and moderately suitable habitats for porcupines. The choice to divide the whole study area into three main areas was taken to best account for the spatial mismatch in terms of camera-trap locations and opportunistic observations, which may have strongly biased the calculation of the prior distribution and, consequently, the posterior one. Our count data, C, which indicate the likelihood, come from a Poisson distribution expressed through a parameter λ: λ represents the expected number of porcupines in the area surveyed and is calculated from the number of animals counted (or estimated) within each monitoring area, N, and the proportion of the area surveyed, a: The prior for N is represented by a Gamma distribution with parameters S (shape) and R (rate):  The Gamma distribution represents the conjugate prior distribution for a Poisson likelihood. Consequently, even the posterior distribution will be a Gamma distribution.
To obtain a better rough estimate of the number of porcupines inhabiting the suitable and moderately suitable habitats of the whole monitoring area (A3 - Fig. 2c), the analysis was divided into three steps (i.e., repeated for each of the three different monitoring areas) and using first a mathematical approach (to assess for the relationship among likelihood, prior, and posterior distributions), and then a computational one (to assess the goodness-of-fit of each model), through the implementation of the Software R (v. 4.0.1) (R Development Core Team 2021) in JAGS (Plummer 2003) using the "jagsUI" package (Kellner and Meredith 2021): A1. Mathematical approach: we used a 100% MCP to calculate the area defined by each coordinate representing independent opportunistic observations and camera trap locations in the northern area falling within the provinces of Grosseto and Siena (A1 - Fig. 2a). Another 100% MCP was used to calculate the area defined by each coordinate representing the recovery sites of each road-kill porcupine (A1.1 - Fig. 2a) falling within the same larger area (i.e., A1 - Fig. 2a). λ was calculated starting from the number of road-kill porcupines multiplied by the ratio between A1 and A1.1 (which expresses the proportion of the area surveyed) and only considering the surface covered by suitable and moderately suitable habitats for the species. We then created a vector defining the interval input, i.e., the minimum and the maximum number of crested porcupines estimated within the northern area of the provinces of Grosseto and Siena (i.e., A1), obtained from the two buffers (i.e., 180 and 4896 m, respectively) applied to those coordinates representing both camera trap locations and independent opportunistic observations defining A1. The minimum value obtained from the interval was subtracted to λ, while the maximum value was added to the same. This was done in order to obtain a rough estimate of the range of individuals inhabiting A1. Since the number of individuals obtained from λ may be overestimated while the range of individuals obtained from the two buffers is most likely underestimated, using this approach we partially reduced such a bias. Thereafter, we calculated the mean and the standard deviation of the range. These data were used as input information to define both the shape and rate of the prior distribution. The combined information obtained from the likelihood and prior distribution were then used to produce the posterior distribution.
N ∼ Gamma (S, R) Computational approach: the model was run in JAGS using three chains, 20,000 iterations, zero burnin, and one thinning. The diagnostic plot used to assess the goodness-of-fit of the model was visualized using the diagPlot function, implemented in the "wiqid" package (Meredith 2017). The goodness-of-fit of the model was then evaluated considering the Rhat index, which represents the scale reduction factor indicating convergence between chains (Gelman et al. 2004;Gelman and Hill 2007;Kruschke 2014) and the MCEpc value, which expresses the error in the mean estimation (percentage value) compared to the target distribution (Lunn et al. 2013).
Step-by-step calculations are provided in the Online Resource 1. A2. Mathematical approach: we used a 100% MCP to calculate the area defined by each coordinate representing independent opportunistic observations and camera trap locations in the whole area including the provinces of Grosseto and Siena (A2 - Fig. 2b). The new λ was calculated starting from the average number of individuals estimated within A1 (posterior distribution obtained from the first-step analysis) multiplied by the ratio between A2 and A1 (proportion of area surveyed) and only considering the surface covered by suitable and moderately suitable habitats for the species. We then created a vector defining the range which, in turn, refers to the minimum and maximum number of crested porcupines estimated within the southern area of the provinces of Grosseto and Siena ( Fig. 2a -light blue dots), obtained from the two buffers applied to those coordinates representing both camera trap locations and independent opportunistic observations. As done in step 1, the minimum value of the range was subtracted to the λ, while the maximum value was added to the same. Subsequently, we calculated the mean and the standard deviation of the range. These data were used as input information to define both the shape and rate of the prior distribution and implemented with the information obtained from the likelihood to obtain the posterior distribution.
Computational approach: the model was run in JAGS using three chains, 20,000 iterations, zero burnin, and one thinning. The diagnostic plot was visualized using the diagPlot function, implemented in the wiqid package (Meredith 2017), and the goodness-offit of the model was assessed based on both the Rhat index (Gelman et al. 2004;Gelman and Hill 2007;Kruschke 2014) and the MCEpc value (Lunn et al. 2013).
Step-by-step calculations are provided in the Online Resource 2. A3. Mathematical approach: we used a 100% MCP to calculate the area defined by each coordinate representing independent opportunistic observations and camera trap locations in the whole monitoring area (A3 - Fig. 3c). As done in step 2, the new λ was calculated starting from the average number of individuals estimated within A2 (posterior distribution obtained from the second-step analysis) multiplied by the ratio between A3 and A2 (proportion of area surveyed) and only referring to the surface covered by suitable and moderately suitable habitats for the species. We then created a vector defining the range which, in turn, refers to the minimum and maximum number of crested porcupines estimated within the remaining monitoring area (Fig. 2b -purple dots), obtained from the two buffers applied to those coordinates representing both camera trap locations and independent opportunistic observations. The minimum value of the range was subtracted to λ, while the maximum value was added to the same. Afterward, we calculated the mean and the standard deviation of the range. These data were used as input information to define both the shape and rate of the prior distribution and implemented with the likelihood information to produce the posterior distribution.
Computational approach: the model was run in JAGS using three chains, 20,000 iterations, zero burnin, and one thinning. The diagnostic plot was visualized using the diagPlot function, implemented in the wiqid package (Meredith 2017), and the goodness-offit of the model was assessed based on both the Rhat index (Gelman et al. 2004;Gelman and Hill 2007;Kruschke 2014) and the MCEpc value (Lunn et al. 2013).
Step-by-step calculations are provided in the Online Resource 3.

Results
The habitat analysis revealed that, in each area (A1.1, A1, A2, or A3), agriculture constituted much of the land covered, being preponderant in A2 and A3 while reaching the "second place" only in A1 and A1.1. Furthermore, in A1, such a value is comparable to the percentage of land covered by canopy-covered and open areas (Table 1).
Following the subdivision provided in the "Statistical analyses" section, results are presented in three steps: A1. Within A1.1 (278 km 2 ) we opportunistically collected 27 carcasses of road-kill porcupines. Starting from the mathematical product between these 27 animals and the ratio between the area covered by suitable and moderately suitable habitats found in A1 (1996.17 km 2 ) and A1.1 (273.97 km 2 ) (proportion of area surveyed), we obtained a λ corresponding to 197 individuals. From the application of the two buffers (180 and 4896 m, respectively) to each coordinate representing independent opportunistic observations and camera trap locations falling within A1, we obtained a range spanning from 17 to 40 individuals. By subtracting the lower value (i.e., 17) to λ and adding the higher one (i.e., 40) to the same, we obtained an interval ranging from 180 to 237 individuals. The model output obtained from both the mathematical and computational approach (showing the posterior distribution) allowed us to estimate an average minimum number of 207 porcupines (SD = 15.41, 95% confidence interval (95% CI) = 176-237) within the suitable and moderately suitable habitats in A1 (Fig. 3) (see the Online Resource 1 for details).
A2. Through the multiplication of the average minimum number of porcupines estimated during the first-step analysis (i.e., 207 individuals) and the ratio between the area covered by suitable and moderately suitable habitats found in A2 (6236.37 km 2 ) and A1 (proportion of area surveyed), we obtained a new λ corresponding to 647 individuals. From the application of the two buffers to each coordinate representing independent opportunistic observations and camera trap locations falling within the southern area of A2 (see Fig. 2a light blue points), we obtained a range spanning from 29 to 50 individuals. By subtracting the lower value (i.e., 29) to the λ and adding the higher one (i.e., 50) to the same, we obtained an interval ranging from 618 to 697 individuals. The model output obtained from both the mathematical and computational approach (showing the posterior distribution) allowed us to estimate an average minimum number of 655 porcupines (SD = 20.56, CI 95% = 614-695) within the suitable and moderately suitable habitats in A2 (Fig. 4) (see the Online Resource 2 for details).
A3. Through the multiplication of the average minimum number of porcupines estimated during the secondstep analysis (i.e., 655 individuals) and the ratio between the area covered by suitable and moderately suitable habitats found in A3 (17,111.52 km 2 ) and A2 (proportion of area surveyed), we obtained a new λ corresponding to 1797 individuals. From the application of the two buffers to each coordinate representing independent opportunistic observations and camera trap locations falling within the central and northern area of A3 (see Fig. 2b purple points), we obtained a range spanning from 42 to 57 individuals. By subtracting the lower value (i.e., 42) to λ and adding the higher one (i.e., 57) to the same, we obtained an interval ranging from 1755 to 1854 individuals. The model output obtained from both the mathematical and computational approach ( within the suitable and moderately suitable habitats in A3 (Fig. 5) (see the Online Resource 3 for details).

Discussion
To date, the crested porcupine is classified as "Least Concern" by the International Union for Conservation of Nature (IUCN) (Amori and De Smet 2016), despite being almost rare in Central African countries (Viviano et al. 2020). Nevertheless, in spite of this classification, data referring to its population trend are still widely lacking (Amori and De Smet 2016). Following the Resource Dispersion Hypothesis (RDH) (Macdonald 1983;Carr and Macdonald 1986), the distribution and quality of resources (e.g., food, shelters, partners, and/or sites for reproduction) affect species presence and distribution at a site. In fact, individuals are more inclined to occupy the smallest areas which contain all the resources they need (Harestad and Bunnel 1979). In the case of the crested porcupine, as reported by Lovari et al. (2013), the home-range size of a male may vary from 10.0 to 398.7 ha while that one of a female range from 18 to 478.15 ha. This means that, on average, the home-range of an individual (male or female) is of about 226.21 ha. Following these data and assuming all individuals as territorials, if we consider the surface covered by suitable and moderately suitable habitats in each area, A1 would be expected to host at least 883 individuals, while A2 and A3 would be expected to host 2759 and 7571 individuals, respectively, with an average minimal density of about 0.44 ind./100 ha for each area. Nevertheless, we believe that our estimates (0.10 ind./100 ha per area) are most closely related to the true number of individuals because of four reasons: (i) both intra-and interspecific competition, along with the low reproductive rate of the species (from one to three births per pair per year, each composed by on average one or two porcupettes) (Coppola and Felicioli 2021), may play a key role in shaping species distribution and abundance at a site; (ii) road-killing and poaching events may substantially affect the species' survival capacity; (iii) our analysis included also sub-adult individuals which notoriously disperse before settling and, hence, do not show a territorial behavior (Mori and Fattorini 2019); and (iv) each area is covered by a considerable percentage of agriculture. Therefore, because homogeneous agricultural areas are considered as non-optimal for the species (Torretta et al. 2021), we believe that the carrying capacity of each area is lower than expected. However, in spite of these considerations, extensive cultivated fields may become hospitable for the species as long as they are intermixed with either natural or semi-natural vegetated areas, which in turn provide shelters and abundant food resources (Lovari et al. 2013;Mori et al. 2014a;Torretta et al. 2021). The porcupine is considered as a "potentially problematic species" because of damages to croplands, riverbanks, and tree debarking (Laurenzi et al. 2016; Lovari and Riga 2016;  Lovari et al. 2017) and is subjected to persecution by local farmers (Cerri et al. 2017). Seeds, fruits, epigeal parts, roots, and other underground vegetables constitute the staple of the diet of the porcupine in Italy (Zavalloni and Castellucci 1994;Bruno and Riccardi 1995). However, corn, potatoes, pumpkins, sunflowers, and melons are consumed if locally available (Mori et al. 2013;Bertolino et al. 2015), thus reducing the tolerance of local farmers (Cerri et al. 2017). Our data suggest that the density of the species is still relatively low within the monitoring areas. However, the information obtained may help managers and conservationists to delineate appropriate actions aimed to reduce the potential negative impacts of porcupines over human activities, through the implementation of mitigation measures especially in those areas characterized by extensively cultivated fields and where the stable presence of the species may lead to agricultural damages. Our research represents the first important contribution in the assessment of the crested porcupine abundance in a highly agricultural environment of central Italy. Furthermore, the Bayesian method we proposed in conjunction with spatial analyses (implemented to define the minimum and maximum number of individuals roaming within each buffer area), to our knowledge, represents the first attempt to estimate the abundance of a species using presence-only data (i.e., Bayesian model with just one parameter). Specifically, it is advantageous being of relatively simple implementation thus allowing to obtain a rough estimate of the target species abundance within an area, even in the absence of presence/ absence data. Nevertheless, despite the results presented, we are aware that our study presents some limitations: (i) as stated above to estimate porcupine abundance we used presence-only data which, compared to presence-absence data, does not account for the imperfect detection (i.e., detection probability) (Chen et al. 2013) and/or the recovery probability (in the case of road-kill individuals); (ii) independent opportunistic observations were collected from citizen-science platforms which, despite their usefulness in the analysis of ecological data has already been assessed (e.g., Franchini et al. 2021), present several limitations (e.g., absence of a survey protocol, unknown sampling effort, no standardization, and poor a-priori control over observer quality) (Kery and Royale 2020); (iii) the mismatch in the spatial extent of camera traps and independent opportunistic observations did not allow us to provide stronger ecological inferences, especially regarding the potential difference in terms of the species abundance at a small-scale level. Indeed, the most likely different availability of resources (not evaluated in this study) may lead porcupines to be mostly abundant in some habitats and to a lesser extent in others; (iv) without being supported by field-data collection and validation, the number of porcupines estimated within each area could be most likely underestimated and, therefore, needs to be considered as a minimum referring value. In this sense, further researches involving spatially homogeneous field-based monitoring activities among areas and in conjunction with more appropriate statistical methods that take into account imperfect detection, are strongly suggested to provide further and detailed inferences regarding the abundance of the species in central Italy.