Introduction

Road infrastructures generate benefits for society, improve accessibility to goods and services, and generate direct contributions to productive sectors. Highways are key to facilitating population growth and economic development in any country (Obregón 2008). If a state has a good road network, this allows increased accessibility and correction of territorial imbalances (Sánchez Galindo 1996). For this reason, investments in various road networks are directly related to economic development (García 2007), as evidenced by a positive and statistically significant relationship between investments made in road infrastructure and economic growth in the municipalities of Antioquia department, Colombia (Durango Agudelo et al. 2016).

Despite the positive effects of linear infrastructure, these also generate negative effects on the ecosystems involved, including, but not limited to, habitat loss and fragmentation, barrier effect, population isolation, changes in land cover, and road mortality (Prus and Sikora 2021; Liu et al. 2019; Lala et al. 2021). These effects have been linked with the alteration of reproductive patterns and the isolation of populations (Roedenbeck et al. 2007). Due to the magnitude of these effects, Wildlife–vehicle collisions (WVC) have been identified as one of the main factors for biodiversity loss (Laurance et al. 2014) and one of the major causes of wildlife death (Poledník et al. 2011; Seiler and Helldin 2006; Sommer et al. 2005).

Collisions between vehicles and wildlife can have a wide range of consequences for both drivers and animals. Drivers can experience serious consequences, emotional distress, and economic costs, while different wildlife species are affected and their probability of survival can be drastically reduced (Huijser et al. 2008). To mitigate the impacts of roads, it has been necessary to propose mitigation measures, such as speed reducers, wildlife crossings, and warning signs, among others (Pomareda et al. 2014). However, the studies necessary for the successful implementation of these measures can have high costs (Santos et al. 2013). One common approach is to estimate the probability of fatalities by making associations between WVC point locations and environmental features (Ha and Shilling 2018; Phillips and Dudík 2008).

With the purpose of mitigating the impacts on biodiversity caused by roads, it has been necessary to develop different methodologies to determine where wildlife–vehicle collisions occur. The identification of spatial–temporal patterns, the structure of the landscape, the biological characteristics of the species, and the alterations in environmental factors are fundamental to analyzing the creation of high-risk areas (Freeman 2010; Gunson et al. 2011; Teixeira and Gunson 2015). Road segments with high mortality rates (HotSpot) are identified as priority locations when proposing mitigation measures (Coelho et al. 2008). However, the identification of hot spots is not sufficient to understand the movement patterns of the species through the landscape surrounding the roads (Clevenger et al. 2003), for this reason, integrating pattern analysis and remote sensing is important for understanding the interactions between habitat and wildlife (Ascensão et al. 2019; Buyantuyev et al. 2009).

Commonly, two categories of methods are distinguished for species distribution modeling: (1) methods using presence–absence data, and (2) methods using presence-only data (Tsoar et al. 2007). There is a big difference between presence-only data and presence–absence data since presence data indicate the area where the target occurrence was observed but does not define the areas where there are no occurrences (Dettmers and Bart 1999). Niche-based estimation models are capable of modeling complex relationships based on the presence of data and environmental variables (Ha and Shilling 2018). MaxEnt is a niche-based software that uses a deterministic machine learning approach to model the interactions between the landscape and the species that inhabit it, based on maximum entropy (Ancillotto et al. 2019; Fekrat and Farashi 2022; Shameer and Sanil 2023; Ha 2022). This method has been used to investigate vehicle collisions with birds, mammals, and amphibians (Ha and Shilling 2018; Fabrizio et al. 2019; Kantola et al. 2019; Chyn et al. 2023).

In the context of this study, we hypothesize that the utilization of ecological niche modeling, in conjunction with environmental variables, will facilitate the prediction of areas that exhibit an increased likelihood of collisions between wildlife and vehicles on the roads of the eastern region of the Antioquia Department. To examine this hypothesis, we employed the MaxEnt modeling methodology, utilizing systematically obtained roadkill data from the Red Colombiana de Seguimiento de Fauna Atropellada (RECOSFA) app, along with environmental descriptors.

Within this study, we develop models to accomplish the following objectives: (1) Predict areas of elevated risk, potentially serving as priority sites for mitigation efforts. (2) Identify the specific environmental factors contributing to wildlife–vehicle collisions (WVCs) on roads within the eastern region of the Antioquia Department, Colombia. (3) Determine the optimal model for predicting high-risk zones within the study area.

Materials and methods

Study area

This work has focused on the roads that communicate the municipalities of Envigado (1575 masl), La Ceja (2200 masl), El Carmen de Viboral (2150 masl), and Rionegro (2135 masl) in Antioquia, Colombia (See Fig. 1). The study includes 92.5 kms of road, of which, according to the Instituto Nacional de Vias, 78.48 km are considered primary roads and 13.71 km are considered secondary roads. These roads usually have a high vehicle flow, especially on weekends.

The municipalities contained in the study area are located in the San Nicolás Valley, also known as the "Altiplano de Oriente de Antioquia". The San Nicolás Valley has temperatures ranging from 9 to \(24\,^{\circ }\)C and is characterized by semi-agricultural activities, especially flower cultivation. Additionally, the region has had considerable growth in its economy in recent years, even doubling the growth of the department and the nation. This growth has led to an expansion of the urban centers of multiple municipalities, which has led to a growth of its road infrastructure. It is characterized by a predominance of secondary vegetation cover, cultivation mosaics with natural spaces, pastures, forest plantations, and open forests. According to the Corporación Autónoma Regional de las Cuencas de los Ríos Negro y Nare (CORNARE), the most frequently seen animal species in the area are the Red-tailed Squirrel (Notosciurus granatensis), the Central American Agouti (Dasyprocta punctata), the Possum (Didelphis marsupialis), the Paca (Cuniculus paca), and the Mountain dog (Potos flavus), among others (García-Morera and Giraldo-Iral 2018).

Roadkill data

Roadkill data were collected by car surveys with three observers conducting surveys while excluding the driver as a participant in a moving vehicle. A GPS was used to register the coordinates of every carcass found on the road. Each roadkill was manually recorded, with the following variables registered: the hour of the day, date, time of year, the class and species of the carcass, estimated time of run-over, climatic conditions, type of road, number of lanes, land cover adjacent to the road, maximum posted speed limit, and the location of the carcass. Surveys were made with an average speed of 30 km/h, starting at 5 AM local time with 15 days between each survey starting in October 2015 until November 2016 for a total of 51 surveys including dry and wet seasons.

Each carcass was removed from the road after being registered to ensure that double-counting wasn’t possible. In cases of uncertainty in the identification process, the general taxonomic features were registered with photography and then consulted with a specialist in the suspected animal class. Data collection was carried out within the framework of project P20249 financed by the Instituto Tecnologico Metropolitano de Medellin–ITM. The roads surveyed included private roads administered by the National Agency of Infrastructure (ANI) and secondary roads administered by the local government. As a result, the data collection may have been affected by cleaning parties, however, the research team reached out to each of the road administrators to let them know about the project and the need to postpone cleaning duties until after the research team had completed their data collection and removal of dead animals.

Point pattern analysis

Ripley’s K statistics was used to identify the distance bands where there was a significant grouping of points, i.e., a non-random grouping (Clevenger et al. 2003; Coelho et al. 2008; Ripley 1981). Ripley’s K creates a neighborhood with a given radius and focuses on a WVC, then sums the number of records within the neighborhood, and multiplies that sum by a correction factor based on the length of road within the neighborhood, that is, the length of the road within the neighborhood. This process is carried out for each WVC on the road and finally, the results are added up to calculate the aggregation intensity for the radial scale evaluated. To compare different subsets of the same road, the sum is multiplied by the total length of the road and divided by \(n(n-1)\). where n is the total number of roadkill (Coelho et al. 2008). The analysis is repeated with an increased radius until the total length of the road is reached, evaluating the occurrence of aggregation at multiple scales (Coelho et al. 2014). We employ an initial 100 m radius increment of 400 m, a confidence level of 95%, and 100 simulations. This approach aimed to discern whether the pattern exhibited clustering, dispersion, or randomness. We derived accumulations based on the minimum distance at which significant clustering was detected.

Finally, to detect the segments with high mortality rates, the 2D hotspot analysis was used (Coelho et al. 2014). We made 1000 divisions for each road segment, considering a radius corresponding to the selected distance band by the spatial correlation analysis. As a result, accumulation value (HS), upper confidence limit (UCL), and lower confidence limit (LCL) were calculated for each segment. Then, we classified each segment between Hotspot and Non-Hotspot, being Hotspot when HS>UCL value (Meza-Joya et al. 2019), and Non-Hotspot when value HS<LCL. We evaluated the significance of potential aggregations at a 95% confidence level. In this section, we used the road mortality software Siriema v2.0 (Coelho et al. 2014).

Environmental data

Different spatial variables were used, including satellite images and spectral indices. The list of environmental variables is shown in Table 1. The Digital Elevation Model (DEM-GTOPO30) was obtained from the United States Geological Survey (USGS) with a spatial resolution of 30 m (downloaded from https://earthexplorer.usgs.gov/) (USGS 2020). Distance to rivers and slope layers were calculated from a watershed model and the DEM layer, respectively. Two 30 m resolution Landsat 8 scenes dated 10–24 February 2020 were downloaded from the USGS platform, each with less than 20% cloud cover (downloaded from https://earthexplorer.usgs.gov/) (USGS 2020). From these images, the Normalized Difference Vegetation Index (NDVI) was derived (Meneses-Tovar 2011; Li et al. 2019). The Tree cover layer for 2019 was downloaded from the Department of Geographic Sciences of the University of Maryland (DGSUM) (downloaded from https://earthenginepartners.appspot.com/science-2013-global-forest/download_v1.7.html) (Hansen et al. 2013), distance to the forest was then calculated from this by subtracting the year loss layer from the original forest data set for 2000. We use the Landcover feature, which provides information on the country’s land use, it was created by the Instituto de Hidrología, Meteorología y Estudios Ambientales (IDEAM) for the year 2018 (downloaded from http://www.siac.gov.co/catalogo-de-mapas) (Instituto de Hidrología 2020). The Resistance layer and the corridor distance were obtained from a least-cost map, and are considered as the accumulated resistance on each pixel that impedes or facilitates animal movement (Wade et al. 2015).

Least cost path map

The least cost of path analysis was performed following the proposed methodology by Isaacs-Cubides et al. (2017). A reclassification of different landscape and urban planning variables was performed. The following layers of geographic information were collected:

  • A layer of protected areas (downloaded from the mapping service of the Natural National Parks Department of Colombia in August 2019).

  • A land use layer (downloaded from the Instituto Geográfico Agustán Codazzi geoportal in August 2019).

  • A hydro-graphic model of the area (Made from digital elevation map (DEM) ASTER V3 2016).

  • A layer of roads in the area (Obtained from DANE Geoportal).

  • A road distance layer (Calculated from the road layer of the area).

  • A distance to rivers layer (Calculated from the Hydro-graphic Model of the zone).

  • A land cover layer (Obtained from the Map of Continental, Coastal, and Marine Ecosystems made by IDEAM in 2017).

  • A municipal Cadastre Layer (Downloaded from the August 2019 Antioquia Cadastre portal for the municipalities in the study area).

From these layers, a reclassification of the variables was made according to expert criteria, and 5 different experts on mammals were consulted about the preferences of the target species. The reclassification values can be seen in the supplementary material. The target species for this model is the Crab-eating Fox (Cerdocyon thous), which was selected due to its high frequency of roadkill in the study area, as well as the wide home range it needs, C. thous, is the canid with the widest range in South America, being found in a wide variety of habitats up to 3000 m.a.s.l., such as mountain and tropical forests, savannas, livestock areas, wetlands (inland) and, additionally, it has been shown that it is increasingly adapted to intervened areas (urban and semi-urban areas) (Lucherini et al. 2015). This species takes advantage of the available food resources (small mammals, frogs, reptiles, birds, and insects, additionally, it consumes eggs, carrion, seeds, and fruits). However, it can behave like a specialist when it is in a productive habitat (Lucherini et al. 2015).

The resulting resistance map will generate a model of the least-cost path in CircuitScape software which, based on Ohm’s law, calculates the current value in each cell of the map, starting from the protected areas (nodes).

Roadkill risk model

MaxEnt software was used to model the suitability of hot spots, this technique determines patterns in the data based on different environmental constraints and then selects the most likely system configuration taking into account the maximization of Shannon entropy (Ali et al. 2021; Merow et al. 2013). The results of MaxEnt are sensitive to initial settings, as noted by Merow et al. (2013). Five different models were created and the results were compared with each other. The models were created with the following settings: we combined the predictor features of linear (L), quadratic (Q), and product (P), and the regularization multiplier values were set to 0.1, 0.5, 1, 2, and 5. 10,000 randomly selected pseudo-absences and for each model we used 50 repetitions. Threshold and hinging features were removed because it has been considered that the combination of these can produce unrealistic overfitting (Heumann et al. 2013; Ha et al. 2016.

To evaluate the predictive performance of each model and reduce potential spatial biases in our data, we used a test set of data from road segment 1, which was previously unknown to the model. Finally, we applied jack-knife analysis for all environmental variables to estimate the contribution of each (Fabrizio et al. 2019; Kantola et al. 2019; Ha and Shilling 2018). While evaluating the performance of different species distribution methods has become a challenge for ecology and conservation biology (Srivastava et al. 2019), there are different techniques to evaluate them. In this study, to define the best predictor of hot spot aggregations we used area under the receiver operating characteristic curve (AUC-ROC) and the true skill statistic (TSS) (Allouche et al. 2006).

AUC-ROC is the method most commonly used to characterize model performance in the articles we reviewed. This metric was developed for binary classification of presence and absence (Ha et al. 2016; Phillips et al. 2006). When presence-only data exists like in our research, Maxent computes a different AUC-ROC value, which only takes into account presence data (Yates et al. 2010), we will refer to this measure as AUC-ROCpo. Such a measure does not have much meaning for this study because it compares the results with the random background data, which may or may not represent true absences (Ha et al. 2016). Therefore, the interpretation of the AUC-ROCpo cannot be interpreted as a measure of true performance, but as the ability of the predictor to classify a different pattern of the background (Ha and Shilling 2018; Ha et al. 2016; Lobo et al. 2008; Elith et al. 2010). As with our research, the limitations of the AUC-ROCpo have been extensively detailed by other authors (Yackulic et al. 2013; Lobo et al. 2008). Therefore, a correction of the values of AUC-ROCpo obtained from MaxEnt was made. To calculate the AUC-ROC value, which we named in our study as AUC-ROCcorr, we perform the process shown in Fig. 2.

We used the Hotspot suitability map obtained with MaxEnt using the logistic output, and it was taken to the ArcGIS Pro 2.6 software where the actual data obtained by the 2D Hotspot analysis was overlaid. Probabilistic values were extracted from each location and converted to binary values. Therefore, the selection of an "optimal threshold" for the model was important. The logistic output is symmetrical to a turning point that is located at a probabilistic value of 0.5 (Real et al. 2006; Guillera-Arroita et al. 2014). This value is the correct threshold when the data in the model have the same percentage presence/absence ratio (Lobo et al. 2008), but if one of the two events tends to be higher, the probabilities are biased toward the common event (Cramer 1999), so a threshold of 0.5 may be incorrect (Lobo et al. 2008). For this reason, a threshold is required that depends on the event prevalence. (Jiménez-Valverde and Lobo 2006, 2007). In our study, we found the value that minimizes the difference between sensitivity and specificity, and we identified it as the optimal threshold for the classifier, as proposed by Jiménez-Valverde and Lobo (2007), Bosso et al. (2022), Liu et al. (2013), Ahmadi et al. (2023). Finally, we created a confusion matrix with the results and calculated the evaluation metrics described for each model and we analyzed the results to identify the model with the best fit from the metric AUC-ROCcorr".

Results

Roadkill data

The surveyed areas in this research were 4 important road segments with lengths between 12 and 39 km that connect municipalities in eastern Antioquia. We studied 92.5 km of primary and secondary roads with a mean speed of 30 km/h. We found 499 wildlife carcasses. In 2015, we found 74 road kills (14.82%), and in 2016, we found 425 (85.17%). The segments with WVC were commonly surrounded by dense forests, low and medium vegetation, and houses. The segments studied can have temperatures between 10.4 and 27.0\(^{\circ }\)C.

In this study, 221 mammals (42.28%), 161 birds (32.26%), 77 amphibians (15.43%), and 40 reptiles (8.01%) were found, the main species recorded were common opossums (Didelphis marsupialis) (15.43%), cane toads (Rhinella marina) (14.22%) and Turdus ignobilis (4.6%). On the other hand, we observed that 97% of vertebrate deaths occurred on road segments with two lanes, the remaining deaths occurred on roads with only one lane, and 60.72% of the accidents occurred during the wet season. Finally, 417 reports were made between 5 a.m. and 12 p.m., of which 51% of the carcasses had been dead for less than 2 days.

Point pattern analysis

Based on the K-Ripley analysis, the distribution of wildlife–vehicle collisions (WVCs) on the studied roads does not appear to be random. Road segments 1, 2, and 4 showed significant WVC aggregations within a radius of 0.1 km. However, segment 3 did not exhibit significant aggregations within a range suitable for prevention or mitigation measures. This suggests that segments 1, 2, and 4 are strongly correlated with the occurrence of roadkill (See Fig. 3). The 2D hotspot analysis identified a total of 160 hotspots, enabling the identification of high-risk areas for roadkill, as shown in Fig. 4.

Road-kill risk model

MaxEnt evaluated different scenarios that allowed for excellent discrimination between sites with high accumulations of wildlife–vehicle collisions and background locations. The AUC-ROCpo values ranged from 0.980 to 0.985. After making corrections to these values, which will be considered the actual values for this study, the AUC-ROCcorr values ranged from 0.54 to 0.77 and the true skill statistics (TSS) values ranged from 0.38 to 0.44. These outcomes collectively underscore the robustness of our models in distinguishing high-collision risk areas from background settings, as well as provide a comprehensive picture of the predictive performance, considering both AUC-ROCcorr and TSS metrics (see Table 2).

The percentage contribution of the variables for the modeling of hot spots on the described roads is shown in Table 2. The resistance of the map of ecological connectivity, the distance to the forest, and the land cover of the area were dominant for the occurrence of WVC. The contribution of other environmental characteristics varied, but we did not exclude any variables from the analysis as all made an acceptable contribution to the jackknife analysis.

The probability map of hot spots predicted by the selected models is shown in Fig. 5. The model projected the largest number of hotspots on-road segments near the municipalities of El Retiro and El Carmen de Viboral. The response curves evidenced that features like distance to forest, the land cover, the resistance of the map of connectivity, and soil vocation Index are characteristics that can increase the probability of WVC. Also, it is evident that large distances to forests and high resistance values in the ecological connectivity map, increase the risk of being run over. On the other hand, short distances to rivers and corridors can also increase the likelihood of the landscape becoming a high-risk area. Of the 37 types of ecosystems, the mosaics of crops and pastures (coverage number 16) increase the probability of collisions, as does the vocation to transitional and semi-permanent crops in a cold climate (vocation number 3) (see supplemental material).

Discussion

Roadkill data

Road mortality is now considered a threat to the survival of many species (Kamler et al. 2003; Beaudry et al. 2008; Grilo et al. 2013). We evaluated the frequency of roadkill, because it may vary depending on various factors, such as the time of year or weather conditions (Kantola et al. 2019). A total of 92.5 km were surveyed with 499 wild vertebrates from at least 60 species. We recorded the largest number of roadkill in 2016, with a total of 425 animals. However, this difference is reflected in the difference in months of data collection compared to 2015. The group of wildlife most frequently involved in roadkill incidents were mammals, followed by amphibians and reptiles. Studies conducted in Brazil and Mexico also report similar findings to ours, where mammals and birds represented the highest proportion of roadkill (Omena Junior et al. 2012; Grosselet et al. 2004). It also resembles what was found in the department of Meta, Colombia (Tejeiro-M 2018), and we differ from what was reported by Seijas et al. (2013), where the most reported group was reptiles.

Our study highlights the importance of road segments with two lanes, which accounted for 97% of the vertebrate deaths, suggesting that the type of road infrastructure can increase the risk of WVC. The wet season accounted for a significant proportion of accidents, indicating that environmental factors, such as rainfall and temperature, can impact the frequency of WVC, Moreover, the data show that the segments studied were surrounded by dense forests, low and medium vegetation, and houses, which indicates that the WVC problem is not only a threat to wildlife but also to humans.

It was important to have reports on bird roadkills because collisions between birds and vehicles pose a high risk, causing driver distraction and serious accidents (Ha and Shilling 2018). Direct observation has shown that birds tend to stay at the edge of the tracks (Castillo-R et al. 2015). The birds’ foraging behavior and their ability to fly at heights similar to those of the vegetation next to the tracks increase their likelihood of colliding with vehicles (Adárraga-Caballero and Gutiérrez-Moreno 2019; De La Ossa-V and Galván-Guevara 2015). As reported by Castillo-R et al. (2015), Bedoya-V et al. (2018), Turdus ignobilis was one of the most frequently reported bird species, but some carcasses were found at the roadside, suggesting that not all birds died from being run over. It can be inferred that, as stated by Trejo and Seijas (2003), birds can collide against vehicle windows and be injured, with the ability to move to the roadside.

The vertebrate species most impacted by vehicle traffic collisions were D. marsupialis and R. marina, as well as in the study conducted by Payan et al. (2013), indicating that these animals may be particularly vulnerable to road collisions. The high frequency of opossum attacks coincides with that reported in studies like Meza-Joya et al. (2019), Payan et al. (2013), Osório and Mauhs (2004), Gottdenker and Wallace (2001), Delgado-V (2007), Castillo-R et al. (2015), De La Ossa-V and Galván-Guevara (2015), Seijas et al. (2013), and this, possibly due to the abundance of the species in the region or its ability to feed on roadkill. In addition, it has been argued that because of their habit of feeding at night, they can be dazzled by vehicles and increase the probability of being run over (Castillo-R et al. 2015; Wendy and Pedro 2006; Delgado-V 2007).

Reptiles possess characteristics that increase their likelihood of being involved in vehicle collisions, these characteristics include age, sex, body size, and activity pattern, among others (Andrews et al. 2004). It has been argued that reptiles use roads for thermoregulation purposes (Andrews et al. 2004), as is the case with Australian lizards (Klingenböck et al. 2000) and snakes (Castillo-R et al. 2015). This shows that roads can serve as an attraction for certain species. However, this can increase the likelihood of being run over by vehicles (Ashley and Robinson 1996).

As in Castillo-R et al. (2015), Fernando and Ingrid (2011), Shine et al. (2004), Omena Junior et al. (2012), the R. marina was the most reported amphibian. We observed that the number of reports in the wet season doubled compared to the dry season, which could be a cause of its opportunistic effect, in addition, they tend to use the roads as dispersion routes, which increases the probability of dying by accident (Brown et al. 2006; Forman and Alexander 1998).

Point pattern analysis

Although no statistically significant groupings were evident in road segments 1, 2, and 4, with the 2D Hotspot analysis it was demonstrated that they have areas with high mortality segments. Roadkill hot areas can be used to plan the implementation of mitigation measures. Crossing structures are efficient in mitigating wildlife–vehicle collisions (Clevenger and Ford 2010; Forman et al. 2003), also, it has been argued that different species of amphibians, reptiles, and mammals would use safe crossing structures (Grilo et al. 2013; Huijser and McGowen 2010). However, the creation of underground passages, fencing, traffic speed reducers, traffic signs, and driver education campaigns should also be implemented (Payan et al. 2013; Oliveira et al. 2018). In the case of older roads, the use of hot spot analysis to identify areas with high aggregations of fatalities may not be the most appropriate measure, since due to their antiquity these segments become little traveled (Teixeira et al. 2017).

Roadkill risk model

Our results indicated that the structural quantification of the landscape plays a vital role in the influence of high roadkill collisions in the department of Antioquia, Colombia. According to the contribution percentages of each variable obtained through the Jackknife analysis, the crash of vertebrates in the analyzed roads is fundamentally affected by the resistance of the ecological connectivity map, the distance to forests, and the land cover, differing from characteristics, such as slope and tree cover, which had an insignificant effect. In addition, we highlight three different characteristics (vocation, distance to corridors, and distance to rivers) of the three most important, which also increase the risk of wildlife–vehicle collision.

Due to the lack of applied research in the study area, it isn’t easy to have a detailed comparison of results, however, some comparisons are made with similar reports based on species distribution. Our results differ from those obtained by Ramírez-Fráncel et al. (2020), Ferro Muñoz et al. (2018), where the elevation was one of the main drivers of the models. It should be noted that the most important role was played by the resistance of the map, observing a higher probability of areas with large aggregations of collisions, in areas with high resistance values. Our results corroborate the inferences made by Meza-Joya et al. (2019), which indicate that the location of these points of conflict appears to be associated with areas of low structural connectivity, where current flow is influenced by the presence of human-modified habitats (Meza-Joya et al. 2019). However, this variable may not have much to do with running over birds, due to how they move from one place to another. This paper also showed that short distances to rivers can increase the likelihood of roadkills since species such as Rhinella marina take advantage of water resources near roadsides for their reproduction (Fernando and Ingrid 2011).

The size of the study area may have greatly influenced the AUC-ROC values, since it may include background points with environmental characteristics distant from the species requirement, resulting in an erroneous increase in model validation (Schoener and Schoener 1968). The choice of the area mentioned should be made with caution and be pertinent to the ecology of the species and the purpose of the research (Fourcade et al. 2014). The values of AUC-ROCpo and AUC-ROCcorr presented a significant difference, for this reason, we support the analyses carried out by Lobo et al. (2008), in which it is indicated that this metric presents disadvantages when applied to species distribution models, and the AUC-ROCpo does not provide information about the good functioning of the model, especially when working with background data. It is pertinent to underscore that while TSS values may be lower than desired, the comprehensive evaluation provided by the Area Under the Receiver Operating Characteristic curve (AUC-ROC) and TSS collectively enriches the interpretation of the model’s performance. The adjustments applied to the AUC-ROC values, leading to the AUC-ROCcorr measurements, underline the commitment to accuracy in the assessment process.

During data collection, no specific mitigation measures (as opposed to traffic signals) were observed for wildlife crossing over the different types of roads. Furthermore, the signs observed were scattered and very specific for large vertebrates (Didelphis marsupialis), isolating the importance of amphibians such as (Rhinella marina). In this context, the methodology proposed in this research allowed for the identification and prioritization of the road segments with the highest risk of being run over, and which should be intervened immediately.

In our study, we also have some limitations, such as (I) The lack of information on the distribution of the different species, which could be of great help to limit the area of study. (II) The restriction of resources made it difficult to collect data that included absences, so our methodology was based on presence-only data, which commonly exhibit a strong spatial bias (Schulman et al. 2007; Dennis and Thomas 2000), so it can be understood that some sites are more likely to be studied than others (Phillips et al. 2009). Besides, the frequency of data may be influenced by animals that remove carcasses from the pathways making registration difficult (Teixeira et al. 2013). These types of biases also reported as sample collection bias, can affect the quality of the model (sample selection bias). To correct these limitations, one could choose background data with the same bias as the occurrence data or apply rigorous statistical methods that correct for the probability of actual occurrences in the background data.

Based on the results of our study, we can suggest some practical advice for managing our strategy moving forward. The identification of areas with high wildlife–vehicle collision potential through our hotspot analysis presents a vital opportunity for immediate action. Focusing mitigation efforts on these locations could yield impactful results, particularly through the implementation of crossing structures tailored to local species’ movement patterns. Our roadkill risk model emphasizes the significance of ecological connectivity resistance, distance to forests, and land cover in shaping collision occurrences. Based on these variables, we propose habitat restoration initiatives and green corridor establishment that align with wildlife movement pathways. Additionally, the influence of factors, such as vocation, distance to corridors, and proximity to rivers, underscores the importance of comprehensive planning that considers not only the road network but also adjacent ecosystems. By embracing these management recommendations, we envision a future where proactive strategies harmonize road infrastructure development with the conservation imperatives of our natural ecosystems.

Fig. 1
figure 1

Location of the roads segments studied within of the east Antioquia, Colombia

Fig. 2
figure 2

Illustration of the methodology used for calculating the corrected AUC-ROC (AUC-ROCcorr). The first image corresponds to the critical points identified using Hotspot analysis in 2 dimensions. The second image corresponds to the prediction of critical points using Maxent

Fig. 3
figure 3

Roadkill aggregations according to the radius. Aggregation is considered to occur when the function L(r) (shown as the turquoise line) exceeds the upper confidence limit (represented by the black line)

Fig. 4
figure 4

Location of hotspots obtained by 2D Hotspot analysis. Zones with a red index represent road segments with a higher intensity of significant accumulations of wildlife–vehicle collisions, while blue zones represent less frequent collision zones

Fig. 5
figure 5

Roadkill risk map predicted by the model. The probability of vehicles hitting and running over animals increases from blue to red. The zoomed-in area shows the details of the risk associated with the zone in which the model did not have data on certain hot spots. The circular areas below the map show actual hot spots of animal-vehicle collisions, as determined by 2D hotspot technique

Table 1 List of environmental variables and their descriptions
Table 2 Percentage contribution of each variable for the prediction of hot spots using environmental variables

Conclusion

Our research presents a novel methodology for projecting road deaths of different vertebrates by modeling ecological niches. The MaxEnt approach is useful for testing interactions between habitat and fauna characteristics by adapting the theory of maximum entropy. We demonstrate how different factors affect the probability of increasing vehicle–wildlife collision accumulations and highlight the possibilities of integrating citizen science initiatives, such as App RECOSFA, machine learning techniques, and remote sensing for developing predictive models of areas with high aggregations of vehicle–wildlife collisions.

We took into account observations made in the literature on the AUC values calculated by MaxEnt (AUC-ROCpo) and proposed a methodology to correct its values. However, future research should integrate different metrics for result evaluation and take into account a more in-depth study of sampling bias. The mortality of different taxonomic groups should be investigated individually to determine the environmental characteristics that increase the run-over of each species. Our approach could easily be extended to larger study areas, and integrating government entities that apply systematic collection methods could be crucial in mitigating trampling.

This paper provides ecological information on wildlife mortality from vehicle collisions in the Department of Antioquia. This information can be used to mitigate the impact on the ecosystem caused by the death of different vertebrates, optimize mitigation zones, and create awareness of this problem. Finally, we did not find any articles in the literature that apply this methodology in the Colombian territory, and given the implications of our country in biodiversity issues, this research represents an important contribution to the state of the art.