Introduction

Strategic conservation planning is essential for protecting threatened species (Marcot and Flather 2007; Rao et al. 2007; IUCN-SSC 2008), as it can formulate threat mitigation measures based on the potential habitats of the species. Caracal (Caracal caracal) is a medium-sized cat, though widely distributed in many African and Asian countries (Avgan et al. 2016; Khandal et al. 2020), is under tremendous threat from several anthropogenic pressures and illegal trading in South Asia (Singh et al. 2014; Avgan et al. 2016). It has led to species extinction from Kuwait, some regions of Turkmenistan and North Africa, and there is a potential extinction risk from Indian landscape too in the foreseeable future (Lukarevsky 2001; Cuzin 2003; Sheikh and Molur 2004; Ray et al. 2005). Despite its wide geographical distribution (Thorn et al. 2011; Avgan et al. 2016), sightings or presence records from South Asia are few and far between.

Globally, there are eight distinct subspecies of caracal C. caracal, classified based on their molecular structure (Wilson and Reeder 2005; Hassan-Beigi 2015). The Asian subspecies—Asiatic caracal Caracal caracal schmitzi (Matschie 1912)—has a patchy distribution across the arid and semi-arid areas of the Indian subcontinent, Middle-east and South Asia (Wilson and Reeder 2005; Hassan-Beigi 2015). Currently, the species is included in the "Schedule-I" category (highest protection) by Indian Wildlife (Protection) Act, 1972 and the "Near Threatened" category by Conservation Assessment and Management Plan (CAMP) and IUCN Red list assessment in India (Molur et al. 1998; Wilson and Reeder 2005). Historically in India, the species had been extensively captured and trained for the purpose of game hunting by Indian Royalty (Divyabhanusinh 1993; Sunquist and Sunquist 2002). Earlier research focused on ecological aspects of species, like home range, diet, and prey base status (Grobler 1981; Avenant and Nel 1998; 2002; van Heezik and Seddon 1998; Mukherjee et al. 2004; Farhadinia et al. 2007; Albayrak et al. 2012), however authentic information on its distribution and population is largely missing from the literature.

A recent broad-scale assessment of caracal distribution in India showed substantial contractions in the distribution range (Khandal et al. 2020). The study reported that the caracal was widely distributed in many central and northern Indian states during the twentieth century, but became restricted to three states, Rajasthan, Gujarat and Madhya Pradesh, by the beginning of twenty-first century (Fig. 1). Even within these states, recent records of the caracal are reported only from a few regions—mainly Malwa and Hadoti plateaus and Aravalli hills, politically administered under the Sawai Madhopur, Karauli, Dholpur, Bharatpur, Alwar, Chittaurgarh, Pratapgarh, Udaipur, Pali, and Rajsamand districts of Rajasthan (Fig. 1), Kachchh district of Gujarat and Chhattarpur and Bhind districts of Madhya Pradesh (Fig. 1; Khandal et al. 2020). The records belong to a wide array of habitats, including ravines, dry deciduous forests, scrublands, grasslands, and teak (Tectona grandis) forests (Khandal et al. 2020). Specific knowledge on habitat selection by caracals in India is also limited, with only a study from Ranthambhore Tiger Reserve investigating this aspect (Singh et al. 2014). Beyond this, no other information about the species is available for the Indian region. It is thus reasonable to believe that the species is highly understudied and facing a significant threat of extinction, given its rarity in the wild. Proliferating human interferences, loss of natural habitats and illegal trading of caracals are considered significant threats to the species (Kolipaka 2011; Avgan et al. 2016). CAMP assessment report designated caracals on Level-3 for captive breeding recommendations, which is not meant for immediate conservation action, but can be implemented for husbandry and research purposes (Molur et al. 1998). However, before any such intervention, it is essential to understand the potential regions where the species is distributed in India and conduct in-depth population-level studies.

Fig. 1
figure 1

Map depicting the states and districts with historical (20th century) and present (21st century) reported distribution ranges of Asiatic caracals in India

The distribution of a species can be estimated using various climatic, environmental, terrain, and anthropogenic variables. A diversity of statistical modeling techniques can be used for this purpose based on available datasets and the ecology of the species (Qiao et al. 2015). MaxEnt is the most commonly used algorithm for widely distributed and endemic species (Phillips et al. 2006; MacCarthy et al. 2015; Jhala et al. 2020). However, Random Forest, Generalized Linear Model (GLM), BioClim, Climate Space Model (CSM), Envelope Score Model (ESM), Artificial Neural Network (ANN), Multivariate Adaptive Regression Splines (MARS), Generalized Additive Model (GAM), Gradient Boosted Machine (GBM) and many more algorithms are also known to provide precise distribution range of rare and cryptic species (Guisan et al. 2006; Williams et al. 2009; Mi et al. 2017; Oleas et al. 2019; Warren et al. 2020). These distributional analyses on pattern of a species help to narrow down population surveys and focus conservation efforts to specific areas, making planning and implementation more specific and effective (Lyet et al. 2013; Eyre et al. 2018; Giné and Faria 2018; Valerio et al. 2020). Linking the prediction modeling with the species ecological knowledge can lead to targeted conservation efforts such as legal protection of species suitable habitats by forming or conserving protected areas. This strategy has worked in conservation efforts of various species in India, where protected areas have relatively performed better at conserving threatened species.

This study aims to identify the potential regions where the caracal might exist in India, based on fine-scale ensemble modeling of its distribution. Further, we intend to evaluate various protected areas where conservation efforts for caracals can be planned from the management and conservation outlook.

Methods

Study area

Based on the historical and recent distribution of caracal, the region falling in nine western and central Indian states, viz., Chhattisgarh, Delhi, Gujarat, Jharkhand, Haryana, Madhya Pradesh, Maharashtra, Rajasthan, and Uttar Pradesh, was chosen for modeling (Fig. 2). The area fully or partially overlaps with the reported extent of species. The study area predominantly comprises four biogeographic zones, viz., Desert, Semiarid, Deccan Peninsula, Gangetic Plains and partially covered by the Western Ghats, having rainfall of < 250–2000 mm (Mohapatra et al. 2021). Maximum temperature rises sharply to excess of 45 °C by the end of May and early June, resulting in torrid summers in the north and north-west regions. During summer, parts of Gujarat, Maharashtra, Rajasthan and Madhya Pradesh exhibit high day-time and low night-time temperatures, resulting in temperature difference of > 15 °C in many areas (Attri and Tyagi 2010). The dominant vegetation type is tropical thorn forests, corresponding to the arid and semi-arid climates. Caracals are known to use a wide range of habitat types, including river and riverine habitat, scrubland, hilly dhonk (Anogeissus pendula) forest, teak forest, Prosopis juliflora thickets and agricultural lands (Khandal et al. 2020), out of more than 70 land use types (Ray et al. 2005), which makes it a habitat generalist species. In India, caracals are known to feed upon various mammals, rodents, birds, reptiles, invertebrates and vegetable matter (Mukherjee et al. 2004).

Fig. 2
figure 2

Map showing the study area spread over the western and central Indian states with spatially thinned presence records of Asiatic caracals (n = 69) used for modeling the habitat suitability

Conventional species distribution modeling vs. ensemble modeling to improve species detection probability

Several spatial modeling algorithms have been proposed to accurately predict the ecological niche of a species (Ho and Pepyne 2002; Thuiller et al. 2009; Elith et al. 2011). However, all these algorithms have pros and cons, and no single optimization approach is effective under all circumstances (Segurado and Araújo 2004; Qiao et al. 2015). Principally, the model performance hinges on the ecological characteristics of a species, e.g., dispersal capacity, eco-climatic specialization, biotic interactions, etc., that in turn affect the species occurrence and its relation with the type and resolution of predictor variables (Gaston 2003; Gilman et al. 2010; Peterson et al. 2011). Given the need to run multiple models for understanding species distribution, ensemble models are considered reliable (Barai and Reich 1999; Araújo and New 2007; Hao et al. 2020). The ensemble model carries virtues of pooled evaluating criteria, their variable responses, and pooled distribution matrices of better-fitted models (Hao et al. 2020). Thus, we used an ensemble modeling approach for the caracals, as this method provides a far more accurate distribution range of rarely detected species (Guisan et al. 1999; Pouteau et al. 2012; Breiner et al. 2015; Siders et al. 2020; Xie et al. 2021). Ensemble models avoid under- or over-prediction of niche estimates (Campos et al. 2019), which in turn provides detailed and precise spatial information, and ultimately aids in increasing the detection probability of species by narrowing down the search area for the species.

Spatial modeling

Species presence events

Presence records of caracal (n = 138) were acquired from published and grey literature in India. We discarded historical records (pre-1995) from the analysis, as they might confound our predictions. Also, older records can be inaccurate on the spatial scale. Records after 1995 were selected in order to reflect the recent decline in the range of the species due to anthropogenic factors (Ravikanth et al. 2000). To avoid cluster biases of locality records, the presence locations of caracals were filtered on the scale of 1 km; thus, randomly selected one location per kilometer square area was used (Coxen et al. 2017). We sourced a total of 138 validated presence records of the species from India and after the spatial thinning at 1 km spatial resolution, selected and used 69 records for further analysis.

Variable identification

Previously published literature on caracal's ecology provides information on its limiting factors, which may potentially affect the fundamental niche of the species (Adibi et al. 2014; Singh et al. 2014, 2015; Ramesh et al. 2016; Hemami et al. 2018; Khandal et al. 2020). Temperature seasonality and precipitation seasonality were acquired from the Worldclim dataset version 2.0 (www.worldclim.org; Fick and Hijmans 2017; Additional file 1: Fig. S1), as these variables affect the regional vegetation types. Since recent records are from arid and semiarid regions, thus, Aridity Index, i.e., the ratio of mean annual precipitation and mean annual evapotranspiration dataset (www.cgiarcsi.community; Zomer et al. 2008; Additional file 1: Fig. S1), was selected for interpreting the aridity status of the landscape. A mean Normalized Difference Vegetation Index (NDVI), belonging to peaks of three seasons (i.e., January for winter, May for summer, September for monsoon) over the temporal space of year 2001 to 2020, was applied to investigate the trend in vegetation phenology in the study area (www.modis.gsfc.nasa.gov; Bao et al. 2014; Additional file 1: Fig. S1). To understand the distribution dynamics with topological complexity, terrain ruggedness was generated from the digital elevation model acquired from the Worldclim database (www.worldclim.org; Riley et al. 1999; Hemami et al. 2018; Additional file 1: Fig. S1). Prior to modeling, the predictor variables were bilinearly resampled to a uniform spatial resolution of 1 km and projected to the WGS1984 geographic coordinate reference system (Bivand et al. 2008). Pearson correlation among the predictor variables was checked to avoid unusual spatial collinearity. In case of a high correlation value of coefficient r =|0.7|, variables were processed for principal component analysis; otherwise, provided variables were retained (Chu et al. 2018).

Modeling algorithms

A total of 3600 random background points (50× higher than presence locations) were generated to denote pseudo-absences. The optimum number of background points were selected, keeping model performance in mind from the perspective of regression, machine learning, and tree-based classification approaches simultaneously (Barbet-Massin et al. 2012; Li and Guo 2013; Konowalik and Nosol 2021). In addition, 20% of the total records (presence and background points) were customized for testing the model trained with 80% of the entire dataset by looking at the sensitivity and specificity (Hijmans and Elith 2017). Model evaluation was achieved using receiver operating curve (ROC as area under curve AUC) values and non-overpredicted performance (Elith et al. 2006; Konowalik and Nosol 2021). We executed several algorithms with the aforementioned datasets, such as Generalized Linear Model with binomial distribution framework (using the R package “stats” version 4.1.0), Random Forest (“randomForest” package version 4.6), Maximum Entropy (“dismo” version 1.3 and “rJava” version 1.0), Bioclim (“dismo” version 1.3), Domain (“dismo” version 1.3), Mahalanobis (“dismo” version 1.3), Generalized Additive Model (“dismo” version 1.3), Support Vector Machine (“dismo” version 1.3) (Hijmans and Elith 2017; Hijmans et al. 2017).

Model ensembling

The top models (with the highest AUC) generated from this workflow were ensembled using AUC values as weights. Models were also evaluated based on their non-overpredicted performance (Barbet-Massin et al. 2012; Hijmans and Elith 2017; Kumari et al. 2021).

Model correction

The presence of anthropogenic disturbance has already been identified as a limiting factor for caracal’s distribution (Kolipaka 2011; Albayrak et al. 2012; Farhadinia et al. 2012; Khandal et al. 2020). We used nightlight data (collected from NOAA-NCEI for the year 2015) as a proxy for anthropogenic disturbance and urbanization. It was classified into two ordinal categories, i.e., less disturbed areas (nightlight values < 20%) and highly disturbed areas (nightlight values > 20%). Given its nocturnal nature (Singh et al., 2014), caracal avoids highly lit areas (Moqanaki et al. 2016; Ashrafzadeh et al. 2020). Also, the nightlight data are related to the availability of free-ranging dogs, known predators and competitors for caracals. Thus, areas with high nightlight values were masked from the output of the ensemble model. Consequently, the current output represents the realized niche of caracals in India. The output probabilities or pixel values of the model (0–100%) were classified into four ordinal categories, i.e., very low (0–25%), low (25–50%), moderate (50–75%), and high (75–100%). Keeping a conservative approach, areas under very low probability (0–25%) were eliminated from the model since our ultimate target was to find the prioritized suitable habitats of caracals in India.

Post-processing with vegetation height in highly suitable habitats

To determine the realized niche of caracals within the highly suitable areas, we surveyed the literature to find information on vegetation height in areas used by caracals. Due to logistic constraints, we selected a total of 30 points in the Ranthambhore-Kuno Landscape (RKL) and collected ground truth data for vegetation height using a 30×30m plot. Mean vegetation height from the ground truth points was regressed with remotely sensed forest height data collected from the Global Land Analysis and Discover team at the University of Maryland for 2019 (www.glad.umd.edu; Schwarz and Zimmermann 2005; Wu et al. 2015; Potapov et al., 2021). The variables were z-transformed for the generalized linear model using the Gaussian distribution (MacCullagh and Nelder 1989). The beta estimate was used to convert the remotely sensed vegetation height in the Ranthambhore-Kuno landscape on the 30 m spatial resolution.

Additionally, a generalized linear regression model between the predicted probability of ensemble model (0–100%) and converted vegetation height was deployed to examine the relationship using R packages “stats version 4.1.0”. In case of a significant relationship in between, converted vegetation height data were classified into three equal classes based on height, i.e., short (0–4.5 m), moderate (4.5–9 m), and high (9–13.5 m). The class related to more suitability for species was considered further as refined potential habitats for fine-scale survey areas; the rest of the range of values were eliminated from the spatial extent between Ranthambhore TR and Kuno NP. These selected areas can be surveyed intensively to find caracals since the potential area is narrowed down after multiple corrections. Area computations were processed in the WGS1984 UTM43N projected coordinate reference system.

Status of suitable habitats for caracals in and around protected areas

Euclidean distances from the nearest available high or moderate suitability areas to the functional protected areas were computed using the “near” tool in Arcmap version 10.8. This provided information on protected areas suitable for caracals and can be given conservation priorities.

Results

The final choice of areas

States with current and historical records of caracals were selected for modeling. We collated a total of 138 validated presence records of the species from India, out of which 69 records were selected after the spatial thinning at 1 km spatial resolution for analysis.

Prediction models

As none of the five predictor variables were highly correlated (r <|0.7|), all of them were retained for spatial analysis (Additional file 2: Fig. S2). After AUC-based comparisons, two models, MaxEnt (AUC 0.89) and Random Forest (AUC 0.91) were selected for ensemble models (Additional file 3: Fig. S3). The remaining models, GLM (AUC 0.88), GAM (0.88), Mahalanobis (0.81), Domain (0.74), Bioclim (0.77), and SVM (0.76), were discarded as they showed either a relatively less AUC or high overprediction (Additional file 3: Fig. S3).

Ensemble spatial model

The performance of the two best models varied in terms of precise prediction performance, and these models were thus ensembled using an AUC-based weighted mean (50.56% and 49.44% weights for Random Forest and MaxEnt models, respectively). True Skill Statistic (TSS) for both the top models was measured as 0.83 (MaxEnt) and 0.84 (Random Forest).

Suitable areas for caracals

In total, an area of 346,726.8 km2 was classified as highly disturbed in the study area, as portrayed by nightlight data. After masking out the highly disturbed areas from the ensemble model, areas of three higher suitability classes were computed, which indicated that the highly suitable class has 1207.83 km2 area, followed by 7453.65 km2 area under moderately suitable, and 39,984.6 km2 area under least suitable habitats in the study area (Fig. 3; Table 1).

Fig. 3
figure 3

Map depicting the potentially suitable habitats for caracals after the model correction using nightlight information in the study area. Colors indicate the gradient of habitat suitability for the species, where areas having < 25% suitability are ignored as our ultimate target is to find potentially suitable habitats in dryland states in India

Table 1 Details on gradient of available suitable habitats for caracals coming under the political states in India

The most significant potential habitat for caracals was found in the state of Rajasthan with an area of 25,221.38 km2, followed by Gujarat (area: 16,652.1 km2), Madhya Pradesh (area: 6416.01 km2), Haryana (area: 191.36 km2), Uttar Pradesh (Area: 131.11 km2), and Maharashtra (area: 34.17 km2; Fig. 3; Table 1). At the district level, high suitability areas for caracals were identified in Kachchh district of Gujarat; Sheopur, Morena, and Shivpuri of Madhya Pradesh; Sirohi, Jalore, Alwar, Karauli, Sawai Madhopur, Kota, Dhaulpur, Bundi, Baran, Jaipur, Tonk, and Dausa of Rajasthan (Additional file 4: Table S2).

The linear model between ground-verified vegetation height and remotely sensed vegetation height showed a significant positive relation (gr_height = 7.02 + 0.91rem_height, p = 0.005, AIC = 158.71, R2 = 0.19). The coefficients of this model were used to convert the remotely sensed vegetation height data (rem_height) into ground-verified vegetation height information (gr_height). A significant positive association was observed between the predicted probability of the corrected model and converted vegetation height (suitability = 0.44 + 0.03(vegetation_height), p = 0.02, AIC = − 17.29, R2 = 0.27), which suggests that the shorter forest height is significantly correlated with relatively low suitability areas and vice versa in the Ranthambhore-Kuno landscape. Areas under shorter vegetation height (0–4.5 m) were clipped out, and the 567.65 km2 area was found to be of higher potential, followed by 1623.85 km2 under moderate and 4847.45 km2 under least potential areas, where both high and medium heights of vegetations exist (Fig. 4).

Fig. 4
figure 4

Map depicting the potentially suitable habitats prior and after correction with vegetation height information in the Ranthambhore-Kuno Landscape (RKL) in central India

Predictors for caracal distribution

The ensemble model was influenced the most by precipitation seasonality (0.41 ± 0.01), followed by aridity (0.39 ± 0.01), terrain ruggedness (0.36 ± 0.01), temperature seasonality (0.18 ± 0.01), and vegetation index (0.11 ± 0.01). Habitat suitability had a strong positive association with temperature seasonality, precipitation seasonality, and terrain ruggedness (Fig. 5), while vegetation index showed a slightly positive effect on habitat suitability (Fig. 5). Within the sampling frame, less arid areas were found to be more suitable for the species (Fig. 5).

Fig. 5
figure 5

Plots depicting the responses of species habitat suitability towards the predictor variables (Aridity Index, Normalized Difference Vegetation Index, Precipitation Seasonality, Temperature Seasonality and Terrain Ruggedness, from top left to mid down, respectively) used for forming the ensemble model for habitat suitability of Asiatic caracals in the study area

Status of suitable habitats for caracals in and around protected areas

The model output showed the availability of multiple protected areas close to high and moderate suitable regions (Fig. 3), which suggests immediate attention and planning for in situ conservation of the species. A large number of protected areas are situated within the category of the highly potential region for caracals, such as Kachchh Desert WLS, Wild ass WLS, Balaram Ambaji WLS, Rampara Vidi WLS, Jessore WLS in Gujarat; Kuno NP in Madhya Pradesh; Ranthambhore TR, Mukundra hills TR, Sariska TR, Ramgarh Vishdhari TR, Shergarh WLS, Kumbhalgarh WLS, Mount Abu WLS, Todgarh-Raoli WLS, Jaisamand WLS, Bassi WLS, Jamwa Ramgarh WLS, Shakambhari CR, Bandh Baretha WLS, Kevaladeo NP, Bisalpur CR, Sundhamata CR and Jawai CR in Rajasthan. Few protected areas such as Gandhi Sagar WLS (2.5 km from the nearest potential area) and Ghatigaon WLS (3.9 km) in Madhya Pradesh and Bhainsrorgarh WLS (4.2 km), Van Vihar WLS (6 km) and Phulwari ki Nal WLS (5.7 km) in Rajasthan offer the potential habitat for caracals in the proximity, which can be planned for habitat conservation perspective (Fig. 3).

Discussion

Distribution and efforts for finding caracals

The historical distribution of Asiatic caracal in India was extensive and overlapped with Blackbuck, Chinkara and Cheetah (Divybhanusinh 1993). Despite the large historical range, they are currently on the brink of local extinction from several regions of India (Ranjitsinh and Jhala 2010). This is evident from the magnitude of its distributional decline, as shown by Khandal et al. (2020; Fig. 1). This situation demands focused research and conservation of the species and its habitat. The decline of the species has coincided with the large-scale conversion of grasslands, shrublands and forests into agrarian lands since the early twentieth century (Tian et al. 2014; Vanak et al. 2017). Prima facie, habitat depletion has pushed caracal towards the less disturbed habitats, as identified by the ensemble model covering several regions in Gujarat, Rajasthan and Madhya Pradesh (Fig. 3). Recent records in the twenty-first century support species occurrence in many of these regions. The current ensemble model helps identify potential habitats where the caracal population could occur, assess status and plan targeted priority conservation actions in these regions; additionally, help updating the extent of occurrence estimates and IUCN species Red list assessment. Previous camera trapping exercises in Panna TR and Kuno NP (both are in the historical distribution range of caracals; Divybhanusinh 1993) did not record the occurrence of the species (Jhala et al. 2020; Khandal et al. 2020), which was suspected due to low detection/inadequate sampling (Singh et al. 2015), or seasonal migration (Adibi et al. 2014). However, Kuno holds high potential habitats for caracals like Ranthambhore TR, where frequent sightings are recorded (in RTR; Parashar 2020; Khandal et al. 2020; Tanwar et al. 2021), albeit previous studies indicate the availability of functional wildlife movement corridors between Ranthambhore TR and Kuno NP (Qureshi et al. 2014). In the case of Mirzapur (Uttar Pradesh), the once suitable habitats have no longer remained suitable for caracals, indicating the possible extinction of caracals from the region. Also, the frequent sighting reports from Kachchh and Sawai Madhopur may be due to the long-term camera trapping efforts in the landscapes (e.g., AITE-2018 has used 53 camera traps at 150 sites with 5341 trap nights in Ranthambhore TR; Jhala et al. 2020) and due to a high footfall of tourism, whereas fewer efforts were deployed in other areas (i.e., Kuno NP, which has 85 sites with 1792 trap nights). Also, species-specific behaviors in fields result in delayed detection than other mammals in camera traps (Tourani et al. 2020). In addition, elusive terrestrial species relatively take enormous sampling efforts to be captured by camera traps (Chatterjee et al. 2021), considering it to be a rare species. However, less availability of prey species and/or large population of competitors in those areas can also be a potential reason behind relatively insignificant population of caracal in Kuno NP (Avenant and Nel 1998; Mukherjee et al. 2004; Moqanaki et al. 2016). A maximum entropy-based species distribution modeling approach for caracals was recently performed from the presence records from Ranthambhore TR and Kachchh (Jhala et al. 2020), which ultimately led the model towards under-prediction. The ground validation of the caracal's presence from highly suitable areas must be carried out through surveys or by referring to newly published records from such places. Meanwhile, new efforts using systematic camera trapping can be helpful to find caracals (Rondinini et al. 2011), from the regions where it was not historically reported, as the ensemble model highlighted the high potential zones, e.g., Kuno NP. Recent records of the species are from specific habitats, such as ravines, grasslands, dry deciduous forests, etc. (Avgan et al. 2016; Khandal et al. 2020). For Kuno, the size of the unit sampling area needs to be reduced (to 1 km2) for camera trapping, smaller than used for tigers and leopards (i.e., 2 km2; Jhala et al. 2020). It can be supported by smaller home ranges of caracals (Avenant and Nel 1998) than big cats, like tigers, leopards, and cheetahs (Broomhall et al. 2003; Sankar et al. 2010; Majumder et al. 2012; Kumbhojkar et al. 2020).

Large amounts of caracal suitable habitats fall within protected areas, providing an excellent opportunity for its conservation in already existing management and conservation setup. India has a robust band of several protected areas, which frequently get surveyed by government-initiated All India Tiger Estimation, along with many wildlife organization surveys, which can assist in monitoring the status of caracals. Also, these areas can be prioritized if species reintroduction is planned.

Modeling rationalization

Looking at the evaluation criteria of models, weighted results would be more robust for informing the precise habitat suitability for caracals, which is better than any single model optimization (Breiner et al. 2015; Qiao et al. 2015). In the case of a small sample size or rarely detected species, the possibility of model overfitting or overprediction may increase, which ultimately results in low accuracy of models on the ground (Lomba et al. 2010; Hardy et al. 2011; Breiner et al. 2015). The ensemble model narrows down the search area if a species needs to be looked at in projected or predicted areas (Mi et al. 2017). Here, this model helped identify the areas where intensive surveys for finding caracals can be conducted, which is both cost and labor efficient. It would be relatively easier to detect the species in the microhabitats if the species exist there.

We also evaluated the potential issue in our study where the numbers of background points were uniform for all executed models (i.e., 3600; Barbet-Massin et al. 2012). Though the numbers were kept optimum, to overcome this issue, AUC computation was evaluated using the equal number of randomly selected background points to the Random Forest algorithm (i.e., equivalent to rarefied presence records 69). Random Forest is the only algorithm requiring an almost equal number of pseudo-absences, unlike regression or machine learning-based algorithms (Barbet-Massin et al. 2012; Li and Guo 2013; Konowalik and Nosol 2021).

On-ground predictors for identifying caracal habitats

The response curves of predictor variables towards the distribution probability of caracals indicate that the areas with a high range of temperature and precipitation seasonality, highly rugged terrain, moderate to the high dense type of forests, and low-to-intermediate arid regions are climatically suitable for caracals (Fig. 5). Studies from Iran presented the use of highly rugged areas with good vegetation cover by caracals (Adibi et al. 2014; Hemami et al. 2018); our results confirm the studied statements on a larger scale. Precipitation seasonality can also imply water availability in the landscape and identified variables known to limit caracals’ distribution in the landscape (Najafi et al. 2019). These analytics helped identify a better-realized niche for the species in the extent area (Peterson et al. 2007). High anthropogenic pressures and free-ranging dogs are known limiting factors, which afflict the habitat utilization of caracals in a real system (Albayrak et al. 2012; Farhadinia et al. 2012; Adibi et al. 2014; Ramesh et al. 2016; Khandal et al. 2020), hence highly disturbed areas were clipped out from the ensemble model.

Species occurrence probability relies upon the abiotic and biotic factors, including prey species. Caracals are found in similar habitats to their prey too, as they rely on a wide variety of dietary choices, e.g., they are known to feed upon rodents, ungulates, small carnivores, other mammals like hares, shrews, hyrax, wild and domestic goats and birds (Palmer and Fairall 1988; Avenant and Nel 2012; Braczkowski et al. 2012; Momeni et al. 2019; Jansen et al. 2019), and this pattern was also reported from India (Mukherjee et al. 2004). The type of taxa stays constant throughout the studied areas from Africa, the Middle East to India. The population status of caracals’ prey base species needs to be monitored regularly as a fundamental step toward species conservation (IUCN SSC 2013). However, the population density of caracals depends not solely on the availability of prey, but also the habitat type and degree of anthropogenic pressures (Avenant and Nel 2002).

Furthermore, selecting the areas correlated to vegetation height took the ensemble model towards a more realized niche of the species, ultimately shrinking the prospective survey areas (also suggested by Peterson (2006)). The aforementioned potential areas found in the selected districts in the dryland states of India need to be adequately surveyed to find caracals. The spaces found suitable in the output map suggest the requirement of an appropriate framework of fieldwork, especially in Kuno NP. In case of no detection in Kuno, the translocation of individual caracals to Kuno NP could be the next option since this is one of the best habitats for caracals in India. Ultimately, inferring highly suitable habitats from the ensemble model may help to provide insights on strategic management planning for the conservation priority areas concerning the caracals in India.

Conclusion

In brief, this study suggests that very few and sporadic spaces (i.e., 567 km2) remained highly suitable for caracals within its known historical ranges, where the intensive surveys and conservation efforts should be prioritized considering the species as threatened in the current scenario.