Introduction

Estimating animal habitat use has become an achievable task thanks to two technological advances of the early 2000s that have become standard practice: GPS technology and remote sensing (Cagnacci et al. 2010). Over the last 2 decades, geographic layers have been based more frequently on satellite data (Kuenzer et al. 2014; Pettorelli et al. 2014) and have facilitated standardized and large-scale ecological analyses regarding the interaction between animals and their environment (Pettorelli et al. 2005; Handcock et al. 2009; Dodge et al. 2013). However, for a given landscape, different layers may show different habitat patterns, depending on the purpose for which they were created (e.g., human land use, vegetation productivity), on the nature of the data source (e.g., number and type of sensors, spatial and spectral resolution), and on the data processing used to develop them (e.g., type of classification algorithm, filtering rules). For these reasons, accurate assessment of animals’ habitat use is potentially very susceptible to the inconsistent classification of animal locations within the landscape by different geographic layers (Ewald et al. 2014). For example, Oeser et al. (2020) found that habitat suitability maps for red deer, roe deer and lynx (Cervus elaphus, Capreolus capreolus, and Lynx lynx, respectively) directly derived from Landsat imagery markedly differed from those based upon standard land-cover layers, potentially leading to different management decisions. Yet, surprisingly little attention has been given to the effect of using different geographic products on habitat use estimates (but see Fleming et al. 2004; Neumann et al. 2015; Remelgado et al. 2018), even for habitat assessments of ecotone species (i.e., species exploiting transition areas between different vegetation communities, like roe deer) that may be particularly sensitive to small errors in mapping. Ecotone species often prefer landscapes with small habitat patches and high amounts of edge, such as fragmented forest units within human-modified agricultural matrices (Tufto et al. 1996; Rivrud et al. 2009), that may be misclassified in simplified, categorical land cover layers (Pekkarinen et al. 2009).

Patterns in habitat use is also affected by human disturbance, with evidence pointing towards species becoming more nocturnal with increasing human activity. For example, red and roe deer in European landscapes have been shown to be more active during crepuscular and night time hours and to alternate their use of forested habitat to avoid human disturbance—deer have been observed moving from forested, less anthropized refugia during the day to food-rich, yet also more anthropized, open habitats in the cover of darkness (e.g., Rivrud et al. 2009; Bonnot et al. 2013; Padié et al. 2015; Dupke et al. 2017). This observed diel-cycle of forest use has been used not only to measure deer responses to human activity (Bonnot et al. 2013), but also as a proxy for examining effects of hunting and predation risk (Gehr et al. 2018), grazing damage (Rivrud et al. 2009), and temporal exposure to vehicle collision (Bruinderink and Hazebroek 1996; Balčiauskas et al. 2020). Carbillet et al. (2020) showed that roe deer ranging closer to roads and human infrastructure during daytime have higher cortisol levels (evidence of higher stress levels), but only when in open habitats. In this study, we aimed to assess two scales of forest use—fine (using raw GPS locations) and broad (within the home range derived from the same locations)—of red and roe deer across European landscapes, relying on and comparing the outputs from two pan-European, freely available geographic layers derived from satellite imagery: Corine Land Cover (CLC) and Tree Cover Density (TCD) (Table 1).

Table 1 Most relevant features of CLC 2012 and TCD 2012

Due to its standardized land cover classification, CLC has been widely used to assess habitat use and selection (e.g., Lundy et al. 2012; De Groeve et al. 2016), habitat suitability (e.g., Schadt et al. 2002; Falcucci et al. 2009; Bosch et al. 2012), and connectivity (e.g., Vogt et al. 2007; Saura et al. 2011) of animal populations ranging in Europe. However, as pointed out by Pekkarinen et al. (2009), CLC filters out forest patches smaller than 25 ha (250 000 m2), implying that small habitat components, such as hedgerows, edges, small forest or open habitat patches, go undetected or get oversimplified. While simplification and generalization are essential for defining broad land cover classes, they might also lead to errors in the environmental classification of animal movement trajectories. In contrast, the high-resolution layer Copernicus TCD is not based upon a minimum size of mapped features, but on the percentage of canopy cover per raster cell (resolution of 20 m). Unlike CLC, TCD does not give any information on land cover types, but only provides a tree cover index. While available for use since 2015, TCD remains underutilized in movement ecology studies based in Europe, to the best of our knowledge (but see De Groeve et al. 2020; Fenton et al. 2021).

To assess the inconsistencies in the estimates of deer habitat use based on two contrasted geographic layers like TCD and CLC, we first evaluated the classification mismatch of red and roe deer GPS locations based on the two layers; then, we modelled day/night forest use at the GPS location and home range (HR) level using both layers and compared the outputs (Fig. 1; Table 2). We outlined our research questions, predictions and analytical steps in Table 2.

Fig. 1
figure 1

Graphical summary of the analyses of red and roe deer habitat use based on TCD and CLC geographic layers. The upper box (Data Preparation) describes the extraction of red and roe deer GPS locations during daytime and nighttime from the Eureddeer and Eurodeer databases, the computation of Kernel Density Estimates for HRs, and their classification with both TCD and CLC as forest or open. As a result, the proportion of forest at the GPS location and HR level, using both layers, is shown. The lower box describes our analysis workflow consisting out of three parts. The upper panel (Classification mismatch) indicates the comparison of GPS locations classified using the two layers, to identify mismatches. These happen when one GPS location is classified as open by TCD and as forest by CLC (OF in the figure, light blue points), or vice versa (FO in the figure, dark blue points). The central panel (Validation) shows the workflow of the validation of samples of mismatching GPS locations (light blue/dark blue) using Google Earth orthophotos. The lower panel (Analysis-GLSM) indicates modelling of forest use, at day and night, by the two species (full model), both at GPS location and HR level. The model has been applied to data classified with TCD and with CLC separately. (Color figure online)

Table 2 List of research questions, Q, and relative predictions, P, with corresponding analytical steps

Materials and methods

Movement data

From five populations of each species (i.e. 10 populations in total) we derived space use metrics for a total of 85 red deer and 105 roe deer (Fig. 1—Data Preparation) from the Eureddeer and Eurodeer databases (euromammals.org, e.g., Peters et al. (2019); see Online Appendix S1 and Fig. 2 for a description of the populations and Urbano et al. (2021) for a general presentation of the database). The use of different populations distributed across Europe allowed us to have a good representation of the different forested landscapes across the continent. For each individual deer, we sub-sampled two GPS locations a day, at noon and midnight (12 h ± 1 h 30 min, 0 h ± 1 h 30 min), so to exclude GPS locations that might occur during crepuscular periods and therefore ensure an unambiguous representation of day and nighttime habitat use. We only included summer GPS locations (from May to October) to avoid the inclusion of seasonal migration movements, that could confound our analyses. We considered GPS locations from the same individual for different years separately, leading to an average of 160 ± 26 and 170 ± 21 fixes for day and night, respectively. This resulted in a dataset of 117 animal-years for red deer and 141 animal-years for roe deer (Fig. 1), with a total of 39,522 and 45,529 GPS locations, respectively (see Online Appendix S2 for a complete list of the individuals, summer GPS location samples and monitoring periods).

Fig. 2
figure 2

Red and roe deer GPS locations (in ochre and brown respectively) mapped on the TCD geographic layer in five different populations per species. Red deer populations concern SE-Germany (1A), N-Italy (2A), SW-Belgium (3A), SE-Belgium (4A), N-Germany (5A). Roe deer populations concern SE-Germany (1B), N-Italy (2B), Switzerland (3B), SW-France (4B), SW-Germany (5B). See Online Appendix S1 for a description of the study areas

We computed the day and night home ranges (HRs) for each summer GPS location sample using Kernel Density Estimation (KDE). A KDE produces a utilization distribution resulting from a sum of unimodal distributions centred around each GPS location, whose spread is controlled by a smoothing parameter. KDEs were calculated using the function KernelUD of the R package adehabitatHR (Calenge 2011), using a bivariate Gaussian kernel, the plug-in estimate for the smoothing parameter (savg n1/6, with savg the average standard deviation of x and y coordinates and n the number of GPS fixes) and the 90% isopleth (Börger et al. 2006).

Classification of movement data with environmental data: TCD and CLC

To estimate the diel use of forest by deer (Fig. 1—Data Preparation), we classified each sampled GPS location as forest or open, and described each HR by its proportion of forest and open habitats. For this purpose, we intersected GPS locations and HRs with both Copernicus (TCD 2012; Table 1) and Corine (CLC 2012; Table 1), extracting the habitat values at their corresponding positions. We chose the 2012 layers being the closest in time to most of the roe and red deer individuals monitored by GPS (Table S2.1).

To estimate the use of forest vs open habitat by deer we needed to reclassify the original TCD and CLC layers into binary layers of forest and open, as is common practice in habitat use or suitability studies focused on forest species (e.g., Bosch et al. 2012; Falcucci et al. 2009). We reclassified the TCD raster to a binary layer where a pixel is considered forest at a minimum canopy cover threshold of 10%, in line with the FAO (Food and Agriculture Organisation) official definition of forest (Global Forest Resources Assessment 2020), and as a conservative choice after a sensitivity analysis that showed marginal change in the output classification with thresholds up to 40% (see Online Appendix S3). Pixels with a forest canopy cover < 10% were therefore considered as open habitats. The CLC original vector layer was rasterized to a resolution of 20 m to match the grid of TCD, based on the class of the vector feature covered by the raster pixels, and considering the class of the feature with the larger area when two features were juxtaposed. We assigned the class ‘forest’ to all pixels belonging to the classes broad-leaved, coniferous and mixed forest (311, 312, 313), and ‘open’ to all others. We compared our raster layer to the original 100 m CLC layer, and to a CLC classification including also agro-forestry class as ‘forest’, finding no relevant differences (Online Appendix S3). Note that in both layers, open habitats may also include non-habitats, such as roads and infrastructure.

Classification mismatch between TCD and CLC in red and roe deer GPS locations

We compared the TCD and CLC classifications of red and roe deer GPS locations (Fig. 1, Analysis panel, part A) by computing a confusion matrix for each species. A confusion matrix typically aims to compare a predicted class to an actual (‘true’) class. In this first objective (Table 2, Q.A1), we used it to compare two forest-open classifications that can be both considered predictions. Hence, the mismatch categories in the matrix (FO and OF, see below) indicate the relative inconsistency between the two classifications. Specifically, our confusion matrices include four categories: GPS locations that were consistently classified by TCD and CLC either as forest or open habitat (i.e.,—FF and OO, Fig. 1); GPS locations oppositely classified by TCD and CLC (i.e. GPS locations classified as forest by TCD, but as open by CLC, or vice versa resulting in a classification mismatch—FO and OF, Fig. 1).

To test whether the size of the habitat patches was particularly associated with a classification mismatch among the two layers (see Table 2), we used the FRAGSTATS software for landscape structure analysis (McGarigal et al. 2012) and measured the size of open or forest units used by deer (i.e., with GPS locations falling therein) as determined by TCD and compared the median size of those units that were consistently classified by TCD and CLC (i.e. FF, OO) to the median size of units that were oppositely classified by TCD and CLC (i.e. FO, OF) (Wilcoxon rank sum test).

Classification mismatch validation with Google Earth

To further compare inconsistent classifications between both layers, we also conducted a ground-truth validation with a random sub-sample of 500 GPS locations per species (i.e., 100 GPS locations per population, equal to 1000 GPS locations in total) with mismatched classification (i.e., FO, OF; Fig. 1, Analysis panel, part B). We overlaid these GPS locations with orthophotos from Google Earth (Gorelick et al. 2017) that we assumed as the “ground truth”. To match the reference time of orthophotos with that of raster layers, we used the Google Earth function “Historical Imagery” to retrieve older images. We then visually interpreted whether a GPS location was in forest or open habitat in the Google Earth Pro software. In this second objective (Table 2, Q.A3), we used a confusion matrix to compare opposite predictions from the two geographic layers to ground-truth values, instead of comparing a single geographic layer to ground-truth, as customarily done to estimate accuracy (Stehman 1997; Pekkarinen et al. 2009). Hence, in this case the matrix indicates which of the two predictions was correct. Thus, we could determine which of the two layers had correctly classified each GPS location in retrospect and summarised the results of this validation as a post-validation confusion matrix for each species, with two dimensions (GPS locations oppositely classified by TCD and CLC and Google Earth) and two classes (forest and open habitat).

Day versus night use of forest in roe and red deer at the GPS and home range scales as determined by TCD and CLC

We modelled the proportion of forest used, respectively by red (n = 117 animal/year) and roe deer (n = 141 animal/year), using Generalized Least Squares models (GLS) with a Gaussian distribution of residuals (Aitken, 1935; Fig. 1, Analysis panel, part C). Our response variables were FGPS, namely the proportion of GPS locations in forest habitat per individual, and FHR, namely the proportion of forest habitat within the HR per individual, modelled as a function of the explanatory variables time of the day (daytime vs. nighttime), and population (five populations for each species; Fig. 1, Analysis panel, part C; Fig. 2). We used GLS, rather than an Ordinary Least Squares model, to correctly describe the differing variances among time of day and populations. Our data met the GLS model assumptions. We compared all models with the different combinations of the two explanatory variables time of the day and population, both the additive effects and their two-way interaction. We added two further models with the variance term for population and time of day, respectively, when the predictors were included as additive factors. This resulted in eight models for each species (red and roe deer), space use metric (GPS locations and HRs), and layer used (TCD and CLC). The models were ranked according to the Akaike Information Criterion corrected for small sample sizes (AICc, Burnham and Anderson 2002; see Online Appendix S5, Tables S5.1–S5.8). Predictions in day vs nighttime forest use are visualized in Fig. 4, comparing predictions derived from CLC and TCD within each panel, for each metric and species combination (GPS—roe deer; GPS—red deer; HR—roe deer; HR—red deer).

Results

Classification mismatch between TCD and CLC in red and roe deer GPS locations

For red deer, 22% of GPS locations were oppositely classified by TCD and CLC layers (8959 out of 39,522 GPS locations, Table 3.A, top-left panel). Specifically, 13% of the GPS locations were classified as forest by TCD but as open by CLC (Table 3.A. dark blue, ‘FO’), and 9% as open by TCD but as forest by CLC (Table 3.A. light blue, ‘OF’). Similarly, 18% of roe deer GPS locations were oppositely classified by TCD and CLC (8294 out of 45,529 GPS locations, Table 3.A, bottom-left panel), with 10% classified as forest by TCD but as open by CLC, and vice versa for 8% of the GPS locations. Thus, prediction P.A1 was not confirmed, since both species presented a similar percentage of GPS locations oppositely classified by TCD and CLC, and even a slightly greater percentage of misclassification for red deer than for roe deer. Interestingly, despite such mismatched classification of single GPS locations, the overall proportion of GPS locations classified as forest or open was very similar when using the two layers (overall proportion of GPS locations classified as forest for red deer: 67% with TCD, 63% with CLC; for roe deer: 38% with TCD, 36% with CLC). This result is showcased with individual examples in Fig. 3 (upper panels, red deer; lower panels: roe deer). In the right-hand panel, the dark-blue and light-blue habitat patches are oppositely classified by TCD and CLC. Consequently, the intersecting GPS locations show a mismatched classification. Visual inspection in Google Earth® shows that oppositely classified habitat patches, in this example, correspond to small fragments or linear features.Footnote 1 In particular, the patches classified as forest by TCD, but open by CLC (dark blue), are small woody fragments, whereas patches classified as open habitat by TCD, but forest by CLC (light blue), are small openings or roads. This example was also confirmed at the population level, since GPS locations oppositely classified by TCD and CLC were found in habitat units significantly smaller than average, confirming our prediction P.A2. Indeed, the median size of used forest units (as determined by TCD) that were oppositely classified by TCD and CLC (FO) was 3468 ha and 68 ha for red and roe deer respectively, while the median size of used forest units consistently classified by TCD and CLC (FF) was 5302 ha and 170 ha for red and roe deer respectively (Wilcoxon Tests: red deer W = 40,459,941, p < 0.001; roe deer W = 17,327,785, p < 0.001). Similarly, used open habitat units that were oppositely classified by TCD and CLC (OF) had a median size of 4 ha in red deer and 102 ha in roe deer, while those that were consistently classified by TCD and CLC (OO) had a median size of 753 ha in red deer and 856 ha in roe deer (Wilcoxon Tests: red deer W = 5,359,340, p < 0.001; roe deer W = 26,076,120, p < 0.001). Hence, oppositely classified GPS locations primarily occurred in small openings for red deer and both small forest patches and openings for roe deer. This is likely linked with the prevalent use of small forest patches by roe deer, when forest is used (23% of roe deer GPS locations classified as forest by TCD were in forest patches smaller than 25 ha (see above, minimum mapping unit for CLC); 4% of GPS locations for red deer) and of small clearings by red deer, when open habitat is used (32% of red deer GPS locations classified as open by TCD were in open patches smaller than 25 ha; 6% of GPS locations for roe deer).

Table 3 (A) Confusion matrices (See also Fig. 1—Analysis Classification Mismatch) comparing the TCD and CLC classifications of red and roe deer GPS locations as forest or open habitats (output for all individuals across the five populations combined).
Fig. 3
figure 3

Summer GPS locations and HRs (HR, 90% KDE; yellow for day and black for night) for one red deer (upper panels: ID 762, adult female, SW-Belgium, summer 2010) one roe deer (lower panels: ID 2285, adult male, Switzerland, summer 2013) overlaid onto TCD (left panel) and CLC (middle panel) layers. The bar charts below the plots indicate the proportion of forest (dark green) and open habitat (light green) within HRs and for GPS locations as classified by the respective layers. The right column illustrates the classification mismatch between the two raster layers, with consistent classification between TCD and CLC indicated in green (dark for forest, light for open), and opposite classifications between TCD and CLC in blue (dark blue: TCD classification as forest but CLC classification as open; light blue: TCD classification as open but CLC classification as forest). See also the colour codes in Table 3. (Color figure online)

Classification mismatch validation with Google Earth

Finally, the validation analysis on 500 GPS locations per species that were oppositely classified by TCD and CLC (i.e., OF and FO) showed that, according to the ground-truth layer, TCD classification was more often correct than CLC, confirming P.A3: 81% for red deer (386 out of 475) and 74% of these for roe deer (344 out of 468) were correctly classified by TCD (Table 3.B). Conversely, 19% of the GPS locations for red deer (89 out of 475) and 26% of the GPS locations for roe deer (124 out of 468) were correctly classified by CLC. For 5% of the oppositely classified GPS locations (57 out of 1000) we could not determine the habitat type through visual interpretation of the orthophotos. When comparing populations, the proportion of GPS locations correctly classified by either TCD or CLC slightly differed, but TCD outperformed CLC in all cases (see Online Appendix S4 for more details).

Day versus night use of forest in roe and red deer at the GPS and home range scales as determined by TCD and CLC

Confirming our prediction P.B1, red and roe deer forest use was greater during daytime than nighttime. Forest use also varied across populations, although the difference observed between daytime and nighttime forest use was consistent across populations (no two-way interaction retained between time of the day and population, see Online Appendix S5). These patterns were more pronounced with TCD, especially when GPS locations were used instead of HRs (Fig. 4 and Table S5.9, S5.10).

Fig. 4
figure 4

Model predictions of red and roe deer forest use during day and night. Upper and lower panels are respectively the model predictions for red and roe deer and left and right for GPS location and HR levels. Model predictions based on TCD and CLC layers are indicated respectively by circles and triangles, and the colour distinguishes the predictions for day and night (in yellow and black respectively). All best models included a Generalized Least Squares estimate of the variance, except HR-TCD for roe deer (grey bars, see also Table S5.9). (Color figure online)

The difference in the use of forest from day to night in both species was larger when using TCD than with CLC, confirming our prediction P.B2. For red deer, the difference in the use of forest from day to night showed on average a decrease of 39% with TCD (circles in Fig. 4, top left panel; Table S5.10), compared to a 22% decrease with CLC (triangles in Fig. 4, top left panel; Table S5.10), at the GPS location level. Similarly, for roe deer, the difference in the use of forest between day and night showed on average a decrease of 53% with TCD (circles in Fig. 4, bottom left panel; Table S5.10), and 46% with CLC (triangles in Fig. 4, bottom left panel; Table S5.10). Also, CLC-based estimates showed a larger within-population variability: the average standard error across populations for red deer was 0.03 and 0.06 for TCD and CLC, respectively; for roe deer, 0.03 and 0.04 for TCD and CLC, respectively (see Table S5.10). On average, the absolute difference between estimated forest use by TCD and CLC was 11.5% for red deer and 6.95% for roe deer during daytime, and respectively 7.01% and 8.02% during nighttime. Hence, while we confirmed our prediction P.B2 that the estimates of diel forest use differed between the two layers, we did not find greater discrepancy between TCD and CLC layers for roe deer (P.B4) given that estimates showed a similar difference between layers in roe and red deer. Interestingly, during daytime forest use estimates were prevalently larger for TCD (9 out of 10 populations), while during nighttime CLC estimates were often larger (6 out of 10 populations). The differences between day and night use of forest were much less evident at the HR level, both for red and roe deer (Fig. 4, right panels; Table S5.10). In addition, differences between forest use estimates obtained with TCD and with CLC were also reduced at the HR level compared to the GPS location one (P.B3), although not markedly (i.e. circles and triangles are closer together in the right panels than in the left panels of Fig. 4; see Table S5.10).

Discussion

We estimated the diel use of forest in five populations of red deer and five populations of roe deer across Europe with two different geographic layers, using two spatial scales of analysis. First, we showed that about 20% of the GPS locations (23% for red deer and 18% for roe deer) were classified oppositely by CLC and TCD, although the overall proportion of forest and open habitat was similarly estimated by the two layers (Table 1A), raising the need for further evaluations on the use of these layers for animal ecology applications. Second, our case study shows that both deer species consistently used more forest habitat during the day than at night across Europe (as much as 40–50% more; Table 1B). Ungulates inhabiting European human-dominated landscapes must cope with high human densities and an extensive road network, and have consequently adjusted their activity pattern and habitat use to decrease the exposure to human disturbance (Bonnot et al. 2020). Our findings indicate that open areas, generally rich in herbaceous plants but also riskier in terms of exposure to human activities (Abbas et al. 2011; Bonnot et al. 2013), are visited substantially more during nighttime and conversely, forested, less anthropized habitats are visited more during the day. However, this crucial behavioural strategy, estimated via the proportion of forest use, was estimated differently using the two geographic layers here considered, TCD and CLC, and at the two spatial scales of the analysis, GPS locations and HRs.

By sampling 1000 of the GPS locations that were oppositely classified by CLC and TCD and annotating them as forest or open according to orthophotos of Google Earth, we found that TCD was more accurate than CLC (Table 3 and Online Appendix S4; P.A3). Indeed, TCD and CLC have an overall similar classification accuracy (Table 1), but deer often use those areas, such as forest openings and edges, that are more prone to misclassification. Indeed, we found that the GPS locations classified oppositely by CLC and TCD were found in patches significantly smaller compared to the consistently classified GPS locations (P.A2). This highlights that the sensitivity of animal habitat use analysis to misclassification will depend on how well geographic layers describe small-scaled habitat features used by animals.

When we modelled deer differential use of forest in diurnal and nocturnal hours using the two layers, we obtained different results at the GPS location and HR level, as expected (P.B3). In particular, TCD pointed at a greater difference in the use of forest between daytime and nighttime at the GPS location level, i.e. a greater deer day-night shift, than CLC, resulting in potentially contrasting conclusions when using either layer. Indeed, using CLC to estimate day-night shifts in the use of forest by deer would lead to an underestimation of this behavioural strategy that TCD identified consistently across individuals and populations, instead. On the other hand, at the HR level, forest use estimates were only marginally different between the two layers, and a day/night shift was hardly detectable by using either layer. To sum up, we found that the forest use estimates depended not only on the geographic layer used, but also on the spatial scale of analysis (i.e., GPS locations or HR). The analysis at the GPS location scale allowed to better detect the day-night shift in the use of forest, but was also more sensitive to the specific geographic layer used. For this reason, special attention should be given when habitat use is evaluated at the level of GPS locations or trajectories, for example in Resource Selection Analysis (especially Step Selection Functions, Thurfjell et al. 2014, and integrated Step Selection Analysis, Avgar et al. 2016), or in the analysis of sequential habitat use (Sequence Analysis Methods, De Groeve et al. 2016, 2020).

Best practice in the use of geographic layers for movement ecology analysis: considerations and limitations

The potential problem of obtaining different results from geographic layers has been given relatively little attention in movement ecology. Since maps are only representations of reality, each with their own limitations and biases (Monmonier 2018), we advise to treat them critically when estimating animals’ habitat use. Our results go beyond the comparison of two specific geographic layers as they demonstrate the need to account for potential sources of errors inherent in the use of geographic layers in animal ecology.

A first general recommendation is to always visually compare layers and select those that better describe the habitat features used by the species of interest. Fleming et al. (2004) suggested starting with the highest resolution imagery available to better assess local-scale relationships when examining habitat associations. Such comparisons can be performed in various GIS environments such as QGIS, ArcGIS, and common programming environments specialized in spatial analysis such as R and Python. A typical example of a comparison between layers is provided in Fig. 3, where we showcase the mismatch between CLC and TCD and their respective reclassifications both with maps and proportional barplots.

Next, users should measure the accuracy of geographic layers, when possible, with respect to the habitat features of interest. Remote sensing products used for habitat or land features identification are often validated through visual interpretation of orthophotos for a random or stratified sample of locations/areas (Pekkarinen et al. 2009). For applications in movement ecology, we recommend applying a validation using orthophotos both for random locations and for locations used by the animals. The former will show the general accuracy of a habitat layer, while the latter will allow assessing whether those areas specifically used by animals (for example, small habitat patches) are more sensitive to misclassification. Here, for the general accuracy we relied on the official accuracy report that accompanied the products (see Table 1), and focussed on locations used by animals. However, local differences on general accuracy have been reported (De Groeve 2018, Online Appendix S3), so if possible also random locations may be evaluated.

Another typical critical step in the workflow of spatial analysis is the reclassification of the original geographic layer. This step, which is often necessary before performing animal habitat use analysis (e.g., Falcucci et al. 2009; Bosch et al. 2012; De Groeve et al. 2016), may consist of merging land cover classes (e.g. CLC) or defining a specific value as the threshold (e.g. TCD) to distinguish different habitats in a raster layer. Reclassifications can introduce further inaccuracy and should be carefully defined, as we did in the present work through a sensitivity analysis (Online Appendix S3) that compared different aggregations of CLC-classes and different forest percentage thresholds of TCD. In this work, a reclassification was essential for directly comparing CLC and TCD, however we recommend using original input values of a layer where possible.

While validation is essential, researchers may not limit their analysis to a single geographic layer. Ecological models can be run in parallel with multiple geographic layers expressing the same environmental covariate, as done in this study (Fig. 4). This could further help to disentangle the ecological effect of the covariate, from the characteristics of the data source. Our results suggest that TCD performs better for the analysis of forest use and allowed to identify day/night shift by deer. However, CLC provides information on many other land use categories for which harmonized European layers are still missing, such as agricultural fields. Its use in movement ecology studies is useful but should consider the limitations on spatial resolution and class aggregation that we further evidenced in this work.

Here we used static layers that refer to a relatively long period (6 years for CLC and 3 years for TCD) to match instantaneous animal relocations. The temporal mismatch between the habitat types represented by the geographic layers and those experienced by animals can represent a source of bias. Static layers can become outdated within a relatively short interval of time, for example because of local or large-scale changes in forest landscapes due to insect outbreaks (Oeser et al. 2021), tempests (Gaillard et al. 2003), fires (Silva et al. 2014) and logging activities. New satellite sensors releasing almost real-time observations, together with remote processing engines (e.g. Google Earth Engine) represent the next generation of opportunities (Oeser et al. 2020), allowing more dynamic mapping to match animal movement and behaviour.

In conclusion, the choice of the geographic layer to utilize should be considered as a crucial step in habitat use and selection studies (Oeser et al. 2020), which requires careful evaluation of the layer-specific characteristics with respect to the target species ecology and behaviour. Here, we suggest to carefully evaluate geographic layers, paying attention to spatial resolution, temporal match, classification accuracy in respect to the spatial scale of animal space use analysis.