Mapping terrestrial ecosystem health in drylands: comparison of field-based information with remotely sensed data at watershed level

Combining field-based assessments with remote-sensing proxies of landscape patterns provides the opportunity to monitor terrestrial ecosystem health status in support of sustainable development goals (SDG). Linking qualitative field data with quantitative remote-sensing imagery to map terrestrial ecosystem health (SDG15.3.1 “land degradation neutrality”). A field-based approach using the Interpreting Indicators of Rangeland-Health (IIRH) protocol was applied to classify terrestrial ecosystem health status at the watershed level as “healthy”, “at-risk”, and “unhealthy”. Quantitative complex landscape metrics derived from Landsat spaceborne data were used to explore whether similar health statuses can be retrieved on a broader scale. The assignment of terrestrial ecosystem health classes based on field and the remotely sensed metrics were tested using multivariate and cluster analysis methods. According to the IIRH assessments, soil surface loss, plant mortality, and invasive species were identified as important indicators of health. According to the quantitative landscape metrics, “healthy” sites had lower amounts of spectral heterogeneity, edge density, and resource leakage. We found a high agreement between health clusters based on field and remote-sensing data (NMI = 0.91) when using a combined approach of DBSCAN and k-means clustering together with non-metric multi-dimensional scaling (NMDS). We provide an exemplary workflow on how to combine qualitative field data and quantitative remote-sensing data to assess SDGs indicators related to terrestrial ecosystem health. As we used a standardized method for field assessments together with publicly available satellite data, there is potential to test the generalizability and context-dependency of our approach in other arid and semi-arid rangelands.


Introduction
Arid and semi-arid ecosystems cover ∼45% of the globe and provide important ecosystem services related to water supply, preserving biodiversity, flood protection, food security, and the storage and capture of carbon (UN 2019; Berdugo et al. 2020). However, the ecological functioning of these drylands is currently threatened by ecosystem degradation, unsustainable land management practices, accelerated urbanization rates, and changing demographics (IPBES 2018(IPBES , 2019Berdugo et al. 2020). These factors are recognized as the most critical drivers of the depletion of natural resources (van der Zanden et al. 2016) and terrestrial ecosystem health (Lausch et al. 2016). The ecosystem degradation in drylands is negatively affecting ecosystem functions and services and indirectly threatening the well-being of at least 3.2 billion people worldwide with costs of more than 10% of the global gross product per year (IPBES 2018;UN 2019).
Thus, understanding how arid and semi-arid ecosystems respond to ongoing environmental change is crucial in achieving sustainable development goals (SDGs) (https:// sdgs. un. org/ goals). Each SDG is assigned certain indicators that are ideally sensitive, unambiguous, quick and easy to sample, temporally consistent, and applicable to a wide range of ecosystems (Prince 2019). More specifically, fulfilling SDGs should be measurable by employing both qualitative and quantitative indicators (Hák et al. 2016). For example, SDG15 "Life on Land" addresses the urgency of stopping the degradation of natural habitats along with the integration of biodiversity and ecosystem values into development processes and local planning (UN 2019). SDG 15.3.1, in particular, addresses the need to accomplish land degradation neutrality (LDN) (Sims et al. 2021). The degree of land degradation can be quantified by sub-indicators such as (i) land cover and land cover change, (ii) land productivity, and (iii) carbon stocks above and below ground (Prince 2019;UN 2019). Therefore, spatial indicators of terrestrial ecosystem health as monitoring and decision-making tools are important for land managers to combat land degradation.
However, scaling issues can apply when translating terrestrial ecosystem information from the field level to broader levels in space and time (Wu and Hoobs 2002). Linking ecological information between finer and larger scales constitutes a fundamental challenge in landscape ecological research as well as a practical land management challenge because some ecological processes might only be noticeable at a certain level, and interactions might occur between levels. Additionally, to assess the effects of land management, it is important to select appropriate spatial units for the studied ecological processes considering the fact that there is a spatial hierarchy of ecosystems and that there are linkages between systems, for example via sediment transport, groundwater movement, runoff, and microclimate (Bailey 1985). In a broader sense, spatially-explicit knowledge of scaling is an essential requirement to understand patterns and processes of ecosystem health status.
Terrestrial ecosystem health can be understood as the degree to which vegetation, climate, and soil integrity are capable of sustaining self-organized structures and processes (NRSC 1994). Thus, a healthy ecosystem can be regarded as resilient against disturbance (Gunderson 2000). In contrast, degradation occurs when different ecosystem states related to energy flow, nutrient cycling, and hydrological regimes are negatively affected. Particularly for semiarid lands, protecting the structure and functionality of the ecosystems and actively restoring degraded drylands are essential to ensure soil integrity, nutrient cycling, and water retention (Pyke et al. 2002;Briske et al. 2005). As a result, a wide range of terrestrial ecosystem health indicators has been developed (van der Zanden et al. 2016;Estevez et al. 2017) with the emphasis on utilizing both qualitative and quantitative indicators (Duniwayet al. 2010). The Interpreting Indicators of Rangeland Health (IIRH) protocol (see Pyke et al. 2002) is regarded as an international standard to assess land health and degradation in drylands (Duniway et al. 2010;Herrick et al. 2019). IIRH is a qualitative assessment of ecological processes using 17 observable indicators and it is designed for assessing ecosystem health and functions in rangelands. However, assessing ecosystem functioning based on qualitative field data has distinct spatio-temporal limitations. These drawbacks can be investigated by properly integrating quantitative metrics from remotely sensed data (Toevs et al. 2011). A variety of remote sensing data has been used to assess and monitor ecosystem health across a range of scales (Li et al. 2014). However, remote sensing data has certain limitations in accurately assessing the state of ecosystem health (Li et al. 2014). Thus, methodological approaches combining and verifying remote sensing data with ground-based field sampling information on ecosystem health require further investigation.
The question is, however, how to scale up from the qualitative field-based measures to a broader scale using quantitative remote sensing data without losing crucial functional information. An integrated approach that combines field data, experimental manipulations, GIS, remote sensing, and modeling is essential for advancing science of scale (Wu and Hobbs 2002). Integrating remote sensing with in situ data is challenging, but it has a large potential to deepen our understanding of sustainable use of ecosystems and ecosystem resilience (Cavender-Bares et al. 2022). Common remote sensing products are land-use land cover (LULC) maps. LULC change analysis allows for mapping changes of landscape structural and functional characteristics (Matsushita et al. 2006) and supports monitoring SDG 15.3.1 (Prince 2019). LULC changes are among the main anthropogenic factors needed for health assessment in semi-arid land ecosystems (Soffianian and Madanian 2015;Baranian et al. 2017;Molaeinasab et al. 2018;Safaei et al. 2021). Beyond the commonly used Normalised Difference Vegetation Index (NDVI) (Tucker 1979), complex remote sensing-based indices that are linked to landscape functionality and degradation (Tanser and Palmer 1999;Bastin et al. 2002) as well as landscape metrics (Frank et al. 2012;Inkoom et al. 2018) have been applied to map ecosystem functioning. An example of a complex remote-sensing index is the Moving Standard Deviation Index (MSDI). It can capture the spectral diversity across images and has been applied in rangelands to map ecosystem degradation caused by overgrazing and improper management (Tanser and Palmer 1999;Jafari et al. 2008). Another example is the Leakiness Index (LI) (Bastin et al. 2002;Ludwig et al. 2007). The LI can be applied to assess sub-catchment-level resource leakage based on the hydrological regime, digital elevation models (DEM), and vegetation cover. As a result, the LI has emerged as a promising ecological index with a high potential to monitor terrestrial ecosystem health status across arid and semi-arid landscapes (Ludwig et al. 2007). To our knowledge, the LI has been underutilized in terrestrial ecosystem health assessments.
Because all remote sensing approaches present scaling issues at least to some extent (Torgerson 1958), and as data availability, transferability, and uncertainty in estimating valid health indicators remains challenging (Li et al. 2014), more research is required on using landscape metrics or complex indices for valid upscaling of field assessments based on remote sensing data to aid mapping terrestrial ecosystem health. In this study, we used multivariate methods to compare field-based, terrestrial ecosystem health indicators from the watershed scale with the landscape level by means of remote sensing. Based on a case study for central Iran, the main aim of our study is to link qualitative field data with quantitative remote sensing data to identify indicators terrestrial ecosystem health related to SDG 15.3.1 that can be monitored from space. By quantifing LULC changes over 28 years using remote sensing metrics and assigning health classes in each sub-catchment, the proportion of land degradation was estimated. We use a standardized field protocol, publicly available satellite data and mainly free and open-source software so the methodology described here can be tested and applied in other arid and semi-arid rangelands as a decision support tool to conduct land degradation zoning and to prioritize regions for restoration efforts.

Study area
This study was carried out in central Iran across an area of 2238 square km located between 33°04′18" and 32°36′23" E and 49°38′40" and 50°19′25" N ( Fig. 1). The region has a mean annual precipitation and temperature of 538 mm and 10.1 °C, respectively, and the climate is considered to be Mediterranean type (DeMartonne 1962). The natural landscape of this region, which is part of the vast central Zagros Fig. 1 A Study region in western Isfahan Province, Iran. The region was divided into 40 sub-catchments. Each sub-catchment has been assigned a health class that is derived from five replications of IIRH (Interpreting Indicators of Rangeland Health) protocol assesments. B Sites of health assessment in the forest and rangelands were distributed using stratified random sampling. Photos show examples of landscape views with the recording of a few health indicators at some evaluation sites. C Rangeland in near reference state condition domi-nated by tall grass with shrubs, subdominant annual grasses, and rarely connected patches of bare ground. D Shrub community (Astragalus versus) with dead and dying branches and leaves (shown in red circles). E Native shrub with good vigor and reproductive capability (red circle). F Rills in a steep slope are the result of geologically sensitive formation. G Degraded rangeland site with large patches of bare ground. H Open forest dominated by oak trees Mountains, traditionally provides forage for livestock, water, and other ecosystem services to the local people. Overgrazing and droughts with varying severities and magnitudes along with the dieback of oak tree forests (Quercus persica Jaub. & Spach), and other valuable rangeland species such as Astragalus verus (Olivier) and Astragalus adscendens (Boiss. & Hausskn. ex Boiss) have resulted in substantial changes in LULC (Safaei et al. 2021). Although many oak forests in Iran suffer from oak charcoal rot, which is caused by two opportunistic fungi, Biscogniauxia mediterranea and Obolarina persica, the incidence of this disease has not been reported in the study area so far (Ahmadi et al. 2020).
We developed an exemplary workflow that links qualitative field data ( Fig. 2, Step 1) with quantitative remote sensing data (Fig. 2, Step 2) for identifying and assessing SDG indicators related to terrestrial ecosystem health in semi-arid lands using different multivariate analyses (Fig. 2, Step 3). Finally, we tested whether this combination of analyses can be used to assess the state of terrestrial ecosystem health in the past (Fig. 2, Step 4). All steps of our workflow are illustrated in Fig. 2.

Field data collection through IIRH protocol and analysis of terrestrial ecosystem health
A field survey was conducted at the sub-catchment scale to determine the most important indicators of health status according to the IIRH assessment given by Pyke et al. (2002). We considered sub-catchments as our sampling unit because their hydraulic gradient is crucial for the calculation of functionality (Ludwig et al. 2007). Of the 40 sub-catchments, rangelands were dominating in 35, and forests were dominating in 5 sub-catchments. Since the study area was large (2238 km 2 ), careful selection of health sites at the sub-catchment scale that reflected the overall condition of each sub-catchment was important. To avoid misevaluations, the scaling of health data from sampling point to sub-catchment was done using stratified random strategy to cover the representative site conditions in each sub-catchment and the 17 Interpreting Indicators of Rangeland Health (IIRH) were assessed (Table 1). Based on the IIRH protocol (Pyke et al. 2002;Pellant et al. 2020), two sites, one rangeland, and one forest, were selected as ecological reference sites. An ecological reference site is a well-managed landscape unit in which ecological processes (i.e., energy flow, nutrient cycling, and hydrological condition) are working within the natural range of variability, and the vegetation has adequate resilience and/or resistance to typical disturbances (Pyke et al. 2002). The 17 ecological indicators were utilized to identify three ecosystem attributes that reflect ecosystem functionality including (i) soil and site stability, (ii) hydrological function, and (iii) biotic integrity (Table 1) (Pyke et al. 2002;Pellant et al. 2020). They were manually recorded on evaluation sheets in five replicates at different locations in each sub-catchment. The five replicates were representative for the degree of health in each sub-catchment. The deviation of the 17 measured indicators in each sub-catchment from the reference area conditions was rated in five quality categories (extreme to total, moderate to extreme, moderate, slight to moderate, and none to slight) (see Pellant et al. (2020) for a detailed IIRH description). As it is shown in Table 1, the deviation of nine indicators were used to rate soil and site stability, 10 indicators to rate hydrological function, and nine indicator to rate biotic integrity based on a preponderance of evidence approach using a two way table in the R package "base" (R Core Team 2021). Finally a health class including "healthy", "at-risk" and "unhealthy" status was assigned to sub-catchment using the preponderance of evidence of three health attributes ((i) soil and site stability, (ii) hydrological function, and (iii) biotic integrity). To prevent or reduce observer bias, fieldwork was performed by two fixed observers who worked together on the same site with a sound ecological knowledge of the study area in two months during June and July 2016. Information related to the 40 sub-catchments, as well as the assigned health status for the three ecosystem attributes, are given in the Supplementary file 1 (Table S.1).

Remotely sensed landscape metrics across terrestrial ecosystem health classes
Two LULC maps with a pixel size of 30 m including forest, rangelands in good, moderate and poor conditions, agriculture, residental area, snow and rock classes were produced based on 2016 Landsat OLI imagery with 6 bands (bands 2 to 7) with an overall accuracy of 94% (see Safaei et al. 2021) and1987 Landsat TM imagery with 4 bands (bands 1 to 4) (Soffianian and Madanian 2015) with an overall accuracy of 84%, respectively, applying the supervised Maximum Likelihood Classification (MLC) algorithm. Both images were downloaded from the United States Geological Society (USGS) Earth Resources Observation and Science (EROS) Center (https:// earth explo rer. usgs. gov/). We consider two-time steps to see whether the estimation of terrestrial ecosystem health in the past is possible. The validity of the 2016 LULC map was assessed using 401 field points assessed in 2016 (Safaei et al. 2021). The validity of the 1987 LULC map was tested using 86 reference points from aerial photographs of 1985 (National Cartographic Center; ncc.gov.ir).
Five landscape metrics including the (i) number of patches, (ii) total edge, (iii) edge density, (iv) largest patch area, and (v) total class area (see Table 2) were quantified at landscape level in each sub-catchment using Fragstats 4.2 (McGarigal et al. 2012). (vi) Entropy and (vii) contagion metrics, two indicators of fragmentation, were measured using the Guidos Toolbox 3 (Vogt and Riitters 2017). The (viii) MSDI, a proxy for landscape degradation patterns (Tanser and Palmer 1999), was computed using a 3 × 3 window filter moving across the Landsat OLI 2016 red band. The (ix) LI, a functionality indicator, was calculated based on a DEM with a spatial resolution of 30 m using 1:25,000 topographic maps (National Cartographic Center; ncc.gov.ir) (Safaei et al. 2021) ( Table 2), and the NDVI (Tucker 1979) from Landsat OLI imagery for 2016. See Ludwig et al. (2007) for 15. Annual production: total above-ground biomass as a measure of the vegetation available to harvest the sun's energy 4. Bare ground: exposed mineral or organic soil not covered by vegetation, gravel /rock, litter or biological crust 5. Gullies: channel that has been cut into the soil by moving water 16. Invasive plants: invasive exoticand noxious weed 6. Wind-scoured, blowout, and/or depositional areas: finer soil particles have been redistributed from depositional areas interspaces and deposited near an obstruction 14. Litter amount: dead organic material in contact with the soil surfacethat influences several ecological processes 7. Litter movement: redistribution of litter by water or wind 10. Plant community composition & distribution relative to infiltration reflects the contributions of functional/structural groups and their associated species in modifying infiltration 17.The reproductive capability of perennial plants: measure of potential for seed or tiller production, not presence of seedlings/new clonal plants 8. Soil surface resistance to erosion: reduced soil surface stability usually reflects lower soil biotic integrity because of organic matter and biological decomposition processes 9. Soil surface loss or degradation: loss or degradation of soil surface (organic matter) affects site potential 11. Compaction layer: near-surface layer of dense soil caused by repeated impact or disturbance of the soil surface 2 N = the pixel number of the filter (N = 9), DNi = the digital number of pixel (i) in pixels, DN' = the average digital number value of pixels Capture the spectral diversity across images and has been applied in rangelands to map ecosystem degradation caused by overgrazing and improper management details about the LI. The reasons for choosing metrics are explained in Table 2. All metrics were calculated at landscape level in each sub-catchment. Analysis of variance (ANOVA) followed by a posthoc Tukey test for unequal sample sizes (α = 0.05) was performed to assess whether landscape metrics showed significant differences among health classes. Only the landscape metric ED met the assumptions for ANOVA. For all other remotely sensed indices, a Kruskal-Wallis test followed by a Dunn's test with a bh p-value adjustment (α = 0.05) for the multiple comparisons was applied. Autocorrelations among the remotely sensed data were checked using the R packages "tseries" (R Core Team 2021; Trapletti and Hornik 2020).

Linking field-based indicators with remote sensing metrics
To test if the sites can be grouped into the three health classes "healthy", "at-risk" and "unhealthy" based on the NMDS, an unsupervised cluster analysis was conducted on the remotely sensed data and a supervised cluster analysis was conducted on the field data. (Tables 1 and 2). Non-Metric Multi-Dimensional Scaling (NMDS) (Kruskal 1964) with different distance measures including Euclidean, Manhattan, Mahalanobis, and Bray-Curtis were employed. The returning stress values for the different dissimilarity indices were used to identify the best dimensional ordination fit of field and remotely sensed data. Minimum stress solution was sought with Procrustean rotation (Gower 1971). The two-dimentional NMDS with 20 runs was performed in the "vegan" R package (Dixon 2003; R Core Team 2021). Terrestrial ecosystem health field indicators and remotely sensed metrics that were significantly correlated with the ordination axes were plotted in the two-dimensional ordination space.
Since it is challenging to identify the best algorithm a priori (e.g. Kozakura et al. 2017), different clustering algorithms were employed. Common types of clustering methods were tested including DBSCAN (Hahsler et al. 2019), k-means clustering (Johnson 1967), hierarchical clustering (Johnson 1967), and Self-Organizing Map (SOM) (Vialaneix et al. 2020). The clustering algorithm DBSCAN can identify clusters of any shape containing outliers and noise in the dataset. The goal of DBSCAN is to measure the number of objects close to a given point and then to identify dense regions (Ester et al. 1996). k-means clustering aims to partition "n" observations into "k" clusters based on the nearest mean (Kriegel et al. 2016), whereas hierarchical clustering is one of the most popular clustering techniques in machine learning. It places similar observations close to each other with no need to pre-specify the number of clusters as in k-means clustering (Johnson 1967). Finally, the Self Organizing Map (SOM) algorithm which is part of Artificial Neural Networks (ANNs) techniques, includes dimensionality reduction and data clustering. The SOM can be used for providing the geographic topology of the input data (Kohonen 2001;van der Zanden et al. 2016). We conducted the DBSCAN, the combination of DBSCAN + k-means clustering, hierarchical clustering, and SOM clustering analysis for both field qualitative indicators and remotely sensed quantitative metrics using the R packages "dbscan" (Hahsler et al. 2019; R Core Team 2021), "factoextra" (Kassambara and Mundt 2020; R Core Team 2021), "SOMbrero", and "kohonen" (Kohonen 2001;R Core Team 2021;Vialaneix et al. 2020).
We compared the patterns of terrestrial ecosystem health using different multivariate and cluster analyses considering the challenges of monitoring landscapes using remote sensing data to quantify and identify the spatio-temporal characteristics at watershed scale (Wu and Qi 2000). The combination of DBSCAN + k-means clustering was applied to check if the performance of clustering can be increased when combining methods.

Comparison of qualitative field indicators and quantitative remote sensing indices
The degree to which ecological processes and spatial patterns of both field and remote sensing level are related can be identified by testing if the assignment of terrestrial ecosystem health classes based on field indicators matched the characterization based on remotely sensed indicators. The multivariate spaces of both levels were compared using Procrustes analysis (PA) (Gower 1971). In PA, the ordination solutions are rotated to minimize the sum of squared residuals and maximize similarity (Gower 1971).
Procrustes analysis has been successfully applied to test the relation between multivariate data sets in ecology including both field and remote sensing data (Feilhauer et al. 2010;Magiera et al. 2013;Große-Stoltenberg et al. 2018). The clustering was compared using the Normalized Mutual Information measure (NMI) (Chiquet et al. 2020). The Normalized Mutual Information measure shows the consistency between two clustering results (Formula. 1).
Formula 1 where represent the cardinality of the p-th and q-th clusters in U and V clusters on a set of data, respectively (Kozakura et al. 2017). A zero NMI value corresponds to no mutual information (e.g. no correlation between the clustering results of the field and remote sensing based NMDS ordinations), while a value of one indicates a perfect correlation. Thus, both NMI and PA help to evaluate the similarity of datasets generated by multivariate analysis and cluster analysis, respectively (Kozakura et al. 2017). The PA was performed using the function "protest" in the R package "vegan" (Dixon 2003; R Core Team 2021) and NMI was calculated in "NMI" and "aricode" packages of R (Chiquetet al. 2020; R Core Team 2021).

Transferability of quantitative remote sensing indices to the past time
In addition, we tested whether this combination of indicators and clustering methods can be used to assess the state of terrestrial ecosystem health in the past. Therefore, after validating the remotely sensed indicators and clustering methods against field data for 2016 (see previous paragraph), metrics and clusters were computed for satellite imagery in 1987 to investigate health status in the past. The reason for choosing 1987 was because shortly thereafter extensive LULC conversions (e.g. converting rangelands to dry farmlands) caused a sharp decline in the size of rangelands and heavy grazing occurred that considerably affected the rangeland health (Baranian et al. 2017;Safaei et al. 2021), so distinct changes could be  Safaei et al. 2021) to health classes derived from NMDS. Expert assessments from the combination of these three data sets have been used to validate health classes in the past. Finally, to determine the efficacy of integrating field and remotely sensed data for estimating SDG 15.3.1, we quantified site-based data to assess the three health classes " healthy", "at-risk" and "unhealthy" derived from geospatial information. The measurement unit for SDG indicator 15.3.1 is the spatial extent (km 2 ) calculated as the proportion of land that is degraded relative to the total land area (percentage) (

Results
Qualitative field assessment following the IIRH protocol: indicators and health class assignment According to the IIRH protocol, nine, 11, and 20 sites were classified as "healthy", "at-risk", and "unhealthy" at the watershed level, respectively (Fig. 3). Among the 17 indicators, the presence of structural and functional groups, vegetation composition, productivity, and litter volume was capable of discriminating healthy sites from unhealthy ones. The most important indicators in unhealthy sites were compaction layer, plant mortality, rills and water flow pattern, invasive species, and surface soil condition. The region has no gully formation, making this factor ineffective in site discrimination and health evaluation.

Remotely sensed landscape metrics across terrestrial ecosystem health classes
Healthy sites had the largest patch sizes (LPI) (Fig. 4A) as well as the lowest edge length (Fig. 4C, and G) and lowest number of patches (NP) (Fig. 4B) compared to at-risk and unhealthy sites (p < 0.05), while the total class area (CA), (Fig. 4D) was similar across health classes. The entropy metric showed that the degree of disorder in each site was low in healthy and high in at risk and unhealthy sites (Fig. 4E), indicating that these were more fragmented than healthy ones. Accordingly, contagion showed an opposite trend relative to entropy, highlighting that regions with a low spatial contagion, i.e. unhealthy sites, were severely fragmented (Fig. 4F). Both the MSDI index and the LI were significantly lower for healthy sites than for unhealthy ones (Fig. 4H, I) indicating lower degradation with lower water and soil resource loss at healthy sites.

Comparison of qualitative field indicators and quantitative remote sensing indices
The multivariate analysis (NMDS) coupled with clustering (DBSCAN, k-means clustering, hierarchical clustering, and SOM) performed well (Table 4) to categorize both field qualitative and remotely sensed quantitative data (Fig. 5). Among the different distance measures (Euclidean, Manhattan, Mahalanobis, and Bray-Curtis), Manhattan performed best with the lowest stress value (0.022) for the qualitative field data, while Euclidean performed best with the lowest stress value (0.067) for remotely sensed data (Supplementary file 5; Fig S.3 and S.4).Based on the NMDS for field health indicators (Fig. 5A), higher values for soil surface loss and degradation, plant mortality and decadence, compaction layer, soil surface resistance to erosion, rills, and water flow patterns were related to unhealthy sites. In contrast, higher values for structural and functional groups, annual production, the reproductive capability of perennial plants, and plant community composition were related with healthy sites. The NMDS revealed that larger patch values were associated with healthy sites (Fig. 5B). High edge density, number of patches, entropy, MSDI, and leakiness values were related to unhealthy sites. These data confirmed the results obtained from the univariate analyses of single measures (Fig. 4).
We used PA to determine if similar NMDS relationships of health indicator data and sites can be retrieved from both field-based indicators and remotely sensed metrics to validate the metrics derived from remote sensing analyses. There was a high Procrustes correlation derived from the symmetric Procrustes residual (0.88) between terrestrial ecosystem health groups based on field indicators (Fig. 5A) and derived from remotely sensed metrics (Fig. 5B) (Supplementary file 4; Fig. S.2).
The NMI differed between clustering approaches. The minimum was 0.54 (DBSCAN and k-means clustering, raw data) while the combination of DBSCAN and k-means clustering on two-dimensional NMDS data performed best (NMI = 0.91). The NMI value for the SOM on NMDS data showed the second-best performance (NMI = 0.84) (see Table 4, Supplementary file 5 and 6; Fig. S3, and S4). Collectively, NMI differed between methods and data types, although using a combination of clustering methods based on ordinated data performed best. The proportion of land that is degraded (%) Compared to the field-based map, both DBSCAN + k-means clustering and SOM overestimated the health status for one site ("At-risk" instead of "Unhealthy", site 16), SOM additionally underestimated the status for a further site ("Unhealthy" instead of "At-risk", site 21). Generally, both field and remote-based maps of terrestrial ecosystem health were consistent (Fig. 6A, B, C; Table 3), and inconsistencies of the RS-based maps accounted for 2.5% (DBScan and k-means clustering) and 5% (SOM), respectively. Finally, a LULC map with an overall accuracy of 84% (See Supplementary file 3; Fig.  S1) was created for 1987, and a terrestrial ecosystem health status map for 1987 was produced following the classification approach for 2016 ( Fig. 6D and see Supplementary file 5 and 6; Fig. S3, 4). Compared to 2016, there were 23 "healthy" sites, 11 "unhealthy" sites, and six "at-risk". There was a clear degradation of ecosystem health status in the Western part of the study area (Fig. 6D). The interpretation of the aerial images and the LULC maps (Supplementary file 3; Fig. S1), indicated that degradation occurred due to the conversion of forest in 1987 to poor and moderate rangelands in 2016 (Supplementary file 3; Fig. S1A, C, D). The increasing number of patches and total edge, as well as the decreasing largest patch index, has resulted in a higher level of fragmentation.  Table 4 Normalized mutual information value for each clustering method A zero NMI value corresponds to no mutual information, whereas one indicates a perfect correlation. Each NMI value is calculated based on the correlation of clusters between remotely sensed quantitative data and field qualitative assessment data. *Graphical results of different methods are shown in Supplementary file 5; Fig. S3. Hierarchical clustering is presented by heatmaps based on ward linkage and Manhattan distance. Since DBSCAN + k-means clustering and SOM performed best, they were used to produce health status maps for 2016 (Fig. 6A, B, C,

Discussion
The overall results of this study illustrate that standardized field assessments of terrestrial ecosystem health in drylands following the established IIRH protocol can be accurately reproduced with satellite data by combining multivariate methods together with unsupervised cluster analysis in this region. So we present an exemplary workflow on how to integrate a standardized in situ data set with publicly available satellite imagery to monitor land degradation (SDG indicator 15.3.1). The best approach, DBSCAN + k-means clustering, reached a very high correlation (PA = 0.88 and NMI = 0.91) between field and spaceborne data. Only the status of one out of 40 sites, which accounted for 0.93% of the total area, was misclassified by one health class ("at-risk" instead of "unhealthy"). Therefore, if field indicators of ecosystem health are linked with the appropriate RS metrics, there is potential for combining the data at different scales and extents, which could be tested in further studies in other rangelands to explore the transferability of our approach. Indeed, integrating in situ ecological data with remote sensing is both promising and challenging (Cavender-Bares et al. 2022). There is an urgent need for identifying indicators that anticipate change in terrestrial ecosystem health caused by human impacts to allow for timely management at the landscape scale (Briske et al. 2005;Estevez et al. 2017), because human, social and economic development relies on the health of natural habitats (UN 2019). Due to the complexity and the variety of available indicators, assessing terrestrial ecosystem health status across field and RS scale is H: MSDI (moving standard deviation index); I: LI (leakiness index). Lowercase letters indicate significant differences across health classes after Tukey test (P < 0.05). n.s. in Fig. 4D indicates no significant differences across classes a challenging task, but integrating expert knowledge and ecological data with remote sensing data has a large potential (Herrick et al. 2019), and in this study we present an exemplary methodological approach on how to deal with this challenge. In the following discussion, we describe the efficacy of combining field-based and remote sensing methods in determining health indices regarding structural and functional changes across health status.

Field assessment and IIRH protocol: health classes and indicators influencing terrestrial ecosystem health
Even though the IIRH protocol is a well-established method, adequate information of the studied ecological units is essential to interpret its indicators (Pyke et al. 2002). The classification of our study sites into three groups ("healthy", "at-risk", and "unhealthy") based on the indicators defined in the IIRH protocol directly provides useful information for land managers and decision-makers. Here, both the unhealthy and the at-risk sites were characterized by a clear deviation of indicators from the optimal state for all three attributes (soil and site stability, hydrological function, and biotic integrity). On the other hand, the healthy sites were characterized by good status of indicators related mainly to biotic integrity. The differences in the ratings of IIRH attributes were likely due to a combination of disturbances associated with the proximity to nomadic and rural areas, land-use/ cover conversion to abandoned rainfed lands, and decreasing coverage of under-canopy perennial plants (Duniway et al. 2010;Molaeinasab et al. 2018). Such disturbances lead to soil degradation and erosion, decreasing the frequency of productive, perennial, and palatable grass species like Bromus tomentellus and Agropyron tricophorum, and increasing the frequency of annual, toxic, and invasive plants in the study area (Safaei et al. 2021). This combination of disturbances affects various ecosystem functions and explains the combination of indicators in all three attributes, which are relevant for the unhealthy ecosystem status. Thus, integrating "moment-in-time assessments" such as the IIRH (Pyke et al. 2002) with scientific knowledge of ecosystem processes of a region can help to stimulate the understanding of terrestrial ecosystem health status. Since healthy ecosystems significantly reduce the destructive impact of human and natural disasters in arid and semi-arid ecosystems (Li et al. 2014;Lausch et al. 2016;Estevez et al. 2017;Molaeinasb et al. 2018), this understanding could help to develop expert knowledge-based benchmarks (Herrick et al. 2019) to support UN sustainable development indicators (Hák et al. 2016) in terms of land degradation neutrality (LDN, SDG.15.3.1).

Assessment of terrestrial ecosystem health using remotely sensed metrics
In the studied area, remote sensing methods have been successfully applied to assess terrestrial ecosystem health status and to investigate impacts on ecosystem functioning and services. There has been a lot of discussion about how to relate landscape metrics to ecological processes (Bastin et al. 2002;Lausch et al. 2016;Robinson and Weckworth 2016;Wu and Hobbs 2002). However, scale issues and data availability as well as uncertainty in estimating health indicators from remote sensing can limit the applicability and accuracy (Li et al. 2014). For example, the diagnostic value of vegetation indices such as the NDVI to assess ecosystem degradation can be increased when changes in the spectral reflectance are linked to the underlying ecosystem processes using scientific knowledge (Herrick et al. 2019). In semi-arid lands, there are various methods to determine terrestrial ecosystem health using structural and functional indicators (Estevez et al. 2017;Berdugo et al. 2020).
Here, we examined changes in landscape metrics related to landscape function (LI) and landscape degradation (MSDI) in rangeland and forest ecosystems. The LI can be used to assess and monitor changes in the leakiness status of various landscapes and at different scales. Combining coarse or fine-scale satellite imagery with low or high-resolution DEM allows for LI to be derived at different spatial scales (Ludwig et al. 2007). Most landscape indices such as the LI and MSDI differed significantly between health classes and showed the same trend as, for example, the number of patches and the largest patch area. Indeed, decreasing terrestrial ecosystem health in this region can lead to an increase in the number of fragmented patches and leakage of resources, showing less integrated ecosystems with a higher degree of fragmentation (Safaei et al. 2021). The spectral heterogeneity was relatively high in degraded and unhealthy sites. Results from this and other studies all found that the MSDI serves as an appropriate complementary or alternative method to expensive field investigation for measuring landscape status and land degradation in arid and semi-arid regions (Tanser and Palmer 1999;Jafari et al. 2008). On the basis of these findings, the current structural condition of these ecosystems, which stems from past changes and conversions, is an indication of increasing land degradation. Given that one significant consequence of landscape degradation and conversion is health reduction, numerous ecological consequences are expected to rise (Tanser and Palmer 1999). Due to the importance of landscape structure and functionality for land management, it would be beneficial to consider RS based landscape metrics such as LI and MSDI in future research to outline management practices and to identify appropriate management strategies.

Integration of field-based indicators with remote sensing metrics to map and assess terrestrial ecosystem health status across spatio-temporal scales
The validation of remotely sensed information on terrestrial ecosystem health with expert field assessments can contribute to a science-based and datadriven assessment of rangeland health (Herrick et al. 2006(Herrick et al. , 2019. In this study, comparing field data with remote sensing metrics using multivariate statistical methods turned out to be a promising approach for clustering terrestrial ecosystem health classes at the landscape scale in the studied area. We compared eight clustering methods to identify the appropriate clustering approaches on both field-based and remote sensing metrics, as the performance of different classification algorithms might vary clearly for a chosen dataset (Abbas 2008;Wiwie et al. 2015). The accuracy between maps using field and remote sensing data differed between best-performing clustering methods and pre-processing procedures. Here, DBSCAN combined with k-means clustering using NMDS ordination data showed the best performance in terms of concordance between field and remote sensing assessments of terrestrial ecosystem health gradients. Our results outline the potential of NMDS to deal with mixed data types (McCune and Grace 2002), the utility of Procrustes rotation of ordination data coupled with cluster analysis to assess survey data at larger spatial scales (Gonzalez-Mejia et al. 2018), and of combining distance and densitybased clustering algorithms (Dash et al. 2001). Our results particularly highlight the recent popularity of DBSCAN (Ester 1996;Saha 2021). It has the typical characteristic to group data points with many nearby neighbors and to accurately identify clusters containing outliers and noise in the dataset (Ester et al. 1996). The results of this and other studies (Kozakura et al. 2017) demonstrated that combining clustering algorithms is a promising avenue to approximate clusters and to find the optimal clustering method suited to different types of ecological data. This approach enabled us to identify appropriate RS indicators of land degradation such as complex vegetation indices that cannot be calculated from field data alone (Toevs et al. 2011). The linkage between field and remotely sensed indices helped to better interpret the numerical appearance of metrics and the respective ecological patterns and processes. With the assignment of health status clusters to the landscape indices using a combination of multivariate analysis and cluster analysis, we go beyond the patch-based metrics (Wu and Hobbs 2002) and illustrate how landscape metrics can be applied to assess ecological processes at the watershed level, which highlights the potential to monitor land degradation from space. Further studies to test the transferability of our approach to other rangelands or across spatial scales are possible as we used a standardized field protocol, mainly free and open-source software as well as publicly available satellite data. Future directions could include identifying ecological thresholds (Andersen et al. 2009) or spatial early warning signals of ecosystem transitions to alternative stable states (Kéfi et al. 2014) based on complex remote-sensing indices and the IIRH protocol to test the integrity of land degradation monitoring in rangelands in the field, from space and based on ecological conceptual frameworks and theory. The primary challenge in defining the benchmark of land degradation is the lack of historical data (Herrick et al. 2019). However, from a remote sensing perspective, the use of archive data from long-lasting satellite programs such as Landsat (Wulder et al. 2016) can partly fill such data gaps. Here, the remote sensing-based model for the assessment of terrestrial ecosystem health in 2016 was validated by field surveys and projected to Landsat data from 1987 (Fig. 6). From 1987 to 2016, decreasing areas of forest and dense rangeland together with increasing fragmentation exert negative trends on the ecosystem health status (Supplementary file 3; Fig. S1). Briefly, terrestrial ecosystem health maps derived from field information and remotely sensed indicators were consistent. This allowed for a transfer of the methods to historical satellite data, which produced plausible results. Providing an overview of the current and past state of ecosystem conditions is important because it affects the delivery of services and, therefore, human well-being (Rendon et al. 2019). Decreasing soil quality (Molaeinasab et al. 2018), increasing soil erosion, dust problems, and tree mortality in unhealthy sites have been attributed to the long history of over-grazing with domestic livestock like sheep and goats, and the conversion of forests and rangelands to arable lands (Baranian et al. 2017). When transferring our remote sensing approach to assess terrestrial ecosystem health from 2016 to satellite data from 1987, we found 32% of degradation and, therefore, a deviation from SDG 15.3.1 and its main aim to achieve land degradation neutrality (Prince 2019;UN 2019). In fact, it has been estimated that the actual number of livestock in some parts of the western rangelands of Iran including the study area exceeds considerably the grazing capacity (Farahpour 2002). It is partly because herders rely more on other feed sources for their livestock such as hay from farming lands (Abolhassani 2011). It is worth mentioning that rangeland management in the study area changed considerably following the Nationalization of Rangelands and Forests Act (Ghorbani et al. 2013). Before the land reform, rangelands were managed by nomads in common, public rangelands, while private rangelands were controlled by landlords. As a result, the rangelands were in better condition due to appropriate management, specifically due to the lower number of livestock and herders compared to the present time. Based on the land reform law, all rangelands were allocated to the government, resulting in "tragedy of the commons" dilemma (Baland and Platteau 1994). Another cause of rangeland health decline is extensive land conversion to farms by local farmers and villagers whose primary intention is to claim ownership of land rather than agricultural goods. These farmers are well aware that rainfed agriculture in degraded rangeland is not profitable but would help them acquire more land by taking advantage of an inefficient long-lasting land management system in the region (Baranian et al. 2017). The current approach holds bright prospects for outlining policies that contribute to the range and forestry development efforts, integrated management of resource exploitation, and reducing resource degradation (Matsushita et al., 2006;Miller 2008).
In summary, the presented workflow on how to combine ecological information from the field with ecological information from space to map ecosystem health in space and time is in line with Wu and Hobbs (2002), who introduced an integrated approach that combines field data, experimental manipulations, GIS, remote sensing, and modeling for developing a science of scale. Using a combination of multivariate and cluster analysis to compare field and remotely sensed data turned out to be suitable to map ecosystem degradation. As the quality and availability of data over large areas and extended time periods are critical for modeling (Wu and Hobbs 2002), and continuously gathering field data over large areas has limitations, complex remotely sensed indices that are verified against field data provide one of the best opportunities to overcome these challenges. Future research could investigate the combined use of different sensors and platforms including the analysis of time-series data, and the transfer to different sites and ecosystems.

Conclusion
In this paper, we showed that assessing rangeland health based on satellite data can coincide accurately with an assessment based on common field approaches (IIRH protocol). Thus, there is potential for the integration of standardized in situ data with remote sensing imagery to enable continuous monitoring of land degradation (SDG 15.3.1). We presented a methodological approach on how this information on ecosystem health can be linked using multivariate statistics, cluster analysis, and complex remote sensing metrics based on publicly available satellite data. As spatial data-driven monitoring approaches are essential to achieve SDGs, this approach could contribute to the development of benchmarks for land health and monitoring of indicators for land degradation neutrality based on remote sensing. Identification of spatial indicators of degradation could also help to establish links to conceptual frameworks such as ecological regime shifts with the objective to determine spatial early warning signals of shifts to alternative stable states. In the next step, our approach could be tested at the national or even continental scale in arid and semi-arid regions or could be transferred to other study areas and ecosystems to support monitoring land degradation neutrality from space.