Keywords

1 Introduction

The main problem in the existing landslide risk management practice in Croatia is the lack of analytical landslide maps and the absence of their application in the spatial planning and civil protection system (Mihalić Arbanas et al. 2023). Therefore, the main objective of the project “Methodology development for landslide susceptibility assessment for land-use planning based on LiDAR technology (LandSlidePlan)”, funded by the Croatian Science Foundation (Bernat Gazibara et al. 2022), was the development of landslide susceptibility assessment on a local scale using high-resolution remote sensing data. Within the project, part of the landslide susceptibility modelling (LSM) was carried out with LAND-SUITE, a suite of tools for statistically-based susceptibility zonation developed by Rossi et al. (2021a). As identified by Malamud et al. (2014), the literature often lacks comprehensive assessments of landslide susceptibility modelling and LAND-SUITE provides reliable susceptibility modelling, comprehensive model evaluation and validation tools, and the possibility to prepare high-quality susceptibility maps.

Together with the definition of the most important landslide causal factors, the selection of an appropriate mapping unit and statistical method is a critical phase in LSM (Reichenbach et al. 2018). In this work, the selection of appropriate landslide causal factors was the first step to prepare an LSM. Moreover, we considered 5 m × 5 m regular grid cells and slope units (Alvioli et al. 2016), and investigated the effect of different statistical methods on training LSM using both types of mapping units for application on a local level, i.e. scale 1:5000.

The LSM was conducted for a study area in Hrvatsko Zagorje (NW Croatia), characterised by small and shallow landslides, where most known failures endanger buildings, roads and other infrastructure. For the study area, high-resolution LiDAR DTM was used to prepare a detailed landslide inventory and some causal factors (Sinčić et al. 2022). Furthermore, based on the landslide susceptibility assessment, it was crucial to disseminate location-based disaster risk information to decision-makers and the general public in an appropriate format that can easily be understood and implemented (Mihalić Arbanas et al. 2023).

2 Study Area

The study area of 20.2 km2 is located in the Hrvatsko Zagorje region, Croatia (Fig. 1). The area is part of the mega-geomorphological region of the Panonian Basin and the macro-geomorphological region of the Mountain valley area of NW Croatia (Bognar 2001). The area is characterised by a steep morphology, with 71% of its surface having a slope higher than 12°.

Fig. 1
4 parts. A map of Europe highlights Croatia. A map of Croatia highlights the study area in the Hrvatsko Zagorje region. The elevation map of the study area, 20.2 kilometers square. The elevation ranges from 222 to 682. A bar graph of area versus slope angle. The highest bar value is 13 from 12 to 32.

Location of the Hrvatsko Zagorje study area

According to the Basic Geological Map of Croatia, the area is composed of Miocene (78%), Quaternary (14%) and Triassic sediments (7%) (Aničić and Juriša 1984; Šimunić et al. 1982). The north-eastern part is composed of Triassic sediments: sandstones, shales and dolomites of Lower Triassic, and dolomites, limestones and dolomitised breccias of Middle Triassic. Most of the area is composed of Miocene sediments represented by sandstones; marls; sands; tuffs (Burdigalian); biogenic, sandy, and marly limestones; calcareous marls; and sandstones (Tortonian). The youngest sediments are the Quaternary formations, predominantly composed of sands, silts and gravels, and associated with flat valley terrain around streams and rivers.

The climate is continental with a mild maritime influence, with mean annual precipitation (MAP) value of approx. 830 mm, according to the meteorological station located 30 km east from the area (DHMZ 2022). Rainfall and rapid snow melting are considered the main landslide-triggering factors, as common in NW Croatia (Bernat et al. 2014).

According to the land use information obtained by Sinčić et al. (2022), the area is covered by forest and high vegetation (73%), agricultural zones and low vegetation (24%) and artificial zones (3%). The area is not densely populated, but landslides cause significant damage to buildings, roads and other infrastructures.

3 Methods and Materials

3.1 LAND-SUITE Software

LAND-SUITE is open-source software and consists of three modules: i.e., LAND-SIP (LANDslide Susceptibility Input Preparation), LAND-SVA (LANDslide Susceptibility Variable Analysis), and LAND-SE (LANDslide Susceptibility Evaluation). The software allows the user to execute different landslide susceptibility applications and zonation using statistically based analyses. In this paper, we briefly summarise the main software characteristics entirely derived from Rossi et al. (2021a), where the software code, a user guide and major functionalities are explained in detail.

LAND-SIP is used to prepare training and validation datasets. Different parameters enable applying numerous scenarios based upon the selection of: (i) balanced or unbalanced random sampling; (ii) subsampling or sampling reducing partitions; and (iii) spatially or temporally-based datasets partition.

LAND-SVA enables the development of multicollinearity tables and correlograms, evaluation the correlation between landslide causal factors. Moreover, outputs such as conditional density analysis, density plots, conditional density plots, histogram plots and mosaic plots enable insight into the causal factors leading to the selection of the most significant thematic data for the analysis.

The main advantage of LAND-SE is the possibility to apply four commonly used statistical methods, i.e., Logistic Regression model (LRM), Linear Discriminant Analysis (LDA), Quadratic discriminant analysis (QDA), Neural network analysis (NNM), and combining them in a model ensemble (CFM) considering different mapping units, i.e., vectors and rasters. LAND-SUITE allows to prepare susceptibility maps of the highest Susceptibility Quality Level, i.e., SQL index, as defined by Guzzetti et al. (2006a, b) and additionally illustrated in Reichenbach et al. (2018). LAND-SUITE or some of its modules have already been used to perform landslide susceptibility assessments in other test areas (e.g., Rossi et al. 2021b; Bornaetxea et al. 2018, 2023; Schlögel et al. 2018).

3.2 Landslide Inventory and Unstable Slope Units

The geomorphological landslide inventory map of the Hrvatsko Zagorje is the result of the visual interpretation of high-resolution LiDAR DTM derivates (Krkač et al. 2022). The LiDAR DTM for landslide mapping was derived at 0.3 m resolution from a point cloud with a density of 16 pt./m2. Considering small and shallow landslides that dominantly occur in NW Croatia, detailed landslide mapping was performed using the highest possible DTM resolution.

Initially, 904 landslides were mapped and dominantly classified as shallow soil slides. From the total number of landslides, 24% were randomly selected and checked in the field. From the 214 checked phenomena, 20 landslides were rejected, and 28 were additionally mapped in the field and added to the inventory. The total number of landslides is 912 (Fig. 2). The total area of the mapped landslides is 0.408 km2, corresponding to 2.02% of the study area with a mean density of approx. 45 slope failures per square kilometre.

Fig. 2
A map of the Hrvatsko Zagorje study area. It represents the L i D A R D T M landslide inventory, stable slopes, and unstable slopes. The stable slopes are displayed more. The L i D A R D T M landslide inventory is high at the top portion of the landscape.

LiDAR-based landslide inventory, and stable and unstable slope units used for landslide susceptibility modelling for the study area in Hrvatsko Zagorje

The area of the recorded landslides ranges from a minimum value of 3.3 m2 to a maximum of 13,779 m2, with an average of 448 m2 (median = 173 m2, std. dev. = 880 m2). The most frequent landslide size in the inventory is 200 m2 and almost 85% of the landslide bodies showed a size ranging between 40 and 2000 m2. The small size of the landslides can probably be explained by the geological (mainly Miocene marls covered with residual soils) and geomorphological conditions, with the differences between the valley bottoms and the top of the hills rarely higher than 100 m (Krkač et al. 2022).

For the slope unit delineation, we used r.slopeunit software described in Alvioli et al. (2016). The software is a GRASS GIS module designed for the automatic and adaptive delineation of slope units based on a digital terrain model and a set of user-defined input parameters. In the study area, the most suitable size of the slope units was obtained with the following input parameters: flow accumulation area threshold (t) = 6000 m2; minimum slope planimetric area (a) = 1000 m2; minimum circular variance of terrain aspect within each slope unit (c) = 0.2; reduction factor (r) = 2; threshold value for the cleaning procedure (clean size) = 448 m2.

The result of r.slopeunit is a partition of 2994 slope units ranging from 517 m2 to 47,415 m2 with an average size of 6747 m2. The percentage of landslide area in each slope unit was used to classify stable or unstable units. In particular, slope units containing >5% of landslide area are classified as unstable. As a result, 342 slope units (11.4%) were classified as unstable.

3.3 Landslide Causal Factors

Landslide causal factor maps used for landslide susceptibility mapping can be divided into four main groups: (1) geomorphological factors; (2) geological factors; (3) hydrological factors; and (4) anthropogenic factors. A list of 22 factor maps, with their minimum and maximum values, class increments and the number of classes, are shown in Table 1.

Table 1 Landslide causal factor maps prepared for the study area in Hrvatsko Zagorje

Maps of the geomorphological factors (elevation, slope gradient, slope orientation, contour density, curvature, roughness and terrain dissection) were derived from the 5 m resolution LiDAR DTM.

Maps of the geological factors were initially derived from the Basic Geological Map of the Republic of Croatia at 1:100,000 scale. However, field verifications showed significant deviations between the geological map and actual geological contacts (Sinčić et al. 2022). Therefore, engineering geological formations were additionally mapped by modifying the geological contacts and adding superficial slope-wash and talus sediments using LiDAR DTM derivatives, according to Jagodnik et al. (2020a, b). Combining small-scale geological maps and LiDAR DTM derivatives resulted in significantly reduced deviations from the actual environmental conditions, except for the geological structures, i.e., fault data set (Sinčić et al. 2022).

The hydrological group of causal factors include the drainage network, springs, permanent and temporary streams, and a topographic wetness index. Topographic wetness index and proximity to the drainage network were derived from the LiDAR DTM. In addition, springs, permanent and temporary streams were digitised from the topographic maps of the Republic of Croatia at 1:25,000 scale.

The information related to the anthropogenic factors was derived from: (1) 0.5 m orthophoto of the Republic of Croatia; (2) 0.3 m resolution LiDAR DTM; (3) Open Street Map (OSM); and (4) land-use planning maps. The complex acquisition process of deriving high-resolution land use data, buildings and road inventory, with different processing methods such as image classification and digitisation, is described in detail in Sinčić et al. (2022).

4 Results

The landslide training dataset for pixel and slope-unit landslide susceptibility analysis was defined in LAND-SUITE by a random selection of 50% of the unstable mapping units. Regarding the pixel-based analysis, landslides were pixel-sampled (see case C, Fig. 1 in Rossi et al. 2021a), i.e., 50% of each landslide polygon was used as a training dataset. Then, an equal amount of stable mapping units for the landslide training dataset were randomly selected in the remaining study area.

In this analysis, we first adopted grid cells as mapping units and applied LAND-SVA to ensure that only significant landslide causal factors were used for susceptibility modelling. The continuous variables were classified heuristically based on researchers’ previous studies (Sinčić et al. 2022). For categorical variables, we computed frequency ratio (FR) values for each class and used them as relative values for their transformation into continuous variables. Next, the significance of the 22 potential causal factors (Table 1) for the susceptibility modelling was evaluated by observing the loss of model fitting performance in terms of Area Under Curve value and their associated Receiver Operating Characteristic curves (AUCROC) when removing each single factor from the analysis (i.e., performing a Leave One Out test, see Fig. 3). Variables were considered pairwise correlated when the correlation coefficient was greater than 0.5 with a significance level of 0.01. In such a case, an objective criterion for variable selection was the variable’s lower or irrelevant influence on the overall model fitting performance (AUCROC). The final set of variables used for grid-based statistical models was also used for the slope unit modelling. The following set of variables was selected to produce pixel and slope-unit based landslide susceptibility maps: slope orientation, terrain curvature, engineering formations, proximity to engineering formation bcontact, proximity to drainage network, proximity to permanent and temporary streams, and proximity to land use boundary (Table 1). Also, LAND-SVA showed that multicollinearity problems between selected variables were not present.

Fig. 3
2 parts. a. A 22 by 22 confusion matrix titled pairwise correlation. The gradient bar at the bottom ranges from negative 1 to 1. b. The error plot titled leave one out test depicts A U C subscript R O C versus 22 variables. It plots for models L D A, N N M, L R M, and Q D A.

Results of LAND-SVA evaluations used to define the final set of causal factors for LSM. In the Hrvatsko Zagorje study area were available 22 variables. The figure shows results of: (a) pairwise correlation test; and (b) Leave One Out test

LAND-SE was used to prepare landslide susceptibility maps, i.e., four single statistical models (LRM, LDA, QDA, NNM) and one combined model (CFM) for the two mapping units (5 m × 5 m grid cells and slope units). The fitting performance of the LSM zonations was evaluated quantitatively with the AUCROC and the Cohen’s Kappa index (Table 2).

Table 2 Model evaluation metrics for ten landslide susceptibility models prepared with LAND-SUITE

Using the training landslide pixel sample, we evaluated the model fitting performance (AUC) for the pixel-based susceptibility maps. The ROC values indicate similar model fitting performance for the LRM (AUC = 0.775) and the LDA (AUC = 0.773) models with similar results for the Cohen’s Kappa index (Table 2) (k = 0.389 and k = 0.389). The lowest model fitting performance was obtained for the QDA model (AUC = 0.759 and k = 0.350), while the highest was obtained for the NNM model (AUC = 0.793 and k = 0.431). The combined model (CFM) has a similar fitting performance as the NNM (AUC = 0.79 and k = 0.432).

We evaluated the model fitting performance (AUC) for the slope-based susceptibility maps using the slope units training sample. The ROC values indicate similar model fitting performance (Table 2) for the LRM (AUC = 0.764) and the QDA (AUC = 0.745) models and slightly higher performance for the LDA model (AUC = 0.766). The highest model fitting performance was for the NNM model (AUC = 0.854). The Cohen’s Kappa index is similar for all four models (k = 0.314–0.364). The combined model (CFM) has a similar fitting performance as the NNM (AUC = 0.837 and k = 0.383).

The probability of landslide occurrence obtained for each of the ten LSM was classified into five susceptibility classes (Fig. 4): very low (values ranging from 0–0.2), low (0.2–0.45), medium (0.45–0.55), high (0.55–0.8) and very high (0.8–1). Mapping units with a probability value ranging from 0.45 to 0.55 correspond to the locations where the model shows the higher classification uncertainty. Considering the pixel-based LSMs, the QDA has the lowest number of mapping units classified in the medium class (7.20%), and the LDA model has the highest value (14.53%). Generally, slope-based LSMs show a higher number of mapping units classified in the medium class, with the LRM, LDA and CFM models having the lower number (from 21.17% to 23.07%) and the NNM model with the higher (61.39%).

Fig. 4
10 maps of study area depict the 5 meters pixel-based landslide susceptibility maps and slope-based landslide susceptibility modeling of L R M, L D A, Q D A, N N M, and C F M. It represents the landslide susceptibility zonation as follows. Very low, low, medium, high, and very high.figure 4

Pixel and slope unit-based landslide susceptibility maps for the study area located in Hrvatsko Zagorje prepared using LAND-SUITE. Legend: LDA = Linear discriminant analysis; LRM = Logistic regression model; QDA = Quadratic discriminant analysis; NNM = Neural network analysis; CFM = Combined model

Considering only the AUC values, the NNM model and the slope-unit could be considered the best combination, but this zonation shows the largest number of mapping units classified in the medium class, which corresponds to a very large part of the study. In addition, the pixel-based combined susceptibility zonation (CFM) shows the best fitting performance (AUC = 0.793, k = 0.432, u = 9.58%) among the ten models obtained with LAND-SUITE.

5 Discussion

The focus of this work was to test LAND-SUITE for landslide susceptibility modelling, including the selection of the appropriate causal factors and the analysis of various statistical models and mapping units. The analysis was conducted for a small study area (20 km2) which can be considered representative to propose a methodology for LSM on a large scale and an example in the Croatian spatial planning system.

In previous work, Sinčić et al. (2022) prepared and analysed high-quality input data derived from high-resolution remote sensing data, i.e., LiDAR data and orthophoto maps, useful for landslide susceptibility assessment in the study area. We have evaluated a large number of geomorphological, geological, hydrological and anthropogenic data as variables, since general criteria or guidelines for the selection of causal factors for susceptibility modelling are not available.

LAND-SVA multicollinearity analysis and correlograms facilitate the selection of relevant explanatory variables. All explanatory variables with low impact on model fitting performance and/or correlation coefficient higher than 0.5 were omitted from the analysis. From the 22 original factor maps, only 6 were selected, considering the lack of variable relevance on the susceptibility model and following the rule “less is more”, avoiding over-parameterisation issues.

All model evaluation tests indicate that pixel-based models have better model fitting and accuracy than slope-unit based models, regardless of the statistical method. However, the overall performances of the pixel-based susceptibility maps presented are of good-to-fair fitting capacity, with AUC values ranging between 0.759 to 0.793 and Cohen’s Kappa index ranging between 0.35 to 0.43 which can be described as moderate (Table 2). Nevertheless, our main interest was not preparing a definitive landslide susceptibility map for the study area, but testing the possibility of using LAND-SUITE for large-scale landslide susceptibility modelling, applying different mapping units and analysing differences on final landslide susceptibility maps. In the future, a more detailed analysis of the input data for LSM will be performed because the availability of new and relevant geo-environmental information can result in better and more reliable susceptibility models and zonations (Reichenbach et al. 2018).

Statistical landslide susceptibility zonations were quantitatively assessed for the entire study area by comparing the total area (Table 2) and the spatial distribution of the susceptibility classes (Fig. 4). The pixel and slope unit-based susceptibility assessment significantly differ for the type of information provided. To emphasise the large-scale application and to highlight the difference between the ten models, detailed views of the derived maps are presented in Fig. 5. The close-up extents show the distribution of susceptibility classes together with existing landslides, buildings, roads and water bodies. Moreover, susceptibility zonation on close-up extents can be discussed, considering their further applicability in the spatial planning system. One important conclusion is that a pixel-based map represents a better option due to higher model accuracy and better relation with existing landslides. Although, pixel-based maps require “post-processing” of the susceptibility zones to produce more clustered and homogeneous information for the final purpose, i.e., application in a local-level spatial planning system. Moreover, slope-based maps have a higher relation with the topography, reduced mapping errors and are more appropriate for further applications. The problem is that the size of slope units derived for this analysis does not correspond well with the scale of landslides susceptibility assessment and would be more appropriate at the regional level.

Fig. 5
10 maps of the study area depict the 5 meters pixel-based landslide susceptibility maps and slope-based landslide susceptibility modeling of L R M, L D A, Q D A, N N M, and C F M. It exhibits the landslide and landslide susceptibility zonation as follows. Very low, low, medium, high, and very high.

Close-up views for pixel and slope unit-based landslide susceptibility maps of the study area in Hrvatsko Zagorje prepared using LAND-SUITE. Legend: LDA = Linear discriminant analysis; LRM = Logistic regression model; QDA = Quadratic discriminant analysis; NNM = Neural network analysis; CFM = Combined model

6 Conclusion

The landslide susceptibility analysis for part of the Hrvatsko Zagorje (20 km2) was performed using the LiDAR-based landslide inventory (Krkač et al. 2022) and causal factor maps derived and/or modified with high-resolution remote sensing data (Sinčić et al. 2022). We applied five statistical methods in LAND-SUITE (Rossi et al. 2021a), i.e., four single statistical models (LRM, LDA, QDA, NNM) and one combined model (CFM), for two different types of mapping unit, (i.e., 5 m × 5 m regular grid cells and slope units). In both cases, an equal number of unstable and stable units were randomly selected as training datasets. The original 22 causal factors was checked for mutual collinearity, and only six were classified as significant for LSM (i.e., slope orientation, terrain curvature, engineering formations, proximity to engineering formation contact, proximity to drainage network, and proximity to permanent and temporary streams, and proximity to land use boundary).

Among the pixel training samples, the best model fitting performance was obtained for the NNM (AUC = 0.793 and k = 0.431) and the combined (CFM) models (AUC = 0.79 and k = 0.432). Similarly, the best model fitting performance for the slope-unit was obtained for the NNM (AUC = 0.854 and k = 0.364) and the combined (CFM) models (AUC = 0.837 and k = 0.383).

The probability of landslide occurrence for each of the ten models was classified into five susceptibility classes. The pixel and slope unit-based maps displayed differences in the information detail, indicating that pixel-based models are more appropriate for the local-level spatial planning system.

In this work, LAND-SUITE proved to be a helpful tool for testing landslide susceptibility hypotheses and searching for new insights in landslide applications. Moreover, LAND-SUITE allowed efficient statistically-based modelling and massive code execution, mainly due to the availability of its command line interface (Rossi et al. 2021a).