# Does parameterization influence the performance of slope stability model results? A case study in Rio de Janeiro, Brazil

- First Online:

- Received:
- Accepted:

DOI: 10.1007/s10346-016-0783-6

- Cite this article as:
- de Lima Neves Seefelder, C., Koide, S. & Mergili, M. Landslides (2016). doi:10.1007/s10346-016-0783-6

- 429 Downloads

## Abstract

We produce factor of safety (*FOS*) and slope failure susceptibility index (*SFSI*) maps for a 4.4-km^{2} study area in Rio de Janeiro, Brazil, in order to explore the sensitivity of the geotechnical and geohydraulic parameterization on the model outcomes. Thereby, we consider parameter spaces instead of combinations of discrete values. *SFSI* is defined as the fraction of tested parameter combinations within a given space yielding *FOS* <1. We repeat our physically based calculations for various parameter spaces, employing the infinite slope stability model and the sliding surface model of the software r.slope.stability for testing the geotechnical parameters and the Transient Rainfall Infiltration and Grid-Based Regional Slope-Stability Model (TRIGRS) for testing the geohydraulic parameters. Whilst the results vary considerably in terms of their conservativeness, the ability to reproduce the spatial patterns of the observed landslide release areas is relatively insensitive to the variation of the parameterization as long as there is sufficient pattern in the results. We conclude that landslide susceptibility maps yielded by catchment-scale physically based models should not be interpreted in absolute terms and suggest that efforts to develop better strategies for dealing with the uncertainties in the spatial variation of the key parameters should be given priority in future slope stability modelling efforts.

### Keywords

Parameter sensitivity Parameter space Slope failure susceptibility index Slope stability model TRIGRS Uncertainty## Introduction

Landslides starting from unstable slopes affect the safety of life as well as of private and public assets. Computer models are employed to identify potentially unstable areas in order to facilitate decision-making at various levels. Whilst statistical models explore the relationships between the spatial patterns of landslide occurrence and a set of predictor layers, physically based models attempt to reproduce or to predict the physical mechanisms involved (Guzzetti et al. 1999; Van Westen 2000; Guzzetti 2006; VanWesten et al., 2006). Physically based models are frequently employed to estimate landslide susceptibility at the scale of small catchments (VanWesten et al., 2006). As long as shallow landslides are considered, these approaches mostly rely on the infinite slope stability model. It is commonly used in raster-based geographic information system (GIS) environments to derive a factor of safety for each pixel. However, the infinite slope stability model is unconditionally suitable only for those areas where shallow translational landslides with a length-to-depth ratio L/D >16–25 are expected (Griffiths et al. 2011; Milledge et al. 2012). As shallow landslides are most commonly triggered by extreme hydrometeorological events, such modelling tools are often coupled with more or less complex hydraulic models (e.g., Montgomery and Dietrich 1994; Van Westen and Terlien 1996; Burton and Bathurst 1998; Pack et al. 1998; Wilkinson et al. 2002; Xie et al. 2004a; Baum et al. 2008; Godt et al. 2008; Muntohar and Liao 2010; Mergili et al. 2012).

For areas with deep-seated landslides, models assuming spherical, ellipsoidal or complex sliding surfaces reproduce the stability situation in a more appropriate way. Whilst they are standard in geotechnical engineering, their implementation with GIS is non-trivial so that catchment-scale applications are less commonly applied (e.g. Xie et al. 2003, 2004b, 2006; Jia et al. 2012; Mergili et al. 2014a, b).

Even simple slope stability or hydraulic models rely on parameters which are highly uncertain in their horizontal and vertical distribution. One possible concept to account for parameter uncertainty is the probability of failure (Tobutt 1982) which has started to complement the conventional factor of safety with increasing computational power, considering parameter spaces using random or regular sampling of uncertain parameters (Mergili et al. 2014a). Various authors have introduced and used different types of probability density functions (pdfs) of geotechnical (El-Ramly et al. 2005; Petrovic 2008; Mergili et al. 2014a) and geohydraulic parameters (Mesquita et al. 2002, 2007; Mesquita and Moraes 2004) which can be employed for parameter sampling. Whilst such functions are a smart way to deal with uncertain information, they are not necessarily transferable between different locations and therefore commonly suffer from small sample sizes and, consequently, weakly supported means and standard deviations.

As the challenge of uncertain parameters is encountered in many fields of geosciences, various approaches have been developed in the previous decades to test the sensitivity of the model results or the model performance to the input parameters or to optimize (calibrate) the input parameters in order to bring the model results in line with reference observations. Testing one parameter at a time is thereby considered inappropriate as both the optimum value and the sensitivity may strongly interrelate with the values of other parameters (Saltelli and Annoni 2010). Multi-parameter strategies are therefore required (e.g., Duan et al. 1992; Eberhart and Kennedy 1995; Hay et al. 2006; Vrugt et al. 2008; Fischer 2013). Optimized parameters or parameter sets, however, are not necessarily meaningful from a physical point of view. Particularly when calibrating many parameters at once, a good model performance in terms of reproducing the observation can be achieved despite a poor process understanding. The sensitivity of local-scale slope stability model results to selected input parameters was tested, e.g. by Griffiths and Fenton (2004) or by Wang et al. (2010). Guimarães et al. (2003) and Formetta et al. (2015) have applied parameter optimization strategies at catchment scale.

Almost all documented parameter sensitivity and optimization strategies target at discrete parameter values. We think that, particularly at broader scales, sensitivity analysis and optimization of parameter values is inappropriate as it disregards the inherent fine-scale spatial variability of the parameters. Instead, we suggest performing sensitivity analysis and optimization of parameter ranges.

The present article demonstrates such a strategy, employing a modification of the probability of failure concept. We investigate how the considered ranges of geotechnical and geohydraulic input parameters influence the results and performance of GIS-based catchment-scale slope stability models. For this purpose, we apply the infinite slope stability model, the sliding surface model of the tool r.slope.stability and the software Transient Rainfall Infiltration and Grid-Based Regional Slope-Stability Model (TRIGRS) to the Quitite and Papagaio catchments, Rio de Janeiro, Brazil. The findings are thought to be useful to identify suitable parameterization strategies for future slope stability modelling efforts.

Next, we introduce the study area (“Study area and data” section) and describe the components of the proposed work flow (“Methods” section). We then demonstrate (“Results” section) and discuss (“Discussion” section) the results obtained before drawing our conclusions (“Conclusions” section).

## Study area and data

^{2}, extending between 12 and 995 m a.s.l. The climate in the area is tropical humid (Guimarães et al. 2009). Due to influence by ocean moisture, the area receives a higher amount of rainfall than the central part of Rio de Janeiro (Hurtado Espinoza 2010). Granitic bedrock dominates both watersheds. The homogeneous, colluvial yellow soil is characterized by sandy-clay features (Hurtado Espinoza 2010; Galindo 2013; Galindo and Campos 2014) and a depth of 1–3 m (Guimarães et al. 2003). Native forest is still the dominant type of vegetation whilst the anthropogenic influence on the land cover is of limited importance (Guimarães et al. 2003; Hurtado Espinoza 2010).

*c*’ (kN m

^{−2}), normalized to depth

*d*, effective angle of internal friction

*φ*’ and specific weight of the saturated soil

*γ*

_{s}(kN m

^{−3}) using published parameters for geomorphologically comparable adjacent areas and back-calculations with the software SHALSTAB. These authors arrived at best fit values of

*c*’/

*d*= 2 kN m

^{−3},

*φ*’ = 45° and

*γ*

_{s}= 15 kN m

^{−3}, but they also indicated that, in general, low values of

*c*’/

*d*, high values of

*φ*’ and values from 15 to 17.5 kN m

^{−3}for

*γ*

_{s}would be appropriate for the area. They proposed a general frame of parameter values realistic for the area (in the sense of a parameter space) summarized in Table 1 and, with some modifications, applied to tests A and B (see “Methods” and “Results” sections). Hurtado Espinoza (2010) measured a dry specific weight around 15 kN m

^{−3}for some undisturbed samples taken at 1 m depth. The same authors stated that the slopes in the lower areas would be weaker whilst those in the higher areas would be stronger.

Range of parameter values applicable to the Quitite and Papagaio watersheds (Guimarães et al. 2003)

Parameter | Minimum value | Maximum value |
---|---|---|

| 0 kPa m | 8 kPa m |

| 25° | 45° |

| 15,000 | 25,000 |

Values of soil saturated conductivity *K*_{s} were measured by Fernandes et al. (2001) using Guelph’s permeameter. The results showed a high variability with values ranging from 10^{−6} to 10^{−4} m s^{−1} as well as some important discontinuities in the profiles, possibly influencing groundwater flow.

^{2}(3.1% of the entire area). Table 2 summarizes the main characteristics of the landslide inventory. Most landslides occurred in the native forest areas dominating the study area. Shallow landslides, debris flows and debris avalanches were most common. The sliding surfaces of most landslides coincided with the soil-rock interface (Guimarães et al. 2003; Miqueletto and Vargas, 2009; Hurtado Espinoza 2010). The landslide inventory displays the entire extent of the directly affected areas without distinguishing between release, transit and deposition areas.

Main characteristics of the inventoried landslides triggered by the rainfall event of 13 and 14 February 1996 in the Quitite and Papagaio watersheds

Landslide characteristics | Values |
---|---|

Number of landslides | 93 |

Total landslide area (fraction of total area) | 0.14 km |

Average (minimum–maximum) landslide area projected to ground plot | 1520 m |

Average (minimum–maximum) landslide length in down-slope direction, projected to ground plot | 65.9 m (8.0–220.0) |

Average (minimum–maximum) landslide width in cross-slope direction | 20.0 m (4.0–96.0) |

Average (minimum–maximum) landslide inclination in down-slope direction | 31.9 (8.5–45.8)° |

Besides the geotechnical and geohydraulic information and the landslide inventory, we use a 2-m resolution digital elevation model (DEM).

## Methods

### Work flow and software

*SFSI*) (dimensionless number in the range 0–1) based on sets of factor of safety (

*FOS*) values derived through the controlled variation of selected key parameters within a defined parameter sub-space. This procedure is repeated for various sub-spaces. The resulting

*SFSI*values are evaluated against the inventory of observed landslides, and the findings are compared and interpreted.

In a first step, we vary the geotechnical parameters (tests A and B) and in a second step, we vary the geohydraulic parameters (test C). Test D uses a simple statistical model for the sake of comparison. Test A builds on the infinite slope stability model, test B on the sliding surface model of the tool r.slope.stability (Mergili et al. 2014a, b), designed as a raster module of the open source GRASS GIS software (Neteler and Mitasova 2008; GRASS Development Team 2016). Test C makes use of TRIGRS (Transient Rainfall Infiltration and Grid-Based Regional Slope-Stability Model; Baum et al. 2008), which is a grid-based tool simulating the permanent and transient rainfall influences on slope stability. Python scripting is used to derive *SFSI*, and the R Project for Statistical Computing (R Core Team, 2016) is employed for the evaluation of the results. Test D relies entirely on Python and R scripting.

### Geotechnical model

*FOS*) is computed as the ratio between resisting forces

*R*and driving forces

*T*:

When *FOS* = 1, the slope is in static equilibrium. Values of *FOS* <1 indicate potential failure (in reality, such slopes do not exist), values of *FOS* >1 indicate stable slopes. The use of this method requires the prior definition of a slip surface, and the soil is considered as rigid material.

*FOS*can be expressed in various ways. For fully saturated soil, the equation may be formulated as follows (modified after Baum et al. 2008):

where *α* is the slope angle, *u* (N m^{−2}) is the pore water pressure, *γ*_{s} (N m^{−3}) is the specific weight of the saturated soil and *d* (m) is the depth of the sliding surface.

In the present work, we use the infinite slope stability model implemented with r.slope.stability and with TRIGRS. Alternatively, we also apply the sliding surface model of r.slope.stability. Thereby, the slope stability is tested for a large number of randomly selected ellipsoid-shaped potential sliding surfaces, truncated at the depth of the soil. *R* and *T* are summarized over all pixels intersecting a given sliding surface, and *FOS* is computed for each surface in a way analogous to Eqs. 1 and 2, applying a modification of the Hovland (1977) model. Finally, the minimum value of *FOS* resulting from the overlay of all sliding surfaces is applied to each pixel. For a more detailed description of the sliding surface model of r.slope.stability, we refer to Mergili et al. (2014a, b).

### Geohydraulic model

*FOS*is computed for one or more user-defined depths. The Richard’s equation is used to calculate the soil transient infiltration for saturated and unsaturated soil conditions (Iverson 2000):

where *ψ* (m) is pressure head, *θ* is soil volumetric water content, *t* (s) is time, *K*_{L} (m s^{−1}) is lateral soil conductivity and *K*_{z} (m s^{−1}) is soil conductivity in *z* direction.

To solve the Richards equation, TRIGRS uses an approach developed by Iverson (2000), considering homogeneous soil, isotropic flow, relatively shallow depth, one-dimensional vertical downslope flow and soil moisture close to saturated conditions (Baum et al. 2008; Park et al. 2013), following the heat conduction approach described by Carslaw and Jaeger (1959). We refer to Baum et al. (2008) for a detailed description of the procedure.

For computing the groundwater level, TRIGRS compares the infiltrated water volume *V*_{I} and the maximum drainage capacity of the soil *V*_{D}. If *V*_{D} ≥ *V*_{I}, the water table remains constant. Otherwise, the water table rises, depending on *K*_{s} and the transmissivity *T*. For unsaturated conditions, the maximum value of *ψ* is the new water level multiplied with *β* (value set according to the adopted flow condition). The amount of water exceeding the maximum infiltration rate is considered surficial runoff. However, surficial runoff is not taken over from one time step to the next (Baum et al. 2008).

### Slope failure susceptibility index

The slope failure susceptibility index (*SFSI*) in the range 0–1 refers to the fraction of geotechnical and/or geohydraulic parameter combinations resulting in *FOS* <1, out of an arbitrary number of tested parameter combinations. This means that *SFSI* for a given pixel increases with each parameter combination where *FOS* <1 and, finally, low values of *FOS* correspond to high values of *SFSI*. The principal concept of the *SFSI* is identical to the concept of the slope failure probability yielded by r.slope.stability (Mergili et al. 2014a). However, we refer to it as a susceptibility index in the context of the present study as we simply use a uniform probability density function throughout all the computations. Such a distribution does not necessarily capture the real-world parameter distribution (which is unknown) and its use does therefore not justify applying the concept of probability in a strict sense.

### Statistical model

*f*

_{C}of observed landslide release pixels related to all pixels.

*SFSI*—referred to as release probability by Mergili and Chu (2015) who employed a comparable approach—is then computed by applying

*f*

_{C}to all pixels of the corresponding slope class. Thereby, it is important to use two different areas for the derivation of

*f*

_{C}and for the computation and evaluation of

*SFSI*(“Test layout” section).

Summary of all tests performed

Test | Description |
---|---|

A1 | The infinite slope stability model is applied with a constant soil depth |

A2 | Infinite slope stability model, constant soil depth |

A3 | Infinite slope stability model, variable soil depth and OIA considered for evaluation. According to Guimarães et al. (2003), |

A4 | Infinite slope stability model, variable soil depth and ORA considered for evaluation |

B | The sliding surface model implemented in r.slope.stability is applied along with the optimized parameters derived from the tests A1–A4. The ellipsoid density per pixel (Mergili et al. 2014a, b) is set to 2500. The ellipsoid dimensions are constrained by the dimensions of the release areas of the observed landslides. All ellipsoids are truncated at the depth of the soil. |

C1 | TRIGRS, rectangular hydrograph and an assumed rainfall duration of 6 h. The rainfall is considered constant throughout the entire period, resulting in an intensity of 24 mm/h. |

C2 | TRIGRS, rectangular hydrograph and rainfall duration of 10 h (intensity 14.4 mm/h) |

C3 | TRIGRS, triangular hydrograph with central peak and rainfall duration of 6 h, resulting in a peak intensity of 48 mm/h |

C4 | TRIGRS, triangular hydrograph with central peak and duration of 10 h and peak intensity of 28.8 mm/h |

D | Simple statistical model employing the slope angle as the only predictor layer. Overlay of a classified slope map with the ORA map and, for each class, computation of the fraction of observed landslide release pixels related to all pixels. |

### Model evaluation

The landslide inventory for the Quitite and Papagaio watersheds displays the entire observed landslide impact areas (OIAs), i.e. the release, transit and deposition areas without any differentiation. We approximate the ORA as the upper third part of each OIA polygon. Depending on the test (“Test layout” section and Table 3), either the OIA map or the ORA map is overlaid with the corresponding *SFSI* map. When using the ORA map, the lower two-thirds portion of the OIA is not considered for evaluation. The true positive (TP), true negative (TN), false positive (FP) and false negative (FN) pixel counts are derived for selected levels of *SFSI*. An ROC curve is produced by plotting the true positive rates TP/(TP + FN) against the false positive rates FP/(FP + TN) derived with each combination of parameters. The area under the ROC curve *AUROC* indicates the predictive capacity of the model: *AUROC* = 1.0 (the maximum) means a perfect prediction, *AUROC* = 0.5 (corresponding to a straight diagonal line) indicates a random prediction, i.e. model failure. *AUROC* refers to the entire area used for model evaluation.

where μ_{SFSI} is the average of *SFSI* over the entire study area, and *r*_{OP} is the observed positive rate, i.e. the fraction of observed landslide pixels out of all pixels in the study area. If FoC >1, the model overestimates the landslide susceptibility, compared to the observation whilst values FoC <1 indicate an underestimation of the landslide susceptibility.

### Test layout

Symbol | Description | A1–A4 | B | C1–C4 |
---|---|---|---|---|

| Effective angle of internal friction | 21–45° | 21–45° | 45° |

| Effective cohesion | 0–24 kN m | 0–24 kN m | 4.5 kN m |

| Dry specific weight | 13.5 kN m | 13.5 kN m | N/A |

| Saturated specific weight | N/A | N/A | 16.0 kN m |

| Saturated water content | 40 Vol.% | 40 Vol.% | 40 Vol.% |

| Residual water content | N/A | N/A | 5 Vol.% |

| Saturated hydraulic conductivity | N/A | N/A | 10 |

| Diffusivity | N/A | N/A | 200 |

| Initial infiltration rate | N/A | N/A | 1.3 10 |

| Sliding surface depth | 3 m | 3 m | 3 m |

| Depth of water table | 0 m | 0 m | N/A |

| Initial depth of water table | N/A | N/A | 0–3 m |

*SFSI*and the associated model performance to the geotechnical parameters

*c*′ and

*φ*′ and the shape of the sliding surface is explored, assuming fully water-saturated soils, and the depth of the sliding surface corresponding with the soil depth. The infinite slope stability model and the sliding surface model implemented in r.slope.stability are employed for this purpose. We introduce a two-dimensional parameter space constrained by lower boundaries of

*c*′ = 0 kN m

^{−3}and

*φ*′ = 21°, and upper boundaries of

*c*′ = 24 kN m

^{−3}and

*φ*′ = 45° (Fig. 3a; Table 4). This parameter space accounts for the full ranges of

*c*′ and

*φ*′ considered representative for the area (“Study area and data” section). We note that the resulting values of

*FOS*vary according to

*φ*′ and

*c*′/

*d*, so that the value of

*FOS*obtained with

*d*= 3 m and with a given value of

*c*′ is identical (infinite slope stability model) or similar (sliding surface model) to the value of

*FOS*with other values of

*c*′ and

*d*, but the same

*c*′/

*d*ratio. The dry specific weight of the soil

*γ*

_{d}= 13.5 kN m

^{−2}and the volumetric saturated water content

*θ*

_{s}= 40 vol.% are set to constant values. We neglect the weight of the trees and the effects of their root systems on the cohesion: sliding surfaces are assumed to develop beneath the rooting depth.

*c*′ and

*φ*′ are (i) considered in their entire extent; (ii) subdivided into two sub-ranges of equal extent and (iii) subdivided into three sub-ranges of equal extent (Fig. 4a, b). Considering all possible combinations of sub-ranges of the two parameters results in 36 partly overlapping parameter sub-spaces with 25 corner points.

*SFSI*is computed for each parameter sub-space, with ten sampled parameters in each dimension (Fig. 4c). This procedure may be extended to three or more dimensions or repeated at a finer level by employing the sub-space with the best model performance as the entire space for the next level. For reasons to be explained in the “Results” section, only one level is applied in the present work. This work flow is repeated for two assumptions of soil depth and two versions of the landslide inventory used for evaluation, resulting in a total of four sub-tests (Table 3).

Test C explores the sensitivity of *SFSI* and the associated model performance to *K*_{s} and the initial depth of the water table *d*_{i} (m). We introduce a two-dimensional parameter space constrained by lower boundaries of *K*_{s} = 10^{−7} m s^{−1} and *d*_{i} = 0 m and upper boundaries of *K*_{s} = 10^{−4} m s^{−1} and *d*_{i} = 3 m (Fig. 3b; Table 4). The ranges of values used are based on works of Saxton and Rawls (2006) and Guimarães et al. (2003). We set *γ*_{s} = 16 kN m^{−2}, *θ*_{s} = 40 vol.%, *θ*_{r} = 5 vol.%, *c′* = 4.5 kN m^{−2}, *φ′* = 45° and *d* = 3 m to constant values. The choice of these values is supported by data from Guimarães et al. (2003) and Hurtado Espinoza (2010). We further assume constant values of diffusivity (*D* = 200*K*_{s}; Park et al., 2013) and initial infiltration rate (*I*_{0} = 1.3 10^{−6} m s^{−1}; Conti 2012).

In a way analogous to the geotechnical parameters, the ranges of both *K*_{s} and *d*_{i} are (i) considered in their entire extent, (ii) subdivided into two sub-ranges of equal extent and (iii) subdivided into three sub-ranges of equal extent, resulting in 36 partly overlapping parameter sub-spaces with 25 corner points. *SFSI* is computed for each parameter sub-space, with five sampled parameters in each dimension. The landslide inventory used for evaluation is ORA.

This procedure is repeated for four combinations of rainfall duration and type of pluviograph (Table 3). We assume rainfall durations of 6 and 10 h and a total rainfall amount derived from the measurements at the Jacarepaguá and Boa Vista stations on 13 and 14 February 1996 (Conti 2012). The Thiessen method is applied for estimating the precipitation in the catchment, and 20% of interception are deduced (Coelho Netto 2005). The total rainfall considered for the analysis is 144 mm in all the scenarios C1–C4.

In test D, we apply the statistical model introduced in the “Statistical model” section for the purpose of comparison (Table 3). *f*_{C} is derived for one of the two catchments. *SFSI* is then computed for the other catchment and evaluated against the corresponding ORA. The entire procedure is repeated in the reverse way, so that a clear separation between the model development and model evaluation areas is ensured.

## Results

### Tests A and B: geotechnical parameterization

*AUROC*) and conservativeness (

*FoC*). Assuming a constant soil depth, the model performs significantly better when considering only the ORA (test A2;

*AUROC*≤ 0.741; Fig. 5b) instead of the entire OIA (test A1;

*AUROC*≤ 0.691; Fig. 5a). This result clearly indicates that the OIA is unsuitable as reference for evaluation, and an appropriate inventory sub-setting is essential. Focusing on Fig. 5b, we note that the model performance in terms of

*AUROC*is insensitive to the variation of the geotechnical parameterization within much of the tested ranges. In particular, the sub-spaces along a diagonal line from medium-high values of

*c*′ and low values of

*φ*′ to low values of

*c*′ and high values of

*φ*′ display almost identical

*AUROC*values to the entire parameter space and to those sub-spaces including broad ranges of

*c*′ or broad ranges of

*φ*′ with medium-low values of

*c*′. Only those sub-ranges limited to high values of

*c*′ or low values of

*c*′ and

*φ*′ yield significantly lower

*AUROC*values. These sub-ranges result in poorly patterned relatively non-conservative and extremely conservative predictions, i.e. they display very low and very high

*FoC*values, respectively. In general, the model results are very conservative, indicated by

*FoC*> > 1. At a lower level of

*AUROC*—and a lower level of

*FoC*caused by a higher number of OP pixels—similar patterns are observed in Fig. 5a.

Varying *d* as a function of the topographic wetness index exerts contrasting effects on the patterns of *AUROC*, depending on whether the OIA or the ORA is used as reference. With the ORA as reference (Test A4; Fig. 5d), the sub-spaces with low values of *c*′ perform comparable to test A2 (Fig. 5b). This is not surprising as the influence of *d* on *FOS* increases with *c*′ (with *c*′ = 0, *d* has no influence). However, *AUROC* and also *FoC* decrease significantly with increasing *c*′, resulting in a very poor performance associated to those sub-spaces with high *c*′, and a reduced performance associated to those sub-spaces with broad ranges of *c*′, compared to Fig. 5b. This trend clearly indicates that most ORA pixels spatially coincide with areas of relatively low topographic wetness index and therefore low values of *d* (Table 3) resulting in high values of *FOS* and low values of *SFSI* in cohesive soils.

The reverse effect occurs when using the entire OIA as reference (test A3; Fig. 5c): many pixels in the lower portions of the landslide polygons coincide with high values of the topographic wetness index. Consequently, *d* and the resulting values of *SFSI* are comparatively high for many of the OP pixels, resulting in an improved model performance, compared to the tests A1 – A3 (*AUROC* ≤ 0.742; Fig. 5b). However, since most of the lower parts of the landslide polygons do most likely not represent release areas, the increased performance represents an artefact of inappropriate assumptions rather than an indicator for model success.

*AUROC*than do some of the sub-spaces, there is no basis to support the choice of a particular sub-space in this specific case. The parameter values used and optimized by Guimarães et al. (2003) are mostly located within the parameter sub-spaces with the higher values of

*AUROC*, indicating a certain plausibility of the results (Fig. 5b). Figure 6a shows the spatial patterns of

*SFSI*derived in the tests A1 and A2 with the full parameter space of

*c*′ and

*φ*′. We note that the results of those tests are similar in terms of

*SFSI*, as only the reference information for validation is varied. The same is true for the

*SFSI*maps derived through the tests A3 and A4 (Fig. 6b).

The spatial patterns of *SFIS* derived with the sliding surface model of r.slope.stability (test B) are illustrated in Fig. 6c. Applying the full parameter space of *c*′ and *φ*′ along with constant soil depth and the ORA as reference, the associated value of *AUROC* is almost identical to the value yielded with the infinite slope stability model (0.735 vs. 0.734 in test A2). Thereby, the results yielded with the sliding surface model are more conservative: *FoC* = 59.5, compared to a value of 48.3 yielded with the infinite slope stability model (Fig. 5b).

### Test C: geohydraulic parameterization

*AUROC*) and conservativeness (

*FoC*) of the model results for the various parameter sub-spaces of

*K*

_{s}and

*d*

_{i}. Firstly, we note that the results are largely insensitive to the four assumptions of rainfall duration and hydrograph shape (C1–C4): the patterns yielded are identical for all four scenarios, even though the numbers vary slightly. Within each scenario, the model performance responds highly sensitive to variations of

*K*

_{s}and

*d*

_{i}: it peaks at

*AUROC*= 0.719–0.724 for the upper sub-range of the hydraulic conductivity (

*K*

_{s}= 10

^{−5}–10

^{−4}m s

^{−1}) and the lower sub-range of the initial depth of the water table (

*d*

_{i}= 0–1 m). However, the model performance drops only slightly when the full range of both parameters

*K*

_{s}and

*d*

_{i}is applied (

*AUROC*= 0.711–0.712). Figure 8 presents the

*SFSI*maps produced in test C1 with the full space of

*K*

_{s}and

*d*

_{i}. The

*SFSI*maps resulting from tests C2, C3 and C4 are almost similar to the map resulting from test C1 and are therefore not shown.

Constraining the model input to the lower ranges of hydraulic conductivity or to deeper initial water tables leads to a significant drop in the model performance. Considering *K*_{s} ≤ 10^{–5.5} leads to model failure (*AUROC* = 0.494), independently of the range applied for *d*_{i} and the rainfall scenario. In this case, *FoC* = 3.9 (blue font colour in Fig. 7). As expected, *FoC* is highest for the configurations with high *K*_{s} and shallow *d*_{i} and lowest for the configurations with low *K*_{s} and deep *d*_{i}. Its maximum coincides with the best model performance (*FoC* = 48.0–48.9).

These outcomes reflect the fact that, with *K*_{s} ≤ 10^{–5.5}, too little water propagates through the soil to substantially influence slope stability. The effect is similar with higher values of *K*_{s} if the initial water table is too deep. A shallower initial water table and higher values of *K*_{s} facilitate increased values of *u* over broad parts of the study area and, consequently, lead to less stable slopes (Eq. 2) and higher values of *FoC*. Only combinations of high *K*_{s} and deep *d*i lead to a sufficient signal to reproduce the observed landslide release patterns with a fair performance. As for tests A and B, all results are very conservative also for test C (*FoC* > > 1).

### Test D: statistical model

*AUROC*value of 0.737 (values of 0.736 and 0.738 for the two catchments) whilst, as prescribed by the approach chosen,

*FoC*≈ 1. The model performance corresponds remarkably well to the performance of the physically based models (tests A2 and B in particular), underlining the fact that the slope angle strongly dominates also the pattern of

*SFSI*derived with the physically based models (Fig. 9).

## Discussion

We have demonstrated that the performance of the physically based-derived slope failure susceptibility index *SFSI* in our study area reacts conditionally sensitive to variations in the considered spaces of selected geotechnical and geohydraulic input parameters and state variables. Those parameter configurations yielding insufficient pattern in terms of simulated landslide vs. non-landslide areas lead to a significantly poorer performance. With regard to the geotechnical information, comparable *AUROC* values are displayed throughout much of the parameter space considered relevant for the study area (Guimarães et al. 2003), except for those sub-spaces with low *c*′ and low *φ*′ (μ_{SFSI} close to 1) and those areas with high *c*′ and high *φ*′ (μ_{SFSI} close to 0). This constellation underlines a well-known negative relationship between *c*′ and *φ*′. Model performance in terms of *AUROC* responds very sensitive to variations in *K*_{s} and *d*_{i} within the tested ranges but insensitive to the variations in the rainfall scenarios applied. Whilst the findings for the geotechnical parameters are claimed to be broadly valid, those for *K*_{s} and *d*_{i} may strongly depend on the assumed rainfall duration and intensity in relation to the water capacity of the soil. In this sense, the pattern displayed in Fig. 7 might change for different rainfall events.

Our findings suggest that any further parameter optimization efforts in terms of *AUROC* may be obsolete: the pattern of *SFSI* derived with the entire parameter space performs approximately as well in reproducing the observed landslide areas as the patterns of *SFSI* derived with various sub-spaces do. Applying broad ranges of the key parameters for physically based catchment-scale landslide susceptibility modelling is on the “safe” side as it yields results comparable in quality to those derived with the best-fit narrower ranges. Acknowledging the fact that geotechnical and geohydraulic parameters are spatially highly variable, uncertain and often poorly known, applying a narrow parameter space—or even a singular combination of parameters—bears a considerable risk to be off target. The direct effects of the vegetation (not accounted for in the present study) increase the level of uncertainty particularly in forested areas.

The conservativeness of the result in terms of *FoC* strongly depends on the parameter sub-spaces used as input. μ_{SFSI} is generally much higher than *r*_{OP}, indicating that the model results tend to be very conservative. The ideal result should correspond to *FoC* = 1. Theoretically, this could be achieved by increasing the upper thresholds of the geotechnical parameters, i.e. to make the parameter spaces considered broader. However, substantially higher parameter thresholds are not realistic for the soil materials involved. We believe that the key for bringing μ_{SFSI} in line with *r*_{OP} consists in appropriately capturing the fine-scale spatial variation of the geotechnical parameters: sliding surfaces most likely coincide spatially with geotechnically susceptible areas, layers or interfaces, spaced in a more or less irregular way. We consider it almost impossible to parameterize such patterns in a deterministic way. In this context, we note that in Figs. 6 and 8, some landslides coincide spatially with areas of low *SFSI*. Such mispredictions are most probably related to localized patches of low soil strength, increased water input or increased hydraulic conductivity or the effects of the vegetation. Whilst the variation in the local slope angle explains much of the pattern of *SFSI*, the residual part is most likely explained by fine-scale spatial variations of the soil and, possibly, the vegetation.

*c*′ and

*φ*′ (Mergili et al. 2015),

*K*

_{s}(Mesquita et al. 2002, 2007; Mesquita and Moraes 2004) or soil depth (McBratney et al. 2003; Frohn and Müller 2015). More precisely, at this time, there are no means to appropriately regionalize the key input parameters of slope stability models. We have demonstrated that ad-hoc assumptions of parameter variations (soil depth) may result in a decreased model performance or, in combination with inappropriate reference data (an inventory including transit and deposition areas), may pretend an improved model performance. Notwithstanding any possible future progress in this field, we highlight two strategies to deal with the challenges identified:

- 1.
Accepting the limitations described and interpreting the outcomes of physically based landslide susceptibility models in a relative way. The

*SFSI*as suggested in the present work is one possibility to do so; other ways were introduced earlier with SHALSTAB (Montgomery and Dietrich 1994) or SINMAP (Pack et al. 1998). In principle, all slope stability software tools can be used to derive relative indices from multiple results. - 2.
Using probabilistic approaches to deal with the spatial parameter variation, i.e. resulting in the identification of the possible size of weak regions (Fan et al. 2016). Fibre bundle models may then be used to simulate the associated patterns of slope failures (Cohen et al. 2009). However, this method also relies on various assumptions of spatial parameter variability.

One may argue that also statistical models—employing a black box in terms of relating predictor layers to a landslide inventory—would do the job of producing relative landslide susceptibility maps. In fact, those approaches may be considered a more honest strategy, compared to physically based calculations with uncertain or even unknown geotechnical and geohydraulic parameters. We have shown that even a simplistic statistical model—employing the local slope as the only predictor layer—performs comparable to the more complex physically based models used. This finding reflects the dominant effect of the slope also in the physically based models, as long as the majority of the other key parameters is assumed constant in space. It reminds of the statement of Box (1976) that it would be simple and evocative models pushing science forward rather than over-elaborated, over-parameterized ones. However, it is clear that statistical models would hardly do the work for dynamic analyses such as—with the data usually available—predicting the slope stability response to a particular rainfall event.

## Conclusions

We have tested the sensitivity of catchment-scale slope stability model results to variations in the geotechnical and geohydraulic parameters. In contrast to many previous studies, we have focused on parameter spaces instead of combinations of parameter values. The results produced with broad parameter sub-spaces show comparable levels of performance in terms of *AUROC* to those produced with narrow sub-spaces, even though the results vary considerably in terms of *FoC*. In general, the *SFSI* maps are classified as very conservative (*FoC* > > 1). It seems obsolete to optimize the parameters tested by means of statistical procedures.

Considering the uncertainty inherent in all geotechnical and geohydraulic data, and the impossibility to capture the spatial distribution of the parameters by means of laboratory tests in sufficient detail, we conclude that landslide susceptibility maps yielded by catchment-scale physically based models should not be interpreted in absolute terms. We suggest that efforts to develop better strategies for dealing with the uncertainties in the spatial variation of the key parameters should be given priority in future slope stability modelling efforts. Even though we consider it likely that many of our results are valid for most types of landslides or geological settings, more tests including a broad spectrum of situations would be necessary to confirm all statements.

## Acknowledgements

Open access funding provided by University of Natural Resources and Life Sciences Vienna (BOKU). We are grateful to Prof. Renato Guimarães for kindly providing the DEM and the landslide inventory. Further, we thank National Department of Transportation Infrastructure (DNIT, Brazil) for enabling this work and Capes (Coordination of Higher Education Training, Brazil) for their generous support.

## Copyright information

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.