Effects of soil heterogeneity on susceptibility of shallow landslides

Uncertainties in parameters of landslide susceptibility models often hinder them from providing accurate spatial and temporal predictions of landslide occurrences. Substantial contribution to the uncertainties in landslide assessment originates from spatially variable geotechnical and hydrological parameters. These input parameters may often vary significantly through space, even within the same geological deposit, and there is a need to quantify the effects of the uncertainties in these parameters. This study addresses this issue with a new three-dimensional probabilistic landslide susceptibility model. The spatial variability of the model parameters is modeled with the random field approach and coupled with the Monte Carlo method to propagate uncertainties from the model parameters to landslide predictions (i.e., factor of safety). The resulting uncertainties in landslide predictions allow the effects of spatial variability in the input parameters to be quantified. The performance of the proposed model in capturing the effect of spatial variability and predicting landslide occurrence has been compared with a conventional physical-based landslide susceptibility model that does not account for three-dimensional effects on slope stability. The results indicate that the proposed model has better performance in landslide prediction with higher accuracy and precision than the conventional model. The novelty of this study is illustrating the effects of the soil heterogeneity on the susceptibility of shallow landslides, which was made possible by the development of a three-dimensional slope stability model that was coupled with random field model and the Monte Carlo method.


Introduction
Landslides are one of the major hazards in the world causing adverse consequences to society, such as fatalities (e.g., Haque et al. 2016;Petley 2012), injuries to people, economical losses (e.g., Nadim et al. 2006), and environmental damages. Among the different types of landslides, shallow landslides are one of the most detrimental types due to their high frequency on hillsides, and the capacity to evolve in destructive debris flows. Shallow landslides can be initiated by extreme events of rainfall, snowmelt, or a combination of rainfall and snowmelt.
In the landslide hazard and susceptibility mapping, physicalbased models are being increasingly employed as the hydrological and geotechnical aspects of the landslide can be explicitly considered. A wide range of physical-based landslide susceptibility models have been developed ranging from local (i.e., single slope to 10 km 2 ) to national scales (i.e., hundreds to thousands of km 2 ). Some of the most commonly used models include the distributed Shallow Landslide Analysis Model (dSLAM) (Wu and Sidle 1995), the Shallow Slope Stability Model (SHALSTAB) (Montgomery and Dietrich 1994), the Stability Index Mapping (SINMAP) (Pack et al. 2005), the Shallow Landslides Instability Prediction (SLIP) (Montrasio and Valentino 2008), GEOtop-FS (Simoni et al. 2008), the Transient Rainfall Infiltration and Grid-Based Regional Slope Stability (TRI-GRS) (Baum et al. 2002(Baum et al. , 2008 model, TRIGRS-P (Raia et al. 2014), the High Resolution Slope Stability Simulator (HIRESS) (Rossi et al. 2013), and the r.rotstab (Mergili et al. 2014b, a).
Significant uncertainty in the geotechnical and hydrological parameters of these models has been reported in the literature (e.g., Burton et al. 1998;Mergili et al. 2014b;Arnone et al. 2016). The uncertainties represent one of the major challenges in the accurate spatial and temporal prediction of rainfall-induced shallow landslides. The uncertainties originate often due to the lack of field and laboratory investigations and the inherent natural variability linked to the parameters (e.g., Melchiorre and Frattini 2012). Avoiding quantification of uncertainty by employing a set of deterministic values for the model parameters might result in unrealistic or too conservative estimates (e.g., Raia et al. 2014). In addition to the uncertainties in the geotechnical and hydrological parameters, uncertainties can arise from different sources including initial hydrological conditions (e.g., Grelle et al. 2014). Bossi et al. (2019) investigated the uncertainties in the slope stability modeling due to soil stratigraphy heterogeneity. The results show that soil stratigraphy heterogeneity has a significant effect on the safety of slopes. The uncertainties originating from GIS data sources (Sandric et al. 2019), raster resolution, and sample size (Shirzadi et al. 2019) have been also reported to be significant in landslide susceptibility assessment.
The uncertainties in the spatially variable model parameters can be statistically modeled with random fields (Fenton and Griffiths 2008). Random fields model spatially variable parameter by assigning a probability density function (pdf) to statistically describe the uncertainties in the parameter and using a covariance function to account for spatial dependence of the parameter. Variability of landslide model parameters (e.g., geotechnical and hydrological parameters) has been mainly incorporated by considering different homogeneous geological units over the terrain (Salciarini et al. 2006;Baum et al. 2010;Melchiorre and Frattini 2012;Schilirò et al. 2021) without explicitly modeling spatial variability within a single geological unit. Additionally, some of the abovementioned physical-based models, such as SINMAP, GEOtop-FS, HIRESSS, and TRIGRS-P, have the capacity to model the variability of the model parameters with the single random variable approach, where the parameters are uncertain but homogeneous within a single geological unit, thus not accounting for spatial variability (Hammond et al. 1992;Haneberg 2004;Raia et al. 2014;Arnone et al. 2016). Spatial variability of physical-based model parameters such as cohesion as a geotechnical parameter has been investigated by extensive field measurements in Burton et al. (1998). The results of the study reveal the significance of the spatial variability on the landslide modeling. Additionally, Fenton and Griffiths (2008) investigated the effects of spatial variability and showed that nonconservative results are obtained without accounting for the spatial variability.
The need of accounting for the spatial variability of the geotechnical and hydrological parameters on the susceptibility of landslides has been addressed by many researchers (e.g., Burton et al. 1998;Mergili et al. 2014a, b;Arnone et al. 2016). In the study of Lizárraga and Buscarnera (2020), spatial variability of hydraulic conductivity, K S , has been accounted in regional modeling of shallow landslide. The physical-based model described in Lizárraga and Buscarnera (2019) has been combined with random field approach and Monte Carlo realizations to account for the spatial variability of K S . The results indicate that accounting for spatially varying K S affects the shallow landslide susceptibility assessment significantly. However, there is yet to make an attempt to study the spatial variability of geotechnical parameters on shallow landslide susceptibility using physical-based models. Accounting the variability of model parameters without spatial dependence (e.g., Rossi et al. 2013;Raia et al. 2014), as homogeneous through space, would result in an overestimated factor of safety, F S , as the failure might occur through weak zones resulting in lower F S in case of heterogeneity. This work aims to evaluate the effects of spatial variable model parameters on the estimates of susceptibility of shallow landslides with the development of a three-dimensional landslide susceptibility model accounting for the variability of model parameters over spatial extent.
The study is presented in such a way that the proposed 3D soil column-based limit equilibrium model, capable of modeling the spatial variability over the problem domain, is introduced first. Details on the hydrological model, the slope stability model, and the statistical methods integrated into the model are provided. This is followed by the validation of the slope stability model. Additionally, the effects of the spatial variability of the model parameters on the susceptibility of shallow landslides will be introduced, and the capacity of the proposed model to capture these effects will be validated on a simplified problem by performing an extensive study in a finite element method software. Finally, the model will be tested on a case study in an area prone to shallow landslides, and the results will be provided.

The 3-Dimensional Probabilistic Landslide Susceptibility (3DPLS) model
The 3-Dimensional Probabilistic Landslide Susceptibility (3DPLS) model is a Python code developed for landslide susceptibility assessment. The 3DPLS model evaluates the landslide susceptibility on a local to a regional scale (i.e., single slope to 10 km 2 ) and allows for the effects of variability of the model parameters on slope stability to be accounted for.
The 3DPLS model couples the hydrological and slope stability models. The hydrological model calculates the transient pore pressure changes due to rainfall infiltration using Iverson's linearized solution of the Richards equation (Iverson 2000) assuming tension saturation. The slope stability model calculates the F S by utilizing the extension of Bishop's simplified method of slope stability analysis (Bishop 1955) to three dimensions, proposed in the study of Hungr (1987). The 3DPLS model requires topographic data (e.g., DEM, slope, groundwater depth, depth to bedrock, geological zones), hydrological parameters (e.g., steady background infiltration rate, permeability coefficient, diffusivity), geotechnical parameters (e.g., soil unit weight, cohesion, friction angle), and rainfall data.
The model can have a grid containing hundreds or thousands of cells depending on the problem size and refinement. The smallest unit of the grid is called a grid cell having its own model parameters. The developed model calculates the F S of an ellipsoidal sliding surface consisting of grid cells over a discretized problem domain, while equivalent cell-based models, as the name states, perform the calculations per each cell individually. The model generates a large number of ellipsoidal sliding surfaces centered at each grid cell over the terrain and calculates the F S of all sliding surfaces. After the calculation, each cell is involved in several ellipsoidal sliding surfaces with different F S values. Among all F S values, the minimum F S representing the critical ellipsoidal sliding surface is assigned to each cell. Each simulation results in a F S map over the terrain. After a number simulation, the 3DPLS model provides the F S map of each simulation, the mean F S , F S , and the probability of failure, P f , of each cell.

The 3D slope stability model
The model assumes an ellipsoidal sliding surface, as shown in Fig. 1, and calculates the corresponding F S . The ellipsoidal sliding surface is characterized by the lengths of three principal semi-axes, a e , b e , and c e and the inclination of the ellipsoid in the direction of motion, , aspect of the motion, , and the geographical coordinates of the center with a perpendicular offset of the ellipsoid center, z e , above the ground, as presented in Fig. 2. The a e is the principal semi-axis in the direction of steepest slope, i.e., in the sliding direction, and b e is the other principal semi-axis perpendicular to the direction of the steepest slope. c e is the third principal semi-axis perpendicular to the other two principal semi-axes.
After an ellipsoidal sliding surface is specified, the depth to the sliding surface is calculated for each cell inside the ellipsoidal sliding zone. Then, the 3DPLS model truncates the ellipsoidal sliding surface at the cells where depth to the sliding surface is greater than depth to bedrock, i.e., thickness of the cell.
The lengths of three principal semi-axes, a e , b e , and c e , are defined by the user considering the characteristics of the shallow landslides that occurred over the study area. The direction of the motion, , can be obtained by using the aspect of the cells inside the sliding zone or using reference points inside the ellipsoidal zone (e.g., Mergili et al. 2014b), but it is assigned by the user in the current version of the 3DPLS model. The inclination of the ellipsoid in the direction of motion, , is calculated by taking the average slope over a rectangular zone with the dimensions of 2a e × 2b e located at the center of the ellipsoid. In the 3DPLS model, when the number of cells in the slope stability calculation for a single ellipsoid is less than a given threshold value, the model can sub-discretize the cells by halving the cell size until the threshold value is reached. To have a reasonable ellipsoidal sliding surface in the analysis, the number of cells inside the ellipsoidal zone should be sufficiently high. Otherwise, the generated sliding surface represents a combination of discrete planes. There is a trade-off between the threshold cell number and runtime. Therefore, a reasonable threshold should be selected by investigating the effect of cell number on the F S . The slope stability model of the 3DPLS model employs the 3D extension of Bishop's simplified method of slope stability analysis (Bishop 1955), proposed by Hungr (1987). This extended method, Bishop 3D, is a three-dimensional soil column-based limit equilibrium method. The Bishop 3D relies on the same assumptions of Bishop's simplified model. These assumptions are (i) that vertical shear forces acting on the vertical faces of the soil column (both longitudinal and lateral) can be ignored and (ii) that the vertical equilibrium of forces for individual soil columns and the moment equilibrium of the entire system of soil columns are sufficient conditions for the identification of unknowns such as the normal force and shear force at the base of soil columns and F S .
After placing the ellipsoidal sliding surface, the depth of the ellipsoidal sliding surface, d e , is calculated using the coordinate and elevation data of each cell. The thickness of the sliding for a given cell, d , is equal to the d e if the maximum depth to bedrock, Z max , is greater than d e . Otherwise, the sliding surface is truncated at Z max and d = Z max . Then, the total weight of the soil column is calculated as W = d(Δ x Δ y ), where Δ x and Δ y are the lateral soil column dimensions. In the case of truncation, the same aspect of the ellipsoidal sliding surface, i.e., , and the slope angle of the truncated cell are assigned to the sliding base of the truncated cell for simplicity.
Considering a single soil column shown in Fig. 1b, the vertical force equilibrium equation can be derived as follows: where N z and S z are the vertical components of the total normal force and shear force at the base, A is the area of the base slip surface, c is the cohesion, is the friction angle, u is the pore pressure at the base center, z is the angle between normal force and vertical axis, and is the inclination of soil column base in direction of motion. Then, normal force, N , can be derived as follows: where The area of the soil column base can be calculated using Δ x and Δ y , and apparent dip angles, x and y , as follows: The angle between normal force, N , and vertical axis can be calculated by the following equation: 1∕2 cos x cos( y ) For the entire sliding volume divided into j number of soil columns, the moment equilibrium equation can be expressed as follows: Then, the F S can be derived by using the equilibrium equation (Eq. 1) and normal force equation (Eq. 2): The above equation is implicit in F S , and this requires one to solve the equation iteratively for F S . The details can be seen in Hungr (1987).

The hydrological model
In the 3DPLS model, Iverson's linearized solution of the Richard equation for tension-saturated soils (Iverson 2000) is employed in the hydrological model to calculate pore pressure changes. The time-dependent pressure head, (t, y) , at a given depth, y , and time, t , composed of a long-term head response (steady component), 0 (y) , and a short-term head response (transient component), 1 t, y , is as follows: The steady component, 0 (y) , is defined as a linear function of y with respect to the initial groundwater depth, H W , assuming slopeparallel flow with a background infiltration rate, I Z : where is a constant calculated as follows: where is the slope angle in degrees, K s is the saturated permeability. The transient component is based on the reduced form of the Richard equation assuming tension saturation with a saturated coefficient of permeability: where D 0 is the maximum diffusivity observed when the soil becomes saturated. Solving the above equation with the boundary conditions defined in Iverson (2000) results in: where I y is the precipitation (surface flux), t * is normalized time, T * is the normalized duration of the precipitation, and R is the response function. These parameters are determined as follows: where D = 4D 0 cos 2 is an effective hydraulic diffusivity and erfc is the complementary error function. As the abovementioned model may result in unrealistic pressure heads at shallow depths, calculated pressure heads are restricted by specifying that it cannot exceed the -line calculated by = y as stated in Iverson (2000). The details and the limitations of Iverson's linearized solution of the Richard equation can be seen in Iverson (2000).

The statistical framework
Soils exhibit heterogeneity and anisotropy in space due to varying deposition and formation processes in geological history. Composition, strength parameters, and physical and chemical properties can vary for the same soil type at the same site because of the inherent variability resulting from the randomness of geological processes. The soil properties display variability through space, and a continuous random field model is often used to model it. The spatial variability of the model parameter is defined by a covariance function to account for spatial dependence of the parameter and a pdf to describe the uncertainties in the parameter. In the 3DPLS model, the variability of the model parameters is modeled in the horizontal directions with two-dimensional Gaussian and lognormal random fields. The variability in the vertical direction is not modeled for simplicity. As it is relatively straightforward to transform a lognormal random field to a Gaussian random field, the Gaussian random field is only explained here. A Gaussian random field of a model parameter, X = X(l 1 ), … , X(l k ) T , where the set of values, l = l 1 , … , l k T , represent the spatial coordinates, defines the joint pdf of X as a multivariate normal pdf, f X (X): where is the mean values of the random field, = (l 1 ), … , (l k ) T , C is the covariance matrix, and k is the number of elements in X. The elements of the covariance matrix, C , are expresses as: where (l m ) is the standard deviation of the parameter at coordinate l m and l m , l n is the correlation coefficient between parameters at coordinates l m and l n . Depending on the problem size, either the two-dimensional ellipsoidal autocorrelation function (Eq. 18) or the two-dimensional separable Markov autocorrelation functions (Eq. 19) has been employed: (17) C mn l m , l n = 2 l m , m = n l m l n l m , l n , m ≠ n where x and y are separation distances between coordinates l m and l n and l x and l y are spatial correlation lengths in the direction of x and y, respectively. To generate random fields, the covariance matrix decomposition method (Fenton and Griffiths 2008) with the ellipsoidal autocorrelation function is utilized for problems with a low number of elements. The soil property, X , at k locations can be obtained as: where L k×k is the lower triangle matrix obtained by the Cholesky decomposition method and U k×1 is a vector of Gaussian random variables with zero mean and unit variance.
The computational time and required memory space to generate correlation matrices and Cholesky decomposition can be demanding as the number of elements in the problem increases (e.g., 50 × 50 ). To overcome this issue, the stepwise covariance matrix decomposition method proposed by Li et al. (2019) is implemented in two-dimensional random field generation. The stepwise matrix decomposition method uses a separable autocorrelation function (Eq. 19) which allows the method to disassemble the correlation matrix, R , into the number of dimensions used in the problem: where R x and R y are the one-dimensional correlation matrices in x and y directions and ⊗ is the Kronecker product. Similarly, the lower triangle matrix, L , can be written as: where L x and L y are the lower triangle matrices in x and y directions satisfying L x L T x = R x and L y L T y = R y . Then, using the matrix-array multiplication operations, Eq. 21 can be rewritten as: The method reduces the computational time and required memory space significantly. The details of the stepwise covariance decomposition method can be seen in Li et al. (2019). In the present study, the stepwise covariance matrix decomposition method was implemented when the number of the elements of the problem exceeds a limit causing unfeasible run time (e.g., more than 500 elements).
The 3DPLS model propagates uncertainties from the spatially variable model parameters to the model output in terms of F S . The 3DPLS model is relatively flexible and can be coupled with a wide range of methods for uncertainty quantification (e.g., Monte Carlo, Importance Sampling). For simplicity, the 3DPLS model was coupled with the Monte Carlo method in this study due to its robustness and straightforward implementation. The 3DPLS model conducts a series of slope stability analyses, in which the values of model parameters are randomly selected based on realizations from statistical models of spatially variable properties (e.g., random fields). The result of the Monte Carlo analysis is a set of F S values that are statistically analyzed to evaluate the resulting uncertainty in model predictions by calculating modes (e.g., mean, standard deviation) or estimating failure probability. The probability of failure, P f , for each cell individually is calculated as follows: where N S is the number of simulations, F S,i is the factor of safety of i th simulation, and I is the indicator function providing 1 in case of failure, when F S,i ≤ 1 , otherwise 0.

Validation of the slope stability model
The performance of the slope stability model on evaluating the F S of slopes has been tested using three validation problems, all of which feature homogeneous model parameters in space. The first validation problem (Fig. 3a, d) involves a spherical sliding surface in a purely cohesive slope with a 2:1 inclination. In Fig. 3a, R is the radius of the sphere centered 0.5 R above the ground surface. The 3D F S was calculated as 1.402 by using the closed-form solution proposed by Baligh and Azzouz (1975) and Gens et al. (1988). Using the developed 3DPLS model, the computed 3D F S ranged from 1.386 to 1.471 depending on the refinement of the sliding volume. In Table 1, the F S values reported for the validation problem by various researchers are provided for comparison.
The second validation study was conducted on a spherical sliding surface in a c � − � slope (Fig. 3b, e) reported in the study of Hungr et al. (1989). In the 3DPLS model, the center of the ellipsoidal sliding surface is introduced with an offset perpendicular to the ground surface. Therefore, it is not possible to introduce the center of the sliding surface shown in Fig. 3b. With a small modification to the proposed model, the center could be defined, and the 3D F S was calculated as 1.207 using a cell size of 0.01 m. In Table 1, the corresponding 3D F S values reported in the literature are given.
The third validation problem is the example from Fredlund and Krahn (1977), shown in Fig. 3c and f. The problem has been used by many researchers (Xing 1988;Hungr et al. 1989;Lam and Fredlund 1993;Huang et al. 2002;Xie et al. 2006;Griffiths and Marquez 2007;Mergili et al. 2014b) to verify their proposed 3D models. This validation problem has been investigated for two sliding surfaces shown in Fig. 3f, spherical sliding surface in pure homogeneous drained material (Slip-1) and composite sliding surface due to the existence of a weak layer (Slip-2). Slip-1 without water table and Slip-2 with/without water table were analyzed by using the 3DPLS model. The pore pressures were calculated assuming hydrostatic conditions for the soil columns with a sliding surface below the water table. Negative pore pressure, i.e., suction, was ignored. The composite sliding surface, Slip-2, was obtained by assigning the maximum depth at the level of (25) the weak layer. Then, the soil columns were truncated when the sliding spherical surface is below the weak layer, and the slope is assigned as zero to those truncated cells instead of surface slopes as originally implemented in the 3DPLS model. In the current study, the 3D F S yielded 2.276 for Slip-1 without water  Table 1, it can be seen that the 3DPLS model results are within the range reported in the literature for the abovementioned three relatively simple validation problems with slight differences. These differences are attributed to the assumption in the models and the level of discretization.

Validation of the model capacity to capture the effects of spatial variability
In this study, the capacity of the 3DPLS model to capture the effect of spatial variability on slope stability has been validated with an extensive study in the finite element method (FEM) based program, PLAXIS. Throughout this paper, the 3DPLS model will be compared with its equivalent cell-based model utilizing the same hydrological model and the statistical framework as the proposed model does, but employing widely used infinite slope stability method on a cellby-cell basis (e.g., Griffiths et al. 2011). In the infinite slope stability method, the F S is calculated at maximum depth for each cell using the pore pressure values obtained from the hydrological model. Additionally, for the cells with low slope angle, i.e., flat zones, the

Reference
Validation problem -1 Validation problem -2 Validation problem -3 a Slip-1 (dry) Slip-2 (dry) Slip-2 (wet) Baligh andAzzouz (1975) Gens et al. (1988) 1.402 ---- Xing (1988) --2.122 1.548 1.441 Hungr et al. (1989) -1.23 -1.62 1.54 Lam and Fredlund (1993)  F S values are limited at 10.0 to prevent possible confusion of the results. The analyses with the infinite slope stability method will be called the "Cell-based model" hereafter. A simplified case, with a domain of 100 × 100m with 25° inclination, was examined. The thickness of the soil is 2 m with a groundwater table at the ground surface. The domain is discretized into a grid of 20 × 20 equally sized cells (cell size of 5 m) as shown in Fig. 4. The parametric study has been conducted for both drained (effective stress-based) and undrained (total stress-based) cases with three variability levels: low, moderate, and high. Table 0 shows the parameters used for the validation problem in both drained and undrained cases. In the analyses, the saturated unit weight of soil and the unit weight of water were employed as 20 kPa and 10 kPa, respectively. In the drained case, a slope parallel groundwater flow was modeled, while total stress analysis was performed for the undrained case. The saturated permeability was assigned as 1 • 10 −6 m/s. The two-dimensional Gaussian random field model was employed to generate random fields for soil strength parameters, c ′ and ′ in the drained case and S u in the undrained case. Spatial dependence was modeled using the two-dimensional ellipsoidal autocorrelation function (Eq. 18) with a spatial correlation length, l , ranged between 0 and 1000 m. The spatial correlation length of 0 m means no spatial dependence across the study domain. As the spatial correlation length increases, the spatial dependence among the parameters increases. The spatial correlation length of 1000 m is used as an upper limit where the soil approaches homogeneous conditions over the 100 m × 100 m study domain. The spatial correlation length was assumed to be equal in both directions, i.e., l x = l y . Analyses were conducted for a range of coefficient of variation, CoV, values with 1000 random field realizations for each combination of correlation lengths and CoV values. There exists a trade-off between the mesh size and the run time for each simulation in the FEM model. Due to a large number of simulations (1000 Monte Carlo simulations for each correlation length for each variability level for both drained and undrained cases; 48,000 simulations in total), the mesh size was optimized considering the convergence of the F S values and the run time.
In the 3DPLS model, the lengths of three principal semi-axes that define the ellipsoid were assumed to be 20 × 20 × 2m without any offset from the ground surface, i.e., z e = 0 . The aspect of the motion, , was assigned to be 90 • as the sliding was through the downslope direction. The inclination of the ellipsoid in the direction of motion, , was assigned as 25 • as the slope is constant over the problem domain. In the 3DPLS model analyses, the problem domain was extended so that the ellipsoidal sliding surfaces placed close to the boundaries would not be truncated at the sides. Then, the F S values within the original problem domain were employed in the performance assessment. In the Cell-based model, the calculations are carried out per each cell individually, and therefore, no extension was needed.
The landslide initiation process is a complex process including the formation of the initial weak zone and its propagation prior to landsliding. A thin zone of intense shearing, i.e., shear band, starts to propagate as the initial weak zone is not able to resist the driving load. Then, the driving load exceeding the capacity of the initial weak zone is distributed to the surrounding neighboring initially stable zones. If the neighboring zones cannot withstand this additional load transferred from the initial weak zone, the shear band propagates. Then, the propagation of the shear band might stop if the excess driving load is compensated by the surrounding neighboring zones; otherwise, it leads to the landslide initiation with the movement of upper soil on the shear band. In addition to the complex processes of load distribution and shear band propagation, the presence of soil heterogeneity adds another complexity to the shear band propagation. That is, the shear band tends to propagate along the weakest path with low capacity to withstand the additional loading.
Without the consideration of these complex processes involved in the landslide initiation, relating the predictions of the cell-based physical-based models to a single F S representing the whole simplified validation problem, that is called the global factor of safety, F g S , is quite challenging. In the FEM analyses, the load distributions and the propagation processes are explicitly satisfied, and therefore, reliable and consistent F g S can be obtained directly from analyses, i.e., F g S,FEM = F S,FEM . However, the physical-based limit equilibrium models do not consider these complex processes included in the landslide initiation. Thus, approximate approaches are required here to relate the F S values over the problem domain, F Si , i = 1, … , n , to the F g S where n is the number of cells in the discretized problem domain. In Fenton and Griffiths (2008), the harmonic average was empirically demonstrated to be best suited to characterize the  effects of heterogeneity. This was explained by the harmonic average being dominated by the low values similar to the tendency of shear band propagation being along the weak zones.
In the Cell-based model, the F S values are calculated for each cell individually. Therefore, the F g S is assumed to be related to the harmonic average of the F S values of the cells F Si , i = 1, … , n over the problem domain based on the empirical observations (Fenton and Griffiths 2008): In the 3DPLS model, the F S is calculated over an ellipsoidal sliding surface including spatially varying strength parameters inside. The F S of a cell represents the most critical ellipsoidal sliding surface intersecting that cell. Therefore, the F g S is related to the minimum F S , F S,min over the problem domain that represents the most critical ellipsoidal sliding surface with the lowest safety for the entire model: Then, the mean global factor of safety, g , is calculated by averaging the F g S values of 1000 Monte Carlo simulations for comparison of the models. Both deterministic and probabilistic analyses were conducted for the simplified validation problem to validate the capacity of the 3DPLS model to capture the effect of spatial variability on slope stability. In the deterministic analyses, the soil is assumed homogeneous over the problem domain, and the mean values are employed only. In the probabilistic analyses, the shear strength parameters shown in Table 0 are treated as random variables. Table 3 presents the deterministic results of the simplified validation problem for the FEM, Cell-based, and 3DPLS models. The deterministic model results show that the 3DPLS model results in higher F S for both drained and undrained cases compared to the Cell-based model. Figure 5 presents the results of the probabilistic analyses on the effects of spatial variability on the g for drained case in Fig. 5a and b and undrained case in Fig. 5c and d. The FEM model results are compared with the Cell-based model results in Fig. 5a and c and with the 3DPLS model results in Fig. 5b and d. The FEM results indicate that the g are higher where there exists no spatial dependence over the problem domain, i.e., 0 m correlation length. It is less likely that there will be large weak zones due to a lack of spatial dependence. As the correlation length increases up to a value of 50 m, the g decreases significantly for both drained and undrained cases. This is due to the weak zones having a larger spatial extent and low capacity to withstand driving loading. That is, there is a higher probability of having a weak zone which can lead to a local failure and a lower F S . Further increase of correlation length from 50 up to 1000 m causes an increase in the g as the model becomes more homogeneous over space. That is, with high correlation lengths, the F S is no longer dominated by local weak zones; eventually, this leads to high F S values.
In addition to the effect of correlation length, the effect of variability level can be also seen in Fig. 5. The FEM model results show that as the variability level increases from low level to high level, the g decreases except for 0 m correlation length. That is, high variability level results in higher g when the correlation length is zero. This is attributed to the weak zones having neighboring zones with higher shear strength values due to the high variability level. Nevertheless, the main trend is that g decreases with increasing variability level. In addition, it can be observed that the effect of variability level decreases as the correlation length increases.
The comparison of the FEM model and the Cell-based model results in Fig. 5a and c shows that the Cell-based model results are not in agreement with the FEM model results. As the undrained case has only one parameter treated as a variable, the effect of variability level on the g and the increase in the g with the increase in correlation length can be observed. For the drained case, the results do not vary significantly. As the calculations are performed for each cell individually, the effect of weak zones leading to local failure is not captured by the Cell-based model. Therefore, the Cell-based model is not able to capture the effect of spatial variability on the g .
When comparing the FEM model and the 3DPLS model results in Fig. 5b and d, it can be detected that the trends in both methods are similar. The g starts with a high value when the correlation length is 0 m. Then, the increase in correlation length up to 50 m causes local failures to dominate the F g S and leading to lower g . When the correlation length increases from 50 to 1000 m, the parameters over the problem domain become more homogeneous, and local failures stop dominating the F g S . When there is no spatial dependence, i.e., 0 m correlation length, the FEM model results in higher g values for high variability level but not the 3DPLS model. This is attributed to that the 3DPLS model does not explicitly model the landslide initiation process. The effect of variability level, i.e., the decrease in g with the increase of variability level, is the same for both models. Needless to say, there is no perfect fit in the results due to the FEM model involving all processes explicitly, and the 3DPLS model being a simplified soil-column-based limit equilibrium model. Besides, the lengths of three principal semi-axes were assumed to be 20 × 20 × 2m in the 3DPLS model. A better fit can be obtained by changing the dimensions of the ellipsoid. However, the main aim here is to show the capacity of the 3DPLS model to capture the effect of spatial variability on slope stability not the perfect fit in the results.

Case study: "Kvam landslides"
The performance of the 3DPLS model, capable of accounting for the spatial variability of soil parameters in landslide susceptibility assessment, will be tested on the Kvam landslides that took place in 2011. For the comparison purpose, the analyses were also conducted with the Cell-based model utilizing the same hydrological model to calculate the pore pressure build-up due to rainfall infiltration and employing an infinite slope stability model to calculate F S value for each cell individually. In the 3DPLS model, the values of the ellipsoid dimensions were determined by considering the observations of previous landslides. The lengths of three principal semi-axes were selected as 100 × 20 × 2.5m without any offset from the ground surface. The aspect of the motion, , was assigned to be zero considering the aspect of the study area being dominantly in the direction of east. The inclination of the ellipsoid in the direction of motion, , was calculated by taking the average slope over a rectangular zone ( 2a e × 2b e ) located at the center of the ellipsoid.

Study area
Kvam is a village in central southern Norway, situated along the Gudbrandsdalslågen River, within the Gudbrandsdalen Valley. The valley has been re-shaped by glaciers, featuring steep edges covered with glacial deposits, during the Quaternary period. Based on the available geological quaternary map (NGU 2020), the materials present at the terrain surrounding Kvam are classified as moraine, glaciofluvial deposits, fluvial deposits, humus/peat cover, and subcropping bedrock (Fig. 6). The valley is characterized by fluvial deposits at the base and by moraine cover at the hillsides. Above the zone covered by moraine material, there exists sub-cropping bedrock with a thin layer of humus and peat.
The area surrounding the Kvam has high landslide susceptibility as the landslide scars are visible on the hillsides mainly in the moraine type geological unit as shown in Fig. 6. This study will focus on the shallow landslides that occurred following the rainfall event in 2011 ( Fig. 9) causing multiple landslides in the valley and the flooding of the village. For the detailed study, the study area shown in Fig. 6 is selected as the landslides following the rainfall event in 2011 concentrated inside the selected zone ( 0.57 km 2 ) . The average slope of the study area is 26.20° with a maximum value of 45.71°.
Aerial photos in Fig. 7 show the studied area in Kvam before the rainfall event in 2010 and after the event in 2011. From Fig. 7a, it can be observed that the hillside is covered by dense vegetation with the existence of channels through the slope. After the rainfall event in 2011, several shallow landslides occur in the studied zones as shown in Fig. 7b. The landslide initiation and runout zones were identified considering high-resolution aerial pictures and orthophotos (Schilirò et al. 2021). The runout zones were not included as the 3DPLS model cannot model the post-failure behavior of the sliding mass. Then, the landslide initiation zones are discretized with respect to the DEM discretization as shown in Fig. 8. Between June 9 and 10 in 2011, the Kvam area receives total precipitation of 61.72 mm/day as presented in Fig. 9. Hourly precipitation amounts are presented within the period of 24 h starting from the beginning of the rainfall event on 9 of June, with the averaged value of I Z = 2.572 mm∕h . In the analyses, the average value of precipitation over the 24-h period was used to simplify the implementation of the developed model. The rainfall is assumed to be constant over the study area. The background infiltration rate, I Z steady , was obtained from the Norwegian Water Resources and Energy Directorate (NVE) as the average inflow (source: nve.no).
Digital elevation model (DEM) with a resolution of 10 m was obtained from hoydedata.no and utilized for the application of the model. The slope and aspect were derived using the DEM. The thickness of the soil, i.e., depth to bedrock, Z max , was calculated using an empirical relationship between the soil thickness and the tangent of the slope angle in degrees, tan(δ) , derived based on the field data from Holm (2012) and Edvardsen (2013). The empirical equation was utilized as follows: where Z min is the minimum thickness and Z is the mean defined with a linear trend function. Based on the field data, the minimum thickness was assigned as 0.4 m to represent surficial cover at the steep slopes. Considering the estimated thickness of the water table and the degree of saturation according to the gridded water balance model of NVE-Xgeo (source: xgeo.no), the groundwater table was assumed to be at half of the thickness of soil at each cell.
The model parameters are provided in Table 4. Due to the lack of field investigations and laboratory tests, the values in Table 4 are selected from literature sources for the considered moraine type geological unit. Only soil strength parameters, cohesion, and friction angle were treated as random variables, and the others were kept constant at their mean value (i.e., treated as deterministic). The normal and lognormal random field models were employed for the friction angle and cohesion, respectively. In the analyses, the saturated soil unit weight of 20 kPa was employed with a 10 kPa unit weight of water. The random fields were created according to the DEM discretization of the studied area in the Kvam area by using the two-dimensional separable Markov autocorrelation function shown in (Eq. 19). Correlation length was assumed to be equal in both horizontal directions with a value of 50.0 m for both cohesion and friction angle considering the values reported in Phoon and Kulhawy (1999).

Results and discussion
The studied area in the Kvam area was analyzed by the 3DPLS model and the Cell-based model. Both models performed 1000 Monte Carlo simulations while accounting for the spatial variability of soil strength parameters. The results include the mean factor of safety, F S , and probability of failure, P f , maps of the study area before and after the rainfall event shown in Fig. 9, for both models separately. The comparison of the models is done by using the metrics in the confusion matrix shown in Fig. 12 and the receiver operating characteristic (ROC) graph (e.g., Baum et al. 2010;Mergili et al. 2014b;Raia et al. 2014). Figure 10 presents the F S map of the studied area for the 3DPLS model in Fig. 10a and b and for the Cell-based model in Fig. 10c and d. The results are presented for the study area before (Fig. 10a, c) and after (Fig. 10b, d) the rainfall event. From Fig. 10a and b, it can be observed that the 3DPLS model smooths the transition of the F S values that are calculated for an ellipsoidal sliding surface. In the Cell-based model, however, the calculations are performed on a cell-by-cell basis, and therefore, there are rapid transitions between F S values over the study area (Fig. 10c, d). For both models, the F S values are higher than unity. Comparison of Fig. 10b and Fig. 10d shows that more critical value of the F S , between 1.0 and 1.1, was obtained using the 3DPLS model for the study area after the rainfall event although 3D models are thought to provide higher F S values. Similarly, Zhang et al. (2013) reported that 3D F S might be less than the 2D F S for specific conditions. One can also observe that the 3DPLS model results in a greater number of cells with F S values closer to unity.
The results of the 1000 Monte Carlo simulations enable one to calculate the standard deviation of the F S value, F S , for each cell. For flat zones, the F S values are almost zero for the 3DPLS model (~ 0.09) or even zero for the Cell-based model in which the F S is restricted by 10.0 as an upper limit. Besides, the average F S values over the study area were calculated as 0.128 and 0.123 for the 3DPLS model before and after the rainfall event, respectively, while they were calculated as 0.312 and 0.306 for the Cell-based model. It is observed that the 3DPLD model results in less variability of F S values, meaning more stable results. This is mainly due to the smoothing effect of the 3DPLS model, i.e., spatial averaging that occurs over the ellipsoidal failure surface. Figure 11 shows the probability of failure, P f (%) maps of the studied area before (Fig. 11a, c) and after (Fig. 11b, d) the rainfall event using the 3DPLS and Cell-based models. For the 3DPLS model, the values of P f range from 0.0 to 8.8% before the rainfall event (Fig. 11a) and from 0.0 to 22.5% after the rainfall event (Fig. 11b). The ranges of P f for the Cell-based model are 0.0-22.8% before the rainfall event (Fig. 11c) and 0.0-31.9% after the rainfall event (Fig. 11d). It has been detected that the Cell-based model results in quite high P f values, up to 22.8%, even before the rainfall event when the study area is stable. Figure 11a shows that the values of P f before the rainfall are fairly low using the 3DPLS model. From Fig. 11c and Fig. 11d, one can observe that there exist sharp changes in the P f values over the study area as the Cell-based model analyzes each cell individually without consideration of the neighboring zones. However, the F S value is calculated for an ellipsoidal sliding surface in the 3DPLS model; i.e., a zone is analyzed instead of a single cell. Therefore, the transition of the P f values in Fig. 11a and Fig. 11b is smoother.
If the results of a landslide susceptibility model can be converted to binary results, i.e., stable or unstable, the receiver operating characteristic (ROC) graph can be utilized to evaluate the performance of the model in predicting stable and unstable zones over the study area (Fawcett 2006). The model performance is measured, in this study, using the parameters in the confusion matrix (Fig. 12) calculated by comparing the model predictions and discretized landslide initiation map shown in Fig. 8. If a cell is predicted to be unstable by the model, and the field observation is consistent with the model prediction, it is considered as "true positive, TP". However, it is considered as "false positive, FP", if the cell is outside of the discretized landslide initiation zone. Similarly, if a cell is predicted as stable and the cell is outside of the discretized landslide initiation zone, it is a "true negative, TN". Otherwise, it is a "false negative, FN". Out of these four possible outcomes assigned to each cell, an additional set of parameters, namely "true positive rate, TPR", "false positive rate, FPR", "accuracy, AC", and "precision, PR", can be also calculated to assess the performance. The TPR is the proportion of the correctly predicted unstable cells inside the discretized landslide initiation zone to the total number of cells in the initiation zone (P) and calculated as TPR = TP∕P where P = TP + FN . The FPR is the ratio of cells predicted as unstable outside of the discretized landslide zone to the number of cells without landslide observation (N) and calculated as FPR = FP∕N where N = FP + TN . The ROC curve is plotted by using TPR and FPR as shown in Fig. 13. The upper left corner of the plot ( TPR = 1 and FPR = 0) represents the perfect performance, and the diagonal line, or the reference line, represents the random classification or "no skill". As the prediction skill plotted on the ROC graph is closer to the upper left, the prediction capacity of the model is better. Another metric to assess the performance of the model is the AC that is the proportion of correctly predicted cells to the total number of cells and calculated as AC = (TP + TN)∕(P + N) . In addition to AC, the metric PR represents how precise the model predicts the unstable cells and is calculated as PR = TP∕(TP + FP) . Both AC and PR vary in the range of [0, 1] where a higher value means a better model performance of accuracy and precision.
In the literature, there exist both deterministic (e.g., TRIGRS, SHALSTAB) and probabilistic (e.g., SINMAP, GEOtop-FS, HIRESSS, TRIGRS-P) landslide susceptibility models providing the results mainly in terms of either F S or P f to assess the stable and unstable zones. In deterministic models, the stability of each cell is evaluated based on the value of F S . That is, a cell is considered to be stable if F S > 1.0 and unstable if F S < 1.0 . Unlike deterministic models, the probabilistic models might employ different assessment criteria to determine the stable and unstable zones. In the study of Rossi et al. (2013), a zone was considered to be unstable if a sub-zone has more than 1% area with a P f higher than 80.0%. Different classification systems were also employed to assess the stability, such as the reliability index (Haneberg 2004) and the stability index (Michel et al. 2014). However, there are no widely recognized assessment criteria for the probabilistic models. In this study, the performance of the 3DPLS model has been evaluated by employing different levels of probability limits, P f ,limit , to estimate landslide stability such that a cell is considered as stable if the P f < P f ,limit and unstable if P f > P f ,limit .
As the values of P f obtained by the 3DPLS model and the Cellbased model are relatively low due to utilizing the model parameters from literature (Table 4) and the bias in the model itself, lower values of P f ,limit were employed for the comparison. For different P f ,limit values from 2.5 to 15.0%, the P f results were converted to binary values, i.e., stable or unstable, and the metrics in Fig. 12 were calculated. Figure 13 shows the ROC curves of the 3DPLS and the Cell-based models using different P f ,limit values. The metrics such as TPR, FPR, AC, and PR are tabulated in Table 5 for each P f ,limit value for both models. In the ROC graph, a point has better performance than the other if the TPR value is higher and FPR is lower. From both Table 5 and Fig. 13, it can be detected that the TPR values of the Cell-based model are higher than that of 3DPLS for each P f ,limit , but the FPR values are also very high. This means that the Cellbased model overpredicts the spatial extend of the unstable zones, and therefore, the Cell-based model can predict nearly all cells of the landslide but with a high FPR. The performance of the models can be compared by the ratio of TPR∕FPR . The larger the ratio, the better the model performance is (e.g., Baum et al. 2010). The 3DPLS model has a higher ratio of TPR∕FPR than the Cell-based model for all values of P f ,limit . This indicates that the performance of the 3DPLS model is better although it has lower TPR. From Table 5, it  Melchiorre and Frattini (2012), [b] Lacasse and Nadim (1996), [c] Janbu (1989)  can be seen that the 3DPLS model has higher AC and PR values than the Cell-based models for all P f ,limit values. Therefore, the 3DPLS model is more accurate and more precise in predicting despite its low TPR values.

Overall discussion
Investigating the effect of spatial variability on the safety of slopes necessitated the implementation of a 3D model as the parameters vary over the space. Therefore, the 3DPLS model was developed to capture the effects of spatial variability in landslide susceptibility. The 3DPLS model couples the hydrological model calculating the transient pore pressure changes due to rainfall infiltration and the slope stability model utilizing a 3D extension of Bishop's simplified method of slope stability. In the 3DPLS model, the failure surface is assumed to be at a depth that is not necessarily the critical path in the soil volume. Therefore, the assumption of the ellipsoidal sliding surface with given dimensions may result in an overestimated F S or underestimated P f (Griffiths et al. 2011). In the present study, the analyses were performed by placing the ellipsoidal sliding surface at the center of each cell over the discretized problem domain for each simulation. As the number of analyzed ellipsoidal sliding surfaces increases over the study area, there is a higher chance of a cell intersecting a more critical ellipsoidal sliding surface. Further investigation of the number of ellipsoids sufficient to cover a region can be done to improve the efficiency of the model. Another limitation is that the 3DPLS model currently does not support cross-correlation among model parameters which may be important in certain situations as shown in Javankhoshdel and Bathurst (2016).
In this study, the uncertainties associated with meteorological, environmental, and geomorphological factors have not been considered. These uncertainties originating from other sources than the model parameters might affect the results significantly (e.g., Bossi et al. 2019;Sandric et al. 2019). The 3DPLS model is very flexible, and with minor modifications, it allows for modeling of uncertainties in a wide range of parameters including meteorological (e.g., rainfall), hydrological (e.g., permeability), and geomorphological (e.g., depth to bedrock) parameters.
The slope stability models in the 3DPLS model were first validated using the example problems in the literature. Then, the capacity of the 3DPLS model to capture the effects of spatial variability of the soil strength parameters was validated by an extensive study in the FEM-based software. The results given in Fig. 5 reveal that the 3DPLS model is able to capture the effects of spatial variability on the g contrary to the Cell-based model. From Fig. 5, one can also detect that the effect of the spatial variability on the g lowers when the correlation length increases up to 1000 m where Fig. 11 The probability of failure, P f , map of the study area: (a) before and (b) after the rainfall event using the 3DPLS model; (c) before and (d) after the rainfall event using the Cell-based model the soil approaches homogeneous conditions. This shows that the models in the literature attempting to include the variability of the model parameters without spatial dependence of the parameters might overestimate the g depending on the degree of spatial dependence.
The performance of the 3DPLS model was tested on the Kvam landslides that took place in 2011. The rainfall was assumed to be constant over the study area which is a reasonable assumption considering the spatial extent of the study area. However, it should be noted that the effect of spatial variation of rainfall can be significant for problems with a larger extent.
From the results of the case study, it is seen that more critical zones with lower F S can be obtained using the 3DPLS model when the spatial variability is included in the analyses. Despite the lower values of F S with the 3DPLS model, the P f values are lower due to the results having less variability. The Cell-based model, however, has a much higher P f values over the study area although it has higher F S . From Fig. 13 and Table 5, one can detect that the 3DPLS model has better performance, accuracy, and precision than the Cell-based model. While the Cell-based model classified nearly all positives correctly regardless of P f ,limit , it has a higher false-positive rate. This indicates that the Cell-based model overpredicts the extent of the unstable zones. On the contrary, the 3DPLS model predictions are more accurate and precise regardless of having low TPR. In addition, the 3DPLS model has a higher ratio of TPR∕FPR than the Cell-based model, meaning better performance in prediction.
Despite its capacity to improve the landslide prediction, and its better performance than the cell-based approach, one of the main limitations of the 3DPLS model is the computational efficiency. The 3DPLS model performs a high number of Monte Carlo simulations, and this process is time-consuming and computationally demanding. Likewise, the excessive time and memory requirement of probabilistic analysis with Monte Carlo simulations were also addressed in the literature (Rossi et al. 2013;Raia et al. 2014).
Another possible improvement of the 3DPLS model is to utilize parallel computing using the message passing interface (MPI) libraries (e.g., Alvioli and Baum 2016). In this study, the 3DPLS model was implemented on a detailed scale case study to show its power and capacity in prediction. Nevertheless, it is possible, in the future, to implement the model on large-scale problems by optimizing the computational time using parallel computing (e.g., Rossi et al. 2013;Sandric et al. 2019).

Conclusions
This study presented the 3-Dimensional Probabilistic Landslide Susceptibility (3DPLS) model which is a Python-based threedimensional soil-column-based limit equilibrium model being able to model the spatial variability of the model parameters on the susceptibility of shallow landslides. The study presented the importance of the spatial variability on the safety of the shallow landslides, and the capacity of the 3DPLS in capturing these effects was validated. The study demonstrated that the spatial variability of the model parameters might lower the overall safety of the slopes and affect the landslide susceptibility analyses significantly.  Additionally, it was shown that the conventional cell-based approach is not capable of capturing the spatial variability effect as the analyses are performed on a cell-by-cell basis. The 3DPLS model was tested on the Kvam landslides that occurred in 2011, and the results indicated that the proposed 3DPLS model leads to more realistic results with a better prediction performance than its cell-based equivalent model. The study showed that the 3DPLS model contributed to the landslide susceptibility analyses by considering the spatial variability as higher TPR/FPR ratio, AC, and PR were calculated for the model.