1 Introduction

“The past and present are keys to the future” has been the underlying principle over three decades to support statistically-based landslide susceptibility studies (e.g., Calvello et al. 2013; Ercanoglu and Gokceoglu 2004; Ermini et al. 2005; Varnes et al. 1984). This hypothesis implies time-invariance of the slope response. However, if on the one hand it may be true that the effect of predisposing factors and triggers does not change over time because the laws of physics stay the same; it is certainly true that the space-time patterns of the triggers change from one event to another (Van Westen et al. 2008). And yet, the temporal dimension is rarely accounted for (Corominas and Moya 2008; Del Gaudio et al. 2003; Lee et al. 2008) although the susceptibility of a given area is likely to change as a function of the space-time realization of the trigger (Ghosh et al. 2012; Lombardo et al. 2020a).

In physically-based approaches for earthquake-induced landslide susceptibility and hazard assessment, more emphasis has been given to this time variant aspect. Jibson et al. (2000) are one of the first that present a method for producing probabilistic seismic landslide hazard maps, based on Newmark displacement modeling, using detailed data of landslide inventories, strong motion records, geological, geotechnical and topographic data. For the Northridge earthquake they computed the probability of failure in relation to Newmark displacement. Several authors have used ground motion data for different return periods in combination with the Newmark displacement model to analyze shallow landslide probability (e.g., Del Gaudio and Wasowski 2004; Jibson and Michael 2009). Rathje and Saygili (2008) developed displacement hazard curves, that show the exceedance probability of Newmark displacement levels. To account for the uncertainty in input parameters, several approaches have been proposed, such as Monte Carlo simulation (see, Refice and Capolongo 2002) or logic tree approach (see, Wang and Rathje 2015). Nevertheless, despite these advances the application of physically-based approaches for earthquake-induced landslide susceptibility remains problematic in many areas, due to the scarcity of geotechnical data to characterize the soil materials, the lack of soil depth information, lack of landslide inventories for different earthquake events, and the limitation of the Newmark model to shallow slope failures. This paper focuses on the possible contribution of statistical models to earthquake-induced landslide hazard assessment. Several authors have applied statistical techniques for analyzing the relationship between coseismic landslides and causative factors for a given earthquake (e.g., Lee et al. 2008).

To deal with such complexity, the research community dealing with data-driven landslide susceptibility assessment typically follows two lines of analysis:

  • The signal of the trigger is ignored in the landslide susceptibility models (e.g., Cama et al. 2015; Reichenbach et al. 2018). This procedure results in susceptibility maps of landslide occurrence for a given area, ignoring the specific impacts produced by a given trigger. The strength of this procedure consists of delivering simple realizations of the geomorphological responses. And, in practice this is often the only possibility, when the available landslide inventory data lacks the information on date of occurrence (Guzzetti et al. 2012). However, the main downside is due to the fact that the spatial (Lombardo et al. 2018a) and temporal dependence of the landslide distribution is entirely neglected. For instance, the spatial distribution of the trigger intensity induces dependence in the landslide distribution (Lombardo et al. 2019) and the resulting landslides may further induce temporal dependence to subsequent occurrences because of local disturbance, re-activation and re-mobilization processes (Samia et al. 2017).

  • The signal of the trigger is incorporated in the predictive models. This is generally done in near-real time applications where the space-time dynamics of the trigger are added to a baseline susceptibility. Such statistically-based examples can be found in Kirschbaum et al. (2010); Kirschbaum and Stanley (2018) in case of rainfall-induced landslides and in Nowicki Jessee et al. (2018); Tanyas et al. (2019) for the earthquake counterpart. The strength of this type of applications resides in a much closer realization of the hazard definition (Varnes et al. 1984; Fell et al. 2008), where the probability of occurrence is estimated in a given time period and area, though it still does not account for the landslide event magnitude (Guzzetti et al. 1999). For this reason, we will refer to susceptibility models that feature the triggering effect of the ground motion as coseismic susceptibility models in the rest of the manuscript.

The latter case is inevitably much more comprehensive than the former, but it comes with an added level of complexity, and at times even non-feasibility because of the lack of information on on the spatial distribution and intensity of the trigger, and also often a lack of the associated landslide pattern. In fact, an initial stage is required where a susceptibility model is calibrated by using an event-based landslide inventory together with traditional morphometric and thematic properties, as well as the actual pattern of the trigger. On the basis of the calibration stage, the regression coefficients associated with the trigger are estimated and used in a second phase to make a prediction. The prediction is essentially a constantly changing susceptibility/hazard model, where the change is driven by substituting the trigger with near-real time estimates coming from remotely-sensed precipitation, for rainfall-induced landslides (Kirschbaum and Stanley 2018), or by ground motion parameters for seismically induced landslides (Nowicki Jessee et al. 2018).

Currently, these procedures have been implemented by keeping the regression coefficient of the trigger fixed. In other words, the uncertainty around the estimation of this model component has been neglected. In this work, we take a very similar starting point, but accounting for the uncertainty of predisposing and triggering effects via statistical simulations. More specifically, by definition any Bayesian model returns the distribution of each model component. By taking advantage of this structure, we calibrate and validate an initial model where the earthquake-induced landslides caused by the 2017 Jiuzhaigou earthquake (\(M_w = 6.5\)) are modeled via a Bayesian Generalized Additive Model (GAM) by incorporating conditioning factors together with a triggering factor expressed in the form of the Peak Ground Acceleration (PGA) of the Jiuzhaigou earthquake. As a result, we extract the full distribution of the regression coefficients, also associated with the PGA, and simulate thousands realizations of the coseismic susceptibility by substituting the trigger pattern with the analogous ground shaking parameters belonging to any past earthquake, for which we have accessible records, for a time window between 1933 and 2017.

Under the assumption that the functional relationship between the trigger and the landslides is well estimated, and that other causative factors have not significantly changed through time, this procedure allows one to reconstruct the space-time variation of the probability of landslide occurrence under different environmental stresses and to retrieve the distribution of unstable slope throughout the investigated time window, for a given study area.

The manuscript is arranged as follows: in Sect. 2, we describe the study area, the data on the previous earthquakes in the period since 1933 as the characteristics of the inventory associated with the Jiuzhaigou earthquake. In Sect. 3 we describe the subdivision of the area in mapping units, their status (landslide/no-landslide assignment), the explanatory variables’ selection, the modeling framework and the simulations’ structure. Section 4 presents the results which will be discussed and interpreted in Sect. 5. Section 6 summarizes the take home message of this work and the implications of what we propose.

2 Study area

2.1 Geomorphological settings

The study area, known as Jiuzhaigou National Geopark, is located in the Jiuzhaigou County, near the northern boundary of Sichuan province in the southwest of China (see Fig. 1a). It is a part of the Min Mountains range between the Tibetan Plateau and Sichuan Basin, approximately 285km north of the city of Chengdu. Jiuzhaigou was recognised as a World Heritage Site by UNESCO in 1992 and a World Biosphere Reserve in 1997. It is one of the most popular tourist destinations in the region and more than five million tourists visit this outstanding natural landscape each year. Jiuzhaigou is the main tributary of Baishui River, and is one of the sources of Jialing River via the Bailong River, part of the Yangtze River system.

Fig. 1
figure 1

a Geographic context; b study area and regional tectonic setting; c Geological map (acronyms are explained in Table 1), showing the tectonic setting as well as the Jiuzhaigou-earthquake-induced landslide inventory; d zoom-in detail of the area with the highest concentration of landslides

Table 1 Overview of geological formations where the acronym used throughout the text is associated with the corresponding age and lithotype. This information is obtained from the geological map provided by the Jiuzhaigou Management Bureau and published by Cui et al. (2005)

The area features high-altitude karst landscape shaped by glacial, hydrological and tectonic activity (Fig. 1b). The latter results from the influence of the Minjiang fault (northwestern section), the Wenxian-Maqin fault and the Huya fault (Yi et al. 2018). The Huya fault is dominated by left-lateral strike slip. Previous studies have pointed out the north-west section of the Huya fault to be the specific seismogenic source of the Jiuzhaigou earthquake (Fan et al. 2018; Yusheng et al. 2017; Wu et al. 2018). More generally, the Jiuzhaigou area is a seismically active hilly mountainous region which is subjected to more than 50 events with \(M\ge 5\) in the past century (Fan et al. 2018).

The study area extends from latitudes 32\(^{\circ }\)5421 N to 33\(^{\circ }\)169 N and from longitudes 103\(^{\circ }\)4624 E to 104\(^{\circ }\)354 E, covering an extent of approximately 653 km\(^2\) (Fig. 1c). This region belongs to a typical cold semi-arid monsoon climate with annual precipitation of about 704.3 mm. The mean annual temperature is 7.8\(^{\circ }C\), with a minimum of \(-3.7^{\circ }C\) in January and maximum of 16.8\(^{\circ }C\) in July.

Topographically, the elevation ranges from 2000 m to 4828 masl. The study area encompasses three valleys namely, Shuzheng, Rize and Zechawa valleys, arranged in a Y shape. The slope gradients, derived from a SRTM Digital Elevation Model with a spatial resolution of \(\sim\) 30 range from 0\(^{\circ }\) to 78\(^{\circ }\) being generally higher than 30\(^{\circ }\) on average. The main lithological formations in the study area are Devonian, Carboniferous, Permian, Triassic, Dolomite and Quaternary and consist of carbonate rocks such as dolomite and tufa, as well as some sandstone and limestone (see Fig. 1 and Table 1). Because of the complex tectonic settings, ten main and several small faults dissect the carbonatic lithotypes, leaving the rock masses weakened by a large number of joints and cracks.

2.2 Jiuzhaigou earthquake

On 8th August 2017, an earthquake of magnitude \(M_w~6.5\) struck the Jiuzhaigou county, belonging to the Sichuan province, China. It was the third strong earthquake in the region in the past 11 years, after the 2008 \(M_w\) 7.9 Wenchuan earthquake and the 2013 \(M_w\) 6.6 Lushan earthquake (Fan et al. 2018). The epicentre of this earthquake was only 5 km west of the Jiuzhaigou National Park, where the touristic infrastructure of the UNESCO world heritage site was damaged, 31 person were killed and 525 were injured (Wang et al. 2018). This event affected to different extents more than 175,000 people, both tourists and locals. More than 73,000 buildings were damaged 76 of which collapsed. The scenic spot was temporarily closed after the earthquake and reopened only after two years (on October 8, 2019), which severely affected the economy around Jiuzhaigou and the Aba Prefecture.

The ground motion not only directly damaged properties and lives but, in a cascading effect, it also caused widespread landsliding, which in turn contributed to increase the overall losses.

Due to the destructive impact of this earthquake, several studies have focused on studying its characteristics and the chain of hazards that cascaded in the area, but also pre-existing conditions. For instance, Cao et al. (2020) has recently modeled the pre-earthquake landslide susceptibility between 2000 and 2015, whereas Yi et al. (2019) modeled the coseismic landslide susceptibility. Conversely, Chen et al. (2020) reconstructed the spatial distribution of coseismic unstable slopes by using the Newmark approach. Chen et al. (2018) and Wang et al. (2018) focused instead on studying coseismic landslide occurrences and their impact on infrastructure. Hu et al. (2019a) and Lu et al. (2019) automatically detected the coseismic landslide signature and Hu et al. (2019b) numerically simulated potential post-earthquake debris-flows.

2.3 Earthquake history

The Jiuzhaigou National Geopark is located in the transition zone of the western margin of the Sichuan Basin and the Qinghai-Tibet Plateau which features tectonically active mountains characterized by narrow and steep valleys. Numerous moderate to large earthquakes have been recorded in the past century. Specifically, spanning a 200km radius from the Jiuzhaigou epicenter, the USGS earthquake catalog (Earle et al. 2009) reports 76 earthquakes with magnitude between 5 and 8. Among them, seven earthquakes have a magnitude \(6.0\le M_W \le 7.0\), and only one above 7.0, which corresponds to the Diexi earthquake.

To reconstruct the coseismic susceptibility patterns due to past earthquakes, we examined all the available earthquakes which have affected the study area and for which the associated ground motion characteristics were available at the ShakeMap service of the US Geological Survey (USGS) (Allen et al. 2008; Wald et al. 2008). We collated 7 past earthquakes here listed in Table 2 and geographically shown in Fig. 2. The \(M_W\) range since 1933 spans from a minimum of 5.7 to a maximum of 7.3, with most of the epicenters located to the south and outside the boundary of the study area. And, two more recent earthquakes which occurred to the north and within the study area.

Table 2 List of eartquake which have affected the study area and for which the shaking levels were digitally accessible in the ShakeMap Atlas (Allen et al. 2008)
Fig. 2
figure 2

a Location of the epicenters and b associated PGA levels (shown as contour lines) of all the earthquakes available in the ShakeMap Atlas (Allen et al. 2008), which have struck the area since 1933)

This is more evident in Fig. 3 where we focus on the study area. In fact the highest PGAs are recorded in the northern sector, these being associated with the 2017 Jiuzhaigou earthquake. The remaining patterns generally show a northward increase of PGA levels.

Fig. 3
figure 3

PGA patterns affecting the study area for each of the earthquakes under examination (Source: USGS ShakeMap system García et al. 2012). Notably, there is no strict minimum PGA reported in the literature to trigger landslides (Sassa et al. 2007). However, several articles have reported that the vast majority of earthquake-induced landslides trigger with a minimum threshold of 0.02 g (Jibson and Harp 2016; Tanyaş and Lombardo 2019), which is also contained in every PGA map shown in this figure: a \(0.03<PGA< 0.07\); b \(0.03<PGA<0.1\); c \(0.04<PGA<0.25\); d \(0.02<PGA<0.1\); e \(0.04<PGA<0.2\); f \(0.04<PGA<0.09\); g \(0.02<PGA<0.05\); h \(0.08<PGA<0.37\)

2.4 Landslide mapping

The preparation of a reliable and accurate landslide inventory map recording the location, spatial extent and landslide characteristics is essential for any susceptibility analysis (Guzzetti et al. 2012; Harp et al. 2011). In case of earthquake-induced landslides (EQIL), the quality and completeness of an inventory affects the coseismic susceptibility estimates of any landslide affected area (Tanyaş et al. 2018).

For this reason, we undertook a multi-source mapping procedure to discriminate pre- and co-seismic landslides. The landslide inventory was carried out through visual interpretation of high-resolution images with different resolutions and sources (see Table 3). Remotely sensed scenes, Unmanned Aerial Vehicle (UAV) photographs and subsequent detailed field surveys were used for mapping the landslides source and deposition areas.

More specifically, the existence of landslides prior to the earthquake was investigated by using a Spot-5 scene (2.5 m resolution) acquired on 21 December 2015. These pre-earthquake inventories were used to isolate coseismic landslides (new activations and re-activations) mapped via Gaofen-1 and Gaofen-2 satellite images (1m resolution) acquired on 16 August and 9 August 2017, respectively. And, we also refined the mapping procedure by additionally examining UAV photos (0.2m resolution) of key areas acquired during field surveys (Table 3). Some examples of identified landslides are shown in Fig. 4.

Table 3 Overview of the data used for mapping landslides. EQ stands for earthquake

Overall, we mapped 1022 landslides associated with the Jiuzhaigou earthquake. The total landslide area sums up to 3.88 km\(^2\) covering approximately 0.6% of the study area. They mainly occurred along the valleys, roads or rivers (see Fig. 4). Notably, Fan et al. (2018), report 1883 landslides triggered for the same earthquake, although this information is the result of a much wider study area, furter extended to the North West (see Figure 5 in their article) compared to the area where we focus in the present work.

Fig. 4
figure 4

Examples of landslide interpretation based on pre- and post-earthquake high resolution satellite and UAV images. Panels a, b and c correspond to pre-earthquake conditions; A, B, C shows post-earthquake satellite images; UA, UB, UC are photographs acquired using drones

The failure mechanisms consisted of shallow rock or debris slides with minor rockfall occurrences (Varnes 1978; Hungr et al. 2014) spanning from small to moderate landslide in size. This is shown in Fig. 5 where the Frequency Area Distribution shows a rollover point at approximately 100 m\(^2\), a minimum landslide area of around 30 m\(^2\) and a well recognizable distribution, traditionally explained by a Inverse-Gamma distribution (Fan et al. 2019). These characteristics have been recently recognized for earthquake-induced landslide inventories of good quality and completeness (Tanyaş et al. 2018; Tanyaş and Lombardo 2020).

Fig. 5
figure 5

Frequency area distribution of coseismic landslides

3 Modeling strategy

3.1 Mapping unit

The first requirement of any landslide susceptibility model is the choice of the type of mapping unit used in the statistical analysis. The most common choice corresponds to a regular squared lattice or grid (e.g., Cama et al. 2016; Hussin et al. 2016; Reichenbach et al. 2018). However, this mapping unit is sensible to mapping errors (Steger et al. 2016) and the assignment of the instability status is inherently uncertain as it is often subjectively chosen between the centroid of the landslide body (e.g., Hussin et al. 2016; Castro Camilo et al. 2017) or the highest point along the landslide polygon (e.g., Amato et al. 2019; Lombardo et al. 2014).

To avoid the issues concerning the grid-based choice, here we opted for a Slope Unit (SU) partition (Carrara et al. 1995). We computed the SUs by using the r.slopeunits software made accessible by Alvioli et al. (2016). More specifically, we parameterized r.slopeunits with a very large Flow Accumulation threshold (800,000 m\(^2\)) to scan the area with a large configuration of possible SU arrangements, whose conversion to the final setup we controlled via a small minimum slope unit area of 10000 m\(^2\).

When at least one centroid of a landslide initiation polygon fell inside a SU, the mapping unit was assigned a positive landslide status, and vice-versa for the non-landslide status.

3.2 Covariates

The covariate set we chose features eight morphometric properties, two related to the geological setting, and one related to the Peak Ground Acceleration (Table 4).

The morphometric covariates were derived from the a 30 m resolution DEM provided by Sichuan Bureau of Surveying, Mapping and Geoinformation, using a number of different methods, following the references indicated in Table 4.

The same institute provided the Geological Map of the area, which we rasterized to coincide with the DEM resolution. We also generated a bedding map by digitizing strike and dip measurements collected in the field and subsequently crossing this parameters together with the exposition of any given lithotype reported in the geological map (see Table 5). More specifically, we followed the idea introduced by Ghosh et al. (2010) and the approach later proposed by Santangelo et al. (2015) exploiting a total sample of 1490 dip measurement, which we then grouped by lithology, as follows: 125 in Q, 123 in T, 133 in P, 368 in Cp, 560 in C, 79 in Dc, 102 in Dd.

From the fault map provided by the Exploitation of Mineral Resources, we computed the Euclidean distance from each pixel in the study area to the nearest fault line. And, we repeated the same operation to compute the Euclidean distance from the river network.

Because the SUs actually contain a distribution of pixel values for each covariate, we summarized the covariate information at the SU level by computing mean (\(\mu\), hereafter) and standard deviation (\(\sigma\), hereafter) of continuous properties. As for the categorical covariates such as bedding and geology, we extracted the most represented class per SU.

Here we will share the rational behind our choice of the covariates, with a particular emphasis on DEM-derived attributes. The covariate set features the elevation as a proxy for relief effects that carry the signal of the potential energy distribution across the landscape (Görüm 2019). The slope steepness influences the balance between retaining and destabilising forces (Wu and Sidle 1995). Profile and planar curvatures control divergence and convergence of both surface runoff and shallow gravitational stresses (Ohlmacher 2007). Eastness and Northness bring the signal of the slope exposition, being a proxy for the distribution of dry and wet soils as a function of solar lighting (Epifânio et al. 2014). Relative Slope Position informs the model about the site location along the topographic profile (Böhner and Selige 2006). The Topographic Wetness Index is a proxy for the terrain influence on retained water coming from upslope contributing areas (Gokceoglu et al. 2005). Distance to Stream can express the natural effect of overland flows as a predisposing factor for slope undercutting (Donati and Turrini 2002).

Table 4 List of covariates used for the modeling phase. The Covariate Name we report corresponds to the covariate we computed at the pixel level, followed by its Resolution or Scale. The Reference describes the method to compute it and the Acronym reports how we will refer to each covariate throughout the text

Each of the continous covariates listed in Table 4 was rescaled with mean zero and unit variance, or in other words, by subtracting every covariate value per SU with the mean covariate value of all the SUs in the study area and ultimately dividing the result by the standard deviation of the covariate values of all the SUs in the area. This procedure ensures that the covariate effects estimated during the modeling phase are all in the same scale, enabling the interpretation of dominant properties on the slope stability process.

Ultimately, before running any analysis we have tested for potential collinearity issues between the \(PGA_{\mu }\) and the other contrinuous properties. This is summarized in Fig. 6 where the maximum Pearson correlation coefficient, in absolute value, is found with respect to \(Elevation_{\mu }\), being equal to \(\sim\)0.56, and distant from the common 0.7 correlation threshold deemed to bring collinearity issues (Castro Camilo et al. 2017).

Fig. 6
figure 6

Pairwise correlations between the ground motion and the other continuous properties

3.3 Landslide susceptibility via generalized additive model

A Generalized Additive Model (GAM) is an extension of the common Generalized Linear Model used in the vast majority of landslide susceptibility studies (Reichenbach et al. 2018). The added value corresponds to the ability of estimating both linear and nonlinear relationships between covariates and landslide occurrences. Nonlinearities can be modeled as pure categorical, or more precisely as independent and identically-distributed random variables (iid), as well as ordinal, corresponding to a model with adjacent inter-class dependence, which we here implemented as a First Order Random Walk (RW1), details provided in Bakka et al. (2018); Krainski et al. (2018). Such implementation allows one to obtain different regression coefficients for each of the considered portions of a covariate reclassified from a continuous property while simultaneously constraining the ordinal dependence between adjacent classes via a spline interpolation. This procedure has been recently introduced and explained in (Lombardo et al. 2018a) in a similar setting although similar modeling approaches have already been tested for landslide susceptibility via machine learning routines such as Multivariate Adaptive Regression Splines (see, Conoscenti et al. 2016).

Here we summarize some definitions that will be useful later through the text. A Generalized Linear Model which assumes a Bernoulli probability distribution as the underlying stochastic process can be summarized as follows:

$$\begin{aligned} \eta (P)=\beta _0 + \beta _1 x_1 + \cdots + \beta _jx_j + f(Xm) \end{aligned}$$
(1)

where \(\eta\) is the logic link, \(\beta _0\) is the global intercept, \(\beta\) are regression coefficients estimated assuming a linear relationship between landslide occurrence and the given covariate \(x_j\). and f can be any function we implement to model discrete covariates (\(X_m\)). In this work we used a iid implementation in case of discrete classes independent from each other, and a RW1 implementation for ordinal cases, where adjacent classes retain a ordered structure which we want to account for in the modeling stage.

It is important to note that the right term of Eq. (1) is referred to as the “linear predictor” and corresponds to the combination of each model component, or in other words, to the sum of all the terms namely, intercept, fixed (covariates linearly modeled) and random effects (covariates nonlinearly modeled).

Once the model estimates the linear predictor, the conversion into probabilities is obtained via the logic link \(\eta\) as follows:

$$\begin{aligned} P= \frac{e^{\beta _0 + \beta _1 x_1 + \cdots + \beta _jx_j + f(Xm)}}{1 + e^{\beta _0 + \beta _1 x_1 + \cdots + \beta _jx_j + f(Xm)}} \end{aligned}$$
(2)

The estimated probabilities can then be intersected with a known (for calibration) and unknown (for validation) vector of presence/absence cases to assess the performance of any model. Here we calibrated over each SU in the area. And, we implemented a 10-fold cross validation scheme where the model is trained with 90% of the available SUs but constraining the random sampling of the complementary 10% to extract each SU only once. In turn, the combination of the 10 subsets used for validation returns the whole study area, producing a fully predicted map.

Before moving to the explanation of the statistical simulations, we remind here the reader that any Bayesian model returns a distribution of potential values for each model component, be it the global intercept, each class of a discrete property (both iid and RW1), the regression coefficients of covariates used linearly and even the outcome (here a probability of landslide occurrence) itself (Lombardo et al. 2020b).

3.4 Statistical simulations

Once a statistical model is estimated, one can always generate any number of simulations from it by randomly sampling from the distribution of each model component and solving for the specific predictive equation.

This is particularly intuitive in the Bayesian setting where each model component is expressed with a distribution of values. Therefore, a posteriori, one can generate any number of regression coefficient values following the estimated distribution, compute the series of products and sums to calculate the linear predictor and finally transform the result into susceptibilities via the logit link fuction. In this work, we use the same structure explained above. However, we apply some changes to retrieve the coseismic susceptibility patterns according to past ground shaking, for which we have digital information throughout the history of the Jiuzhaigou area.

More specifically, we developed an initial model fitted and validated by using coseismic landslides and the set of covariates listed in Table 4, where the PGA corresponded to the Jiuzhaigou earthquake. Once we tested the prediction skill of the model (reported in Sect. 4) we implemented the following simulation stages, also graphically summarized in Fig. 7:

  • We simulated 1000 realizations of the linear predictor estimated from our binomial GAM implementation for each of the SUs, from the initial model where the ground motion effect onto the coseismic susceptibility map was carried by the product of the PGA regression coefficient and the PGA values. Each simulation essential draws one random sample from all regression coefficients’ distributions, multiply the sample for the corresponding covariate value and sum all separate model components together.

  • From the 1000 simulations we subtracted the product between each randomly generated \(\beta _{PGA_{\mu 2017}}\) sample and the \({PGA}_{\mu 2017}\) values for each SU.

  • We added the product between the each randomly generated \(\beta _{PGA_{\mu 2017}}\) sample and the vector of \({PGA}_{\mu }\) values for each SU, this time coming from one of the 7 past alternatives of PGA pattern listed in Table 2.

  • We stored the 1000 simulations, converted them all into coseismic susceptibilities by using Equation (2).

  • We calculated the mean coseismic susceptibility of the 1000 simulations for each SU and each past earthquake.

  • We calculated the 95% Credible Interval (the difference between the 97.5% and the 2.5% percentiles of the 1000 simulations) for each SU and each past earthquake.

  • We calculated the mean and 95% Credible Interval of all the mean coseismic susceptibility maps obtained from 1933 to 2017.

Fig. 7
figure 7

Graphical sketch of the simulation routine

4 Results

This section includes the assessment of the 10-fold cross-validation routine. This is followed by the description of the significant covariate effects (for which the model is at least 95% certain that the contribution is either positive or negative, i.e., the estimated distribution does not contain zero). This is complemented by summarizing the reference coseismic susceptibility model for the 2017 Jiuzhaigou earthquake in map form and the resulting descripting statistics for the 1000 simulations for each of the 7 earthquake under examination. Ultimately, we show the summary maps for the coseismic susceptibility model which combines the signal of all possible susceptibility arrangements for the period between 1933 and 2017.

4.1 Predictive performance

Figure 8 shows the 10-fold cross-validation we performed, which classifies as outstanding according to Hosmer and Lemeshow (2000). The x-axis reports the False Positive Rate (FPR or 1-Sensitivity) whereas the y-axis corresponds to the True Positive Rate (TPR or Specificity), measured at various probability cutoffs (for more details, see Fluss et al. 2005; Rahmati et al. 2019). The 10 AUCs (computed as the integral under the ROC curves) we obtained are all confined above 0.9 (with a median of approximately 0.93). And, the variability associated with each cross-validated subset is small. This can be seen both in Fig. 8a where the ROC curves do not spread over the 2D space defined between sensitivity and specificity. The same is also valid for Fig. 8b where the interquartile distance is approximately 0.03 and the difference between minimum and maximum AUC is only 0.07.

Fig. 8
figure 8

Left panel shows the ten cross-validated Receiver Operating Characteristics curves of the reference model featuring the PGA from the Jiuzhaigou earthquake. The right panel summarizes the associated AUC distribution

4.2 Covariates’ effects

This section reports the covariates effects for which we recall that positive regression coefficients contribute to increase the coseismic landslide susceptibility and negative coefficients contribute to reduce it. As for regression coefficients equal or near the zero value, these are representative of covariates estimated to play a negligible role with respect to the coseismic landslide susceptibility. Being the modeling routines framed in a Bayesian formulation, the description made above needs to be considered with respect to the whole regression coefficients’ distributions.

Figure 9 shows the significant fixed effects, or the regression coefficients of those covariates that have been used linearly.

Fig. 9
figure 9

Significant fixed effects shown through their estimated posterior distributions summarized with mean (blue rhombi) and 95% CI (in black)

The distribution of all regression coefficients appears quite narrow despite the relatively small sample size, and well determined by the model. The most striking characteristics is associated to the \(\beta _{PGA_2017}\) which appears to dominate the coseismic susceptibility pattern. In fact, being the covariates rescaled (see, Sect. 3.2), the fixed effects reported in Fig. 9 are directly comparable, which makes the PGA contribution, in absolute value, much larger than any other covariate, taking aside the \(Elev_\mu\), which has an opposite role to the PGA.

With regards to the random effects, or those that have been modeled nonlinearly, Fig. 10 shows the descriptive statistics of two categorical and two ordinal cases.

Fig. 10
figure 10

The first row reports the random iid effects, for each class of Bedding and Geology (the acronyms are explained in Tables 5 and 1 ), through their estimated posterior distributions summarized with mean (blue rhombi) the 95% CI (in black). The second row shows the random ordinal (RW1) effects, of \(Slope_\mu\) and \(RSP_\mu\) (Relative Slope Position), via their posterior distributions where the mean is highlighted in blue and the the 95% CI in black

Overall, both Bedding and Geology do not appear to be significant (the zero lines cross the distribution of each categorical class) although certain classes have a posterior mean quite far (both positively and negatively) from being null. Therefore, despite the overall non-significance, on average some classes can still contribute to the spatial pattern of the final coseismic susceptibility model/map. A similar consideration can be made for \(Slope_\mu\) and \(RSP_\mu\). Both covariates have their distribution crossed by the zero line. However, the posterior mean of \(RSP_\mu\) is clearly aligned with zero making its impact negligible with respect to the final coseismic susceptibility model/map. As for the posterior mean of \(Slope_\mu\), this is constantly quite far from zero and shows a clear, nonlinear and increasing trend from low to high slope steepness values. More specifically, the \(Slope_\mu\) contribution becomes increasingly positive for slopes above 27 degrees of steepness.

4.3 Coseismic landslide susceptibility mapping

Figure 11 shows the summary statistics of the reference model for the 2017 Jiuzhaigou earthquake in map form (posterior mean and 95% CI), together with the error plot. The mean coseismic landslide susceptibility map for the Jiuzhaigou earthquake shows a southward decreasing trend due to the dominant contribution of the \(PGA_{\mu _{2017}}\). The uncertainty associated to the mean appears relatively small in the southernmost sector of the study area although it shows a much larger spread to the north. The error plot or mean versus 95% CI (Fig. 11c) is meant to evaluate whether these estimates are reasonable. In fact, an ideal model should reliably predict very low and very high probability values. In other words, the left and right tail of the posterior mean probability distribution should be associated to a very limited uncertainty. Conversely, the central portion of the posterior mean probability distribution is intrinsically much more difficult (i.e., uncertain) to be determined (Reichenbach et al. 2018) and therefore a larger spread can be reasonably accepted (Rossi et al. 2010).

Fig. 11
figure 11

Posterior mean (panel a) and 95% CI (panel b) of the reference coseismic susceptibility model generated with the Peak Ground Acceleration of the Jiuzhaigou earthquake. Panel c shows the error plot where the two maps in the first and second panel are plotted against each other, the colorbar indicates the point density per pixel in the two-dimensional space defined between the posterior mean and 95% CI

4.4 Statistical simulations

For each of the seven other earthquakes (see Table 2) that occurred in the study area before the Jiuzhaigou earthquake, we also simulated 1000 realizations of the coseismic susceptibility patterns by replacing the PGA map of the 2017 Jiuzhaigou earthquake with the ones for the particular earthquakes. To summarize the 1000 simulations we show in Fig.  12 the main statistical moment as well as the 95% CI, separately for each scenario.

Fig. 12
figure 12

Mean simulated coseismic susceptibility maps (a to g) and 95% CI maps (A to G), for each of the seven earthquakes occurred in the area prior to the Jiuzhaigou earthquake

As a result, the coseismic susceptibility pattern clearly changes as a function of the various PGA patterns of the earthquake, some of which are located to the south of the study area. More specifically, being the majority of past epicenters located to the south, the largest coseismic susceptibility values are shown near the catchment outlet. And, similarly to the spatial patterns shown in Fig.  11, the uncertainty closely follows the coseismic susceptibility trend by increasing as the probability increases and decreasing towards the highest mean probability again. This is expected because being the PGA map the largest and positive contributor to the reference model (see Fig. 9), i.e., whenever the PGA levels are low, the slopes are estimated with proportionally low coseismic susceptibilities. The opposite is also true, for whenever the PGA levels are high, this effect dominates the coseismic susceptibility and the other model components have a negligible effect, hence low variations. Conversely, whenever the PGA is in between these two extreme situations, the model becomes more uncertain because the contributions from the other model components becomes more relevant and lead to an increased variability, hence larger uncertainty around the mean coseismic susceptibility behavior.

Ultimately, we generated a combined probability which would account for all the possible variations in the coseismic susceptibility patterns as a result of the contributions of the ground motions experienced from 1933 to 2017. To achieve this, we combined the mean coseismic susceptibility map of the reference model calibrated on the Jiuzhaigou earthquake (Fig. 11) and the mean simulated coseismic susceptibility of the other seven earthquakes (Fig. 12). Ultimately, to compress the information carried by the multiple scenarios, we compute the mean and 95% interval across the whole time series. This is shown in Fig.  13 where we choose to show the 95% CI in its separate components, i.e., the 2.5 percentile and the 97.5 percentile to plot the best and worst case scenarios that the area would exhibit across the considered time span. As for most representative pattern since 1933, the mean map delivers such information.

Fig. 13
figure 13

Quantile 0.025 (a), mean (b) and quantile 0.975 (c) of the distribution of all the possible posterior means featuring different ground motion scenarios from 1933 to 2017

5 Discussion

5.1 Reference model interpretation

In this work, we attempted to combine the ground motion patterns into an overall earthquake-induced landslide susceptibility map for the time period between 1933 and 2017. The reference model which was validated for the Jiuzhaigou earthquake performs with outstanding results (Hosmer and Lemeshow 2000) suggesting that the influence of each model component is well determined (Fig. 8). We prove this in Fig. 9 where the regression coefficients can be easily interpreted with respect to the slope instability process, although the same cannot be said for the categorical cases corresponding to the Geology and Bedding. This could be an effect due to the complexity of representing thematic information at the SU level. In fact, one of the most difficult tasks when creating susceptibility models with mapping units different from the grid-cell case consists of the loss of high-resolution information. In fact, at the SU level or catchment or any large mapping unit, one needs to approximate the distribution of properties that vary at small spatial scales (e.g., Dreyfus et al. 2013). For a continuous factor such as Slope or any other terrain derivative, this comes relatively easy by computing some descriptive statistics such as the mean and standard deviation (same as we did here) or a much finer description into quantiles (e.g., Amato et al. 2019). However, for a geological map or a bedding measurement, representing these two properties at the SU level is more complex. One could opt to assign to a given SU the most representative categorical class (same as we did here) or one could compute the percentage extent of each categorical class with respect to the given SU (e.g., Castro Camilo et al. 2017; Guzzetti et al. 2006). Here, we opted for the most representative class to minimize the model complexity due to the subsequent simulations stages. However, this may have neglected a more informative use of the Bedding and Geology in the model, which we remind here, outstandingly performed nevertheless.

With respect to each model component, the mean distance to all the tectonic lines shows a negative contribution (mean \(\beta _{Dist2F_{\mu }} = -0.60\)). This is generally counter-intuitive as one would expect the closer the seismogenic fault, the higher the probability of landslide occurrence. However, this assumption is not valid for two reasons. The first is that the proximity to the rupturing fault is already part of the PGA information. Therefore, we computed the distance from any fault dissecting the carbonate rocks in the area. As a result, the weakening effect of the fault traces onto the rock mass strength, increases the chance that the loose material draping over the landscape gets removed by regular or common erosion. The regression coefficient of the mean Elevation per SU appears to be negative (mean \(\beta _{Elev_{\mu }} = -2.17\)). This could be a characteristic of the landslides in study area, which have mostly been recognised in the lower portions of the topography range (see Figs. 1 and 4 ). It is also worth noting that the PGA map, does not account for topographic amplification. Therefore, some confounding effect may still be present between Elevation and PGA. A much more reliable model could actually be obtained by using a ground shaking parameter which includes topographic and possibly soil amplification. By doing this, any landslide predictive model should experience an increase in performance and should provide a much clearer interpretation of each covariate role.

The mean planar (mean \(\beta _{PLC_{\mu }} = -0.63\)) and profile (mean \(\beta _{PRC_{\mu }} = 0.62\)) curvatures show an opposite contribution to the model, where the former favors slope instability in SUs which are predominantly sidewardly concave and the latter contributes to increase the coseismic susceptibility in SUs which are upwardly concave.

The standard deviation of the RSP appears to play a major role in explaining the slope instability, with a significant and positive mean coefficients which implies that as the \(RSP_{\mu }\) increases, the co-seismic landslide susceptibility increases as well. Being the RSP a normalized elevation where the minimum values is assigned to the theoretical floodplain and the maximum value assigned to local ridges, a large standard deviation of this covariate in a given SU implies a large topographic roughness. As a result, the large and positive mean regression coefficient (mean \(\beta _{RSP_{\sigma }} = 0.69\)) is a reasonable result. A similar signal is carried by the standard deviation of the slope steepness per SU. This covariate is also a proxy for topographic roughness and here (Fig.  9) it is reasonably shown to positively contribute to slope instability (mean \(\beta _{Slope_{\sigma }} = 0.38\)). The topographic wetness index expresses the morphometric effect to retain water as a function of local slope steepness values and the upslope contributing area. Hence, as the TWI increases it generally expresses locations in the landscape corresponding to flat areas where water flows receiving water flows from upslope, i.e., rivers or floodplains. In this work, we have used the most recent version of the software r.slopeunits by Alvioli et al. (2016), which directly removes flat topographies from the SU calculations. And yet, due to the extremely rough landscape, no SUs have been removed. By removing floodplains, one could expect the TWI effect to be positive and to express the effect in pore pressure increase in portion of the slope hanging in the highest sectors of the landscape. However, because r.slopeunits could not remove any large flat area, a negative contribution of the TWI may hint for localized conditions near to the river network. This is why we interpret the negative role of the TWI as reasonable in our model (mean \(\beta _{TWI_{\mu }} = -0.65\)). Ultimately, the PGA effect is shown to have the largest impact on the coseismic susceptibility estimates with a positive contribution (mean \(\beta _{PGA_{\mu }} = 2.51\)). Because the model recognizes its contribution to be significant and with a narrow credible interval around the mean regression coefficient, we enabled the simulations for previous earthquake occurrences.

Contrary to our expectations, the covariates related to the Lithology (Geology) and structural geology (Bedding) did not appear to be significant. We initially expected a much stronger and significant lithological and structural contributions on the basis of the analyses reported in Fan et al. (2018). There, for the Jiuzhaigou earthquake, albeit for a much larger area, they highlight that a lithological control on landslides stands out in their data, especially for limestones which show the highest values of landslide densities. The differences between the results shown in Fan et al. (2018) and ours, may be due to the different size of the study area but they may also be related to the different approaches and below we will provide a brief interpretation on this matter. First of all, our choice of a SU partition requires an aggregation step from the pixel based resolution. Here we opted to assign the most representative lithology to each SU. Therefore, other lithotypes that may occupy just a small portion of a SU will not be assigned to it, although they may still be responsible for the initial failure mechanism. As for the structural information, three causes may have “diluted” its effect on the final coseismic susceptibility. The first one may corresponds to the number of measurements used to generate the structural geological classes (B1 to B5). We collected 1490 measurements of strike angle, dip angle and dip direction, but they still might have been too limited with respect to the whole study area. The second reason may be due to the procedure we used to regionalize the data into meaningful bedding classes. In the end, any model produces some degrees of errors and uncertainties. And, the aggregation required by the mapping unit we chose may have also played a role, smoothing out this signal when we assigned the most representative bedding to each SU. Some improvements can be made by collecting more information, increasing the density of measurements over the study area. For similar reasons, other relevant factors such as soil types and soil depth could not be taken into account due to lack of input data. Overall, Bedding and Geology did not appear to be significant across each respective classes, which could also be sue to the shallow nature of the landslides we mapped. Nevertheless, taking the significance aside, some of the classes showed a posterior mean quite different from zero, which implies that the contribution to the posterior coseismic susceptibility mean would still be sensitive to such classes. It is worth to note that significance strongly depends on sample size and being the SU 1234 in number, a relatively large credible interval is to be expected. Therefore, here we try to provide an interpretation of the Bedding and Geology roles solely based on the posterior mean contribution, disregarding the rest of the posterior distribution of each class.

For Bedding, the largest contribution to the the mean coseismic susceptibility is carried by B5, or Down-Dip slope (see Fig.  14 and Table 5), with a mean regression coefficient of 0.24. Despite the limited contribution compared to the other classes in absolute value, B3 (or Reverse oblique slope) also increases the mean coseismic susceptibility with a mean regression coefficient of 0.10. This is surprising because the most intuitive bedding type would have been B4, also in consideration of the vast majority of landslides being rock-slides. However, the meaning of a nonsignificant covariate indicates that the model is not certain of the covariate role (negative or positive) and therefore, being also the Bedding coefficients close to zero across classes, we can disregard these limited differences and their interpretation with respect to the expected bedding behaviour.

Fig. 14
figure 14

Graphical sketch of Bedding types obtained in the study area

Table 5 Overview of bedding types listed with the same acronyms used throughout the manuscript. The rational corresponds to the relationship we used to calculate the bedding, where Asp is the slope aspect [0\(^\circ\),360\(^\circ\)), Dip is the dip direction [0\(^\circ\), 360\(^\circ\)). From these two values we compute the absolute difference \(AbsDiff=|Asp-Dip|\) and reclassify it to obtain the strata attitude or bedding

Similarly, Geology highlights two positive classes, on average, these being C, or Carboniferous limestone, and Cp, or Carboniferous-Permian limestone (see Table 1), with respective mean coefficients of 0.25 and 0.10. This result is well in line with Fan et al. (2018). Overall the area essentially comprises calcareous formations whose difference is mainly driven from the fracture system dictated from the tectonic compressive regime. As a consequence, we may infer from a positive mean regression coefficient that lithologies C and Cp may have a higher degree of weathering and fracture network.

A common test in susceptibility studies to assess how reasonable a model is consists of checking the regression coefficient of the slope steepness. If the slope is estimated to contribute negatively to the model, then either the model is wrong or some variable interaction effects need to be dealt with prior to the modeling phase. Our reference model for the Jiuzhaigou earthquake estimates a positive contribution of the mean slope per SU (see Fig. 10), supporting our assumption that the model reliably recognizes the functional relations between causative factors and landslides. Being the \(Slope_\mu\) modeled as a nonlinear ordinal covariate, the posterior mean and 95% CI trends support this choice. More specifically, the slope classes between 9 and 20 degrees contributes negatively to the coseismic susceptibility; and, as of 20 degrees to 34 the contribution to slope instability increases quite linearly becoming positive at 27. From 34 degrees to 46 the contribution does not substantially change. This nonlinear trend well aligns with the observations made by Parise and Jibson (2000). There, the authors summarized the slope distribution for the landslides triggered by the 1994 Northridge earthquake and showed that, for steeper slopes, the curve rolls over because very steep slopes are composed made of hard rocks.

The \(RSP_\mu\) appears to be not significant and even the posterior mean aligns along the zero line (Fig. 10), which in turn means that taking aside the significance, the average contribution of this covariate to the coseismic susceptibility is negligible. This position index is a variable that should capture whether landslides are located closer to ridges (high values), mid slope, or lower on the slopes. Therefore, for positions close to the ridge, the RSP should be sensitive to slope failures where topographic amplification occurs (see, Meunier et al. 2008). The fact that the relation is weak may be due to the Slope Unit scale. In fact, within a slope unit, the RSP signal can be disrupted because of the aggregation step from pixels to Slope Units. Smoothing the the original pixel-based RSP signal to an extent where the link between landslides and their triggering location along the topographic profile gets lost.

5.2 Coseismic susceptibility overview

The coseismic susceptibility map associated to the Jiuzhaigou earthquake shows a spatial pattern well correlated to the Jiuzhaigou PGA (Fig. 11), implying the dominance of the shaking signal onto the final model. We could only build and validate our model for the Jiuzhaigou case because the only earthquake-induced landslide inventory available in the area corresponds to the coseismic Jiuzhaigou landslides. Therefore, this model is our reference which we used to infer the PGA effect in the study area over the landslide occurrences and retro-project it to the previous seven earthquakes. It is important to note, that we could have modeled the PGA as a nonlinear ordinal property by reclassifying it and using a RW1 same as we did for the \(Slope_\mu\) and \(RSP_\mu\). However, in doing this we would have calibrated the model on a predefined PGA range, specific of the Jiuzhaigou earthquake (range between 0.08 and 0.36 g). As a result, it would have been a complex task to extrapolate the PGA effect for the other PGA maps (overall range between 0.02 and 0.2 g) outside the Jiuzhaigou PGA classified values. For this reason, here we opt to use the Jiuzhaigou \(PGA_\mu\) as a linear property, to extrapolate the PGA effect outside the Jiuzhaigou \(PGA_\mu\) limit. Thanks to this we simulated by using one single parameter distribution for the PGA effect and retrieved one thousand simulated scenarios for each past earthquake (see Fig. 12). As mentioned before, being the past epicenters mostly located to the south of the study area, there the mean simulated coseismic susceptibility show the highest values.

This is also reflected in the combined coseismic susceptibility maps shown in Fig. 13. The novelty in the simulation procedure we propose is clearly highlighted in this maps which, unless simulated could have not been produced otherwise. In fact, by incorporating different PGA contributions, our combined coseismic susceptibility essentially shows the best, average and worst susceptibility scenarios that the study area has theoretically experienced for almost a century. It is important to note that since the other covariates we have used are static (do not significantly change over time), we consider this approach to be valid. However, if other factors such as landuse, roads (cuts and embankments) and buildings (cutslopes) would be incorporated, these would have experienced significant changes in the period since 1933, as the development of the national park has led to many human interventions that might also have contributed to landslide occurrence. Therefore, we suggest that whether simulations in different periods would encompass time-varying covariates, their variation through time should also be expressed and included in the modeling procedure.

Nevertheless, the combined map we present in Fig. 13 is not exactly a conventional susceptibility map as it can be found in many other studies (e.g., Ercanoglu and Gokceoglu 2004; Lee et al. 2004; Van Westen et al. 2008). Our combined susceptibility incorporates a temporal dimension (limited to the availability of past scenarios) which makes it much closer to the definition of a landslide hazard map. By definition, the landslide hazard should include a return time or the expected frequency of a widespread landsliding event. Here, we propose a map which delivers the slope instability at the SU level for a period of 84 years (1933–2017). However, if a significant earthquake would occur in the direct surroundings of the study area, the landslide pattern might be still quite different, so the map can still not be considered a full predictive map for the coming century.

6 Conclusions

Projecting statistically-derived landslide susceptibility maps over temporal scales different from the one responsible for the specific event for which the model is calibrated is quite uncommon in the landslide literature (Lombardo and Tanyas 2020b). The very few cases where this is performed correspond to future times, where the land use is expected to change. This is the example of Reichenbach et al. (2014) and Pisano et al. (2017), however the implementation they propose neglects the uncertainty that affects the estimation of any regression parameters whereas our implementation follows a greater statistical rigor.

Our proposed approach is able to depict time-varying susceptibility patterns as a function of the space-time ground motion variability. However, we could only validate the reference model over one specific coseismic inventory. To test, whether simulations could be reliably made over different temporal scales, more earthquake-induced landslide inventories for the same area should be included to validate the simulations. And, a further improvement could also be made by accounting for additional ground motion effects such as topographic and soil amplifications.

Notably, our approach differs not only from traditional statistically-based susceptibility approaches but also from traditional PSHA methods used in ground-motion probability assessment. The latter includes an analogous plug-in scheme while using fixed empirical relations derived for physically-based properties. As a result, models that feature empirical relations suffer from uncertainty in the empirical relations but also in terms of model parameterization, as demonstrated by Wang and Rathje (2015). Our work derives statistical relations instead of empirical ones and essentially translates the uncertainty estimation routine in Wang and Rathje (2015) into the binomial GAM context.

Nevertheless, implementing statistical simulations for earthquake scenarios was never tested so far and especially in the study area, where the main landslide trigger is due to the strong seismicity. Therefore, our proposed method may deliver a much more relevant information to local authorities compared to traditional susceptibility models, especially in the case where the scarcity of data on soil and rock characteristics limit the application of physically-based methods for earthquake-induced landslide hazard assessment, such as those presented by Wang and Rathje (2015). In fact, the usual procedure consists of building a susceptibility model trained by using past landslides and either including the responsible ground motion (thus being overly specific) or without it (thus neglecting the spatial dependence in the model induced by the shaking levels).

We also stress here that retrieving past coseismic susceptibility patterns are just one application of statistical simulations. In fact, one could also project the simulations towards the future by incorporating scenario-based ground motion. By doing so, one could estimate future landslide-prone areas prior to a potential earthquake occurrence and plan ahead structural design of infrastructure. An example that goes in this research direction can be found in the companion paper submitted by Lombardo and Tanyas (2020a).