Introduction

An earthquake-induced landslide event refers to landslides triggered by a particular earthquake. Such landslides are one of the most destructive secondary hazards associated with earthquakes in mountainous environments (e.g., Jibson et al. 2000). Therefore, the estimation of earthquake-induced landslide hazard is an important risk mitigation component in seismically active mountainous areas (Wasowski et al. 2011).

Earthquake-induced landslide (EQIL) inventories are the primary data source to extend our knowledge of the relationship between earthquakes and the landslides they can trigger (e.g., Tanyaş et al. 2017). Using an EQIL inventory, we can assess the distribution of landslides and better evaluate the total earthquake impacts considering this secondary seismic hazard (e.g., Robinson et al. 2017). An EQIL event is characterized by the distribution of landslides caused by a single earthquake.

The impact of EQIL events can be quantified using landslide inventories (e.g., Malamud et al. 2004). Keefer (1984) used the number of triggered landslides (NLT) to define an EQIL-event magnitude scale (mLS), which quantifies the severity of the event, and it is defined as follows:

$$ \mathrm{mLS}=\log {\mathrm{N}}_{\mathrm{LT}} $$
(1)

According to the method proposed by Keefer (1984), the magnitude scale of an EQIL-event triggering 102–103 landslides is classified as “class 2”; 103–104 landslides is classified as” class 3″, etc. This is an important concept because we could better evaluate the relation between landslide causes and impacts as a quantitative approach simplifies a complex phenomenon into a single, or a few, standard values (i.e., landslide-event magnitudes) which can be compared between triggering events (Tanyaş et al. 2018a).

Malamud et al. (2004) used Keefer’s (1984) method to define mLS (Eq. 1) and improved this method using the size statistics of the landslides associated with various triggers such as earthquakes, rapid snowmelt, or large storms. Malamud et al. (2004) established that the frequency-area distribution of landslides follows an inverse power law for medium- to large-sized landslides, while the distribution shows a rollover at smaller landslide sizes. They modeled the frequency-area distribution of three well-documented event inventories and defined empirical curves to identify mLS. Tanyaş et al. (2018b) examined the frequency-area distributions of 45 earthquake-induced landslide inventories and showed that the form of the rollover does not follow the modeled empirical distribution curves. They noted that the power-law tail is the most important part of the frequency-area distribution because it gives insight in characteristics of landslide size distribution and contains the greatest volume of material (e.g., Bennett et al. 2012).

Many studies make use of the empirical distribution of landslide sizes, independently on the trigger of the landslide event (Malamud et al. 2004). For example, Guzzetti et al. (2005) extracted the probability of landslide size from frequency-size statistics of landslides and used this information for quantitative analysis of landslide hazard. The power-law region of the distribution can also be reproduced by different physically based models (Alvioli et al. 2014, 2018b; Hergarten 2012).

A magnitude scale for the landslide events can be defined by identifying the power-law fits for medium and large landslides. Thus, the examined landslide inventory may be partial (i.e., some small landslides may be missing), but the assigned mLS is equivalent to the one associated to complete landslide event based on a frequency-area distribution, obtained by properly rescaling a frequency density curve to the measured distribution in the power-law region as in Malamud et al. (2004). Malamud et al. (2004) also proposed equations to estimate the maximum landslide area (ALmax) (Eq. 2) and total landslide area (AT) (Eq. 3) triggered by one event (e.g., earthquake, rainstorm) in relation with mLS, defined as follows:

$$ {A}_{\mathrm{Lmax}}=1.10\times {10}^{-3}\times {N}_{\mathrm{LT}}^{0.714} $$
(2)
$$ {A}_{\mathrm{T}}=3.07\times {10}^{-3}\times {10}^{\mathrm{mLS}} $$
(3)

Regarding the estimation of mLS, Tanyaş et al. (2018a) introduced an updated method that better fits the observations. They determined a slope (power-law exponent) of the power-law fit for each specific landslide inventory and used this value instead of the average value (2.4) used by Malamud et al. (2004) to define the empirical frequency-area distribution curves. To construct the empirical curves, Tanyaş et al. (2018a) rotated the power-law fits around a reference point identified considering the most reliable EQIL inventories. They then determined the mLS using the constructed empirical frequency-area distribution curves. They also checked the variation in mLS in their proposed method based on different reference points and identified 95% confidence limits for various mLS intervals (Table 1). The mLS values determined by Tanyaş et al. (2018a) are presented in Table 2.

Table 1 Variation in mLS (Tanyaş et al. 2018a)
Table 2 EQIL inventories used in this study

Tanyaş et al. (2018a) also proposed an updated equation to estimate total landslide area (AT) triggered by an earthquake in relation with mLS (Eq. 4):

$$ {A}_{\mathrm{T}}=0.0125{e}^{\left(1.7651\times \mathrm{mLS}\right)} $$
(4)

However, calculation of mLS requires a landslide inventory which is not available for most of the landslide triggering earthquakes. The preparation of a landslide inventory is a tedious process (e.g., Wasowski et al. 2011), despite advances in mapping techniques, and it may take months to complete when based on visual image interpretation, or weeks when based on (semi-) automated image classification (Martha et al. 2010). In any case, the time required to create an EQIL inventory is too long to provide information for rapid emergency response phase after an earthquake (Robinson et al. 2017).

To capture the effect of an EQIL-event without having an inventory, some statistical relations were proposed, using a global dataset, between earthquake magnitude and the area affected by landslides or the maximum landslide distance, either from the epicenter or the rupture zone (Keefer 1984; Rodriguez et al. 1999). However, Jibson and Harp (2012) found that the proposed landslide distance buffers differ between plate-boundary earthquakes and intraplate earthquakes, where seismic wave attenuation is generally much lower and thus the proposed relation could not be used for accurate estimation of any of these landslide distance limits.

Marc et al. (2016) proposed an expression to estimate the total volume and area of EQIL. Their expression is based on seismogenic characteristics (e.g., seismic moment and asperity depth), landscape steepness, and material sensitivity (rock strength and pore pressure). However, the required inputs such as the parameters describing rock strength, earthquake asperity depth, and ground motion attenuation are often not precisely known (Li et al. 2017).

Given these circumstances, rapid prediction of mLS of EQIL events could provide us valuable information not only for studies regarding landscape evolution (e.g., Malamud et al. 2004) and hazard assessments (Guzzetti et al. 2005) but also for applications in emergency response. We could evaluate the severity of an EQIL event in near-real time, providing a rapid prediction of mLS.

In this study, we used 23 EQIL inventories and their mLS values calculated by Tanyaş et al. (2018a). We propose a method to predict mLS that can lead to estimates of the total triggered landslide area, total landslide volume, and frequency-area distribution of landslides. We construct a stepwise linear regression model using both seismogenic and morphologic predictors. We predict the mLS of EQIL events and validate our method using the leave-one-out technique.

Materials

Available data

An EQIL inventory database including 66 inventories from around the world was presented by Tanyaş et al. (2017), which included detailed information regarding their mapping methodologies. From this database, Tanyaş et al. (2018a) examined the inventories for which landslide area information is available and calculated the mLS values for 45 EQIL inventories from 32 earthquakes (Fig. 1). We examined those 45 EQIL inventories which were analyzed by Tanyaş et al. (2018a) in terms of their mLS values and excluded some of them following the inventory selection criteria presented below. The list of EQIL inventories, their main characteristics, and references are presented in Table 2.

Fig. 1
figure 1

Distribution of examined earthquakes with a landslide inventory listed in Table 2

We used both seismogenic and morphologic independent variables in a linear regression analysis. As seismogenic variables, we collected earthquake magnitudes and the estimated values of peak ground acceleration (PGA), peak ground velocity (PGV), and Modified Mercalli Intensity (MMI) from the US Geological Survey (USGS) ShakeMap system (Allen et al. 2008; Garcia et al. 2012). The ShakeMap system provides the deterministic estimates of ground-motion parameters in near-real time. Additionally, we used Global Centroid-Moment Tensor (CMT) half duration (the duration of the rupture process) (Dziewonski et al. 1981; Ekström et al. 2012) as another seismogenic variable.

We used the Shuttle Radar Topography Mission (SRTM) digital elevation model (about 30-m resolution) (NASA Jet Propulsion Laboratory (JPL) 2013) to create morphologic variables.

Selection of inventories

Each of the available EQIL inventories has a varying level of quality and completeness, which are difficult to assess both quantitatively and qualitatively due to lack of metadata regarding mapping preferences and the subjectivity of mapping procedure. We checked the mapping techniques of selected landslide inventories to get a general idea about the quality of mapping. In each inventory, the landslide-affected area was analyzed systematically by visual interpretation of satellite images and/or aerial photography. In addition, Tanyaş et al. (2017) introduced an evaluation system to help users assess the suitability of the available inventories for different types of studies. They listed four essential criteria to check whether the inventory suitable for a landslide susceptibility or hazard assessment, or to investigate the distribution, types, and patterns of landslides in relation to morphological and geological characteristics (Table 3). Based on this approach, Tanyaş et al. (2017) assigned scores to each inventory. We indicated those scores in Table 2 to have a general idea about the quality of mapping in the examined inventories. Scores show that each inventory meets at least half of the criteria and we decided to use these in this study.

Table 3 Evaluation scheme for EQIL inventories (Tanyaş et al. 2017)

Considering other available information about inventories provided by Tanyaş et al. (2017), we discarded several of them to increase the reliability of the applied method. The list of selected EQIL inventories and the exclusion criteria are presented in Table 2.

We excluded incomplete EQIL inventories for which we know that only part of the landslide-affected area was mapped. For example, the 1989 Loma Prieta EQIL inventory is such a partial inventory where McCrink (2001) only mapped part of triggered landslides to test a dynamic slope stability method. Similarly, part of the landslide-affected area associated with the 2006 Kiholo Bay earthquake was mapped in detail by Harp et al. (2014) to check if the landslide-distribution pattern is predictable using a high-resolution ground-motion simulation model. EQIL inventories that can be attributed to more than one earthquake were also excluded, such as the 1980 Mammoth Lakes (Harp et al. 1984), the 1993 Finisterre (Meunier et al. 2008), the 1997 Umbria-Marche (Marzorati et al. 2002), and the 2004 Mid-Niigata (GSI of Japan 2005; Sekiguchi and Sato 2006; Yagi et al. 2007). In each of these inventories, the earthquake associated with the triggered landslides is not clear, and thus this can cause a problem in the representation of seismogenic variables regarding these inventories. Also, we excluded the 2007 Niigata Chuetsu-Oki inventory (Kokusai Kogyo 2007) because pre-earthquake landslides were not eliminated in this inventory (Collins et al. 2012). If we have more than one inventory for the same earthquake, we only included the one that has the largest number of landslides and covers the largest area (Table 2). We also excluded the earthquakes without ShakeMap data, such as the 1998 Jueili and 2007 Aysen Fjord earthquakes. For the rest of the inventories, we checked the uncertainties of the ShakeMaps data. The relative uncertainty level of each ShakeMap output is described by a quality grading developed by Wald et al. (2008). The grades of the selected ShakeMaps data (Table 2) show that none of them belongs to the poorest grades, which are D and F.

Methods

Delineation of the geographical boundary of a landslide event is usually no trivial task. For example, in the case of inventories prepared by field campaigns, a crucial step is to determine the area that was actually surveyed by the researchers (Bornaetxea et al. 2018; Guzzetti et al. 2012). Inventories prepared by visual interpretation of aerial or satellite imagery (Alvioli et al. 2018c; Casagli et al. 2017; Guzzetti et al. 2012), as is the case for many of the inventories considered in this work, should indicate the boundary of the available images, or the actual area mapped. However, in many cases, this information is not available.

The peak ground acceleration (PGA) contours, which show a correlation with landslide density (e.g., Meunier et al. 2007), was used to identify the landslide-affected area. Wilson and Keefer (1985) are the first who proposed a minimum threshold of 0.05 g to such a boundary. They used the data gathered by Keefer (1984) regarding the 40 EQIL inventories. However, EQIL inventory maps were only available for a few of the 40 reported earthquakes (Tanyaş et al. 2017), and the general relations and conclusions reported were pieced together from various resources, listed in Keefer and Tannaci (1981). Similar minimum PGA thresholds that covers all triggered landslides were also reported for individual EQIL inventories as 0.01 g for the 1980 Irpinia earthquake (Del Gaudio and Wasowski 2004) and 0.02–0.04 g for the Mineral, Virginia earthquake (Jibson and Harp 2012). Recently, Jibson and Harp (2016) analyzed six EQIL events and explored the absolute minimum PGA value considering the very smallest failures (< 1 m3) triggered by the corresponding earthquakes. They examined four of those inventories by field studies and showed that PGA contour covering all landslides ranges from 0.02 to 0.08 g. They investigated two other inventories using aerial-photographic interpretations and pointed out the PGA range of 0.05–0.11 g as an absolute outermost limit of triggered landslides.

Jibson and Harp (2016) also stated that the proposed outermost limits of triggered landslides can only be valid where susceptible slopes are extensive. Yet, the actual area that is affected by landslides depends on the local topographic, lithologic, climatic, and land cover conditions, which are different for each earthquake-affected area, and the interaction between these features and ground shaking causes the specific landslide distribution pattern. Thus, for some of the inventories such a common PGA limit could be larger or smaller than the real landslide-affected area. In this study, we also assumed that the susceptible slopes are extensive in our examined sites to estimate the boundary of a landslide-affected area.

Note that in the case of EQIL, there can be a significant difference between the area that includes the entire landslide population, and one that includes the vast majority (e.g., 90%) of them. Hancox et al. (2002) use the term “main area affected by landslides.” Despite the lack of explanation regarding the parameter in the referred paper, we adapted that term here, modifying it slightly to the main landslide-affected area, and defined it to include the area containing 90% of the mapped landslides. To define the term main landslide-affected area, we examined the inventories and we systematically calculated the percentage of the total number of landslides contained within various PGA contours. We began examining from the highest to lowest PGA contours provided by the USGS ShakeMap system and keep examining until we find the PGA contour covering 90% of the mapped landslides. All other analyses were conducted for the identified main landslide-affected areas.

Eliminating the flat regions as non-susceptible zones to landsliding is a generally accepted approach in landslide modeling studies (e.g., Kritikos et al. 2015). Thus, we defined those regions and subtracted them from the main landslide-affected areas. To identify the flat areas, we used the GRASS GIS module r.geomorphon by Jasiewicz and Stepinski (2013) to extract the “flat” landform class, and an algorithm that gets rid of the sparse pixel result developed by Alvioli et al. (2018a). The algorithm starts from the pixels classified as “flat” by r.geomorphons and shrinks the borders of the flat raster map by a few pixels and then grows it again; the procedure is repeated until sparse pixels disappear.

In our regression model, we did not use the variables such as lithology, land cover or climate that we could not evaluate their contribution to landsliding. For example, we did not include lithologic units because without knowing their geotechnical properties, the description of a lithologic unit is not enough to evaluate its role in landslide initiation process. Instead, we used morphologic variables which were used in statistical landslide probability assessments (e.g., Budimir et al. 2015; Reichenbach et al. 2018). For example, Budimir et al. (2015) examined EQIL causal factors in their review papers. They investigated nine studies and presented the percentages at which covariates were found to be significant. Budimir et al. (2015) stated that in all those studies slope was found as a significant variable. On the other hand, distance to streams was found significant in at least 20% of those studies, while profile curvature, topographic wetness index (TWI), and surface roughness were found significant in at least 10% of those studies. Tanyaş et al. (2017) analyzed about 554,000 landslide initiation points from 46 EQIL events and examined the frequency values of earthquake-induced landslides in intervals of slope, surface roughness, local relief, and distance to streams. They stated that the highest landslide frequencies are concentrated in particular intervals for all of these parameters. This implies that these variables may be good candidates to check their significance in our regression analysis as well.

Slope is a factor controlling the normal and shear stresses, which take a role in slope stability. Local relief is the maximum difference in height in a local neighborhood of each pixel and can be related to slope instability caused by tectonic uplift. It partially correlates with slope. Both slope and local relief are related to the magnitude of static stress loading in hillslopes (Parker et al. 2015). TWI (Moore et al. 1991) is a proxy for potential soil wetness used to estimate the spatial variability of wetness within a landscape (e.g., Nowicki Jessee et al. 2018). It can take a role in slope stability by changing the pore water pressure. We used vector ruggedness measure (VRM) to consider surface roughness. It quantifies local variation in terrain more independently of slope than other methods such as land surface ruggedness index or terrain ruggedness index (Sappington et al. 2007). Tanyaş et al. (2017) showed that the majority of EQIL are initiated at low VRM values, and the number of observed EQIL decreases while VRM increases. Distance to stream is proxy related to fluvial undercutting (e.g., Kritikos et al. 2015) that cause high rates of shear stress as a result of loss of lateral support (Korup 2004). Tanyaş et al. (2017) showed that the majority of EQIL are initiated close to river channels and the frequency of observed landslides gradually decreases while going far away from channels. Profile curvature is a measure describing the concavity/convexity of slope along the vertical direction. Having a concave surface can increase slope instability by increasing the subsurface drainage that can cause high water pressure (Pierson 1980).

To create our morphologic variables used as covariates in our regression model, we worked with a few of the modules of GRASS GIS (Neteler and Mitasova 2013) and SAGA GIS (Conrad et al. 2015). In total, we derived six DEM derivatives (Table 4) using the module given within parentheses; slope (r.slope.aspect) (Hofierka et al. 2009), topographic wetness index (r.topidx) (Cho 2000), vector ruggedness measure (r.vector.ruggedness) (Sappington et al. 2007), distance to stream (r.watershed and r.grow) (Ehlschlaeger 1989), local relief (r.geomorphon) (Jasiewicz and Stepinski 2013), and profile curvatures (r.param.scale) (Wood 1996).

Table 4 List of independent variables

We also tested five seismogenic variables (PGA, PGV, MMI, earthquake magnitude, and half duration) in linear regression analysis (Table 4). MMI is a scale classifying the shaking strength observed at a site. PGA is the largest peak acceleration recorded in a strong-motion accelerogram of an earthquake, while PGV is the largest increase in velocity experienced by a particle on the ground during an earthquake (Bormann et al. 2013). If the variables such as fault-rupture mechanism and fault geometry are known, they are also taken into account, and a ShakeMap is created accordingly (e.g., Wald 2013). Therefore, we can assume that fault-rupture mechanism and fault geometry is represented by the resultant ground-motion parameters provided by ShakeMap. One of these ground-motion parameters is used in almost all statistical based EQIL prediction models (e.g., Nowicki Jessee et al. 2018; Nowicki et al. 2014; Robinson et al. 2017; Tanyas et al. 2019). PGA, PGV, and MMI are collinear variables and thus we considered three of them to identify the most significant ground motion parameter for this study. The other two-seismogenic variables, earthquake magnitude, and half duration are proxies for energy released by rupturing and duration of rupturing, respectively.

Apart from two independent variables (earthquake magnitude and half duration) which do not have any variation within a landslide-affected area, we calculated both mean value and its standard deviation for each independent variable to represent the characteristics of main landslide-affected areas.

We evaluated the significance level of each variable used in the linear regression model based on p values. We selected a significance level of 5%, which refers to a p value of 0.05 as a confidence level, below which the relation between the examined independent and dependent variables were considered significant (Moore et al. 2012). To decide on the best predictor subset, we run the stepwise linear regression algorithm provided by Matlab (Version R2017b). We applied a forward feature selection method which searches for covariates to add to the model based on p value. The algorithm tests the model with and without a potential covariate at each step considering p value. The algorithm tests not only the individual terms but also their interactions (e.g., multiplication of variables). If any of the available covariates in the model has a p value less than 0.05, the one with the smallest p value is added into a model and this procedure is repeated until the significant covariates are included into the model. This procedure provided us the set of covariates giving the best model performance. We then checked the collinearity between those variables using the variance inflation factor (VIF) (Chatterjee and Hadi 2012); a VIF larger than 10 is assumed as an indication of a collinearity.

Because we have limited observations, to validate our model, we used the leave-one-out methodology and predicted mLS values for each earthquake using the described stepwise linear regression algorithm. Considering p values, we selected the best predictor subset and the corresponding best model.

Results

To define the term main landslide-affected area, we compared the differences in PGA values covering the various landslide populations. For example for the Haiti inventory (Harp et al. 2016), PGA contours of 0.23 g, 0.36 g, 0.41 g, and 0.48 g contain 100%, 90%, 80%, and 70% of the entire mapped landslide population, respectively. We calculated these values for all inventories. Table 5 shows the PGA values and the percentage of the total number of landslides falling within these limiting PGA contours for each inventory. Table 5 shows that except for the 2007 Pisco, Peru earthquake (Mw 8.0), the 0.12 g is the minimum PGA contour covering at least 90% of the mapped landslides in each inventory. The 2007 Pisco earthquake is an offshore event where significant part of the area covered by large peak ground acceleration (PGA) locates at sea. Therefore, for this earthquake, the 0.12-g PGA contour covers about 80% of the mapped landslides (Table 5). Given these observations, we took the 0.12-g PGA contour as an estimate for the boundary of main landslide-affected area. This PGA value is slightly larger than the PGA range (0.05–0.11 g) indicated in the literature (e.g., Jibson and Harp 2016) as the outmost limit of EQIL, and thus consistent with the literature.

Table 5 PGA contours and percentages of their landslide coverage for each inventory. The italicized PGA values are the ones that are higher than PGA 0.12 g

We calculated our predictors for the area bounded by the 0.12-g PGA contour in each landslide-affected area. The stepwise regression algorithm identified five predictors as the best subset explaining our dependent variable: Earthquake magnitude (EqM), profile curvature (mean), profile curvature (std), TWI (mean), and EqM × TWI (mean) (Table 6). The regression equation is as follows:

$$ \mathrm{mLS}=-262.6393-40.3712\times \left[\mathrm{EqM}\right]+9160.0595\times \left[\mathrm{profile}\ \mathrm{curvature}\ \left(\mathrm{mean}\right)\right]-204.9325\times \left[\mathrm{profile}\ \mathrm{curvature}\ \left(\mathrm{std}\right)\right]+40.0981\times \left[\mathrm{TWI}\ \left(\mathrm{mean}\right)\right]-6.0393\times \left[\mathrm{EqM}\times \mathrm{TWI}\ \left(\mathrm{mean}\right)\right] $$
(5)
Table 6 Results of the model developed using the selected five covariates

The regression model run using these predictors show that each predictor has a p value less than 0.05 and thus, they all have high significance in our model. We checked the collinearity between predictors using VIF. We excluded our interaction term (EqM × TWI (mean)) from the collinearity evaluation (Friedrich 1982). The results show that VIF values for all other variables are less than two and thus, the collinearity is not an issue for the selected variables. Among the selected variables, earthquake magnitude (EqM), profile curvature (mean), and TWI (mean) have explicit physical meaning in our regression equation in addition to their statistical significance. On the other hand, profile curvature (std) and the interaction term (EqM × TWI (mean)) have only statistical significance.

We presented the adjusted R2, root-mean-square error (RMSE), and mean absolute error (MAE) values for the best-fit line (Fig. 2). The adjusted R2 value shows that the model explains 86% of the variability of the response data around its mean. On the other hand, the average magnitude of the error is 0.39 (RMSE) and the absolute differences between predicted and calculated mLS value is 0.30 (MAE).

Fig. 2
figure 2

Graph showing the model result. The confidence intervals which are shown by vertical error bars are calculated for each prediction separately. Uncertainties in calculated mLS values are given by using ± 2σ error bars. Calculated mLS values are obtained from Tanyaş et al. (2018a)

To validate this model, for each predictor subset, we followed the leave-one-out technique and predicted the entire mLS array. Results show that adjusted R2 is 0.79, RMSE is 0.50, and MAE is 0.40 (Fig. 3a). The residuals show a random distribution around a constant value without a distinct pattern and the average residual value is 0.0004 (Fig. 3b). This supports our assumption that a linear dependence exists between mLS and the variables. The average uncertainty for the calculated mLS values, which were shown by horizontal error bars in Fig. 3a and vertical error bars in Fig. 3b, is 0.15. In a few cases (e.g., EQIL Inventory ID of 2, 10, 12, 20, and 21), the residuals are lower than uncertainties in calculated mLS values. These are the cases that our predictions are successful. In all cases, our predictions stay within the 95% confidence limits for the best-fit line passing from the origin (Fig. 3a).

Fig. 3
figure 3

Graphs showing the results of validation using the leave-one-out methodology: a the distribution of calculated versus predicted mLS values and the best-fit line passing from the origin and b the residuals for predicted mLS values. The confidence intervals which are shown by vertical error bars are calculated for each prediction separately in (a). The uncertainties in calculated mLS values (± 2σ) are given by horizontal error bars in (a) and vertical error bars in (b). Calculated mLS values are obtained from Tanyaş et al. (2018a). The number in the lower graph refers to the EQIL inventory IDs listed in Table 5

We can predict mLS and other measures that we can estimate using mLS, soon after an earthquake, in four steps (Fig. 4): (i) the PGA map of an investigated earthquake is obtained from USGS ShakeMap system and the SRTM DEM is obtained for the areas bounded by minimum PGA value of 0.12 g; (ii) the independent variables listed in Table 6 are collected/derived for non-flat areas; (iii) the proposed regression equation (Eq. 5) is run using the coefficients listed in Table 6 and mLS is predicted for the examined earthquake; and (iv) the maximum landslide area (Eq. 2) and the total landslide area (Eq. 4) are estimated using existing methodologies (Malamud et al. 2004; Tanyaş et al. 2018a). Further, the variation ranges for the estimated mLS are calculated using the confidence intervals given in Table 1. Frequency-size distribution of the examined landslide event can be estimated using the empirical curves proposed by Malamud et al. (2004).

Fig. 4
figure 4

Flowchart for the proposed method

We used the 2004 Mid-Niigata earthquake as an example to show the application of the proposed method (Fig. 5), which is presented in Fig. 4. We have three landslide inventories (GSI of Japan 2005; Sekiguchi and Sato 2006; Yagi et al. 2007) for this earthquake but all of them include landslides triggered by a sequence of earthquakes rather than a single main shock. Therefore, we discarded these inventories in the modeling stage (see Table 2) because they may include more landslides and thus the predicted mLS using a single earthquake may be lower than the calculated mLS.

Fig. 5
figure 5

Illustration of the four-step procedure (Fig. 4) to predict mLS and related parameters for the 2004 Mid-Niigata earthquake. In STEP-1, the 0.12-g PGA contour as an estimate for the boundary of main landslide-affected area is identified. In STEP-2, the independent variables are determined for non-flat regions located within the main landslide-affected area. In STEP-3, the regression equation (Eq. 5) is run to predict mLS. In STEP-4, the number of triggered landslides (NLT) is estimated (Eq. 1) to predict the maximum landslide area (ALmax) (Eq. 2), and the total landslide area (AT) (Eq. 4). The empirical frequency-size distribution curve corresponding to estimated mLS is also identified in STEP-4. The empirical frequency-size distribution curves presented in STEP-4 are taken from Malamud et al. (2004)

We predicted mLS using our proposed regression equation (Eq. 5). Also, we predicted the maximum landslide area (ALmax in Fig. 5) and the total landslide area (AT in Fig. 5) based on existing methodologies (Malamud et al. 2004; Tanyaş et al. 2018a). The predicted mLS (3.06 ± 0.33), the maximum landslide area (ALmax) (0.16 km2 (− 0.07, + 0.11)), and the total landslide area (AT) (2.82 km2 (− 1.24, + 2.12)) are close to the values calculated from the 2004 Mid-Niigata inventory map created by Yagi et al. (2007) (mLS = 3.11 ± 0.04; ALmax = 0.17 km2 and AT = 3.80 km2). As we expected, our predictions are lower than the values calculated from three of the inventories (see Table 2) due to the overprinting of landslides from different earthquakes. We did not predict total landslide volume using the equation suggested by Malamud et al. (2004) because the examined inventories do not have volume information to validate our prediction. We estimated the frequency-area distribution of landslides (Fig. 5) which can be useful for quantitative analysis of landslide hazard assessment. Note that the form of the frequency-area distribution curve may not be representative for small landslides (Tanyaş et al. 2018b) and thus we suggest to focus on the power-law tail in this estimate.

Discussion

The most relevant advantage of our method is that we use both static and dynamic parameters, which are publicly available. The static predictors are DEM derivatives and thus they can be easily derived for any location on the globe. Earthquake magnitude and ShakeMap data can be obtained using USGS ShakeMap system in near-real time.

The proposed method has also some limitations. Our approach gives poor prediction results in a few cases as was shown in Fig. 3, due to several reasons. First, offshore events may not be well characterized using the proposed approach. In offshore earthquakes, most of the areas bounded by the 0.12-g PGA contour are not located on land, and thus our morphological predictors may not represent the landslide-affected area well. Figure 3 shows that for two offshore earthquakes, we have residuals, which are larger than MAE (0.40). The 1978 Izu Oshima Kinkai (3) and the 2010 Eastern Honshu (18) earthquakes give residual values of 0.57 and 0.50, respectively. Second, the quality of the ShakeMap data may also affect our model performance since we identify the main landslide-affected area using the PGA values from the raster files provided by USGS ShakeMap system. The relatively poor quality of ShakeMap regarding the 2010 Yushu earthquake may be the reason for having a larger residual (0.69) than the average value for this earthquake. Third, the inventories used for the calculation of mLS values may be incomplete or may contain landslides which were not triggered by the specific earthquake. If these landslides are medium or large in size, this may affect the calculated mLS value. Landslide mapping is a subjective procedure (e.g., Tanyaş et al. 2017), and each landslide inventory can be exposed to various levels of amalgamation and the delineated landslide polygons may show minor/major differences comparing to the actual landslide boundaries based on the quality of an inventory. However, evaluating the quality and completeness of the inventories is not possible without examining the landslides from the original imagery from which the inventories were made, which is very time-consuming. This implies an uncertainty in mLS that we could not assess quantitatively. Further studies need to assess this uncertainty. Fourth, the simplicity of the proposed method may be the main reason for poor prediction in some cases. We used earthquake magnitude (EqM), profile curvature (mean and standard deviation), topographic wetness index (TWI) (mean), and EqM × TWI (mean) (Table 6) to derive our regression equation. We used mean values and standard deviations for these variables which may not represent the landslide-affected areas in a few cases, affecting the prediction performance. Moreover, we could not consider some variables that may play an important role in landslide initiation and thus affect the resulting landslide-event magnitude. For example, shear strength parameters of slope material are not available globally. Although a global lithologic map is available (e.g., Hartmann and Moosdorf 2012), the evaluation of the strength parameters only based on lithologic descriptions is not a reliable method. Similarly, we could not account for the effect of previous earthquakes (Parker et al. 2015), or previously occurred landslides (Samia et al. 2017) because we do not have globally available datasets to quantify the effect of such variables.

Last but not least, a considerable drawback of this study is that there are only a limited number of digital landslide inventories available. Although we worked with the largest EQIL dataset available (Tanyaş et al. 2017), the number of selected inventories is still limited given the variation in seismogenic or environmental characteristics of the examined landslide events. With a larger EQIL inventory database, landslide events can be better categorized based on common features and different regression coefficients can be calculated for each of those categories. For example, offshore earthquakes can be analyzed separately to address the possible drawback mentioned above. Similarly, categorizing the earthquakes having different faulting mechanism would be possible with a larger database. Although the ground motion estimates provided by ShakeMap take into account the characteristics of faulting mechanism such as fault type and geometry (e.g., Wald 2013), categorization of inventories considering these features may help us to improve our mLS predictions as well. Now, we have only 10 landslide events associated with strike-slip faulting and 13 events with thrust faulting, while no EQIL inventory is available that is associated with normal faulting (Table 2), which is not sufficient to make separate categories.

Conclusions

We analyzed 23 EQIL inventories to develop an approach for the near-real-time prediction of the landslide-event magnitude scale. We restricted our analyses within non-flat regions located within the main landslide-affected areas, which were identified using the PGA contour containing 90% of the landslides and largest PGA values. For each of the main landslide-affected areas, we calculated mean values of three independent seismogenic and six morphologic variables and their standard deviations (Table 4). Additionally, we gathered earthquake magnitude and half duration for each earthquake and examined 20 variables in total. We assumed a linear dependence for mLS over the variables and identified five variables as the best subset of the independent parameters using a stepwise linear regression algorithm. Using the selected subset of variables, we identified the coefficients of the regression model and validated this model using the leave-one-out approach, since we have limited observations.

Validation results show that our proposed approach provides a relatively good prediction (adjusted R2 = 0.79, RMSE = 0.50, and MAE = 0.40) for mLS. Although this has not been tested in practice, it is possible to make near-real-time predictions as the required predictors can be derived rapidly after an earthquake.

Rapid prediction of mLS can improve our ability to estimate the intensity of landslide events within a day after an earthquake and, thus, it can provide useful information in the emergency response phase. Using the predicted mLS, we can also estimate maximum landslide area, total landslide area, and volume, which can help us better understand the balance between crustal advection and seismically induced mass wasting and thus the landscape evolution process (e.g., Hovius et al. 2011). We can also estimate the frequency-size distribution of landslide-event using the empirical curves of Malamud et al. (2004). Tanyaş et al. (2018a) emphasized the variation in the slope of frequency-size distribution curves and argue that modeling the frequency-size distribution of landslides may not be accurate using an average slope as Malamud et al. (2004) did. However, in the absence of landslide-event inventory, to provide estimates regarding the size distribution of landslides, the empirical curves of Malamud et al. (2004) can be quite useful. Our method needs further calibration using a larger dataset to ensure its validity globally. With a larger EQIL database, this model can be improved addressing some of the drawbacks mentioned above and predict mLS with smaller uncertainties.